You are on page 1of 3

Programming and optimizing C code, part 5

Alan Anderson, Analog Devices 3/19/2007 3:00 AM EDT

Code Speed vs. Space Small code has many benefits. Smaller code fits better into internal memory, so smaller code can raise the speed of the application. Smaller code also reduces the need for external memory, thereby reducing the cost and power consumption of the system. Compilation tools will often assist you in optimizing the size of your program. You may request optimization for maximum speed, you can request minimum code size, or you can aim for a trade-off between speed and space. Often you can give the compiler guidance at a low level. You can decide that one file should be compiled for space and another for speed, and you can even decide how individual functions should be optimized by using pragmas. Figure 3 shows the results we obtained from switching certain optimizations on and off to uncover the differences between "compiled for speed" and "compiled for space".

Figure 3. Differences in file size for different optimization options. Two optimization options caused significant differences in the code size. The first culprit was function inlining. That was a bit of a surprise, and a warning to discover exactly how aggressive your compiler is when it is told to go for speed at all costs. The other code expanding optimizations mostly come from heavy optimization of loops. These optimizations heavily use techniques such as loop unrolling and pipelining. Of minor interest was 2% gained by arranging data access to maximize the use of 16 bit data offsets, rather than using 32 bit address calculations A small thing to watch out for is the fact that memory has edges. Programmers tend to place data arrays at the first address or butt data up to the last address. However, this creates problems for software pipelining. Software pipelining is the most powerful technique that the compiler has for optimizing tight loops because it allows the DSP to fetch the next set of data while processing the previous data. When a data array is placed on a memory edge, the last iteration of the loop will attempt to fetch data that lies outside of the memory space, causing an address error. To avoid this problem, compilers reduce the loop count by one, and executing the last iteration as an epilog to the loop. This creates safe code, but it adds many instructions to the code. To avoid this code bloat, Blackfin has an "-extra-loads" compiler option. This option tells the compiler it is safe to load one element off the end of arrays. Advanced compilers attempt to provide intelligent blending of space- and speed-sensitive optimization. For example, the Blackfin compiler offers a selection from 0% to 100% space-sensitive optimization. The compiler combines the target it is given with its own understanding of how space expansive each of its optimizations are. The compiler also evaluates the execution profile of the application as discovered under simulation. The blocks of code which are infrequently executed are compiled to save space rather than to maximize speed. All of this results in a very flexible solution. Naturally we want to know if it works. In Figure 4 we graph the response for a test program, in this case, a JPEG compression (cjpeg) program. This graph shows that at the extremes of minimizing space or maximizing speed, we get very low returns. Several points are clustered near the origin corresponding to target values 30% through 70%. Any of those give a good tradeoff between speed and space.

Figure 4. Results for speed/size compiler tradeoffs. On both axis, a lower number represents a better result. If you are not comfortable with fully automatic optimization, or you cannot use simulation, you can approach this problem manually. Figure 5 shows what happens when you take each of the files comprising this application individually, compile them for speed and space, and measure the effects for each file.

Figure 5. Results of optimizing individual files for speed and size. The "% of avail" column shows how much each file contributes to the overall optimization. For example, fileio.c accounts for 24.95% of the total available speedup. In the "speed" column we see the difference between optimizing for speed or space. Only the files highlighted in yellow show a significant performance effect. This tells us most of our application could be compiled neutrally or to save space. Similarly, the right-hand columns show how code size varies under the same conditions. Only the files highlighted in blue show a significant effect. The only files that need careful consideration are those which are highlighted in both yellow and blue. For the others, the best optimization settings are obvious. Interestingly, most of the files that are highlighted in yellow don't match up with the files highlighted in blue. This demonstrates that a little analysis can substantially reduce the complexity of the optimization choices.

We end this series where we began. To optimize C code successfully, your efforts must be applied intelligently. You should base your efforts on study of the application and the target processor, rather than optimizing indiscriminately.

You might also like