Hayes Technologies - Software Speed Optimization

Software Speed Optimization · Performance Optimization · High Performance Computing · Number Crunching · C/C++ · Assembly/Assembler · SIMD · MMX · SSE · SSE2 · SSE3 · 3DNow!

Software Speed Optimization

This page discusses and explains the key issues of Software Speed Optimization:

 

What is Software Speed Optimization?

Software can be optimized in a number of ways:

  • For memory consumption
  • For ease of use
  • For (execution) speed
  • For maintainability
  • For latency (time to respond to external events; important for real-time software)
  • ...

Optimizing software for execution speed or, somewhat shorter, Software Speed Optimization, is the type of optimization dealing with the reduction of execution duration for executing some piece of code by designing and implementing the software in an appropriate manner.

This should not be confused with another type of speed optimization: The optimization of software packages by some sort of configuration settings. This kind of optimization could be called "Software Package Speed Optimization". An example for this is are optimizations of database applications by configuring the database or the used data types etc. Another example is the "tuning" of an operating system by setting the size of swap files or file/font/etc. caches.

The difference between the two are that "Software Speed Optimization" deals with the source code of that piece of software actually being executed by the hardware whereas "Software Package Speed Optimization" does not.

In the context of Hayes Technologies the term "Software Speed Optimization" is only used according to the definition above.

Jump to top of page

 

What Influences is Execution Speed?

The execution speed is basically influenced / defined by:

  • The software configuration of the operating system with drivers etc.
  • The tools (i.e. compilers) und libraries employed
  • The design of the software
  • The algorithms employed
  • The detail-implementation of the most time-consuming functions incl. the corresponding data layout
  • Multi-threading / Multi-Processing / Parallelization

Besides this, obviously also the hardware configuration plays a major role. But this is another topic outside of the scope of Software Speed Optimization; however, the interaction of Software Speed Optimization and Hardware Optimization must always be considered.

Jump to top of page

 

Why Optimize Software?

This is a critical question. You may think:

  1. Our customers do not care about speed. They are happy with the performance they get.
  2. Processors get faster all the time.
  3. Optimizing costs development time, I rather use a faster machine (when bundling HW + SW).
  4. Compilers are so good today, that the software probably simply can not be made faster.

Obviously these arguments are valid in many cases. But let us look a little closer:

  1. Well, if this really is so, then that is fine. But maybe some of the points below apply?
  2. That's certainly true. On the other hand, have you thought about those of your customers with slower processors? Or have you considered that slower processors are cheaper? A number of your potential customers may well have chosen, to not buy or upgrade to your software because the total cost of ownership (incl. hardware (upgrade)) would be to high.
  3. This trade-off must always be considered. The key point is that with a high enough volume significant cost savings can be achieved if Software Speed Optimizations allows to employ a cheaper processor.
  4. Compilers have improved significantly over the last years and continue to do so. However, it is definitely a modern myth that simply writing a program professionally and putting it through a good compiler will yield unbeatable software. Almost any software, which has not been put through one or more explicit optimization processes by a software developer with massive experience in Software Speed Optimization, can be optimized, in most cases significantly (meaning more than factor 2).

 

And these are not the only arguments for Software Speed Optimization, here are a couple more:

  1. Higher speed often enables new features or more complex operations. Have you thought about what your software could do more if it only ran faster? Common examples include more complex simulations (i.e. for weather forecasting), higher resolution, 3D instead of 2D. etc. etc.
  2. For many applications speed is a differentiator. All other things being equal a faster program will be perceived as the better program.

 

Another important aspect for a whole class of applications, those, where power consumption is critical, is the following:

  1. Faster execution equals lower power consumption (and thereby, for battery operated devices, longer usage time). Either this is achieved by being able to extend the percentage of power down modes or by using a slower processor which in turn uses less power and typically costs less.

 

And finally one of the most cost saving and risk lowering points have to be made:

  1. In many cases Software Speed Optimization can eliminate the need for switching to costly more complex hardware platforms such as multi processor systems or dedicated hardware.

Jump to top of page

 

What Speed-Ups Can be Expected from Software Speed Optimization?

This depends largely on the specific context and on the effort put into the Software Speed Optimization. Basically, if the functions / modules in question are not already highly tuned (i.e. math routines from the compiler manufacturer) and the task is not I/O-bound (limited by the speed of the memory or hard disk etc.) and the task is not rather simple (i.e. element-wise addition of 2 arrays, matrix multiplication etc.) then the possible speed-up factors are in many cases in the range 2 .. 5. And in some cases significantly higher. That is as if your software would run on a machine with speeds of up to 10 GHz or even more than 20 GHz! (assuming a 2 GHz machine as base)

 

Why can the speed-up factor be so high? The key reasons for this are:

  • Typical software is not really optimized by the programmer
  • A compiler can not influence a number of key points in optimization, such as data layout, data types, required precision of algorithms, basic algorithms employed etc.
  • A compiler does not know a number of key points such as: how often an if expression is true or with what parameters a function is called (although an extended profile guided compiler could know)
  • A compiler does not understand what the true intention of the code is, this often makes more complex optimizations impossible
  • A compiler always has a more limited view on what instructions can be employed (especially regarding SIMD instructions), basically a compiler always comes up with a very schematic solution to each task
  • Compilers do not measure the performance of different code variants and so do not select the best code
  • Compilers always have less optimization strategies at hand than an optimization expert

Please also see the case studies which contain achieved speed-up factors and the technology section which explains some of the reasons for the speed-ups in more detail.

 

Hayes Technologies can analyze the potential for speed-up and give you an estimate of the required effort.

Jump to top of page

 

Why Hire an Expert for Software Speed Optimization?

Basically nothing stops you from tackling the Software Speed Optimization issue yourself. However, reality clearly shows that only very few teams or organizations actually address this task in an appropriate manner. Not without reason: The key reason for this is that Software Speed Optimization is a discipline by itself which requires massive knowledge and experience, similar maybe to, say, writing device drivers or operating system kernels.

In addition to standard programming skills an expert in Software Speed Optimization must posses:

  • a number of years of high level and assembly level programming experience
  • a deep understanding of the inner workings of the relevant processor families and their assembly languages, specifically including newer instruction set extensions
  • a deep understanding of memory hierarchy / caching issues
  • knowledge and experience with profiling and timing
  • a intimate knowledge of Software Speed Optimization techniques
  • multiple years experience with Software Speed Optimization

 

Typically this means that either this skill set is not available in-house or it may be available but the opportunity costs are very high, because it implies that the most experienced developers can not focus on other important tasks, probably leading to delays, sub-optimal software design, down-scaling of features etc. and, as the final consequence, reduction of earnings.

 

Taking all issues into account will in most cases clearly not only justify but require an external expert.

Jump to top of page

Platforms: x86 · Pentium · Pentium MMX · Pentium II · Pentium III · Pentium 4 · Core · Core 2 · Xeon · Itanium · Athlon · DSPs · Embedded CPUs · Windows · Linux · RTOSs

Especially Benefiting Application Areas: Image Processing · Signal processing · High Performance Computing / Number Crunching · Simulations · Compression · Games · 3D Software · Device Drivers · Multi-processor Systems · Multi-Computer Systems / Clusters · Embedded Devices · Real-time Systems · Interactive Systems · And many more...