How to Optimize Code - a General Approach and Principles

In case it is of use, some general practices and principles to guide work on optimizing source code to perform better.

Essentially, optimization work involves a combination of:

a) working from principles - for example, examining the algorithm, calculating the O notation, and re-thinking the algorithm.

b) working from empirical testing - conduct a performance test, ideally with a profiler. Observer the results before and after any code change.

Both techniques are essential!

1. Try to get data that is as close as possible to Production, both in terms of volume and structure.

2. Use a profiler to find the bottlenecks in the code - ideally on hardware that is close to what is in Production.

3. Review the code. A good code review can identify likely places to optimize. Ideally, calculating O notation for the algorithm. 

3b. One heuristic is that simpler code tends to perform better, although of course this is not guaranteed.

4. Optimise the worst performance problems first, following profiling or if that is not possible then following code review. This maximises the return on investment and reduces the risk of introducing bugs or further performance issues with different data. 

5. Avoid optimising without first measuring the benefit - otherwise such changes can introduce accidental complexity and may in fact make performance worse under different conditions. This is somewhat similar to “avoid premature optimisation”…

5. Principle: avoid un-necessary work. For example, if only comparing sizes, then there is no need to have the exact size. Example: if need to compare is Circle-1-radius greater than Circle-2-radius, and the Areas are already known, it is enough to compare the Areas, there is no need to calculate the Radius.

6. Execute cheaper operations first. In particular if boolean AND logic (&&) is involved then the cheaper comparisons will short-circuit out the more expensive comparisons, provided the cheaper comparisons are done first.

7. Principle: avoid repeated work. This includes simply avoiding re-iterating the same collection. Also this includes iterating over shorter collections first!

7. Principle: always re-test the performance - do not make assumptions about the possible performance benefit. Even experienced developers find it difficult to predict the impact of a code change on performance, especially if a lot of legacy code is being reused OR if data sets vary a lot.

If you have suggestions on improving this list, or other principles or heuristics that can be added, please comment below. Thank you!


Comments