The Performance Myth

One of the things in computing education that intrigues me is programmer folklore about performance. Students often seem to arrive at university under three misapprehensions:

  1. speed of code is vitally important,
  2. speed of code is largely determined by performance of small aspects (i.e. individual lines) of programs, and
  3. these small aspects are usually integer operations.

There is some truth in the first point, but never as much as students think. Several applications do require speed: High-Performance Computing is built around it, web servers do need to be fast, GUIs do need to be responsive. But more important in many applications is maintainability of code, good software design, portability and other aspects. (And quite soon, I think speed of code will become secondary to power consumption of code, which is surely an upcoming research area)

The second point is frequently nonsense. Hopefully we educate it out of them with things like big-O notation and Hoare’s quote about premature [micro-] optimisations being the root of all evil. The lesson we should teach is that you should never but any effort (beyond the bare minimum) into optimisation until you’ve benchmarked. But unless a particular line of code is in a tightly nested loop, it simply can’t make that much difference to the program’s performance.

The third point is horribly outdated. A very common line trotted out is “in for loops, ++i is faster than i++“, based on a quirk of compilers about 20–30 years ago. You would have to go out of your way to find a compiler that generates different code for those two: similarly, the idea that “i++ is faster than i += 1“. Lots of other optimisations that the students know involve bit twiddling: shift-left being faster than multiplication for powers of two, XOR to swap variables instead of using a temporary. All of these bits of knowledge have been invalidated by compiler optimisations, and now the reverse has become true: these odd “optimisations” can hamper the compiler’s ability to really optimise the code.

So where does this knowledge come from? Students seem to arrive with it (and admittedly, often leave with it fairly intact). I don’t think they can have picked it up at school from a teacher, simply because most students arrive without having studied computing at school (which will hopefully change). An obvious explanation is that they learned it from the Internet, but I think this phenomenon predates the growth of the Internet (I remember being told these things by fellow students at university, which were already myths by then). I don’t know the answer, but it remains a source of idle curiosity.

I think the modern line on performance can probably be summed up as:

  1. Performance is usually not as important as other factors such as maintainability.
  2. Performance is usually determined by macro aspects of the program (e.g. choice of data structure) not by micro aspects (e.g. use of floating point).
  3. You should never optimise existing code until you have to.
  4. You should never optimise without benchmarking before and after, to check you were focused on the right part, and actually made a useful difference.

The last one probably puts it out of reach of students, at least until the later years of university education. So if possible, we should all try and dispel the performance myth, and get students focused on writing readable code instead.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s