Mainly Tech projects on Python and Electronic Design Automation.

Tuesday, November 08, 2011

Should you worry about a 2x speedup?

Let's take as context an implementation of a task for Rosetta Code - a site set up to compare how different programming languages are used to implement the same task for over five hundred tasks and over four hundred languages.

My short answer would be it depends! You need to:
  • Read all of the task description,
  • Read some of the solutions in other languages, 
  • And maybe skim the tasks talk page
I.e.  ensure you know what the task is asking for and then verify that both solutions solve the task as stated. It can be very easy to misinterpret what the task is asking for, for example, if a task asks for a particular algorithm, do both the examples you are comparing use that algorithm?

As well as comparing two implementations for speed, you should also compare for readability. How well the code reads can have a large impact on how easy the code is to maintain. It has been known for task descriptions to be modified; someone tracking that modification may need to work out if and how code needs to be updated. If an example is overly complex  and/or unidiomatic then it could cause problems.

Time complexity. If one version of code works better when given 'bigger' data then  you need to know more about when that happens - it could be that the cross-over point in terms of speed of execution is never likely to be met. Maybe the size of data needed to reach cross over is unreasonable to expect, or that other mechanisms come into play that mask predicted gains (in other words you might need to verify using that actual bigger data set to account for things like swapping or caching at the OS and hardware level.

How fast does it need to be? Rosetta code doesn't usually mention absolute speed of execution, but if one example takes ten hours and the other takes five then you might want to take that into account. If one example took 0.2 seconds and the other only 0.1 seconds then I guess there is an unwritten expectation that examples "don't take long to run" where long is related to the expectation and patience of the user.

You need to look at the context. In the case of Rosetta code, it may be best to give a solution using a similar algorithm to other examples, or a solution that shows accepted use of the language.

When you make your considered choice, you might want to squirrel away the losing code with notes on why it wasn't used,  - On Rosetta Code we sometimes add more than one solution to a task with comments contrasting the two if they both have merit.

It seems to me that talk about optimising for speed, and speed comparisons tends to dominate on the web over other optimisations, (usually with no extra info on the accuracy of the result. Actually there might be more cases of a revised result that showed not even the first digit of the original answer was right, but more than two digits of precision were shown in the answers)!

No comments:

Post a Comment


Subscribe Now: google

Add to Google Reader or Homepage

Go deh too!

Blog Archive