I just read an article that had a good go at explaining why Silicon Valley firms are filled with white middle class males that managed to actually do what it set out to do when it explained why it wanted to take a more reasoned approach. They boil it down to everyone having an in-built bias - not necessarily a good or bad thing, it is just how humans operate. The article then comes up with a very useful experiment:
When selecting from resumes for interview, have the names, age, sex, and origin of the candidates blanked out.
The author found that his resulting selections for interview were a lot different to what he had before.
He goes on to discuss how orchestras changed from being all male affairs to the current mix of genders by them adopting a selection process where candidates played unseen, behind a screen, and where selected based on how well they played.
This got me wondering about the current controversy about Oxbridge admissions. Maybe the universities should adopt similar schemes, candidates should be lead to a separate room, unseen, where voice changers would mask ethnicity and accent, The interview would be audio only and taped, with the selection panel in a separate room. After the interview, the candidate should be given a copy of the tape, and another copy kept for independent review. The idea would be to make the process a better meritocracy rather than the obvious selection of "people like me" that goes on at the moment.
Oh, here's the link to the original article.
Tuesday, November 22, 2011
Tuesday, November 08, 2011
Should you worry about a 2x speedup?
Let's take as context an implementation of a task for Rosetta Code - a site set up to compare how different programming languages are used to implement the same task for over five hundred tasks and over four hundred languages.
My short answer would be it depends! You need to:
As well as comparing two implementations for speed, you should also compare for readability. How well the code reads can have a large impact on how easy the code is to maintain. It has been known for task descriptions to be modified; someone tracking that modification may need to work out if and how code needs to be updated. If an example is overly complex and/or unidiomatic then it could cause problems.
Time complexity. If one version of code works better when given 'bigger' data then you need to know more about when that happens - it could be that the cross-over point in terms of speed of execution is never likely to be met. Maybe the size of data needed to reach cross over is unreasonable to expect, or that other mechanisms come into play that mask predicted gains (in other words you might need to verify using that actual bigger data set to account for things like swapping or caching at the OS and hardware level.
How fast does it need to be? Rosetta code doesn't usually mention absolute speed of execution, but if one example takes ten hours and the other takes five then you might want to take that into account. If one example took 0.2 seconds and the other only 0.1 seconds then I guess there is an unwritten expectation that examples "don't take long to run" where long is related to the expectation and patience of the user.
You need to look at the context. In the case of Rosetta code, it may be best to give a solution using a similar algorithm to other examples, or a solution that shows accepted use of the language.
When you make your considered choice, you might want to squirrel away the losing code with notes on why it wasn't used, - On Rosetta Code we sometimes add more than one solution to a task with comments contrasting the two if they both have merit.
It seems to me that talk about optimising for speed, and speed comparisons tends to dominate on the web over other optimisations, (usually with no extra info on the accuracy of the result. Actually there might be more cases of a revised result that showed not even the first digit of the original answer was right, but more than two digits of precision were shown in the answers)!
My short answer would be it depends! You need to:
- Read all of the task description,
- Read some of the solutions in other languages,
- And maybe skim the tasks talk page
As well as comparing two implementations for speed, you should also compare for readability. How well the code reads can have a large impact on how easy the code is to maintain. It has been known for task descriptions to be modified; someone tracking that modification may need to work out if and how code needs to be updated. If an example is overly complex and/or unidiomatic then it could cause problems.
Time complexity. If one version of code works better when given 'bigger' data then you need to know more about when that happens - it could be that the cross-over point in terms of speed of execution is never likely to be met. Maybe the size of data needed to reach cross over is unreasonable to expect, or that other mechanisms come into play that mask predicted gains (in other words you might need to verify using that actual bigger data set to account for things like swapping or caching at the OS and hardware level.
How fast does it need to be? Rosetta code doesn't usually mention absolute speed of execution, but if one example takes ten hours and the other takes five then you might want to take that into account. If one example took 0.2 seconds and the other only 0.1 seconds then I guess there is an unwritten expectation that examples "don't take long to run" where long is related to the expectation and patience of the user.
You need to look at the context. In the case of Rosetta code, it may be best to give a solution using a similar algorithm to other examples, or a solution that shows accepted use of the language.
When you make your considered choice, you might want to squirrel away the losing code with notes on why it wasn't used, - On Rosetta Code we sometimes add more than one solution to a task with comments contrasting the two if they both have merit.
It seems to me that talk about optimising for speed, and speed comparisons tends to dominate on the web over other optimisations, (usually with no extra info on the accuracy of the result. Actually there might be more cases of a revised result that showed not even the first digit of the original answer was right, but more than two digits of precision were shown in the answers)!