Or: If time is an issue then measure!
I was recently introduced to the Boyer-Moore MJRTY algorithm for determining which, if any, of an arbitrary
number of candidates has received a majority of the votes cast in an election.
The linkedin post by Prof. Wissam Fawaz gave an implementation of that algorithm and I gave my own version, reproduced below.
The algorithm, given a list of peoples votes such as if thirteen people are voting between parties A, B, and C then their votes might be represented by the list AAACCBBCCCBCC. The MJRTY algorithm takes just two passes through the list to determine if some party has more than half the votes, and who that is.
I decided to also code a solution using collections.Counter, which is the first consideration when a problem includes counting in Python:
Given the problem statement, it is more evident how the counted version works:
- Find the most common vote
- See if it amounts to more than half of all votes.
Testing.
Like all tests, it has holes, but I decided to concentrate on the run time for longer and longer lists that were randomised, and also had a 50/50 chance of producing a winning candidate.
First a generator of suitable data:
Then wrap each implementation and timeit (Thanks IPython/Spyder).
Results
The run times are comparable over from ten to a million voters given the restrictions of the test strategy.
the vote generator restricts the number of parties to three which favours the dict underlying Counter. A more accurate comparison might be made by taking different sized samples from actual data.
END.
No comments:
Post a Comment