Friday, April 11, 2008

That history meme, in Python!


It seemsed that everyone was joining in the 'history meme', and finding out what was in their history:

They were all using unix command line tools, piped together to show the most frequent commands they had used.:
history|awk '{a[$2]++ } END{for(i in a){print a[i] " " i}}'|sort -rn|head

This being a Python blog, and knowing before-hand that Python is really awful at one-liners, I nevertheless decided that it would be of some use to try and pipe history to a Python one-liner that performed a similar task .

I came up with:
bash$ history | python  -c 'import sys,itertools,pprint; 
    pprint.pprint(sorted([
        (len(list(g)),k) for k,g in itertools.groupby(sorted([
            x.split()[1]for x in sys.stdin if len(x.split())>1])
            , lambda x:x)
        ])
        [-10:][::-1])' 
[(63, 'echo'),
 (41, 'history'),
 (30, './tst1.sh'),
 (28, 'declare'),
 (21, 'python'),
 (18, 'perl'),
 (18, 'cat'),
 (16, 'ls'),
 (15, 'xterm'),
 (15, 'history|python')]
bash$ 
I decided to break the command into multiple lines for readability above; it was developed, painfully, all on one line.

So readers, the above is something that Python stinks at - one-liners. (Unless you know a better way in Python ...)

- Paddy.

7 comments:

  1. You don't need the "lambda x:x". itertools.groupby defaults to the identity function for the key. That makes it a bit shorter!

    ReplyDelete
  2. You might like my pyline script, which I wrote to make Python work well in command pipelines. I use it all the time, and it's one of the first things I install on a new Unix box.

    ReplyDelete
  3. Hi David,
    Someone actually read that god-awful mess that I wrote - to such a degree that they understood it?!?!
    Thanks for correction. I had not known that about groupby. (I have gone from reading about it and thinking that I will never use it - it must be there for completeness sake, to using groupby several times).

    Hi fawcett,
    I had seen pyline before but my post was just to show Pythons limitations. I do know my way around standard Unix tools, as well as Perl and AWK and tend to use them for my huge one-liners :-)

    - Paddy.

    ReplyDelete
  4. P.S.
    The two list comprehensions could be replaced by generator expressions but they stayed as it was easier to find my way around the line with the occasional ']' or '[' to 'anchor' me when reading the mush.

    - Paddy.

    ReplyDelete
  5. python isn't that bad, if you have helper modules lying around...

    history | python -c 'import sys,pprint,count;
    pprint.pprint(count.count(x.split()[1] for x in sys.stdin)[:10])'

    works for me :)

    ReplyDelete
  6. Hi Justin,
    You just got me scrambling off searching for this unknown count module - which turns out to not be a standard module, which I guess is your point.

    No pulling magic bunnies out of a hat! :-)


    - Paddy.

    ReplyDelete
  7. Now, 7 years later, Python _has_ collections.Counter, whose most_common method is what you need. ;-)

    ReplyDelete