tag:blogger.com,1999:blog-11149365.post3933142596944992932..comments2024-03-23T04:34:59.089+00:00Comments on Go deh!: Wide Finder on the command linePaddy3118http://www.blogger.com/profile/06899509753521482267noreply@blogger.comBlogger6125tag:blogger.com,1999:blog-11149365.post-3612477699421102282007-10-16T19:00:00.000+01:002007-10-16T19:00:00.000+01:00Please anonymous peoples (anonimi )? Could you add...Please anonymous peoples (anonimi )? Could you add some sort of distinguishing name/number so I can refer to you individually?<BR/><BR/>For a parallel version, I hope to get some time one evening to try <A HREF="http://paddy3118.blogspot.com/2007/10/multi-processing-design-pattern.html" REL="nofollow"> this </A> at work on their dual & quad core machines. <BR/><BR/>On the top 10 suggestion, I can't fully follow your example - I hate bloggers comment box formatting too - but I tried to use the built-in sort to find thhe top 10 rather than explicit AWK loops as the undferlying C code should be faster.<BR/><BR/><BR/>I like the idea of a using 'sort -r| head' instead of 'sort|tail' I should time it :-)<BR/><BR/>- Paddy.Paddy3118https://www.blogger.com/profile/06899509753521482267noreply@blogger.comtag:blogger.com,1999:blog-11149365.post-28649973643176708462007-10-16T14:32:00.000+01:002007-10-16T14:32:00.000+01:00It seems to me that using head instead of tail and...It seems to me that using head instead of tail and giving the second sort a "-r" should speed up the things a little bit.<BR/><BR/>In addition, it would be very interesting to see how much time the different programs take (e.g., I suspect the second sort to be more costly than the first, but how does the first sort relate to the grep?)Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-11149365.post-18873397235879027672007-10-16T08:46:00.000+01:002007-10-16T08:46:00.000+01:00Perhaps you could still speed up by only keeping a...Perhaps you could still speed up by only keeping a top 10 of most frequent requests against which you sort (and forgetting about the rest).<BR/><BR/>Something like this (sorry about the formatting):<BR/><BR/>match($0, /GET [^ .]+ /) {<BR/> counts[substr($0, RSTART, RLENGTH)]++<BR/>}<BR/>END {<BR/> for (req in counts) {<BR/> x = counts[req]<BR/> y = req<BR/> for (i=1; i<=n; i++)<BR/> if (top[i,0] < x) {<BR/> xt = top[i,0]<BR/> yt = top[i,1]<BR/> top[i,0] = x<BR/> top[i,1] = y<BR/> x = xt<BR/> y = yt<BR/> }<BR/> if (n<=10) {<BR/> n++<BR/> top[n,0] = x<BR/> top[n,1] = y<BR/> }<BR/> }<BR/> OFS=": "<BR/> for (i=1; i<=10; i++)<BR/> print top[i,0], top[i,1]<BR/>}Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-11149365.post-82006277351943495512007-10-16T08:29:00.000+01:002007-10-16T08:29:00.000+01:00Thanks for showing that awk still rocks.How about ...Thanks for showing that awk still rocks.<BR/><BR/>How about a parallel version now, like <A HREF="http://www.tbray.org/ongoing/When/200x/2007/09/27/WF-Meta#c1191270394.675213" REL="nofollow">here</A>?Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-11149365.post-53467187830252868722007-10-11T20:50:00.000+01:002007-10-11T20:50:00.000+01:00Thanks Andrew for the feedback!I'm still comfortab...Thanks Andrew for the feedback!<BR/>I'm still comfortable with awk, (I learn t it well before I knew Perl or Python).Paddy3118https://www.blogger.com/profile/06899509753521482267noreply@blogger.comtag:blogger.com,1999:blog-11149365.post-28009940780762972552007-10-11T18:51:00.000+01:002007-10-11T18:51:00.000+01:00Cool! Until I got comfortable with perl, I wrote ...Cool! Until I got comfortable with perl, I wrote a lot of awk and gawk code.<BR/><BR/>I didn't have gawk on my machine so I installed the latest version, 3.1.5. My wall-clock time for clv5.awk was 3.36 seconds (2.234u 0.948s 0:03.36 94.3%). That's a lot faster than normal awk, which died after 30 seconds saying it knew nothing about asort. But it's slower than most of the versions I tested.Andrew Dalkehttps://www.blogger.com/profile/17091314849699854287noreply@blogger.com