Comments on Go deh!: Wide Finder on the command line

Please anonymous peoples (anonimi )? Could you add...

2007-10-16T19:00:00.000+01:00

Please anonymous peoples (anonimi )? Could you add some sort of distinguishing name/number so I can refer to you individually?

For a parallel version, I hope to get some time one evening to try this at work on their dual & quad core machines.

On the top 10 suggestion, I can't fully follow your example - I hate bloggers comment box formatting too - but I tried to use the built-in sort to find thhe top 10 rather than explicit AWK loops as the undferlying C code should be faster.

I like the idea of a using 'sort -r| head' instead of 'sort|tail' I should time it :-)

- Paddy.

It seems to me that using head instead of tail and...

2007-10-16T14:32:00.000+01:00

It seems to me that using head instead of tail and giving the second sort a "-r" should speed up the things a little bit.

In addition, it would be very interesting to see how much time the different programs take (e.g., I suspect the second sort to be more costly than the first, but how does the first sort relate to the grep?)

Perhaps you could still speed up by only keeping a...

2007-10-16T08:46:00.000+01:00

Perhaps you could still speed up by only keeping a top 10 of most frequent requests against which you sort (and forgetting about the rest).

Something like this (sorry about the formatting):

match($0, /GET [^ .]+ /) {
counts[substr($0, RSTART, RLENGTH)]++
}
END {
for (req in counts) {
x = counts[req]
y = req
for (i=1; i<=n; i++)
if (top[i,0] < x) {
xt = top[i,0]
yt = top[i,1]
top[i,0] = x
top[i,1] = y
x = xt
y = yt
}
if (n<=10) {
n++
top[n,0] = x
top[n,1] = y
}
}
OFS=": "
for (i=1; i<=10; i++)
print top[i,0], top[i,1]
}

Thanks for showing that awk still rocks.How about ...

2007-10-16T08:29:00.000+01:00

Thanks for showing that awk still rocks.

How about a parallel version now, like here?

Thanks Andrew for the feedback!I'm still comfortab...

2007-10-11T20:50:00.000+01:00

Thanks Andrew for the feedback!
I'm still comfortable with awk, (I learn t it well before I knew Perl or Python).

Cool! Until I got comfortable with perl, I wrote ...

2007-10-11T18:51:00.000+01:00

Cool! Until I got comfortable with perl, I wrote a lot of awk and gawk code.

I didn't have gawk on my machine so I installed the latest version, 3.1.5. My wall-clock time for clv5.awk was 3.36 seconds (2.234u 0.948s 0:03.36 94.3%). That's a lot faster than normal awk, which died after 30 seconds saying it knew nothing about asort. But it's slower than most of the versions I tested.