Monday, March 25, 2013

itertools.first ?

Note: Extended with Stop Press! section.

Note: Extended later with "Identity with next" section.
Note: Extended later with "Deviant" section.



I wrote something on Rosetta Code and found myself wanting, (again), a version of the any() function that returned the first True elements value rather than the Boolean value True.

The code was calculating the Harshad numbers defined as a positive integer that is divisible by the sum of its digits. I need to show the first 20 and the one greater than 1000.

I came up with the following:


>>> import itertools
>>> def harshad():
    n=1
    while True:
        if n % sum(int(ch) for ch in str(n)) == 0:
            yield n
        n += 1

>>> list(itertools.islice(harshad(), 0, 22))
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 18, 20, 21, 24, 27, 30, 36, 40, 42, 45, 48]
>>> for n in harshad():
    if n > 1000:
        print(n)
        break


1002



It does the job, but I wanted a more functional way of finding the 1000'th element.

I tried to use any but was stuck because it only returns True or False. I could make it work :


>>> tmp = []
>>> any((tmp.append(x), ) for x in harshad() if x > 1000)
True
>>> tmp[0]
1002
>>>


But the above is an obscuring hack to get the value of x out of the genexp.

Needed

What I need is a function that I will call first, that works like any but returns the first value that is true (rather than True) on success, otherwise False.

If any can be defined similarly to:


def any(iterable):
    for value in iterable:
        if value:
            return True
    return False


Then function first would be defined as:


def first(iterable):
    for value in iterable:
        if value:
            return value
    return False


And I could then have written:


>>> first(x for x in harshad() if x > 1000)
1002
>>>


Which is so clean I would think its a candidate for adding to the itertools module.

What do you think?

(I'll add this to Reddit and g+ for discussion as well).

Stop Press!

Darn! I have just worked out that I could have written the following:


>>> (x for x in harshad() if x > 1000).__next__()
1002
>>>


It works and is not so bad looking. Not as good as a first() function, but usable.


Identity with next

aceofears on Reddit pointed me in the direction of the default value for the next() function. That got me thinking and I think I have an identity between my first() and the existing next()

first(iterator)  next((item for item in iterator if item),
                       False) 

If the above holds then I know I'd rather write first(), but could live without it {melodramatic sigh}.

Oh, these should hold too:


any(iterator)  next((True for item in iterator if item),
                     False) 


all(iterator)  next((False for item in iterator if not item),
                     True) 


Deviant

There is an identy missing. any() is to first() as all() is to ???
I would call this missing function deviant(). Its action would be to return the first item from an iterator that was false rather than the Boolean False.

So:


deviant(iterator)  next((item for item in iterator if not item),
                         True) 


(I have not worked out a use case for this as yet though).