unzip is straight-forward to calculate because:
>>> t1 = (0,1,2,3)
>>> t2 = (7,6,5,4)
>>> [t1,t2] == zip(*zip(t1,t2))
True
Explanation
In answer to a commentator, I have written a (large), program to explain the above.
unzip_explained.py:
'''
Explanation of unzip expression zip(*zip(A,B))
References:
1: Unpacking argumment lists
http://www.network-theory.co.uk/docs/pytut/UnpackingArgumentLists.html
2: Zip
>>> help(zip)
Help on built-in function zip in module __builtin__:
zip(...)
zip(seq1 [, seq2 [...]]) -> [(seq1[0], seq2[0] ...), (...)]
Return a list of tuples, where each tuple contains the i-th element
from each of the argument sequences. The returned list is truncated
in length to the length of the shortest argument sequence.
'''
def show_args(*positional, **kwargs):
"Straight-forward function to show its arguments"
n = 0
for p in positional:
print " positional argument", n, "is", p
n += 1
for k,v in sorted(kwargs.items()):
print " keyword argument", k, "is", v
A = tuple( "A%i" % n for n in range(3) )
print "\n\nTuple A is:"; print " ", A
B = tuple( "B%i" % n for n in range(3) )
print "Tuple B is:"; print " ", B
print "\nLets go slowly through the expression: [A,B] == zip(*zip(A,B))\n"
print "List [A,B] is:"
print " ", [A,B]
print "zip(A,B) has arguments:"; show_args(A,B)
print "zip(A,B) returns:"
print " ", zip(A,B)
print "The leftmost zip in zip(*zip(A,B)), due"
print " to the 'list unpacking' of the previous"
print " value has arguments of:"; show_args(*zip(A,B))
print "The outer zip therefore returns:"
print " ", zip(*zip(A,B))
print "Which is the same as [A,B]\n"
And here is the program output:
Tuple A is:
('A0', 'A1', 'A2')
Tuple B is:
('B0', 'B1', 'B2')
Lets go slowly through the expression: [A,B] == zip(*zip(A,B))
List [A,B] is:
[('A0', 'A1', 'A2'), ('B0', 'B1', 'B2')]
zip(A,B) has arguments:
positional argument 0 is ('A0', 'A1', 'A2')
positional argument 1 is ('B0', 'B1', 'B2')
zip(A,B) returns:
[('A0', 'B0'), ('A1', 'B1'), ('A2', 'B2')]
The leftmost zip in zip(*zip(A,B)), due
to the 'list unpacking' of the previous
value has arguments of:
positional argument 0 is ('A0', 'B0')
positional argument 1 is ('A1', 'B1')
positional argument 2 is ('A2', 'B2')
The outer zip therefore returns:
[('A0', 'A1', 'A2'), ('B0', 'B1', 'B2')]
Which is the same as [A,B]
What happens in the third line, can you explain that in more details?
ReplyDeleteCool!
ReplyDelete@repei
ReplyDelete>>> zip(*zip(A,B)) == [A, B]
True
>>> zip(*zip(*zip(A,B))) == zip(A, B)
True
zip is unzip is zip. This is a bit weird to think about.
Lisp hacks seem to consider this obvious, but us mere mortals sometimes need a little help. Thanks for a fantastic post!
I considered having this as one of the problems for the Python Lab at PyCon, but decided it was too much of a 'trick' problem.
This property of the zip function in Python is only true if you zip tuples and not lists.
ReplyDelete>>> A = [1,2,3]
>>> B = [4,5,6]
>>> zip(*zip(A,B)) == [A,B]
False
When I give zip a tuple of lists, I want a tuple of lists back from unzip:
>>> def unzip(a):
.......return tuple(map(list,zip(*a)))
>>> unzip(zip(A,B)) == (A,B)
True
>>> zip(*unzip(zip(A,B))) == zip(A,B)
True
Now that you mention it, it's obvious! Even more so if you think of zip as (a truncating) matrix transpose.
ReplyDeleteI just didn't know about unpacking function arguments by '*'. It all became clear for me now. Thank you.
ReplyDeleteThat is really funny... how could anyone use Python's zip() function and NOT realize that it is its own inverse???
ReplyDeleteGood to see another Python/EDA hacker out there. I'm in the same boat... Pythonista and a bit of a hardware hacker. I've got some random utilities up at http://tonquil.homeip.net/~dlenski
LOL. That's awesome. I've never thought about it. <3 self-composable functions in python library.
ReplyDeleteUnfortunately there is quite a low limit on how many values you can unpack, so this method is only applicable to small lists.
ReplyDeleteUncloak and state this limit you found.
ReplyDeletenice
ReplyDeleteI wrote a simple unzip in Python using list comprehensions.
ReplyDeletedef unzip(lst):
if lst == []:
return ()
else:
return ([x[0] for x in lst], [x[1] for x in lst])
Slightly inefficient because of the two loops over the same list, but I like the elegancy.
Hi anonymous, That may be good for two lists, but what about three or more?
ReplyDelete>>> t1 = (0,1,2,3)
>>> t2 = (7,6,5,4)
>>> t3 = (2,4,6,8)
>>> t4 = (7,5,3,1)
>>> z1,z2,z3,z4 = zip(*zip(t1,t2,t3,t4))
>>> (t1,t2,t3,t4) == (z1,z2,z3,z4)
True
- Paddy.
Very enlightening. Thanks!
ReplyDeleteMy initial thought was that this wouldn't work for large lists as well, but size doesn't seem to be a problem.
def check(x):
a = range(0,x,1)
b = range(0,0-x,-1)
t1 = time.time()
c = zip(a,b)
t2 = time.time()
a2,b2 = zip(*c)
t3 = time.time()
assert(a == list(a2))
assert(b == list(b2))
print "zip:%0.2fs unzip:%0.2fs"%(t2-t1, t3-t2)
check( 100000) = zip:0.02s unzip:0.02s
check( 1000000) = zip:0.17s unzip:0.83s
check( 10000000) = zip:2.11s unzip:13.63s
check(100000000) = MemoryError
Doing zip(*somelistoftuples) is not an inverse of zip. For example, if it were then the following would be true:
ReplyDeletesomelist = zip(range(5), "word1 word2 word3 word4 word5".split())
somelist == zip(*zip(somelist))
And it most definitely is not.
Oh good. 'Cos that's not what I state in the original t1 and t2 example of the article.
Delete:-)