Lets say you have data and want to create a list of the running sum of the data. For example if the data is [1,2,3] the running sum is [1, 3, 6]
Doing this outside a comprehension:
data = [1, 2, 3] s = 0 # accumulator sums = [] for x in data: s += x sums.append(s) print(sums) # [1, 3, 6]
Now this needs the accumulator s, initialised to zero, but comprehensions create their own local variables and the syntax does not allow you to write a simple assignment within it, and each item from the comprehension is stated first in its syntax .
The Hack
I'll just show the hack then work through it afterwards.
In [1]: data = [1, 2, 3] In [2]: [s for s in [0] for x in data for s in [s + x]] Out[2]: [1, 3, 6]
When converting a comprehension into similar for statements then the output expression at the beginning of the comprehension is thought of as moving to inside the rightmost if or for section of the comprehension, so we get:
# Comprehension over many lines [s # output expression for s in [0] # For clauses (nested) for x in data for s in [s + x]] # Is similar too... for s in [0]: # For clauses (nested) for x in data: for s in [s + x]: print(s, end=' ') # output expresion: 1 3 6
Explanation
In the comprehension, the initial
[s for s in [0] ...
says:
- Individual items of the comprehension will be the expression s.
(Remember the output expression is stated first , but from the environment at the right of the comprehension). - In the comprehensions local scope we use the one-entry outer for loop to set local s to zero.
The middle for loop of the comprehension just iterates over the data
The final for loop of the comprehension is special:
... for s in [s + x]]
s is set to itself plus the next item of data, x, using iteration over a one element list [s + x]:
- For the first x, s was initialised to zero in the local scope via the outermost for.
- s becomes 0 + data[0] in the inner loop and becomes the first output expression value, 1.
- For the second iteration of the middle loop, x = data[1], so s then becomes 0 + data[0] + data[1]. The second evaluation of the the output expression for the comprehension, 3.
- And so on...
Multiple local variables
We can generalise this. Here we generate running sums, and running sums of the squares which needs two local variables s and s2:
In [3]: data = [1, 2, 3] In [4]: [(s, s2) for s, s2 in [(0, 0)] for x in data for s, s2 in [(s + x, s2 + x**2)]] Out[4]: [(1, 1), (3, 5), (6, 14)]
Summary
- You can satisfy the need for local variables in comprehensions.
- Its a hard to understand hack!
UPDATE: Added Walrus:
In [5]: # We had: In [6]: data = [1, 2, 3] In [7]: [s for s in [0] for x in data for s in [s + x]] Out[7]: [1, 3, 6] In [9]: # With := In [10]: s = 0 In [11]: [s := (s + x) for x in data] Out[11]: [1, 3, 6] In [12]: In [13]: # We then had: In [14]: del s In [15]: [(s, s2) for s, s2 in [(0, 0)] for x in data for s, s2 in [(s + x, s2 + x**2)]] Out[15]: [(1, 1), (3, 5), (6, 14)] In [16]: # Which becomes: In [17]: s = s2 = 0 In [18]: [(s := s + x, s2 := s2 + x**2) for x in data] Out[18]: [(1, 1), (3, 5), (6, 14)]
Steven D'Aprano has written a similar explanation in reply to my query here: https://discuss.python.org/t/whats-new-but-obscure-in-python-3-9/9172/5?u=paddy3118
ReplyDelete