(Best viewed on a larger screen).
There's a task that goes:
Given a list of integers and a target integer, return all "normalised" triplets of numbers from that list that sum to the target value.
Normalised triplets are a tuple of three values in sorted order, i.e. (t0, t1, t2) where t0 <= t1 <= t2.
First, simple, thoughts
A straight-forward implementation might be to:
- Generate all triplets from the numbers.
- Filter out those that sum to the target
numbers = [2, 0, 1, 3] list(combinations(numbers, 3)) = [(2, 0, 1), (2, 0, 3), (2, 1, 3), (0, 1, 3)] sorted(numbers) = [0, 1, 2, 3] list(combinations(sorted(numbers), 3)) = [(0, 1, 2), (0, 1, 3), (0, 2, 3), (1, 2, 3)]
We see that the sorted numbers produce normalised triplets
triplet_sum1([2, 0, 1, 3], 4) = [(0, 1, 3)]
triplet_sum1([0, 1, 3, 2, 4, 5, 6], 11) = [(0, 5, 6), (1, 4, 6), (2, 3, 6), (2, 4, 5)]
triplet_sum1([0, 1, 2], 6) = []
triplet_sum1([0, 1, 2], 3) = [(0, 1, 2)]
triplet_sum1([0, 1, 2], 1) = []
triplet_sum1([0, 1], 2) = []
triplet_sum1([0, 1], 1) = []
triplet_sum1([0], 0) = []
triplet_sum1([], 99) = []
The above looks correct and triplet_sum1
is neat and small.
But that combinations can take a long time even if most of its terms are to be filtered out
CPU times: total: 6.59 s Wall time: 6.58 s
[(0, 1, 5), (0, 2, 4), (1, 2, 3)]
Explicit, short-circuited, combinations
If we generate our own triples we should be able to short circuit them if they cannot sum to the target.
Let's generate triples from numbers.
(i,j,k) = (0, 1, 2); (item, jtem, ktem) = (0, 10, 20); sum([item, jtem, ktem]) = 30
(i,j,k) = (0, 1, 3); (item, jtem, ktem) = (0, 10, 30); sum([item, jtem, ktem]) = 40
(i,j,k) = (0, 2, 3); (item, jtem, ktem) = (0, 20, 30); sum([item, jtem, ktem]) = 50
(i,j,k) = (1, 2, 3); (item, jtem, ktem) = (10, 20, 30); sum([item, jtem, ktem]) = 60
Because numbers is sorted, and the triples are produced in this sorted order then, the sum of the triples will never decrease.
We can create the sum with less additions by using partial sums:
(item, jtem, ktem) = (0, 10, 20); ksum = 30 (item, jtem, ktem) = (0, 10, 30); ksum = 40 (item, jtem, ktem) = (0, 20, 30); ksum = 50 (item, jtem, ktem) = (10, 20, 30); ksum = 60
Those (partial) sums never decrease either so if any of the partial sums are more than the target then its whole for-loop can be skipped.
numbers = [0, 10, 20, 30, 40, 50, 60, 70, 80, 90]
target = 30
# iloop (item,) = (0,)
# jloop (item, jtem) = (0, 10)
# kloop (item, jtem, ktem) = (0, 10, 20)
ANSWER: (0, 10, 20)
# kloop (item, jtem, ktem) = (0, 10, 30)
# jloop (item, jtem) = (0, 20)
# kloop (item, jtem, ktem) = (0, 20, 30)
# jloop (item, jtem) = (0, 30)
# kloop (item, jtem, ktem) = (0, 30, 40)
# jloop (item, jtem) = (0, 40)
# iloop (item,) = (10,)
# jloop (item, jtem) = (10, 20)
# kloop (item, jtem, ktem) = (10, 20, 30)
# jloop (item, jtem) = (10, 30)
# iloop (item,) = (20,)
# jloop (item, jtem) = (20, 30)
# iloop (item,) = (30,)
# jloop (item, jtem) = (30, 40)
# iloop (item,) = (40,)
Look for the breaking out of inner loops when the (partial) sums exceed target
.
We now get our second implementation:
triplet_sum1([2, 0, 1, 3], 4) = [(0, 1, 3)] triplet_sum2([2, 0, 1, 3], 4) = [(0, 1, 3)] --- triplet_sum1([0, 1, 3, 2, 4, 5, 6], 11) = [(0, 5, 6), (1, 4, 6), (2, 3, 6), (2, 4, 5)] triplet_sum2([0, 1, 3, 2, 4, 5, 6], 11) = [(0, 5, 6), (1, 4, 6), (2, 3, 6), (2, 4, 5)] --- triplet_sum1([0, 1, 2], 6) = [] triplet_sum2([0, 1, 2], 6) = [] --- triplet_sum1([0, 1, 2], 3) = [(0, 1, 2)] triplet_sum2([0, 1, 2], 3) = [(0, 1, 2)] --- triplet_sum1([0, 1, 2], 1) = [] triplet_sum2([0, 1, 2], 1) = [] --- triplet_sum1([0, 1], 2) = [] triplet_sum2([0, 1], 2) = [] --- triplet_sum1([0, 1], 1) = [] triplet_sum2([0, 1], 1) = [] --- triplet_sum1([0], 0) = [] triplet_sum2([0], 0) = [] --- triplet_sum1([], 99) = [] triplet_sum2([], 99) = []
Check the timings:
CPU times: total: 6.59 s Wall time: 6.6 s CPU times: total: 0 ns Wall time: 0 ns
[(0, 1, 5), (0, 2, 4), (1, 2, 3)]
lets try timing multiple runs of ttriplet_sum2 as it is so fast:
29.7 µs ± 89.1 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
So that's microseconds vs seconds.
Space - the final frontier
Looking again at triplet_sum2, I see that after the necessary sorting; many sliced copies of the numbers
list are created in both enumerations.
We could use islice
to stop those copies.
13.9 µs ± 34.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
Hmmm, even faster than triplet_sum2 as it does not have to create extra list slices.
Lets see if it tests the same:
triplet_sum2([2, 0, 1, 3], 4) = [(0, 1, 3)] triplet_sum3([0, 1, 2, 3], 4) = [(0, 1, 3)] --- triplet_sum2([0, 1, 3, 2, 4, 5, 6], 11) = [(0, 5, 6), (1, 4, 6), (2, 3, 6), (2, 4, 5)] triplet_sum3([0, 1, 2, 3, 4, 5, 6], 11) = [(0, 5, 6), (1, 4, 6), (2, 3, 6), (2, 4, 5)] --- triplet_sum2([0, 1, 2], 6) = [] triplet_sum3([0, 1, 2], 6) = [] --- triplet_sum2([0, 1, 2], 3) = [(0, 1, 2)] triplet_sum3([0, 1, 2], 3) = [(0, 1, 2)] --- triplet_sum2([0, 1, 2], 1) = [] triplet_sum3([0, 1, 2], 1) = [] --- triplet_sum2([0, 1], 2) = [] triplet_sum3([0, 1], 2) = [] --- triplet_sum2([0, 1], 1) = [] triplet_sum3([0, 1], 1) = [] --- triplet_sum2([0], 0) = [] triplet_sum3([0], 0) = [] --- triplet_sum2([], 99) = [] triplet_sum3([], 99) = []
Pairs, triplets, quads, ... Prancing Pointers
The indices i, j, k
are fixed at three variables for generating triplets, and rely on a
nesting of a static three for-loops. If I used an array of all these
indices in order, called indices
I could have a pointer called incrementing
that indexed which of these indices
to increment at any time.
Similarly I could keep isum, jsum, ksum
the (partial) summs of the elements of numbers
pointed to by indices
, in its own array called accum
- short for accumulator.
Last but not least, I will use an argument n_tuple
defaulting to 3 to sum triples, 4 to sum quads, ...
You can see why I mention prancing pointers.
Lets check it works for triplets.
triplet_sum3([2, 0, 1, 3], 4) = [(0, 1, 3)] triplet_sum4([0, 1, 2, 3], 4) = [(0, 1, 3)] --- triplet_sum3([0, 1, 3, 2, 4, 5, 6], 11) = [(0, 5, 6), (1, 4, 6), (2, 3, 6), (2, 4, 5)] triplet_sum4([0, 1, 2, 3, 4, 5, 6], 11) = [(0, 5, 6), (1, 4, 6), (2, 3, 6), (2, 4, 5)] --- triplet_sum3([0, 1, 2], 6) = [] triplet_sum4([0, 1, 2], 6) = [] --- triplet_sum3([0, 1, 2], 3) = [(0, 1, 2)] triplet_sum4([0, 1, 2], 3) = [(0, 1, 2)] --- triplet_sum3([0, 1, 2], 1) = [] triplet_sum4([0, 1, 2], 1) = [] --- triplet_sum3([0, 1], 2) = [] triplet_sum4([0, 1], 2) = [] --- triplet_sum3([0, 1], 1) = [] triplet_sum4([0, 1], 1) = [] --- triplet_sum3([0], 0) = [] triplet_sum4([0], 0) = [] --- triplet_sum3([], 99) = [] triplet_sum4([], 99) = []
Can it reject unnecessary triplets?
1.42 ms ± 8.19 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
This time its milliseconds instead of the seconds of triplet_sum1, but not as fast as the microseconds of triplet_sum3.
Versatility
Lets try other than triplets with this new for-loop-less algorithm.
Help on function triplet_sum4 in module __main__: triplet_sum4(numbers: list[int], target: int, n_tuple: int = 3) -> list[tuple[int]] Generate triplets from numbers that sum to target in sorted order. Indexes the sorted numbers without otherwise copying them. n_tuple=3 for triplets, 2 for pairs, etc, for any +ve int triplet_sum4([0, 1, 2, 3, 4], 5, 2) = [(1, 4), (2, 3)] triplet_sum4([0, 1, 2, 3, 4, 5, 6], 5, 2) = [(0, 5), (1, 4), (2, 3)] triplet_sum4([0, 1, 2, 3, 4, 5, 6], 5, 3) = [(0, 1, 4), (0, 2, 3)] triplet_sum4([0, 1, 2, 3, 4, 5, 6], 5, 4) = []
END.
No comments:
Post a Comment