Best seen on a larger than landscape phone
Someone blogged about a particular problem:
From: https://theweeklychallenge.org/blog/perl-weekly-challenge-294/#TASK1
Given an unsorted array of integers, `ints`
Write a script to return the length of the longest consecutive elements sequence.
Return -1 if none found. *The algorithm must run in O(n) time.*
The solution they blogged used a sort which meant it could not be O(n) in time, but the problem looked good so I gave it some thought.
Sets! sets are O(1) in Python and are good for looking things up.
What if when looking at the inputted numbers, one at a time, you also looked for other ints in the input that would extend the int you have to form a longer range? Keep tab of the longest range so far and if you remove ints from the pool as they form ranges, when the pool is empty, you should know the longest range.
I added the printout of the longest range too.
My code
def consec_seq(ints) -> tuple[int, int, int]:
"Extract longest_seq_length, its_min, its_max"
pool = set(ints)
longest, longest_mn, longest_mx = 0, 1, 0
while pool:
this = start = pool.pop()
ln = 1
# check down
while (this:=(this - 1)) in pool:
ln += 1
pool.remove(this)
mn = this + 1
# check up
this = start
while (this:=(this + 1)) in pool:
ln += 1
pool.remove(this)
mx = this - 1
# check longest
if ln > longest:
longest, longest_mn, longest_mx = ln, mn, mx
return longest,longest_mn,longest_mx
def _test():
for ints in[(),
(69,),
(-20, 78, 79, 1, 100),
(10, 4, 20, 1, 3, 2),
(0, 6, 1, 8, 5, 2, 4, 3, 0, 7),
(10, 30, 20),
(2,4,3,1,0, 10,12,11,8,9), # two runs of five
(10,12,11,8,9, 2,4,3,1,0), # two runs of five - reversed
(2,4,3,1,0,-1, 10,12,11,8,9), # runs of 6 and 5
(2,4,3,1,0, 10,12,11,8,9,7), # runs of 5 and 6
]:
print(f"Input {ints = }")
longest, longest_mn, longest_mx = consec_seq(ints)
if longest <2:
print(" -1")
else:
print(f" The/A longest sequence has {longest} elements {longest_mn}..{longest_mx}")
# %%
if __name__ == '__main__':
_test()
Sample output
Input ints = ()
-1
Input ints = (69,)
-1
Input ints = (-20, 78, 79, 1, 100)
The/A longest sequence has 2 elements 78..79
Input ints = (10, 4, 20, 1, 3, 2)
The/A longest sequence has 4 elements 1..4
Input ints = (0, 6, 1, 8, 5, 2, 4, 3, 0, 7)
The/A longest sequence has 9 elements 0..8
Input ints = (10, 30, 20)
-1
Input ints = (2, 4, 3, 1, 0, 10, 12, 11, 8, 9)
The/A longest sequence has 5 elements 0..4
Input ints = (10, 12, 11, 8, 9, 2, 4, 3, 1, 0)
The/A longest sequence has 5 elements 0..4
Input ints = (2, 4, 3, 1, 0, -1, 10, 12, 11, 8, 9)
The/A longest sequence has 6 elements -1..4
Input ints = (2, 4, 3, 1, 0, 10, 12, 11, 8, 9, 7)
The/A longest sequence has 6 elements 7..12
Another Algorithm
What if, you kept and extended ranges untill you amassed all ranges then chose the longest? I need to keep the hash lookup. dict key lookup should also be O(1). What to look up? Look up ints that would extend a range!
If you have an existing (integer) range, say 1..3 inclusive of end points then finding 0 would extend the range to 0..3 or finding one more than the range maximum, 4 would extend the original range to 1..4
So if you have ranges then they could be extended by finding rangemin - 1 or rangemax +1. I call then extends
If you do find that the next int from the input ints is also an extends value then you need to find the range that it extends, (by lookup), so you can modify that range. - use a dict to map extends to their range and checking if an int is in the extends dict keys should also take O(1) time.
I took that sketch of an algorithm and started to code. It took two evenings to finally get something that worked and I had to work out several details that were trying. The main problem was what about coalescing ranges? if you have ranges 1..2 and 4..5 what happens when you see a 3? the resultant is the single range 1..5. It took particular test cases and extensive debugging to work out that the extends2range mapping should map to potentially more than one range and that you need to combine ranges if two of them are present for any extend value being hit.
So for 1..2 the extends being looked for are 0 and 3. For 4..5 the extends being looked for are 3, again, and 6. The extends2ranges data structure for just this should look like:
{0: [[1, 2]],
3: [[1, 2], [4, 5]],
6: [[4, 5]]}
The Code #2
from collections import defaultdict
def combine_ranges(min1, max1, min2, max2):
"Combine two overlapping ranges return the new range as [min, max], and a set of limits unused in the result"
assert (min1 <= max1 and min2 <= max2 # Well formed
and ( min1 <= max2 and min2 <= max1 )) # and ranges touch or overlap
range_limits = set([min1, max1, min2, max2])
new_mnmx = [min(range_limits), max(range_limits)]
unused_limits = range_limits - set(new_mnmx)
return new_mnmx, unused_limits
def consec_seq2(ints) -> tuple[int, int, int]:
"Extract longest_seq_length, its_min, its_max"
if not ints:
return -1, 1, -1
seen = set() # numbers seen so far
extends2ranges = defaultdict(list) # map extends to its ranges
for this in ints:
if this in seen:
continue
else:
seen.add(this)
if this not in extends2ranges:
# Start new range
mnmx = [this, this] # Range of one int
# add in the extend points
extends2ranges[this + 1].append(mnmx)
extends2ranges[this - 1].append(mnmx)
else:
# Extend an existing range
ranges = extends2ranges[this] # The range(s) that could be extended by this
if len(ranges) == 2:
# this joins the two ranges
extend_and_join_ranges(extends2ranges, this, ranges)
else:
# extend one range, copied
extend_and_join_ranges(extends2ranges, this, [ranges[0], ranges[0].copy()])
all_ranges = sum(extends2ranges.values(), start=[])
longest_mn, longest_mx = max(all_ranges, key=lambda mnmx: mnmx[1] - mnmx[0])
return (longest_mx - longest_mn + 1), longest_mn, longest_mx
def extend_and_join_ranges(extends2ranges, this, ranges):
mnmx, mnmx2 = ranges
mnmx_orig, mnmx2_orig = mnmx.copy(), mnmx2.copy() # keep copy of originals
mn, mx = mnmx
mn2, mx2 = mnmx2
if this == mn - 1:
mnmx[0] = mn = this # Extend lower limit of the range
if this == mn2 - 1:
mnmx2[0] = mn2 = this # Extend lower limit of the range
if this == mx + 1:
mnmx[1] = mx = this # Extend upper limit of the range
if this == mx2 + 1:
mnmx2[1] = mx2 = this # Extend lower limit of the range
new_mnmx, _unused_limits = combine_ranges(mn, mx, mn2, mx2)
remove_merged_from_extends(extends2ranges, this, mnmx, mnmx2)
add_combined_range_to_extends(extends2ranges, new_mnmx)
def add_combined_range_to_extends(extends2ranges, new_mnmx):
"Add in the combined of two ranges's extends"
new_mn, new_mx = new_mnmx
for extend in (new_mn - 1, new_mx + 1):
r = extends2ranges[extend] # ranges at new limit extension
if new_mnmx not in r:
r.append(new_mnmx)
def remove_merged_from_extends(extends2ranges, this, mnmx, mnmx2):
"Remove original ranges that were merged from extends"
for lohi in (mnmx, mnmx2):
lo, hi = lohi
for extend in (lo - 1, hi + 1):
if extend in extends2ranges:
r = extends2ranges[extend]
for r_old in (mnmx, mnmx2):
if r_old in r:
r.remove(r_old)
if not r:
del extends2ranges[extend]
# remove joining extend, this
del extends2ranges[this]
def _test():
for ints in[
(),
(69,),
(-20, 78, 79, 1, 100),
(4, 1, 3, 2),
(10, 4, 20, 1, 3, 2),
(0, 6, 1, 8, 5, 2, 4, 3, 0, 7),
(10, 30, 20),
(2,4,3,1,0, 10,12,11,8,9), # two runs of five
(10,12,11,8,9, 2,4,3,1,0), # two runs of five - reversed
(2,4,3,1,0,-1, 10,12,11,8,9), # runs of 6 and 5
(2,4,3,1,0, 10,12,11,8,9,7), # runs of 5 and 6
]:
print(f"Input {ints = }")
longest, longest_mn, longest_mx = consec_seq2(ints)
if longest <2:
print(" -1")
else:
print(f" The/A longest sequence has {longest} elements {longest_mn}..{longest_mx}")
# %%
if __name__ == '__main__':
_test()
Its Output
Input ints = ()
-1
Input ints = (69,)
-1
Input ints = (-20, 78, 79, 1, 100)
The/A longest sequence has 2 elements 78..79
Input ints = (4, 1, 3, 2)
The/A longest sequence has 4 elements 1..4
Input ints = (10, 4, 20, 1, 3, 2)
The/A longest sequence has 4 elements 1..4
Input ints = (0, 6, 1, 8, 5, 2, 4, 3, 0, 7)
The/A longest sequence has 9 elements 0..8
Input ints = (10, 30, 20)
-1
Input ints = (2, 4, 3, 1, 0, 10, 12, 11, 8, 9)
The/A longest sequence has 5 elements 0..4
Input ints = (10, 12, 11, 8, 9, 2, 4, 3, 1, 0)
The/A longest sequence has 5 elements 8..12
Input ints = (2, 4, 3, 1, 0, -1, 10, 12, 11, 8, 9)
The/A longest sequence has 6 elements -1..4
Input ints = (2, 4, 3, 1, 0, 10, 12, 11, 8, 9, 7)
The/A longest sequence has 6 elements 7..12
This second algorithm gives correct results but is harder to develop and explain. It's a testament to my stubbornness as I thought there was a solution there, and debugging was me flexing my skills to keep them honed.
END.