[Not suitable for viewing on phones]
Someone on Linkedin submitted a code Kata from the Codewars site:
Simple Encryption #1 - Alternating Split
Implement a pseudo-encryption algorithm which given a string S and an integer N concatenates all the odd-indexed characters of S with all the even-indexed characters of S, this process should be repeated N times.
Examples:
encrypt("012345", 1) => "135024"
encrypt("012345", 2) => "135024" -> "304152"
encrypt("012345", 3) => "135024" -> "304152" -> "012345"encrypt("01234", 1) => "13024"
encrypt("01234", 2) => "13024" -> "32104"
encrypt("01234", 3) => "13024" -> "32104" -> "20314"
Together with the encryption function, you should also implement a decryption function which reverses the process.
If the string S is an empty value or the integer N is not positive, return the first argument without changes.
The task author was user5036852
I liked that the task description was compact but that a solution might be fun so attempted it
First thoughts
Encryption
- String slicing: evens are
[0::2]
, odds are[1::2]
Decryption
- Hmm, find halfway of string then use
zip(first_half, second_half)
with extra magic. - The examples all have an even number of characters in the string - What happens for odd?
- We may need zip_longest
The code
Notice the if statements at the beginning of each function that cater for "bad" inputs.
The encoder flows directly from the task description.
The decoder did need zip_longest to grab that last odd character by pairing it with the null string, ''
, as fillvalue.
Some tests
def test_simple_functionality(encrypter, decrypter):
"Tests for using short, even and odd length strings over a small range of N"
for txt in ['012345', '0123456']: # Notice even then odd-length
for n in range(4):
enc = encrypter(txt, n)
dec = decrypter(enc, n)
print(f"{txt, n, enc, dec = }")
assert txt == decrypter(encrypter(txt, n), n)
test_simple_functionality(encrypt, decrypt)
txt, n, enc, dec = ('012345', 0, '012345', '012345') txt, n, enc, dec = ('012345', 1, '135024', '012345') txt, n, enc, dec = ('012345', 2, '304152', '012345') txt, n, enc, dec = ('012345', 3, '012345', '012345') txt, n, enc, dec = ('0123456', 0, '0123456', '0123456') txt, n, enc, dec = ('0123456', 1, '1350246', '0123456') txt, n, enc, dec = ('0123456', 2, '3041526', '0123456') txt, n, enc, dec = ('0123456', 3, '0123456', '0123456')
The code seems to work!
When submitted to Codewars for their more extensive set of tests, then the code above was accepted.
Patterns!
Looking at the test output above, it seems:
encrypter(txt, n=4) == txt
Lets look deeper:
def test_simple_functionality2(encrypter, decrypter, nmax=8):
"Tests for using short, even and odd length strings over a small range of N"
for txt in ['012345', '0123456']: # Notice even then odd-length
for n in range(nmax):
enc = encrypter(txt, n)
dec = decrypter(enc, n)
print(f"{txt, n, enc, dec = }")
assert txt == decrypter(encrypter(txt, n), n)
test_simple_functionality2(encrypt, decrypt)
txt, n, enc, dec = ('012345', 0, '012345', '012345') txt, n, enc, dec = ('012345', 1, '135024', '012345') txt, n, enc, dec = ('012345', 2, '304152', '012345') txt, n, enc, dec = ('012345', 3, '012345', '012345') txt, n, enc, dec = ('012345', 4, '135024', '012345') txt, n, enc, dec = ('012345', 5, '304152', '012345') txt, n, enc, dec = ('012345', 6, '012345', '012345') txt, n, enc, dec = ('012345', 7, '135024', '012345') txt, n, enc, dec = ('0123456', 0, '0123456', '0123456') txt, n, enc, dec = ('0123456', 1, '1350246', '0123456') txt, n, enc, dec = ('0123456', 2, '3041526', '0123456') txt, n, enc, dec = ('0123456', 3, '0123456', '0123456') txt, n, enc, dec = ('0123456', 4, '1350246', '0123456') txt, n, enc, dec = ('0123456', 5, '3041526', '0123456') txt, n, enc, dec = ('0123456', 6, '0123456', '0123456') txt, n, enc, dec = ('0123456', 7, '1350246', '0123456')
Lets cut the chaff and:
- Just show data when the encoding == the input text.
- The txt could be any string without repeats - lets show the length of the string, assuming that it has no repeats
from string import ascii_letters, digits
VALID_CHARS = f"{digits}{ascii_letters}"
def show_when_encrypt_equals_input(encrypter, decrypter, nmax=16):
"Show n when encrypter(txt, n) == txt, for longer txt strings and range of n."
for lentxt, txt in ((i, VALID_CHARS[:i])
for i in range(len(VALID_CHARS) + 1)):
print()
for n in range(max(len(txt), 1) * 4 + 1):
enc = encrypter(txt, n)
if enc == txt:
print(f"enc == txt when: {lentxt, n = }")
show_when_encrypt_equals_input(encrypt, decrypt)
enc == txt when: lentxt, n = (0, 0) enc == txt when: lentxt, n = (0, 1) enc == txt when: lentxt, n = (0, 2) enc == txt when: lentxt, n = (0, 3) enc == txt when: lentxt, n = (0, 4) enc == txt when: lentxt, n = (1, 0) enc == txt when: lentxt, n = (1, 1) enc == txt when: lentxt, n = (1, 2) enc == txt when: lentxt, n = (1, 3) enc == txt when: lentxt, n = (1, 4) enc == txt when: lentxt, n = (2, 0) enc == txt when: lentxt, n = (2, 2) enc == txt when: lentxt, n = (2, 4) enc == txt when: lentxt, n = (2, 6) enc == txt when: lentxt, n = (2, 8) enc == txt when: lentxt, n = (3, 0) enc == txt when: lentxt, n = (3, 2) enc == txt when: lentxt, n = (3, 4) enc == txt when: lentxt, n = (3, 6) enc == txt when: lentxt, n = (3, 8) enc == txt when: lentxt, n = (3, 10) enc == txt when: lentxt, n = (3, 12) enc == txt when: lentxt, n = (4, 0) enc == txt when: lentxt, n = (4, 4) enc == txt when: lentxt, n = (4, 8) enc == txt when: lentxt, n = (4, 12) enc == txt when: lentxt, n = (4, 16) enc == txt when: lentxt, n = (5, 0) enc == txt when: lentxt, n = (5, 4) enc == txt when: lentxt, n = (5, 8) enc == txt when: lentxt, n = (5, 12) enc == txt when: lentxt, n = (5, 16) enc == txt when: lentxt, n = (5, 20) enc == txt when: lentxt, n = (6, 0) enc == txt when: lentxt, n = (6, 3) enc == txt when: lentxt, n = (6, 6) enc == txt when: lentxt, n = (6, 9) enc == txt when: lentxt, n = (6, 12) enc == txt when: lentxt, n = (6, 15) enc == txt when: lentxt, n = (6, 18) enc == txt when: lentxt, n = (6, 21) enc == txt when: lentxt, n = (6, 24) enc == txt when: lentxt, n = (7, 0) enc == txt when: lentxt, n = (7, 3) enc == txt when: lentxt, n = (7, 6) enc == txt when: lentxt, n = (7, 9) enc == txt when: lentxt, n = (7, 12) enc == txt when: lentxt, n = (7, 15) enc == txt when: lentxt, n = (7, 18) enc == txt when: lentxt, n = (7, 21) enc == txt when: lentxt, n = (7, 24) enc == txt when: lentxt, n = (7, 27) enc == txt when: lentxt, n = (8, 0) enc == txt when: lentxt, n = (8, 6) enc == txt when: lentxt, n = (8, 12) enc == txt when: lentxt, n = (8, 18) enc == txt when: lentxt, n = (8, 24) enc == txt when: lentxt, n = (8, 30) enc == txt when: lentxt, n = (9, 0) enc == txt when: lentxt, n = (9, 6) enc == txt when: lentxt, n = (9, 12) enc == txt when: lentxt, n = (9, 18) enc == txt when: lentxt, n = (9, 24) enc == txt when: lentxt, n = (9, 30) enc == txt when: lentxt, n = (9, 36) enc == txt when: lentxt, n = (10, 0) enc == txt when: lentxt, n = (10, 10) enc == txt when: lentxt, n = (10, 20) enc == txt when: lentxt, n = (10, 30) enc == txt when: lentxt, n = (10, 40) ...
enc == txt when: lentxt, n = (62, 162) enc == txt when: lentxt, n = (62, 168) enc == txt when: lentxt, n = (62, 174) enc == txt when: lentxt, n = (62, 180) enc == txt when: lentxt, n = (62, 186) enc == txt when: lentxt, n = (62, 192) enc == txt when: lentxt, n = (62, 198) enc == txt when: lentxt, n = (62, 204) enc == txt when: lentxt, n = (62, 210) enc == txt when: lentxt, n = (62, 216) enc == txt when: lentxt, n = (62, 222) enc == txt when: lentxt, n = (62, 228) enc == txt when: lentxt, n = (62, 234) enc == txt when: lentxt, n = (62, 240) enc == txt when: lentxt, n = (62, 246)
It seems that for each size of input txt, encoding == input when n % some_delta == 0
Lets find those deltas
from pprint import pp
def show_when_encrypt_equals_input2(encrypter, decrypter, nmax=16):
"Show n when encrypter(txt, n) == txt, for longer txt strings and range of n."
for lentxt, txt in ((i, VALID_CHARS[:i])
for i in range(len(VALID_CHARS) + 1)):
last = None
for n in range(max(len(txt), 1) * 4 + 1):
enc = encrypter(txt, n)
if enc == txt:
diff, last = (n - last, n) if last else (n, n)
yield lentxt, n, diff
pp(list(show_when_encrypt_equals_input2(encrypt, decrypt)))
[(0, 0, 0), (0, 1, 1), (0, 2, 1), (0, 3, 1), (0, 4, 1), (1, 0, 0), (1, 1, 1), (1, 2, 1), (1, 3, 1), (1, 4, 1), (2, 0, 0), (2, 2, 2), (2, 4, 2), (2, 6, 2), (2, 8, 2), (3, 0, 0), (3, 2, 2), (3, 4, 2), (3, 6, 2), (3, 8, 2), (3, 10, 2), (3, 12, 2), (4, 0, 0), (4, 4, 4), (4, 8, 4), (4, 12, 4), (4, 16, 4), (5, 0, 0), (5, 4, 4), (5, 8, 4), (5, 12, 4), (5, 16, 4), (5, 20, 4), (6, 0, 0), (6, 3, 3), (6, 6, 3), (6, 9, 3), (6, 12, 3), (6, 15, 3), (6, 18, 3), (6, 21, 3), (6, 24, 3), (7, 0, 0), (7, 3, 3), (7, 6, 3), (7, 9, 3), (7, 12, 3), (7, 15, 3), (7, 18, 3), (7, 21, 3), (7, 24, 3), (7, 27, 3), (8, 0, 0), (8, 6, 6), (8, 12, 6), (8, 18, 6), (8, 24, 6), (8, 30, 6), (9, 0, 0), (9, 6, 6), (9, 12, 6), (9, 18, 6), (9, 24, 6), (9, 30, 6), (9, 36, 6), (10, 0, 0), (10, 10, 10), (10, 20, 10), (10, 30, 10), (10, 40, 10), (11, 0, 0), (11, 10, 10), (11, 20, 10), ...
(62, 150, 6), (62, 156, 6), (62, 162, 6), (62, 168, 6), (62, 174, 6), (62, 180, 6), (62, 186, 6), (62, 192, 6), (62, 198, 6), (62, 204, 6), (62, 210, 6), (62, 216, 6), (62, 222, 6), (62, 228, 6), (62, 234, 6), (62, 240, 6), (62, 246, 6)]
Lets get those diffs w.r.t. the string length
lentxt2delta = {lentxt: delta
for lentxt, n, delta in show_when_encrypt_equals_input2(encrypt, decrypt)}
lentxt2delta
{0: 1, 1: 1, 2: 2, 3: 2, 4: 4, 5: 4, 6: 3, 7: 3, 8: 6, 9: 6, 10: 10, 11: 10, 12: 12, 13: 12, 14: 4, 15: 4, 16: 8, 17: 8, 18: 18, 19: 18, 20: 6, 21: 6, 22: 11, 23: 11, ... 53: 52, 54: 20, 55: 20, 56: 18, 57: 18, 58: 58, 59: 58, 60: 60, 61: 60, 62: 6}
print(list(lentxt2delta.values()))
[1, 1, 2, 2, 4, 4, 3, 3, 6, 6, 10, 10, 12, 12, 4, 4, 8, 8, 18, 18, 6, 6, 11, 11, 20, 20, 18, 18, 28, 28, 5, 5, 10, 10, 12, 12, 36, 36, 12, 12, 20, 20, 14, 14, 12, 12, 23, 23, 21, 21, 8, 8, 52, 52, 20, 20, 18, 18, 58, 58, 60, 60, 6]
I don't recognise that series. let's look it up
OEIS series lookup
Well lets get the output in OEIS form to create a search:
print(','.join(str(x) for x in list(lentxt2delta.values())[:12]))
1,1,2,2,4,4,3,3,6,6,10,10
That search found nothing.
What about removing every other value and doing another search?
print(','.join(str(x) for x in list(lentxt2delta.values())[:24:2]))
1,2,4,3,6,10,12,4,8,18,6,11
Ha, we have one match to A002326., (and subsequent values match our series too).
A002326
least m > 0 such that 2n+1 divides 2^m-1
- No code given, but how hard could coding it be :-)
[1, 2, 4, 3, 6, 10, 12, 4, 8, 18, 6, 11, 20, 18, 28]
New en/decrpters using a002326
Why?
Well for a string length of 62 we could need inner loops of no more than 6 iterations - no matter how large n
is!
It also seems that a002326(x) <= 2x, for x > 0
. I haven't proven that, but lets show it for up to some large string length:
[x for x in range(1, 1_000) if a002326(x) > x * 2]
[]
How?
Well,
- a002326 needs to be applied to the length of the input string
s
.- Not quite, as, remember, there was an initial doubling of diff values so lencth might need multiplying or dividing by 2.
Some playing around its a divide by 2 and we get:
def test_new_v_old():
fails = []
testcases = 0
for txt in [VALID_CHARS[:i] for i in range(len(VALID_CHARS) + 1)]:
# print()
lentxt = len(txt)
last = None
for n in range(max(len(txt), 1) * 5 + 1):
testcases += 1
enc = encrypt(txt, n)
dec = decrypt(enc, n)
enc2 = encrypt2(txt, n)
dec2 = decrypt2(enc2, n)
if enc != enc2 or dec != dec2:
fails.append((txt, n, enc, enc2, dec, dec2))
print(f"\n{testcases, len(fails) = }")
if not fails:
print('\n encode2() == encode() and decode2() == decode()')
%time test_new_v_old()
testcases, len(fails) = (9833, 0) encode2() == encode() and decode2() == decode() CPU times: total: 4.23 s Wall time: 4.34 s
Because of our testcases being a string of a particular length being encoded and decode with many values on N
lets cache a002326
from functools import cache
@cache
def a002326(n: int) -> int:
"Least m > 0 such that 2n+1 divides 2^m-1"
n21 = 2 * n + 1
m, m2 = 1, 2
while (m2 - 1) % n21:
m, m2 = m+1, m2 * 2
return m
%time test_new_v_old()
testcases, len(fails) = (9833, 0) encode2() == encode() and decode2() == decode() CPU times: total: 4.31 s Wall time: 4.47 s
a002326.cache_info()
CacheInfo(hits=19498, misses=32, maxsize=None, currsize=32)
%time test_new_v_old()
testcases, len(fails) = (9833, 0) encode2() == encode() and decode2() == decode() CPU times: total: 4.23 s Wall time: 4.45 s
No comments:
Post a Comment