Go deh!

Mainly Tech projects on Python and Electronic Design Automation.

Monday, April 06, 2020

Spin the table: Solution!

An answer to a puzzle set by Matt Parker on his YouTube channel:



The Puzzle:


  1. A Circular table with positions 1..7
  2. Delegates numbered 1..7 arranged arbitrarily around the table.
  3. Delegates start with only one match to a correct seating position.
  4. Is it ALWAYS possible to spin the table and get two or more matches?
Find arrangements where it is not possible to spin the table to get more than one match.

The Model.

Let's number the positions on the table with integers 1 .. 7; and also give the delegates numbers 1 .. 7. If the delegate number equals the table number then that constitutes a seating match.

Storing the table order as a fixed list of [1, 2, ..7], the all the rotations of the table  can be modelled by the additional six rotations of that initial list found by popping the last number from the last rotation and appending to the front of the list, i.e:
[[1, 2, .. 7],
 [7, 1, .. 6],
 [6, 7, 1, .. 5],
 ..
 [2, 3, .. 7, 1]]

This is stored in variable spun.

The permutations of delegate numbers 1 .. 7 are all the possible orderings of delegates.
Every "base" perm will have six other perms which are just other rotations of the "base" ordering . If only the permutations generated with an initial 1 are used, (for example), then we can assume that any of the other six rotations of the permutation will also be an answer.

The Code.



1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
# -*- coding: utf-8 -*-
"""
Spin the table.

Created on Wed Mar 25 19:59:49 2020

@author: Paddy3118
"""

from itertools import permutations


table = list(range(1, 8))
spun = []                   # spinning table positions
for i in range(7):
    spun.append(table)
    table = table[-1:] + table[:-1]     # rotate once

def match_count(table, delegates):
    return sum(1 for t, d in zip(table, delegates)
               if t == d)

def max_matches(delegates):
    return max(match_count(table, delegates) for table in spun)

def spin_the_table():
    return [d for d in permutations(table)
            if d[0] == 1 and match_count(table, d) == 1 and max_matches(d) < 2]


if __name__ == '__main__':
    answer = spin_the_table()
    print(f"Found {len(answer)} initial arrangements of the delegates:")
    for a in answer:
        print(' ', str(a)[1:-1])
    print("The 6 rotations of each individual answer above are also solutions")

Answer:



Found 19 initial arrangements of the delegates:
  1, 3, 5, 7, 2, 4, 6
  1, 3, 6, 2, 7, 5, 4
  1, 3, 7, 6, 4, 2, 5
  1, 4, 2, 7, 6, 3, 5
  1, 4, 6, 3, 2, 7, 5
  1, 4, 7, 2, 6, 5, 3
  1, 4, 7, 3, 6, 2, 5
  1, 4, 7, 5, 3, 2, 6
  1, 5, 2, 6, 3, 7, 4
  1, 5, 4, 2, 7, 3, 6
  1, 5, 7, 3, 6, 4, 2
  1, 6, 2, 5, 7, 4, 3
  1, 6, 4, 2, 7, 5, 3
  1, 6, 4, 3, 7, 2, 5
  1, 6, 4, 7, 3, 5, 2
  1, 6, 5, 2, 4, 7, 3
  1, 7, 4, 6, 2, 5, 3
  1, 7, 5, 3, 6, 2, 4
  1, 7, 6, 5, 4, 3, 2
The 6 rotations of each individual answer above are also solutions


Matt Parker's Solution Video: (He mentions my solution at time 05:00).

END.

Saturday, January 25, 2020

Sharing another way?

Tickled!

Sharing came up in something I was reading that got me revisiting the Wikipedia page on the Thue Morse Sequence. Tucked away to the right is a static image of ones and zeroes with the caption:
 "When counting in binary, the digit sum modulo 2 is the Thue-Morse sequence"
Now I had already blogged about the Thue-Morse sequence with respect to sharing fairly:

If you had two people, Asia and Bain chosing from different cash bounties then if Asia went first she'd choose the largest cash bounty, then Bain would chose the next larget, ... if they alternate like this then Asia, who went first, would always get more than Bain; furthermore, Asias total accumulated bounty would increase over Bains over time like this:





In my earlier blog entry: Sharing Fairly, I looked into the Thue-Morse sequence for who gets to pick next and showed that using that sequence, and itsextension for more than two people sharing, the person with the most accumulated wealth switches over time and does not diverge like in the simple taking of turns shown above.

I showed how using a Thue-Morse sequence for multiple people sharing gave a "fairer" distribution of wealth, for example  this graph of accumulated wealth for three people:


Old Code

My code to generate the Thue Morse sequence for sharing between multiple people was this function:


def thue_morse(persons='ABC', mx=20, tutor=True)

Its algorithm works, and was straight forward to describe as a natural extension of the two-person case, but could generate more terms than requested that are then trimmed.

Back to the tickle

 "When counting in binary, the digit sum modulo 2 is the Thue-Morse sequence"
Now that method might be coded to only give the number of terms required. I might code it as a generator.  But as it stands it applies only to two people (binary).

I thought that an equivalent statement for many people might be:
"When counting base b, the digit sum modulo b is the Thue-Morse sequence of fairer sharing between b people"
I set out to code it.

Code Journey

"When counting base b" ... "digit sum" ...

I am going to need to express an integer in different bases, and sum the digits.

Base changing

Normally when expressing an integer in a different base, b,  it is for printout and the routine returns characters: The columns in the string denote, by position, the multiples of successive powers ob b.
For example:
    "123" base 10 == 3 * 10**0 + 2 * 10**1 + 1 * 10**2

If the base goes beyond 10 such as in hexadecimal, then letters A... are used to represent numbers beyond 9 as single characters.

We don't want that!

We need to sum the digits so in the _basechange_int function a list is returned of integers to make for easier summation later.
Integer 123 base 10 would return the list  [1, 2, 3]

People representation

I am used to generating the Thue-Morse sequence as a string ofupper-case characters where each individual characters corresponds to one persons turn in sharing. as in my original, I will code my new function to have a persons parameter which is a string of different characters, each representing a person.
The number of persons becomes the base, b, of my altered  statement above.
The algorithm will generate an int that will then need to be represented as a corresponding character from the persons string.

Generate

I decided to code the algorithm as a generator of successive terms from the given persons. islice is my friend :-)


Thue-Morse sequence generator from digit counts




1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
# -*- coding: utf-8 -*-
"""
Created on Sat Jan 18 08:34:29 2020

@author: Paddy3118
"""

#%%

from itertools import count, islice

def _basechange_int(num, b):
    """
    Return list of ints representing positive num in base b

    >>> b = 3
    >>> print(b, [_basechange_int(num, b) for num in range(11)])
    3 [[0], [1], [2], [1, 0], [1, 1], [1, 2], [2, 0], [2, 1], [2, 2], [1, 0, 0], [1, 0, 1]]
    >>>
    """
    if num == 0:
        return [0]
    result = []
    while num != 0:
        num, d = divmod(num, b)
        result.append(d)
    return result[::-1]

def _sumdigits_int(numlist, b):
    "Return the sum of numbers in numlist modulo b, in base b"
    return sum(numlist) % b

def fairshare(persons='ABC'):
    """
    Sequence of fairer sharing between persons
    We have:
      "When counting in binary, the digit sum modulo 2 is the Thue-Morse sequence"

    This computes:
        When counting base b, the digit sum modulo b is the Thue-Morse sequence
        of fairer sharing between b people
    """

    base = len(persons)     # number base
    num2person = dict(zip(range(base), persons))
    for i in count():
        yield num2person[_sumdigits_int(_basechange_int(i, base), base)]


if __name__ == '__main__':
    PERSONS = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
    for b in [10, 2, 3, 5, 7]:
        #print(b, [_basechange_int(num, b) for num in range(11)])
        print(b, ''.join(islice(fairshare(PERSONS[:b]), 44)))


Example runs:



In [24]: b = 3; print(b, ''.join(islice(fairshare(PERSONS[:b]), 44)))
3 ABCBCACABBCACABABCCABABCBCABCACABABCCABABCBC

In [25]: b = 2; print(b, ''.join(islice(fairshare(PERSONS[:b]), 44)))
2 ABBABAABBAABABBABAABABBAABBABAABBAABABBAABBA

In [26]: 

Checks

I can't do the maths to formally proove it, but I have compared outputs between the old and the new and they generate the same sequences.

P.S:

I decided to start the "Fairshare between two and more" task on Rosetta Code based on this.

Sunday, November 10, 2019

Quasi-random Sobol sequence



In which I revisit low-discrepancy series, Hunt for an understanding of the Sobol sequence, then translate and optimise a C++ program into Python.

Re-visit VdC

I had left a mental note to get back to the Van der Corput sequence after creating the Rosetta Code task some years ago. Van der Corput is a type of low-discrepancy or quasi random sequence often used in Financial simulations; and with ever larger digital systems on automotive System on Chips, I keep a look-out for Verification techniques to add to the toolbox.

I decided to investigate the Sobol Sequence, understand the maths and write some Python!

Sobol Maths

I followed references; I googled; I read; and I just could not gain an understanding of what is happening, enough to write some code. What code I initially browsed gave me only snippets - Those snippets were bit twiddling so I coded those up for fun.

Fun bit twiddles

Highest 1 bit in binary number

def binhi1(b):
    "highest 1 bit of binary b. (== ceil(log2(b)))"
    pos = 0
    while b > 0:
        pos, b = pos + 1, b // 2
    return pos

Example 1
for n in range(18):
    print(f"{n:>5b} {binhi1(n)}")
    
    0 0
    1 1
   10 2
   11 2
  100 3
  101 3
  110 3
  111 3
 1000 4
 1001 4
 1010 4
 1011 4
 1100 4
 1101 4
 1110 4
 1111 4
10000 5
10001 5

Lowest 0 bit in binary number

def binlo0(b):
    "lowest 0 bit from right of binary b. Always >=1"
    pos, b2 = 1, b // 2
    while b2 * 2 != b:
        pos, b, b2 = pos + 1, b2, b2 // 2
    return pos

Example 2

for n in range(18):
    print(f"{n:<05b binlo0="" font="" n="">)
00000 1
00001 2
00010 1
00011 3
00100 1
00101 2
00110 1
00111 4
01000 1
01001 2
01010 1
01011 3
01100 1
01101 2
01110 1
01111 5
10000 1
10001 2


Code to translate

One of the Sobol references was this one by S. Joe and F. Y. Kuo. They had written academic papers on Sobol and had generated constants for the use of the sequence in up to 21201 dimensions. They also had a C++ program with BSD-style license and, crucially, sample outputs from their C++ code that could aid me in writing  Python conversion of their code.

I downloaded their C++ and their latest set of direction numbers and took a look. Eager to code something, I coded up a reader for their directions file:

Initial dirctions file reader


def direction_file_reader(fname="Sobol_new-joe-kuo-6.21201"):
    """\
    Reads direction file of format:
        d       s       a       m_i     
        2       1       0       1 
        3       2       1       1 3 
        4       3       1       1 3 1 
        5       3       2       1 1 1 
        6       4       1       1 1 3 3
        ...
    """
    with open(fname) as f:
        headers = f.readline().strip().split()
        assert headers == ['d', 's', 'a', 'm_i']
        directions = []
        for line in f:
            d, s, a, *m = [int(cell) for cell in line.strip().split()]
            directions.append((d, s, a, tuple([0] + m)))
        return directions

The direction file was three single columns then all the other columns made up the list m.

C++ code direct translation.

I decided to translate the C++ code near verbatim to Python, then run the example and ensure that the Python gave the same output as stated for the C++ program.

I cut-n-pasted sections of the C++ code as comments/unused triple-quoted strings into my Python then slowly translated from C++.
The C++ used several single-character, uppercase, variable names. I took them to be references to their papers or maybe earlier Fortran programs and either kept them, or made note of their re-naming.

I finally got my first working Python code which is below, but without the C++ code comments I followed:


1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
def sobol_points(N, D):
    """From: C++ Source of Frances Y. Kuo "sobol.cc" """
    pow2_32 = 2**32

    L = binhi1(N)
    C = [binlo0(i) for i in range(N)]
    points= [[0 for j in range(D)] for i in range(N)]
    V = [1 << (32- i) for i in range(L+1)]

    X = [0]
    for i in range(1, N):
        X.append(X[i-1] ^ V[C[i-1]])
        points[i][0] = X[i]/pow2_32;
    del V, X
    
    directions = direction_file_reader()
    for j in range(1, D):
        d, s, a, m = directions[j-1]
        #print(d, s, a, m)
        
        if L <= s:
            V = [0] + [m[i] << (32 - i) for i in range(1, L+1)]
        else:
            V = [0] + [m[i] << (32 - i) for i in range(1, s+1)]
            for i in range(s+1, L+1):
                V.append(V[i-s] ^ (V[i-s] >> s))
                for k in range(1, s):
                    V[i] ^= (((a >> (s-1-k)) & 1) * V[i-k])
        X = [0]
        for i in range(1, N):
            X.append(X[i-1] ^ V[C[i-1]])
            points[i][j] = X[i]/pow2_32
    
    return points

Issues with the directly translated Python/C++:

  • The calculation ov V[] in lines 21 to 28 I thought should become a function.
  • All points are computed and returned at once. It would be better as a generator function.
  • The calculation of X[] in lines 29 to 31 seems like a very inefficient loop within a loop. Furthermore, if turned into a generator; Iwould need to unscramble the loops to try and recalculate X for each point without another loop.

Final Python generator function version


1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
# -*- coding: utf-8 -*-
"""sobol_point_gen.py

 Usage:
    sobol_point_gen.py N D <dimensions_file>
    sobol_point_gen.py -h | --help

 Options:
     -h --help     Show this screen.     

 Arguments:
     N  Number of points to generate (<2**32)
     D  Number of dimensions for each point. (D <= max D in dimensions_file)
     dimensions_file    See: https://web.maths.unsw.edu.au/~fkuo/sobol/

 Created from an original c++ source by Frances Y. Kuo and Stephen Joe

Created on Fri Nov  1 09:13:12 2019

@author: Paddy3118
"""

from functools import lru_cache

#%%
def binhi1(b):
    "highest 1 bit of binary b. (== ceil(log2(b)))"
    pos = 0
    while b > 0:
        pos, b = pos + 1, b // 2
    return pos

def binlo0(b):
    "lowest 0 bit from right of binary b. Always >=1"
    pos, b2 = 1, b // 2
    while b2 * 2 != b:
        pos, b, b2 = pos + 1, b2, b2 // 2
    return pos

#%%
def direction_file_reader(d, fname="Sobol_new-joe-kuo-6.21201"):
    """\
    Reads d+header lines from direction file of format:
        d       s       a       m_i     
        2       1       0       1 
        3       2       1       1 3 
        4       3       1       1 3 1 
        5       3       2       1 1 1 
        6       4       1       1 1 3 3
        7       4       4       1 3 5 13 
        8       5       2       1 1 5 5 17 
        9       5       4       1 1 5 5 5 
        10      5       7       1 1 7 11 19 
        11      5       11      1 1 5 1 1 
        12      5       13      1 1 1 3 11 
        13      5       14      1 3 5 5 31 
        14      6       1       1 3 3 9 7 49 
        15      6       13      1 1 1 15 21 21 
        16      6       16      1 3 1 13 27 49 
        17      6       19      1 1 1 15 7 5 
        18      6       22      1 3 1 15 13 25 
        19      6       25      1 1 5 5 19 61 
        20      7       1       1 3 7 11 23 15 103 
        21      7       4       1 3 7 13 13 15 69 
        22      7       7       1 1 3 13 7 35 63 
        23      7       8       1 3 5 9 1 25 53 
        24      7       14      1 3 1 13 9 35 107 
        25      7       19      1 3 1 5 27 61 31 
        26      7       21      1 1 5 11 19 41 61 
        27      7       28      1 3 5 3 3 13 69 
        28      7       31      1 1 7 13 1 19 1 
        29      7       32      1 3 7 5 13 19 59 
        30      7       37      1 1 3 9 25 29 41 
        31      7       41      1 3 5 13 23 1 55 
        32      7       42      1 3 7 3 13 59 17 
        ...
    """
    with open(fname) as f:
        headers = f.readline().strip().split()
        assert headers == ['d', 's', 'a', 'm_i']
        directions = []
        for i, line in zip(range(d), f):
            d, s, a, *m = [int(cell) for cell in line.strip().split()]
            directions.append((d, s, a, tuple([0] + m)))
        return directions

#%%
@lru_cache(128)
def calc_dnum(s, a, m, mxbits):                      # V, direction numbers
    if mxbits <= s:
        dirnum = [0] + [m[i] << (32 - i) for i in range(1, mxbits+1)]
    else:
        dirnum = [0] + [m[i] << (32 - i) for i in range(1, s+1)]
        for i in range(s+1, mxbits+1):
            dirnum.append(dirnum[i-s] ^ (dirnum[i-s] >> s))
            for k in range(1, s):
                dirnum[i] ^= (((a >> (s-1-k)) & 1) * dirnum[i-k])
    return dirnum

def sobol_point_gen(N, D, fname="Sobol_new-joe-kuo-6.21201"):
    pow2_32 = 2**32
    mxbits = binhi1(N)                              # L, max bits needed
    dnum = [1 << (32- i) for i in range(mxbits+1)]  # V, direction numbers
    directions = direction_file_reader(D, fname)
    xd = [0 for j in range(D)]                      # X
    pt = [0 for j in range(D)]                      # POINTS
    yield pt
    for n in range(1, N):
        nfirst0 = binlo0(n-1)                       # C
        pt = [0 for j in range(D)]
        xd[0] = xd[0] ^ dnum[nfirst0]
        pt[0] = xd[0] / pow2_32
        for jj in range(1, D):
            _, s, a, m = directions[jj-1]
            dnum2 = calc_dnum(s, a, m, mxbits)      # V, direction numbers
            xd[jj] = xd[jj] ^ dnum2[nfirst0]
            pt[jj] = xd[jj]/pow2_32
        yield pt

#%%
if __name__ == '__main__':
    first_sobol_3D_points = [
         [0, 0, 0],
         [0.5, 0.5, 0.5],
         [0.75, 0.25, 0.25],
         [0.25, 0.75, 0.75],
         [0.375, 0.375, 0.625],
         [0.875, 0.875, 0.125],
         [0.625, 0.125, 0.875],
         [0.125, 0.625, 0.375],
         [0.1875, 0.3125, 0.9375],
         [0.6875, 0.8125, 0.4375]]
    generated = list(sobol_point_gen(10, 3))
    assert generated == first_sobol_3D_points


Function def direction_file_reader is changed to only read as much of thedirections file as is necessary.
Function def sobol_point_gen is now a generator function in which successive calls to next give successive points. The loops are swapped inside with an outer loop yielding a point every iteration, and an inner loop over dimensions > 1.
Function def calc_dnum is cached. Initial suspicions of a lot of recalculation proved true:

calc_dnum cache stats


In [1]: g = list(sobol_point_gen(10000, 3))
In [2]: calc_dnum.cache_info()
Out[2]: CacheInfo(hits=20012, misses=4, maxsize=128, currsize=4)



Sunday, October 20, 2019

Indent datastructure for trees

I was browsing StackOverflow and came across a question that mentioned a new-to-me format for a datatructure for holding a tree of data.I am well used to the (name, list_of_children) set of interconnected node datastructures way of doing things, but this mentioned where you
  1. Create an empty list
  2.  then for each node starting at the root which has a depth of zero:
    1. add the (depth, name) tuple of the node to the list
    2. visit all this nodes child node.
It is a preorder traversal of the conceptual tree, aggregating (depth,  name) tuples into a list to form what I am calling the indent tree datastructure as it captures all the information of the tree but in a different datastructure than normal, and can be extended to allow data at each node and might be a useful alternative for DB storage of trees.

An example tree:





Its indent format:

[(0, 'R'),
 (1, 'a'),
 (2, 'c'),
 (3, 'd'),
 (3, 'h'),
 (1, 'b'),
 (2, 'e'),
 (3, 'f'),
 (4, 'g'),
 (2, 'i')]

Indented representation:

If you print out successive names from the indent format list above, one per line, with indent from the left of the indent value, then you get a nice textual regpresentation of the tree; expanded left-to-right rather than the top-down representation of the graphic:


R
  a
    c
      d
      h
  b
    e
      f
        g
    i


Code

I wrote some code to manipulate and traverse this kind of tree datastructure, as well as to use graphviz to draw graphical representations.


  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
import graphviz as gv
from collections import defaultdict
import re
from pprint import pprint as pp


#%%
node_attr = {at: val for at, val in 
             (line.strip().split() for line in '''
shape      record
fontsize   12
height     0.1
width      0.1
rankdir    LR
ranksep    0.25
nodesep    0.25
            '''.strip().split('\n'))}
edge_attr = {at: val for at, val in 
             (line.strip().split() for line in '''
arrowhead  none
minlen     1
            '''.strip().split('\n'))}
root_attr = {at: val for at, val in 
             (line.strip().split() for line in '''
fontcolor  green
color      brown
fontname   bold
            '''.strip().split('\n'))}
#%%


def _pc2indent(pc, root=None, indent=None, children=None):
    "parent-child dependency dict to indent format conversion"
    if root is None:
        parents = set(pc)
        kids = set(sum(pc.values(), []))
        root = parents - kids
        assert len(root) == 1, f"Need exactly one root: {root}"
        root = root.pop()
    if indent is None:
        indent = 0
    if children is None:
        children = []
    children.append((indent, root))
    if root in pc:
        for child in pc[root]:
            pc2indent(pc, child, indent+1, children)
    return children

def dot2indent(tree):
    "simple dot format to indent format translator"
    depends = defaultdict(list)
    for matchobj in re.finditer(r'\s+(\S+)\s*->\s*(\S+)\s', tree.source):
        parent, child = matchobj.groups()
        depends[parent].append(child)
    parent2child = dict(depends)
    return _pc2indent(parent2child)

#%%
def _highlight(txt):
    return f'#{txt}#'

def pp_indent(data, hlight=None, indent='  '):
    "simple printout of indent format tree with optional node highlighting"
    if hlight is None:
        hlight = set()
    for level, name in data:
        print(indent * level 
              + (_highlight(name) if name in hlight else name))

#%%
def indent2dot(lst):
    tree = gv.Digraph(name='indent2dot', node_attr=node_attr)
    tree.edge_attr.update(**edge_attr)
    levelparent = {}
    for level, name in lst:
        levelparent[level] = name
        if level - 1 in levelparent:
            tree.edge(levelparent[level-1], name)
        else:
            tree.node(name, _attributes=root_attr)
    return tree

#%%
def printwithspace(i):
    print(i, end=' ')

def preorder(tree, visitor=printwithspace):
    for indent, name in tree:
        visitor(name)

def levelorder(tree, reverse_depth=False, reverse_in_level=False, 
               visitor=printwithspace):
    if not tree:
        return
    indent2name = defaultdict(list)
    mx = -1
    for indent, name in tree:
        if indent > mx:
            mx = indent
        indent2name[indent].append(name)
    if reverse_in_level:
        for names in indent2name.values():
            names.reverse()
    if not reverse_depth:
        for indent in range(0, mx + 1):
            for name in indent2name[indent]:
                visitor(name)
    else:
        for indent in range(mx, -1, -1):
            for name in indent2name[indent]:
                visitor(name)
    
    
#%%
# Example tree
ex1    = [(0, '1'),
          (1, '2'), 
          (2, '4'), 
          (3, '7'), 
          (2, '5'), 
          (1, '3'),
          (2, '6'),
          (3, '8'), 
          (3, '9'),
          (3, '10'),
          ]

#%%
if __name__ == '__main__':
    print('A tree in indent datastructure format:')
    pp(ex1)
    print('\nSame tree, printed as indented list:')
    pp_indent(ex1)
    print('\nSame tree, drawn by graphviz:')
    display(indent2dot(ex1))  # display works in spyder/Jupyter

    print('\nSame tree, preorder traversal:')
    preorder(ex1)
    print()
    print('Same tree, levelorder traversal:')
    levelorder(ex1)
    print()
    print('Same tree, reverse_depth levelorder traversal:')
    levelorder(ex1, True)
    print()
    print('Same tree, reverse_depth, reverse_in_level levelorder traversal:')
    levelorder(ex1, True, True)
    print()
    print('Same tree, depth_first, reverse_in_level levelorder traversal:')
    levelorder(ex1, False, True)
    print()

Output:



A tree in indent datastructure format:
[(0, '1'),
(1, '2'),
(2, '4'),
(3, '7'),
(2, '5'),
(1, '3'),
(2, '6'),
(3, '8'),
(3, '9'),
(3, '10')]

Same tree, printed as indented list:

1
    2
        4
            7
        5
    3
        6
            8
            9
            10

Same tree, drawn by graphviz:


Same tree, preorder traversal:
1 2 4 7 5 3 6 8 9 10
Same tree, levelorder traversal:
1 2 3 4 5 6 7 8 9 10
Same tree, reverse_depth levelorder traversal:
7 8 9 10 4 5 6 2 3 1
Same tree, reverse_depth, reverse_in_level levelorder traversal:
10 9 8 7 6 5 4 3 2 1
Same tree, depth_first, reverse_in_level levelorder traversal:
1 3 2 6 5 4 10 9 8 7

In [47]:



Sunday, October 13, 2019

ISO26262:: Some thoughts on Specs.

It seems as if there are many engineers who can do, but who struggle with doing the documentation.
In safety flows, everything comes from the specification - the designer designs to the spec., the verification team verify that the spec. is correctly implemented.
  • The spec. is central.
  • The spec. is a document written by engineers.
  • In modern system-on-chip designs, the spec. is complex.
There's an aphorism, (yanked from the zen of Python),  that states "complex is better than complicated" which I take to mean that you need to look deeper for the structure in multi-facetted, intricate systems. How you find and present/document that structure can make a wealth of difference to how well, and how quickly the system is understood.

Structured data

Let's call those that develop  specs Concept Engineers. Concept engineers will have their modelling tools, scripts, and spreadsheets, etc that they use to converge on the correct design that they
proceed to write up as the spec. Those tools often create a wealth of structured data such as:
  • Register maps
  • Register field descriptions
  • Memory maps
  • Top level bus maps
  • ...
The Concept engineers will/can produce structured data of many formats, such as XML, JSON, SQL DB's, CSV files, REST format web DB's.

When writing the spec: Parts can be automated by using these concept pre-spec data sources to generate aspects.

When implementing and verifying the design: Parts can be automated by using these data sources.

What is often left out is: ISO26262 mandates that all data come from the spec.

What is the spec?

I assume that the specification must be print friendly and understandable. You need to convince auditors that what is delivered is an expression of the spec. I am assuming that expressing all of the concept-stage structured data in textual form and appending reams of xml as an appendix is unacceptable. You need register tables, state machine diagrams, truth tables, ...
We have standard ways of expressing our technical concepts that are expected to be used.  In the world of safety, they need to be built upon.

Round tripping

If an item is auto generated from structured date for use by the design verification team, then:
  1. The spec. should contain that data.
  2. Create a script that can scrape the end spec. format (usually a PDF), and regenrate the structured data - identical enough so that a simple textual diff will show they are the same.

Scraping aids:

(I.e. aids to scrape data from a specification. Like web-scraping does for HTML)
  • I usually take the spec in PDF as a format for reading. Luckily, if you check with your PDF generation tools, there are PDF readers that can convert PDF's to spreadsheets, In my current case, PDF text lines appear as spreadsheet rows with the text line all in the leftmost column. PDF table columns appear in multiple cells of rows, and each PDF page is a separate sheet of the workbook.
    Python, and other languages have libraries that can read spreadsheets.
  • The "original" structured data generated from concept engineering tools  should be "pretty printed" and ordered  before use in downstraem flows to allow easier textual comparison by diff'ing or other simple means.
  • When generating sections of text for a spec from structured data pre-tag that section.
    Pre tagging means adding a recognised word or sequence of words immediately before the generated spec data that denotes the format of that data chunk. For example, it could be a new line starting REGISTER:: that must allways start a register definition with its fields arranged, in-order, inside a 31-to-16, then 15-to-0 annottated horizontal table; then a table with headers of maybe "field, bitrange, type, comment"; defined text specifying register features; ...
    That pre-tag format should be used throughout the document and should not detract from how it reads. I use an example of a word followed by double colons above; a hash, '#' followed by a word, (a hashtag), would work too, but choose a format and stick to it.
    Making the tag immediately precede what is tagged and having data fields with the same tag be expressed in the same format aids scraping enormousely. (And reading too).
  • Show the structure in the items: If a pre-tagged item, such as a register name has a range of values then show a parameterised name and use a named index then show how the index links to properties of the register (in this case), that are designed to vary with the index, e.g. register offset, any of the registers bitfields, reset values, ...
    Don't expand the index in the spec. as valuable information may be lost. It may seem easier if the index has only two values, to add two sepearate "expanded" entries, but then their inter-relationships and the very fact that they are related, must then be inferred rather than being given.
    Different , separate, parameterised items may then share the same named index and index range to show extra information

Scraping benefits

  • Data duplication in specs: Explaining a topic might need the mention of a tagged item before items of that type are all shown, for example specific registers before the section where all registers are shown as part of the register map. When the spec is scraped, the scraper can make sure multiple definition are equivalent.
  • Scraped data can be used to regenerate the concept data used in design and verification flows before the spec was finalised to ensure the spec is correct. (Or those tools rerun on the specs scraped data).
  • By thinking of scraping needs, you are forced to think about finding the patterns in the data, and ensuring th spec is complete.

Diagrams

They are too much bother to try and scrape I find. I'd cut down on expressing Concept data used to generate verification and design parts as only diagrams in the spec. Most diagrams do have textul countrparts such as netlists. Creative layout of textual , parsable "code", may suffice or be used in addition to the diagram that ahows the same - scraping might then just point out an area for manual checking.

When you just can't get away from that long list

Lets say you have have 1024 identical register in all but name and "power domain". You could use a parameterised name for the register type, an alias for each indexed instance of that register type. The data format for registers needs to expand to have a field pointing to a named, pre-tagged  data table that would map each index to its alias and power domain. (Of course, if the table is never very wide, then it could be doubled-up on the page to save space and make the spec. more presentable - someone will read it.

END.

(There are other aspects of spec writing I need to write about in further posts.).

Tuesday, August 27, 2019

N-Dimensional matrix to 1-D array indexing translations.


Having done the 2-D address indexing translations, I thught about how to translate between a set of 3-D indices and a linear 1-D array index then extrapolated to n-dimensions.

I liked the idea of testing the solution and have brought that across too (with additions)

The class

1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42

# -*- coding: utf-8 -*-
"""
Created on Tue Aug 27 01:49:51 2019

@author: Paddy3118
"""

from collections import OrderedDict
from itertools import product
from functools import reduce

#%%

class ND21D_Addressing():
    """
    Convert n-dimensional indexing to/from 1-D index as if packed  
    into 1-D array. 
    All indices assumed to start from zero
    """
    def __init__(self, *extent):
        "extent is tuple of index sizes in each dimension"
        n_dim = len(extent) # Dimensionality
        self._extent = extent
        self._offsets = [reduce(int.__mul__, 
                               extent[n + 1:], 1) 
                        for n in range(n_dim)]

    # What n-dimensional index-tuple is stored at linear index.
    def i2ndim(self, index_i):
        "1-D array index to to n-D tuple of indices"
        return tuple((index_i // s) % c 
                     for s, c in zip(self._offsets, self._extent))
    
    # What linear 1-D index stores n-D tuple of indices.
    def ndim2i(self, ni):
        "n-D tuple of indices to 1-D array index"
        return sum(d * s for s, d in zip(self._offsets, ni))
    
    def __repr__(self):
        return f"{self.__class__.__name__}({str(self._extent)[1:-1]})"
#%%
    


The test

43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
def _irange(mini, maxi):
    "Integer range mini-to-maxi inclusive of _both_ endpoints"
    # Some think ranges should include _both_ endpoints, oh well.
    return range(mini, maxi+1)

def _print_n_dim(ranges_from_zero):
    "Represent the indexing of an n-D matrix"
    last = [0] * len(ranges_from_zero)
    for ni in product(*ranges_from_zero):
        for s, t in zip(last, ni):
            if s != t and t == 0: print()
        last = ni
        print(str(ni).replace(' ', ''), end=' ')
    print()

#%%
if __name__ == "__main__":
    # Dimensionality for test
    n_dim = 4
    
    # range of values in each dimension.
    dranges = [_irange(0, d+1) for d in range(n_dim)]
    # Num of values in each dim.
    extent = [len(dr) for dr in dranges]  
    
    ## The address mapper instance
    admap = ND21D_Addressing(*extent)
    
    ## A test matrix of given dimensionality
    # Optimum size of mapping to 1-dim. array
    size_1d = reduce(int.__mul__, extent)  
    # Range of all mapped to 1-dim. array index values
    range_1d = _irange(0, size_1d - 1)  

    print(f"\n## ORIGINAL {n_dim}-D ARRAY:")
    _print_n_dim(dranges)

    print(f"\n# TEST TRIAL MAP {n_dim}-D TO/FROM 1-D ARRAY ADDRESSING")
          
    # Representing a 1-D array mapped to n-D index tuple 
    dim_1 = OrderedDict((index_i, admap.i2ndim(index_i)) 
                        for index_i in range_1d)
    all_ndim = set(dim_1.values())
    all_by_dim  = [set(d1) for d1 in zip(*all_ndim)]
    assert len(all_ndim) == size_1d, "FAIL! ndim index count"
    for a_dim, its_count in zip(all_by_dim, extent):
        assert len(set(a_dim)) == its_count, \
               "FAIL! ndim individual index count"
               
    # Representing n-D index tuple mapped to 1-D index
    dim_n = OrderedDict(((ndim), admap.ndim2i(ndim))
                        for ndim in product(*dranges))
    all_i = set(dim_n.values())
    assert min(all_i) == 0, "FAIL! Min index_i not zero"
    assert max(all_i) == size_1d - 1, \
           f"FAIL! Max index_i not {size_1d - 1}"

    # Check inverse mappings
    assert all(dim_1[dim_n[ndim]] == ndim 
               for ndim in dim_n), \
           "FAIL! Mapping n-D to/from 1-D indices"
    assert all(dim_n[dim_1[index_i]] == index_i 
               for index_i in range_1d), \
           "FAIL! Mapping 1-D to/from n-D indices"

    print(f"  {admap}: PASS!")



The test output


## ORIGINAL 4-D ARRAY:
(0,0,0,0) (0,0,0,1) (0,0,0,2) (0,0,0,3) (0,0,0,4) 
(0,0,1,0) (0,0,1,1) (0,0,1,2) (0,0,1,3) (0,0,1,4) 
(0,0,2,0) (0,0,2,1) (0,0,2,2) (0,0,2,3) (0,0,2,4) 
(0,0,3,0) (0,0,3,1) (0,0,3,2) (0,0,3,3) (0,0,3,4) 

(0,1,0,0) (0,1,0,1) (0,1,0,2) (0,1,0,3) (0,1,0,4) 
(0,1,1,0) (0,1,1,1) (0,1,1,2) (0,1,1,3) (0,1,1,4) 
(0,1,2,0) (0,1,2,1) (0,1,2,2) (0,1,2,3) (0,1,2,4) 
(0,1,3,0) (0,1,3,1) (0,1,3,2) (0,1,3,3) (0,1,3,4) 

(0,2,0,0) (0,2,0,1) (0,2,0,2) (0,2,0,3) (0,2,0,4) 
(0,2,1,0) (0,2,1,1) (0,2,1,2) (0,2,1,3) (0,2,1,4) 
(0,2,2,0) (0,2,2,1) (0,2,2,2) (0,2,2,3) (0,2,2,4) 
(0,2,3,0) (0,2,3,1) (0,2,3,2) (0,2,3,3) (0,2,3,4) 


(1,0,0,0) (1,0,0,1) (1,0,0,2) (1,0,0,3) (1,0,0,4) 
(1,0,1,0) (1,0,1,1) (1,0,1,2) (1,0,1,3) (1,0,1,4) 
(1,0,2,0) (1,0,2,1) (1,0,2,2) (1,0,2,3) (1,0,2,4) 
(1,0,3,0) (1,0,3,1) (1,0,3,2) (1,0,3,3) (1,0,3,4) 

(1,1,0,0) (1,1,0,1) (1,1,0,2) (1,1,0,3) (1,1,0,4) 
(1,1,1,0) (1,1,1,1) (1,1,1,2) (1,1,1,3) (1,1,1,4) 
(1,1,2,0) (1,1,2,1) (1,1,2,2) (1,1,2,3) (1,1,2,4) 
(1,1,3,0) (1,1,3,1) (1,1,3,2) (1,1,3,3) (1,1,3,4) 

(1,2,0,0) (1,2,0,1) (1,2,0,2) (1,2,0,3) (1,2,0,4) 
(1,2,1,0) (1,2,1,1) (1,2,1,2) (1,2,1,3) (1,2,1,4) 
(1,2,2,0) (1,2,2,1) (1,2,2,2) (1,2,2,3) (1,2,2,4) 
(1,2,3,0) (1,2,3,1) (1,2,3,2) (1,2,3,3) (1,2,3,4) 

# TEST TRIAL MAP 4-D TO/FROM 1-D ARRAY ADDRESSING
  ND21D_Addressing(2, 3, 4, 5): PASS!


END.

Followers

Subscribe Now: google

Add to Google Reader or Homepage

Go deh too!

whos.amung.us

Blog Archive