Saturday, May 25, 2013

Greedy Ranking Algorithm in Python


I mentioned in an earlier post that I had written my own ranker and thought I'd revisit this with some code.

I verify and ensure the safety of microprocessors for my day job. One way that very complex CPU's are tested is to create another model of the chip which can be used to generate pseudo-random instruction streams to run on CPU. The so-called ISG can create thousands (millions!) of these tests in very little time, and the ISG is written in such a way that it can be 'tweaked' to give some control or steering to what the instruction streams will exercise on the CPU.

Now simulating these instruction streams and gathering information on just what parts of the CPU are exercised, called covered, by each individual test takes time, and multiple ISG generated tests may cover the same regions of the CPU. To increase the overall coverage of of the CPU we run what is called a regression - all the tests are run and their coverage and the time they take to simulate are stored. at the end of the regression run you may have several thousands of tests that cover only part of the CPU.

If you take the regression results and rank them you can find that subset of the tests that give all the coverage. Usually thousands of pseudo-random tests might be ranked and generate a sub-list of only hundreds of tests that when run would give the same coverage. What we then usually do is look at what isn't covered and generate some more tests by the ISG or other methods to try and fill the gaps; run the new regression and rank again in a loop to fully exercise the CPU and hit some target coverage goal.

Ranking tests is an important part of the regression flow described above, and when it works well you forget about it. Unfortunately sometimes I want to to rank other data, for which the stock ranking program from the CAD tool vendors does not fit. So here is the guts of a ranking program that will scale to handling hundreds of thousands of tests and coverage points.

Input

Normally I have to parse my input from text or HTML files of results generated by other CAD programs - it is tedious work that I will skip by providing idealised inputs in the form of a Python dict. (Sometimes the code for parsing input files can be as large or larger than the ranking algorithm).

Let us assume that each ISG test has a name, runs for a certain 'time' and when simulated is shown to 'cover' a set of numbered features of the design. after the parsing, the gathered input data is represented by the results dict in the program.

  1 
  2 results = {
  3 #    'TEST': (  TIME, set([COVERED_POINT ...])),
  4   'test_00': (  2.08, set([2, 3, 5, 11, 12, 16, 19, 23, 25, 26, 29, 36, 38, 40])),
  5   'test_01': ( 58.04, set([0, 10, 13, 15, 17, 19, 20, 22, 27, 30, 31, 33, 34])),
  6   'test_02': ( 34.82, set([3, 4, 6, 12, 15, 21, 23, 25, 26, 33, 34, 40])),
  7   'test_03': ( 32.74, set([4, 5, 10, 16, 21, 22, 26, 39])),
  8   'test_04': (100.00, set([0, 1, 4, 6, 7, 8, 9, 11, 12, 18, 26, 27, 31, 36])),
  9   'test_05': (  4.46, set([1, 2, 6, 11, 14, 16, 17, 21, 22, 23, 30, 31])),
 10   'test_06': ( 69.57, set([10, 11, 15, 17, 19, 22, 26, 27, 30, 32, 38])),
 11   'test_07': ( 85.71, set([0, 2, 4, 5, 9, 10, 14, 17, 24, 34, 36, 39])),
 12   'test_08': (  5.73, set([0, 3, 8, 9, 13, 19, 23, 25, 28, 36, 38])),
 13   'test_09': ( 15.55, set([7, 15, 17, 25, 26, 30, 31, 33, 36, 38, 39])),
 14   'test_10': ( 12.05, set([0, 4, 13, 14, 15, 24, 31, 35, 39])),
 15   'test_11': ( 52.23, set([0, 3, 6, 10, 11, 13, 23, 34, 40])),
 16   'test_12': ( 26.79, set([0, 1, 4, 5, 7, 8, 10, 12, 13, 31, 32, 40])),
 17   'test_13': ( 16.07, set([2, 6, 9, 11, 13, 15, 17, 18, 34])),
 18   'test_14': ( 40.62, set([1, 2, 8, 15, 16, 19, 22, 26, 29, 31, 33, 34, 38])),
 19  }
 20 

Greedy ranking algorithm

The object of the algorithm is to select and order a subset of the tests that:

  1. Cover as many of the coverage points as possible by at least one test.
  2. After the above, reduce the number of tests needed to achieve that maximum coverage by as much as is possible.
  3. Generate a ranking of the tests selected to allow an even smaller set of tests to be selected if necessary.
  4. After all the above having increasing importance, it would be good to also reduce the total 'time' accrued by the ranking tests .
  5. Of course it needs to work for large sets of tests and points to cover.
The greedy algorithm works by first choosing the test giving most coverage to be the test of highest rank, then the test giving the most incremental additional coverage as the next highest ranking test, and so on...
If there are more than one test giving the same incremental additional coverage at any stage then the test taking the least 'time' is picked.

The following function implements the algorithm:


 21 def greedyranker(results):
 22     results = results.copy()
 23     ranked, coveredsofar, costsofar, round = [], set(), 0, 0
 24     noncontributing = []
 25     while results:
 26         round += 1
 27         # What each test can contribute to the pool of what is covered so far
 28         contributions = [(len(cover - coveredsofar), -cost, test)
 29                          for test, (cost, cover) in sorted(results.items()) ]
 30         # Greedy ranking by taking the next greatest contributor                 
 31         delta_cover, benefit, test = max( contributions )
 32         if delta_cover > 0:
 33             ranked.append((test, delta_cover))
 34             cost, cover = results.pop(test)
 35             coveredsofar.update(cover)
 36             costsofar += cost
 37         for delta_cover, benefit, test in contributions:
 38             if delta_cover == 0:
 39                 # this test cannot contribute anything
 40                 noncontributing.append( (test, round) )
 41                 results.pop(test)
 42     return coveredsofar, ranked, costsofar, noncontributing
 43 

Each time through the while loop (line 25), the next best test is appended to the ranking and tests that can nolonger contribute any extra coverage are discarded (lines 37-41)

The function above is a bit dry so I took a bit of time to annotate it with a tutor capability that when run prints out just what it is doing along the way:

The function with tutor

It implements the same thing but does it noisily:

 44 def greedyranker(results, tutor=True):
 45     results = results.copy()
 46     ranked, coveredsofar, costsofar, round = [], set(), 0, 0
 47     noncontributing = []
 48     while results:
 49         round += 1
 50         # What each test can contribute to the pool of what is covered so far
 51         contributions = [(len(cover - coveredsofar), -cost, test)
 52                          for test, (cost, cover) in sorted(results.items()) ]
 53         if tutor:
 54             print('\n## Round %i' % round)
 55             print('  Covered so far: %2i points: ' % len(coveredsofar))
 56             print('  Ranked so far: ' + repr([t for t, d in ranked]))
 57             print('  What the remaining tests can contribute, largest contributors first:')
 58             print('    # DELTA, BENEFIT, TEST')
 59             deltas = sorted(contributions, reverse=True)
 60             for delta_cover, benefit, test in deltas:
 61                 print('     %2i,    %7.2f,    %s' % (delta_cover, benefit, test))
 62             if len(deltas)>=2 and deltas[0][0] == deltas[1][0]:
 63                 print('  Note: This time around, more than one test gives the same')
 64                 print('        maximum delta contribution of %i to the coverage so far'
 65                        % deltas[0][0])
 66                 if deltas[0][1] != deltas[1][1]:
 67                     print('        we order based on the next field of minimum cost')
 68                     print('        (equivalent to maximum negative cost).')
 69                 else:
 70                     print('        the next field of minimum cost is the same so')
 71                     print('        we arbitrarily order by test name.')
 72             zeroes = [test for delta_cover, benefit, test in deltas
 73                      if delta_cover == 0]
 74             if zeroes:
 75                 print('  The following test(s) cannot contribute more to coverage')
 76                 print('  and will be dropped:')
 77                 print('    ' + ', '.join(zeroes))
 78 
 79         # Greedy ranking by taking the next greatest contributor                 
 80         delta_cover, benefit, test = max( contributions )
 81         if delta_cover > 0:
 82             ranked.append((test, delta_cover))
 83             cost, cover = results.pop(test)
 84             if tutor:
 85                 print('  Ranking %s in round %2i giving extra coverage of: %r'
 86                        % (test, round, sorted(cover - coveredsofar)))
 87             coveredsofar.update(cover)
 88             costsofar += cost
 89 
 90         for delta_cover, benefit, test in contributions:
 91             if delta_cover == 0:
 92                 # this test cannot contribute anything
 93                 noncontributing.append( (test, round) )
 94                 results.pop(test)
 95     if tutor:
 96         print('\n## ALL TESTS NOW RANKED OR DISCARDED\n')
 97     return coveredsofar, ranked, costsofar, noncontributing

Every block starting if tutor: above has the added code.

Sample output

The code to call the ranker and print the results is:

 98 
 99 
100 totalcoverage, ranking, totalcost, nonranked = greedyranker(results)
101 print('''
102 A total of %i points were covered, 
103 using only %i of the initial %i tests,
104 and should take %g time units to run.
105 
106 The tests in order of coverage added:
107     
108     TEST  DELTA-COVERAGE'''
109  % (len(totalcoverage), len(ranking), len(results), totalcost))
110 print('\n'.join('  %6s  %i' % r for r in ranking))

The output has a lot of stuff from the tutor followed by the result at the end.

For this pseudo randomly generate test case of 15 tests it shows that only seven are needed to generate the maximum total coverage. (And if you were willing to loose the coverage of three tests that each cover only one additional point then 4 out of 15 tests would give 92.5% of the maximum coverage possible).

## Round 1
  Covered so far:  0 points: 
  Ranked so far: []
  What the remaining tests can contribute, largest contributors first:
    # DELTA, BENEFIT, TEST
     14,      -2.08,    test_00
     14,    -100.00,    test_04
     13,     -40.62,    test_14
     13,     -58.04,    test_01
     12,      -4.46,    test_05
     12,     -26.79,    test_12
     12,     -34.82,    test_02
     12,     -85.71,    test_07
     11,      -5.73,    test_08
     11,     -15.55,    test_09
     11,     -69.57,    test_06
      9,     -12.05,    test_10
      9,     -16.07,    test_13
      9,     -52.23,    test_11
      8,     -32.74,    test_03
  Note: This time around, more than one test gives the same
        maximum delta contribution of 14 to the coverage so far
        we order based on the next field of minimum cost
        (equivalent to maximum negative cost).
  Ranking test_00 in round  1 giving extra coverage of: [2, 3, 5, 11, 12, 16, 19, 23, 25, 26, 29, 36, 38, 40]

## Round 2
  Covered so far: 14 points: 
  Ranked so far: ['test_00']
  What the remaining tests can contribute, largest contributors first:
    # DELTA, BENEFIT, TEST
     12,     -58.04,    test_01
     10,    -100.00,    test_04
      9,     -12.05,    test_10
      9,     -26.79,    test_12
      9,     -85.71,    test_07
      8,      -4.46,    test_05
      7,     -15.55,    test_09
      7,     -16.07,    test_13
      7,     -40.62,    test_14
      7,     -69.57,    test_06
      6,     -34.82,    test_02
      5,      -5.73,    test_08
      5,     -32.74,    test_03
      5,     -52.23,    test_11
  Ranking test_01 in round  2 giving extra coverage of: [0, 10, 13, 15, 17, 20, 22, 27, 30, 31, 33, 34]

## Round 3
  Covered so far: 26 points: 
  Ranked so far: ['test_00', 'test_01']
  What the remaining tests can contribute, largest contributors first:
    # DELTA, BENEFIT, TEST
      7,    -100.00,    test_04
      5,     -12.05,    test_10
      5,     -26.79,    test_12
      5,     -85.71,    test_07
      4,      -4.46,    test_05
      3,      -5.73,    test_08
      3,     -16.07,    test_13
      3,     -32.74,    test_03
      3,     -34.82,    test_02
      2,     -15.55,    test_09
      2,     -40.62,    test_14
      1,     -52.23,    test_11
      1,     -69.57,    test_06
  Ranking test_04 in round  3 giving extra coverage of: [1, 4, 6, 7, 8, 9, 18]

## Round 4
  Covered so far: 33 points: 
  Ranked so far: ['test_00', 'test_01', 'test_04']
  What the remaining tests can contribute, largest contributors first:
    # DELTA, BENEFIT, TEST
      4,     -12.05,    test_10
      3,     -85.71,    test_07
      2,      -4.46,    test_05
      2,     -32.74,    test_03
      1,      -5.73,    test_08
      1,     -15.55,    test_09
      1,     -26.79,    test_12
      1,     -34.82,    test_02
      1,     -69.57,    test_06
      0,     -16.07,    test_13
      0,     -40.62,    test_14
      0,     -52.23,    test_11
  The following test(s) cannot contribute more to coverage
  and will be dropped:
    test_13, test_14, test_11
  Ranking test_10 in round  4 giving extra coverage of: [14, 24, 35, 39]

## Round 5
  Covered so far: 37 points: 
  Ranked so far: ['test_00', 'test_01', 'test_04', 'test_10']
  What the remaining tests can contribute, largest contributors first:
    # DELTA, BENEFIT, TEST
      1,      -4.46,    test_05
      1,      -5.73,    test_08
      1,     -26.79,    test_12
      1,     -32.74,    test_03
      1,     -34.82,    test_02
      1,     -69.57,    test_06
      0,     -15.55,    test_09
      0,     -85.71,    test_07
  Note: This time around, more than one test gives the same
        maximum delta contribution of 1 to the coverage so far
        we order based on the next field of minimum cost
        (equivalent to maximum negative cost).
  The following test(s) cannot contribute more to coverage
  and will be dropped:
    test_09, test_07
  Ranking test_05 in round  5 giving extra coverage of: [21]

## Round 6
  Covered so far: 38 points: 
  Ranked so far: ['test_00', 'test_01', 'test_04', 'test_10', 'test_05']
  What the remaining tests can contribute, largest contributors first:
    # DELTA, BENEFIT, TEST
      1,      -5.73,    test_08
      1,     -26.79,    test_12
      1,     -69.57,    test_06
      0,     -32.74,    test_03
      0,     -34.82,    test_02
  Note: This time around, more than one test gives the same
        maximum delta contribution of 1 to the coverage so far
        we order based on the next field of minimum cost
        (equivalent to maximum negative cost).
  The following test(s) cannot contribute more to coverage
  and will be dropped:
    test_03, test_02
  Ranking test_08 in round  6 giving extra coverage of: [28]

## Round 7
  Covered so far: 39 points: 
  Ranked so far: ['test_00', 'test_01', 'test_04', 'test_10', 'test_05', 'test_08']
  What the remaining tests can contribute, largest contributors first:
    # DELTA, BENEFIT, TEST
      1,     -26.79,    test_12
      1,     -69.57,    test_06
  Note: This time around, more than one test gives the same
        maximum delta contribution of 1 to the coverage so far
        we order based on the next field of minimum cost
        (equivalent to maximum negative cost).
  Ranking test_12 in round  7 giving extra coverage of: [32]

## Round 8
  Covered so far: 40 points: 
  Ranked so far: ['test_00', 'test_01', 'test_04', 'test_10', 'test_05', 'test_08', 'test_12']
  What the remaining tests can contribute, largest contributors first:
    # DELTA, BENEFIT, TEST
      0,     -69.57,    test_06
  The following test(s) cannot contribute more to coverage
  and will be dropped:
    test_06

## ALL TESTS NOW RANKED OR DISCARDED


A total of 40 points were covered, 
using only 7 of the initial 15 tests,
and should take 209.15 time units to run.

The tests in order of coverage added:
    
    TEST  DELTA-COVERAGE
  test_00  14
  test_01  12
  test_04  7
  test_10  4
  test_05  1
  test_08  1
  test_12  1

What should be next

There is a new Unified Coverage Interoperability Standard for a database for storing test coverage data ideally the greedy ranker should be hooked up to that UCIS DB to get its inputs via its C-interface or maybe its XML output instead of parsing text files.

Addendum

Random results dict creator

As used for testing:

def cover_creator(ntests=25, maxcoverpoints=100):
    import random
    results = {}
    coveredrange = (maxcoverpoints * 1 // 6, 
                    1 + maxcoverpoints * 2 // 6)
    print coveredrange
    for test in range(ntests):
        name = 'test_%02i' % test
        covered = sorted(set(random.randint(0, maxcoverpoints-1)
                         for i in range(random.randint(*coveredrange))))
        time = len(covered) * (100 + (random.random() - 0.5) * 40) / 100.0
        results[name] = ( float('%6.2f' % time), set(covered))
    return results


END.


8 comments:

  1. (Using Python≥2.7 set notation)

    I believe that greedyranker({
    'test1': (1, {1, 2, 3, 4, 5, 6, 7, 8}),
    'test2a': (1, {1, 3, 5, 7, 9, 11, 13}),
    'test2b': (1, {2, 4, 6, 8, 10, 12}),
    'test3a': (1, {9}),
    'test3b': (1, {10}),
    'test3c': (1, {11}),
    'test3d': (1, {12}),
    'test3e': (1, {13}),
    })
    should optimally return 'test2a' and 'test2b' as ranked.

    ReplyDelete
  2. Replace last line of my comment with “should optimally return *only* 'test2a' and 'test2b' as ranked.”

    ReplyDelete
  3. Hi ΤΖΩΤΖΙΟΥ, optimaly, you are correct. This is a greedy algorithm that can be used when there are thousands of tests and tens of thousands of cover points when finding the optimum solution would take a lifetime (of a star or more ;-)

    ReplyDelete
  4. Yes, obviously attempting the full powerset could take a lifetime. :)
    I would be interested in giving it a shot as a pet project, if you gave me a sample of such input data. If I come up with a modified algorithm that improves results in tolerably increased time, I guess it'll be a win-win.

    ReplyDelete
    Replies
    1. Hi again ΤΖΩΤΖΙΟΥ, the new addendum has the code I used to create sample test data. Have fun, and tell me how you get on please. Thanks.

      Delete
  5. Hi - I got interested enough in the UCIS to mostly wrap it in Python https://bitbucket.org/verilab/pyucis

    I say 'mostly' as I haven't quite gotten the whole SWIG interface completed, but a lot of it is there and working. Might be interesting/ useful. Certainly a lot less code to write using Pythonic iterators and file handling than using the C equivalent version of the API - most of the example code is about 1/3rd the equivalent sample from the UCIS specification.

    ReplyDelete
  6. Can we at the end of the test do the following:
    For each set check if this set is a subset of any other set and then exclude this from the result.

    ReplyDelete
    Replies
    1. This _could_ be done, but for my use cases, it is not likely to exclude anything more.

      Delete