The usual meaning of significant is "what is the smallest set of tests that would give the same overall coverage". The tools don't try every combination of tests, (I have used a simple greedy algorithm). This got me thinking of what other ways I might want to calculate the test rankings.
I came up with two other ways of ranking tests that might be useful:
- File centred ranking: Using coverage results expressed on a per-source-file basis, extract the ranking of tests for full coverage of each source file individually; finally rank just the tests contributing to individual files on how they contribute to all files of the design.
- Instance centred ranking: Using coverage results expressed on a per-instance basis, extract the ranking of tests for full coverage of each instance individually; finally rank just the tests contributing to individual instances on how they contribute to all instances in the design.
File and instance centred ranking leave behind useful information for the designers, what tests target parts of the design they are working on. I guess (because this is another thought experiment), that FCR and ICR would never give smaller ranked test lists than DCR.
All three of the above methods of ranking tests could do with flagging which tests provide sole coverage for a cover point. Some tests may cover points that other tests cover, other tests may cover points that only that test covers it might be useful to be able to show this information.
Its my Data
If the tool vendors supported an interface to Python, then data mining within your coverage results would be a lot simpler, and get us part way to the kind of environment Scientists use in SciPy. Expressing test coverage as Python sets, you could roll your own coverage processing.