How to use combinations of sets as test data

Asked 2/8, 2008 at 21:34 Answered 4/10, 2008 at 8:52

Solved unit-testing language-agnostic testing

I would like to test a function with a tuple from a set of fringe cases and normal values. For example, while testing a function which returns true whenever given three lengths that form a valid triangle, I would have specific cases, negative / small / large numbers, values close-to being overflowed, etc.; what is more, main aim is to generate combinations of these values, with or without repetition, in order to get a set of test data.

(inf,0,-1), (5,10,1000), (10,5,5), (0,-1,5), (1000,inf,inf),
...

As a note: I actually know the answer to this, but it might be helpful for others, and a challenge for people here! --will post my answer later on.

Demonstrable answered 2/8, 2008 at 21:34 Comment(1)

Abacus github a combinatorics library for Node.JS, Python, PHP, Actionscript (ps i'm the author) – Cubiculum 5/3, 2015 at 23:4

Absolutely, especially dealing with lots of these permutations/combinations I can definitely see that the first pass would be an issue.

Interesting implementation in python, though I wrote a nice one in C and Ocaml based on "Algorithm 515" (see below). He wrote his in Fortran as it was common back then for all the "Algorithm XX" papers, well, that assembly or c. I had to re-write it and make some small improvements to work with arrays not ranges of numbers. This one does random access, I'm still working on getting some nice implementations of the ones mentioned in Knuth 4th volume fascicle 2. I'll an explanation of how this works to the reader. Though if someone is curious, I wouldn't object to writing something up.

/** [combination c n p x]
 * get the [x]th lexicographically ordered set of [p] elements in [n]
 * output is in [c], and should be sizeof(int)*[p] */
void combination(int* c,int n,int p, int x){
    int i,r,k = 0;
    for(i=0;i<p-1;i++){
        c[i] = (i != 0) ? c[i-1] : 0;
        do {
            c[i]++;
            r = choose(n-c[i],p-(i+1));
            k = k + r;
        } while(k < x);
        k = k - r;
    }
    c[p-1] = c[p-2] + x - k;
}

~"Algorithm 515: Generation of a Vector from the Lexicographical Index"; Buckles, B. P., and Lybanon, M. ACM Transactions on Mathematical Software, Vol. 3, No. 2, June 1977.

Demonstrable answered 3/8, 2008 at 19:6 Comment(5)

What does choose() do? Does that basically return n-c[i] choose p-(i+1)1? – Archuleta 28/2, 2012 at 4:1

I'm sorry, I don't follow the choose behavior. It sounds like it's defined as self reflexive - choose does choose. Could you tell me what it does in simpler terms? – Annikaanniken 18/5, 2015 at 18:55

"self-reflexive" is an incorrect term for this. It's called "recursive" and it's a fundamental part of computer science. The "choose" function in question is C(N,K), en.wikipedia.org/wiki/Binomial_coefficient . – Demonstrable 19/5, 2015 at 14:14

It should be noted, that x is 1 based, not 0 based as one would expect. – Malarkey 7/6, 2015 at 10:28

Even the c array seems to be referring to set elements as 1 based and not 0 based. Considering we increment c[i] off the bat, it can never refer to 0'th element. Or am I missing something? Or did OP mean to init c[i] with -1 and say choose(n-c[i]-1,...)? – Vendace 29/10, 2015 at 17:53

With the brand new Python 2.6, you have a standard solution with the itertools module that returns the Cartesian product of iterables :

import itertools

print list(itertools.product([1,2,3], [4,5,6]))
   [(1, 4), (1, 5), (1, 6),
   (2, 4), (2, 5), (2, 6),
   (3, 4), (3, 5), (3, 6)]

You can provide a "repeat" argument to perform the product with an iterable and itself:

print list(itertools.product([1,2], repeat=3))
[(1, 1, 1), (1, 1, 2), (1, 2, 1), (1, 2, 2),
(2, 1, 1), (2, 1, 2), (2, 2, 1), (2, 2, 2)]

You can also tweak something with combinations as well :

print list(itertools.combinations('123', 2))
[('1', '2'), ('1', '3'), ('2', '3')]

And if order matters, there are permutations :

print list(itertools.permutations([1,2,3,4], 2))
[(1, 2), (1, 3), (1, 4),
   (2, 1), (2, 3), (2, 4),
   (3, 1), (3, 2), (3, 4),
   (4, 1), (4, 2), (4, 3)]

Of course all that cool stuff don't exactly do the same thing, but you can use them in a way or another to solve you problem.

Just remember that you can convert a tuple or a list to a set and vice versa using list(), tuple() and set().

Obvious answered 4/10, 2008 at 8:52 Comment(0)

Interesting question!

I would do this by picking combinations, something like the following in python. The hardest part is probably first pass verification, i.e. if f(1,2,3) returns true, is that a correct result? Once you have verified that, then this is a good basis for regression testing.

Probably it's a good idea to make a set of test cases that you know will be all true (e.g. 3,4,5 for this triangle case), and a set of test cases that you know will be all false (e.g. 0,1,inf). Then you can more easily verify the tests are correct.

# xpermutations from http://code.activestate.com/recipes/190465
from xpermutations import *

lengths=[-1,0,1,5,10,0,1000,'inf']
for c in xselections(lengths,3):        # or xuniqueselections
    print c

(-1,-1,-1);
(-1,-1,0);
(-1,-1,1);
(-1,-1,5);
(-1,-1,10);
(-1,-1,0);
(-1,-1,1000);
(-1,-1,inf);
(-1,0,-1);
(-1,0,0);
...

Jurado answered 3/8, 2008 at 0:4 Comment(0)

I think you can do this with the Row Test Attribute (available in MbUnit and later versions of NUnit) where you could specify several sets to populate one unit test.

Lapierre answered 16/8, 2008 at 13:31 Comment(0)

While it's possible to create lots of test data and see what happens, it's more efficient to try to minimize the data being used.

From a typical QA perspective, you would want to identify different classifications of inputs. Produce a set of input values for each classification and determine the appropriate outputs.

Here's a sample of classes of input values

valid triangles with small numbers such as (1 billion, 2, billion, 2 billion)
valid triangles with large numbers such as (0.000001, 0.00002, 0.00003)
valid obtuse triangles that are 'almost'flat such as (10, 10, 19.9999)
valid acute triangles that are 'almost' flat such as (10, 10, 0000001)
invalid triangles with at least one negative value
invalid triangles where the sum of two sides equals the third
invalid triangles where the sum of two sides is greater than the third
input values that are non-numeric

...

Once you are satisfied with the list of input classifications for this function, then you can create the actual test data. Likely, it would be helpful to test all permutations of each item. (e.g. (2,3,4), (2,4,3), (3,2,4), (3,4,2), (4,2,3), (4,3,2)) Typically, you'll find there are some classifications you missed (such as the concept of inf as an input parameter).

Random data for some period of time may be helpful as well, that can find strange bugs in the code, but is generally not productive.

More likely, this function is being used in some specific context where additional rules are applied.(e.g. only integer values or values must be in 0.01 increments, etc.) These add to the list of classifications of input parameters.

Platyhelminth answered 17/9, 2008 at 3:15 Comment(0)

Recommended topics

Hot tags