Off By One errors and Mutation Testing
Asked Answered
L

4

8

In the process of writing an "Off By One" mutation tester for my favourite mutation testing framework (NinjaTurtles), I wrote the following code to provide an opportunity to check the correctness of my implementation:

public int SumTo(int max)
{
    int sum = 0;
    for (var i = 1; i <= max; i++)
    {
        sum += i;
    }
    return sum;
}

now this seems simple enough, and it didn't strike me that there would be a problem trying to mutate all the literal integer constants in the IL. After all, there are only 3 (the 0, the 1, and the ++).

WRONG!

It became very obvious on the first run that it was never going to work in this particular instance. Why? Because changing the code to

public int SumTo(int max)
{
    int sum = 0;
    for (var i = 0; i <= max; i++)
    {
        sum += i;
    }
    return sum;
}

only adds 0 (zero) to the sum, and this obviously has no effect. Different story if it was the multiple set, but in this instance it was not.

Now there's a fairly easy algorithm for working out the sum of integers

sum = max * (max + 1) / 2;

which I could have fail the mutations easily, since adding or subtracting 1 from either of the constants there will result in an error. (given that max >= 0)

So, problem solved for this particular case. Although it did not do what I wanted for the test of the mutation, which was to check what would happen when I lost the ++ - effectively an infinite loop. But that's another problem.

So - My Question: Are there any trivial or non-trivial cases where a loop starting from 0 or 1 may result in a "mutation off by one" test failure that cannot be refactored (code under test or test) in a similar way? (examples please)

Note: Mutation tests fail when the test suite passes after a mutation has been applied.

Update: an example of something less trivial, but something that could still have the test refactored so that it failed would be the following

public int SumArray(int[] array)
{
    int sum = 0;
    for (var i = 0; i < array.Length; i++)
    {
        sum += array[i];
    }

    return sum;
}

Mutation testing against this code would fail when changing the var i=0 to var i=1 if the test input you gave it was new[] {0,1,2,3,4,5,6,7,8,9}. However change the test input to new[] {9,8,7,6,5,4,3,2,1,0}, and the mutation testing will fail. So a successful refactor proves the testing.

Lover answered 30/4, 2012 at 11:43 Comment(0)
H
3

One natural case of "mutation test failure" is an algorithm for matrix transposition. To make it more suitable for a single for-loop, add some constraints to this task: let the matrix be non-square and require transposition to be in-place. These constraints make one-dimensional array most suitable place to store the matrix and a for-loop (starting, usually, from index '1') may be used to process it. If you start it from index '0', nothing changes, because top-left element of the matrix always transposes to itself.

For an example of such code, see answer to other question (not in C#, sorry).

Here "mutation off by one" test fails, refactoring the test does not change it. I don't know if the code itself may be refactored to avoid this. In theory it may be possible, but should be too difficult.


The code snippet I referenced earlier is not a perfect example. It still may be refactored if the for loop is substituted by two nested loops (as if for rows and columns) and then these rows and columns are recalculated back to one-dimensional index. Still it gives an idea how to make some algorithm, which cannot be refactored (though not very meaningful).

Iterate through an array of positive integers in the order of increasing indexes, for each index compute its pair as i + i % a[i], and if it's not outside the bounds, swap these elements:

for (var i = 1; i < a.Length; i++)
{
    var j = i + i % a[i];
    if (j < a.Length)
        Swap(a[i], a[j]);
}

Here again a[0] is "unmovable", refactoring the test does not change this, and refactoring the code itself is practically impossible.


One more "meaningful" example. Let's implement an implicit Binary Heap. It is usually placed to some array, starting from index '1' (this simplifies many Binary Heap computations, compared to starting from index '0'). Now implement a copy method for this heap. "Off-by-one" problem in this copy method is undetectable because index zero is unused and C# zero-initializes all arrays. This is similar to OP's array summation, but cannot be refactored.

Strictly speaking, you can refactor the whole class and start everything from '0'. But changing only 'copy' method or the test does not prevent "mutation off by one" test failure. Binary Heap class may be treated just as a motivation to copy an array with unused first element.

int[] dst = new int[src.Length];
for (var i = 1; i < src.Length; i++)
{
    dst[i] = src[i];
}
Hodeida answered 3/5, 2012 at 13:3 Comment(7)
So, a meaningful code task, where starting from index 0 or 1 has the same effect because, like in the summation case, the operation on index 0 has no effect. And in this case, without resorting to an external library for matrix calculations, there's no refactor - assuming you take as granted the use of a one-dimensional array. Interesting - pondering this.Hellebore
@evgeny - I've only just finished getting my head around you're original post. But it was worth it. I took the C++ made it C# and fiddled. I was originally completely stumped as to how to refactor in order to achieve a result that could be provable via mutation testing. Eventually tho, I had my lightbulb moment, and it applies to this example as well. Applying some immutability to the method works wonders here. So instead of a void, return char[], and you have some provable code, and possibly a method that is eminently better for the experience.Lover
@pms1969, right, immutability helps here. But it removes "in-place" constraint to the algorithms. I don't think it is possible to treat algorithm, modified this way, as the same algorithm. For matrix transposition there is a very simple not-in-place algorithm, but all in-place algorithms are pretty advanced. By the way, I just added one more example, which is already "immutable".Hodeida
And I would agree with the in-place contraint comment, but perhaps we would need to ask ourselves why that constraint is there when writing our own code? Simpler code is easier to maintain and refactor. And the fact that mutation testing can't prove out the in-place algorithm, might make you reconsider your design (the royal "you" there). The only time I can think of where the constraint may be valid is working on huge datasets that might blow up your memory if copied.Lover
OK. Added the code for "copy" itself. Hope, writing whole class implementation is not necessary. I didn't use Array.Copy because there is nothing to test in this case. As for in-place constraint, the only thing that matters for matrix transposition is huge dataset. For other algorithms, in-place constraint may have other advantages. It is better to modify 10% of dataset than copy it all. (At least this is true for imperative languages; functional languages use linked structures, where immutability is more convenient).Hodeida
Thanks for this Evgeny. I think you've succeeded in demonstrating some non-trivial examples, which is what Paul and I were really after. Think you've earned the bonus. :)Hellebore
absolutely. Very much appreciated.Lover
H
4

I think with this particular method, there are two choices. You either admit that it's not suitable for mutation testing because of this mathematical anomaly, or you try to write it in a way that makes it safe for mutation testing, either by refactoring to the form you give, or some other way (possibly recursive?).

Your question really boils down to this: is there a real life situation where we care about whether the element 0 is included in or excluded from the operation of a loop, and for which we cannot write a test around that specific aspect? My instinct is to say no.

Your trivial example may be an example of lack of what I referred to as test-drivenness in my blog, writing about NinjaTurtles. Meaning in the case that you have not refactored this method as far as you should.

Hellebore answered 30/4, 2012 at 13:19 Comment(1)
Precisely. My instinct is also no. A non-trivial example should always fail if you mutate the starting constant of a for loop, and I suspect that my contrived example is just one of a handful of actual cases that won't fail when it is changed. I guess even a trivial example that can't be refactored would be of interest. I'll change the question.Lover
H
3

One natural case of "mutation test failure" is an algorithm for matrix transposition. To make it more suitable for a single for-loop, add some constraints to this task: let the matrix be non-square and require transposition to be in-place. These constraints make one-dimensional array most suitable place to store the matrix and a for-loop (starting, usually, from index '1') may be used to process it. If you start it from index '0', nothing changes, because top-left element of the matrix always transposes to itself.

For an example of such code, see answer to other question (not in C#, sorry).

Here "mutation off by one" test fails, refactoring the test does not change it. I don't know if the code itself may be refactored to avoid this. In theory it may be possible, but should be too difficult.


The code snippet I referenced earlier is not a perfect example. It still may be refactored if the for loop is substituted by two nested loops (as if for rows and columns) and then these rows and columns are recalculated back to one-dimensional index. Still it gives an idea how to make some algorithm, which cannot be refactored (though not very meaningful).

Iterate through an array of positive integers in the order of increasing indexes, for each index compute its pair as i + i % a[i], and if it's not outside the bounds, swap these elements:

for (var i = 1; i < a.Length; i++)
{
    var j = i + i % a[i];
    if (j < a.Length)
        Swap(a[i], a[j]);
}

Here again a[0] is "unmovable", refactoring the test does not change this, and refactoring the code itself is practically impossible.


One more "meaningful" example. Let's implement an implicit Binary Heap. It is usually placed to some array, starting from index '1' (this simplifies many Binary Heap computations, compared to starting from index '0'). Now implement a copy method for this heap. "Off-by-one" problem in this copy method is undetectable because index zero is unused and C# zero-initializes all arrays. This is similar to OP's array summation, but cannot be refactored.

Strictly speaking, you can refactor the whole class and start everything from '0'. But changing only 'copy' method or the test does not prevent "mutation off by one" test failure. Binary Heap class may be treated just as a motivation to copy an array with unused first element.

int[] dst = new int[src.Length];
for (var i = 1; i < src.Length; i++)
{
    dst[i] = src[i];
}
Hodeida answered 3/5, 2012 at 13:3 Comment(7)
So, a meaningful code task, where starting from index 0 or 1 has the same effect because, like in the summation case, the operation on index 0 has no effect. And in this case, without resorting to an external library for matrix calculations, there's no refactor - assuming you take as granted the use of a one-dimensional array. Interesting - pondering this.Hellebore
@evgeny - I've only just finished getting my head around you're original post. But it was worth it. I took the C++ made it C# and fiddled. I was originally completely stumped as to how to refactor in order to achieve a result that could be provable via mutation testing. Eventually tho, I had my lightbulb moment, and it applies to this example as well. Applying some immutability to the method works wonders here. So instead of a void, return char[], and you have some provable code, and possibly a method that is eminently better for the experience.Lover
@pms1969, right, immutability helps here. But it removes "in-place" constraint to the algorithms. I don't think it is possible to treat algorithm, modified this way, as the same algorithm. For matrix transposition there is a very simple not-in-place algorithm, but all in-place algorithms are pretty advanced. By the way, I just added one more example, which is already "immutable".Hodeida
And I would agree with the in-place contraint comment, but perhaps we would need to ask ourselves why that constraint is there when writing our own code? Simpler code is easier to maintain and refactor. And the fact that mutation testing can't prove out the in-place algorithm, might make you reconsider your design (the royal "you" there). The only time I can think of where the constraint may be valid is working on huge datasets that might blow up your memory if copied.Lover
OK. Added the code for "copy" itself. Hope, writing whole class implementation is not necessary. I didn't use Array.Copy because there is nothing to test in this case. As for in-place constraint, the only thing that matters for matrix transposition is huge dataset. For other algorithms, in-place constraint may have other advantages. It is better to modify 10% of dataset than copy it all. (At least this is true for imperative languages; functional languages use linked structures, where immutability is more convenient).Hodeida
Thanks for this Evgeny. I think you've succeeded in demonstrating some non-trivial examples, which is what Paul and I were really after. Think you've earned the bonus. :)Hellebore
absolutely. Very much appreciated.Lover
E
1

Yes, there are many, assuming I have understood your question.

One similar to your case is:

public int MultiplyTo(int max)
{
    int product = 1;
    for (var i = 1; i <= max; i++)
    {
        product *= i;
    }
    return product;
}

Here, if it starts from 0, the result will be 0, but if it starts from 1 the result should be correct. (Although it won't tell the difference between 1 and 2!).

Entrap answered 30/4, 2012 at 11:54 Comment(4)
I think the OP's after a non-trivial example that fails mutation testing - this one passes, because a loop start of i = 0 gives the result 0 and a loop start of i = 1 gives a non-zero result.Hellebore
That makes more sense...So his question is really saying, "Here is an example of what I'm looking for, but is there another case which is less trivial?"Entrap
well, actually, looking at it a mutation of the var i = 1 to var i = 2 will fail as a multiply by 1 has no effect on the answer. Still a fairly contrived answer. Is there a refactor that might work here?Lover
Yes, there is a refactor that might work here. Make the method recursive.Hellebore
C
1

Not quite sure what you are looking for exactly, but it seems to me that if you change/mutate the initial value of sum from 0 to 1, you should fail the test:

public int SumTo(int max) 
{ 
  int sum = 1; // Now we are off-by-one from the beginning!
  for (var i = 0; i <= max; i++) 
  { 
    sum += i; 
  } 
  return sum; 
}

Update based on comments:

The loop will only not fail after mutation when the loop invariant is violated in the processing of index 0 (or in the absence of it). Most such special cases can be refactored out of the loop, but consider a summation of 1/x:

for (var i = 1; i <= max; i++) {
  sum += 1/i;
}

This works fine, but if you mutate the initial bundary from 1 to 0, the test will fail as 1/0 is invalid operation.

Cheeks answered 30/4, 2012 at 12:21 Comment(7)
Yes that will, but it's the for loop that is of interest to me here.Lover
As far as I understand your question, if all operations within the loop are valid, then there is no difference between starting from 0 or 1 (the result might obviously be different, but the invariant(with the modified boundaries) will hold). Can you do a i <= max+0? If that's OK, mutating it to i<=max+1 might be invalid (e.g. overindexing an array)Cheeks
so the constants here are the things that are mutating, and you are right, over indexing may well cause a failure (but I don't think we could guarantee it with the sum example. I'll edit the post to give an example of something that is less trivial and would fail with certain inputs. The test itself could be refactored so that the inputs then would fail the test, and thus good coverage/testing is achieved.Lover
Another possibility is that the loop requires that you start from 0 (e.g. step 0 is some sort of initialization) then the mutation will cause the test to failCheeks
that could then be refactored to do the initialisation outside of the loop. Unless you can think if an example that it couldn't?Lover
If your loop invariant is such that it holds both when you start from 0 and when you start from 1, there is no way to fail with your off-by-one mutation. If the loop is such that it only holds if you are starting from 0 (equivalently: only if you start form 1), then your mutation will fail the test. Having a special case at value 0 that you must (equivalenty: must not) process will result in what you are looking for. Another special case: sum of reciprocal numbers: 1/0 must not be calculated, but 1/i i=[1,...n] is OK.Cheeks
let us continue this discussion in chatLover

© 2022 - 2024 — McMap. All rights reserved.