How to generate good code coverage of floating-point logic?
Asked Answered
B

1

13

I am hand-crafting new code. I'd like to make sure I leave no stone unturned.

Is there anything specific I can do beyond specifying Code Contracts to guide Pex so it produces good coverage in numerically-intensive code?

Try searching http://research.microsoft.com/en-us/projects/pex/pexconcepts.pdf for keyword 'float' for some background information.

Arithmetic constraints over floating point numbers are approximated by a translation to rational numbers, and heuristic search techniques are used outside of Z3 to find approximate solutions for floating point constraints.

...and also...

Symbolic Reasoning. Pex uses an automatic constraint solver to determine which values are relevant for the test and the code-under-test. However, the abilities of the constraint solver are, and always will be, limited. In particular, Z3 cannot reason precisely about floating point arithmetic.

Alternatively, do you know a tool under .NET that is better suited for the task of finding numerical anomalies under .NET? I am aware of http://fscheck.codeplex.com/ but it does not perform symbolic reasoning.

Berardo answered 26/5, 2012 at 2:7 Comment(9)
Avoid conditionals relating to == for floats. Use < or > instead. If you have to use == then use the expression Math.Abs(value - target) < epsilon for whatever tolerance of epsilon you care about. Because of the approximation to rational the == relationship too often fails when you would like it to succeed. But Pex should have an easier time dealing with <.Susannahsusanne
@JesseChisholm I am aware of static analysis tools that allow you to find such coding errors. I am not sure how this helps with the question.Berardo
@GregC, I'm not sure I understand what you're asking. You want to know (A) whether conditional statements that contain floating point numbers use an epsilon tolerance or (B) your algorithms are numerically stable or (C) simply a recommendation for a code coverage tool? Or something else?Punkie
(A) is a coding error that can be determined by static analysis. (B) could be deduced by analyzing generated test inputs. (C) Code coverage is not hard to do; generating meaningful edge conditions to drive interesting outputs is hard to do, and that's what I am looking forBerardo
In NUnit you can do this: Assert.That(result, Is.EqualTo(expected).Within(.000001)); Don't know much about Pex, but it seems this is the type of thing you want to be doing.Wray
@Wray I feel your comment is irrelevant to the discussion. Please have a look at what Pex does; maybe you'll have some ideas. You can try Pex at pexforfun.com if you don't want to deal with installing it.Berardo
This leads me to thinking that we should avoid using floating point types unless absolutely compelled to do so.Priscian
0.0f, 0.1f, 0.9f, 1.0f, 1.1f, float.MaxValue - 0.1f, float.MaxValue -0.1f, -0.9f, -1.0f, -1.1f, float.MinValue + 0.1f, float.MinValueTarratarradiddle
@Tarratarradiddle Your comment demonstrates the gap in understanding of floating-point error due to binary representation. Imagine an iterative process that magnifies the error each and every step. Now consider such an iterative process that takes weeks to complete on a modern compute cluster. Error seeps into the result. I would like for a tool to tell me when it has a high chance of happening.Berardo
K
0

Is what you want good coverage? Just having a test that runs every branch in a piece of code is unlikely to actually mean that it is correct - often it's more about corner cases and you as the developer are best placed to know what these corner cases are. It also sounds like it works by just saying 'here's an interesting input combination' whereas more than likely what you want is to specify the behaviour of the system you want to see - if you have written the code wrong in the first place then the interesting inputs may be completely irrelevant to the correct code.

Maybe this isn't the answer you're looking for but I'd say the best way to do this is by hand! Write down a spec before you start coding and turn it in into a load of test cases when you know/as you are writing the API for your class/subsystem.

As begin filling out the API/writing the code you're likely to pick up extra bits and pieces that you need to do + find out what the difficult bits are - if you have conditionals etc that are something you feel that someone refactoring your code might get wrong then write a test case that covers them. I sometimes intentionally write code wrong at these points, get a test in that fails and then correct it just to make sure that the test is checking the correct path through the code.

Then try and think of any odd values you may not have covered - negative inputs, nulls etc. Often these will be cases that are invalid and you dont want to cater for/have to think about - in these cases I will generally write some tests to say that they should throw exceptions - that basically stops people misusing the code in cases you haven't though about properly/with invalid data.

You mentioned above that you are working with numerically intensive code - it may be worth testing a level above so you can test the behaviours in the system you are looking for rather than just number crunching - presuming that the code isn't purely numerical this will help you establish some real conditions of execution and also ensure that whatever the number crunching bit is actually doing interacts with the rest of the program in the way you need it to - if it's something algorithmic you'd probably be better off writing an acceptance test language to help characterise what the desired outputs are in different situations - this gives a clear picture of what you are trying to achieve, it also allows you to throw large amounts of (real) data through a system which is probably better than a computer generated input. The other benefit of this is that if you realise the algorithm needs a drastic rewrite in order to meet some new requirement then all you have to do is add the new test case and then rewrite/refactor; if your tests were just looking at the details of the algorithm and assuming the effects on the outside world then you would have a substantial headache trying to figure out how the algorithm currently influences behaviour, which parts were correct and which were not and then trying to migrate a load of unit tests onto a new API/algorithm.

Kinky answered 17/7, 2012 at 8:48 Comment(4)
Of course that is exactly what's done now. What I am saying is, I'd rather sift through a computer-generated set of test data than having to guess ahead of time what problems field data might bring. If you've studied Numerical Analysis, this might make sense to you. I am specifically interested in finding reasonably-formed data that causes numerical errors, such as overflow and underflow. Such problems are often computationally complex, and best left to a machine to find.Berardo
Here's an example of an exploration that would be simplified with help of automated trouble-finder: mathworks.com/matlabcentral/newsreader/view_thread/278975Berardo
I couldn't see what the solution to that guys problem was - was it something that could've been found by generated test data? There must be a very large search space for floating point numbers - it seems to say in that pdf that pex can't reason precisely about them. Interesting problem, sorry I couldn't be of more help!Kinky
I marked your answer as the solution to the problem, even though it really isn't. I guess I need to take a dive into deep learning...Berardo

© 2022 - 2024 — McMap. All rights reserved.