Floating point equality and tolerances
Asked Answered
R

7

18

Comparing two floating point number by something like a_float == b_float is looking for trouble since a_float / 3.0 * 3.0 might not be equal to a_float due to round off error.

What one normally does is something like fabs(a_float - b_float) < tol.

How does one calculate tol?

Ideally tolerance should be just larger than the value of one or two of the least significant figures. So if the single precision floating point number is use tol = 10E-6 should be about right. However this does not work well for the general case where a_float might be very small or might be very large.

How does one calculate tol correctly for all general cases? I am interested in C or C++ cases specifically.

Ravi answered 1/7, 2013 at 12:28 Comment(4)
Did you read this ?Goodfellowship
If you want perfect precision, you can use one of rational number libraries.Shawana
Q: How does one calculate tol correctly for all general cases? A: One doesn't. This kind of comparison is not suitable for all cases, regardless of tolerance value (and FWIW, wouldn't you know best what the appropriate tolerance is for the thing you are testing?)Taveras
unwind’s answer is correct, there is no general way to calculate an error bound. You must plan for error when designing the calculations, which means you must have a good understanding of floating-point operations.Conny
C
23

This blogpost contains an example, fairly foolproof implementation, and detailed theory behind it http://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/ it is also one of a series, so you can always read more. In short: use ULP for most numbers, use epsilon for numbers near zero, but there are still caveats. If you want to be sure about your floating point math i recommend reading whole series.

Crimmer answered 1/7, 2013 at 12:52 Comment(3)
While exact float comparisons are usually looking for trouble, this is not always the case. There are times when anything less than an exact comparison is sloppy, as I document here: randomascii.wordpress.com/2014/01/27/… It all depends on the context. In other words, sometimes the correct value for 'tol' is zero.Ungenerous
@BruceDawson Excellent article - not only thorough, but maximally thorough! Sadly doubles would take a bit longer... erm, 12.25 thousand years, or 440 generations according to WolframAlpha? It's probably safe to pass the buck on that one unless you're really optimistic about Moore's Law.Cashandcarry
You could knock that 12.25 thousand years down to around 2.5 years if you were testing on a gpu. :DPavier
N
10

As far as I know, one doesn't.

There is no general "right answer", since it can depend on the application's requirement for precision.

For instance, a 2D physics simulation working in screen-pixels might decide that 1/4 of a pixel is good enough, while a 3D CAD system used to design nuclear plant internals might not.

I can't see a way to programmatically decide this from the outside.

Noblesse answered 1/7, 2013 at 12:29 Comment(8)
I want as much accuracy as reasonably possible. That is the full number of significant digits for the storage type minus one or two to allow for roundoff error.Ravi
The floating point error is not bounded if you don't know the exact operations performed on it. If you have a==3e10000 and you subtract 1 100000 time you may not get a==2....Diacetylmorphine
@Diacetylmorphine I'm pretty sure that 3e10000 - 100000 is not 2, regardless of floating-point error.Lintel
@Ravi there are ways to make such a test, but it doesn't have a comparison like the one shown in your question. See en.wikipedia.org/wiki/Unit_in_the_last_placeTaveras
@Hong Ooi: The three dots were to denote the e-suffix - of course you are right in that the equation is never true. I really should have written it precisely.Diacetylmorphine
@Diacetylmorphine I don't even know of a machine where 3e10000 is representable.Naca
@JamesKanze: 3e10000 is trivially representable on any machine with at least 7 bytes of storage (which is pretty much any machine). Performing floating-point operations on the value will require some amount of software support, however.Neese
@Ravi Whether you can get what you want depends on your calculations and the floating point format you are using. Generally, that should be double unless there are really, really good reasons for using float, and good analysis indicating it is precise enough.Haynes
C
6

The C header file <float.h> gives you the constants FLT_EPSILON and DBL_EPSILON, which is the difference between 1.0 and the smallest number larger than 1.0 that a float/double can represent. You can scale that by the size of your numbers and the rounding error you wish to tolerate:

#include <float.h>
#ifndef DBL_TRUE_MIN
/* DBL_TRUE_MIN is a common non-standard extension for the minimum denorm value
 * DBL_MIN is the minimum non-denorm value -- use that if TRUE_MIN is not defined */
#define DBL_TRUE_MIN DBL_MIN
#endif

/* return the difference between |x| and the next larger representable double */
double dbl_epsilon(double x) {
    int exp;
    if (frexp(x, &exp) == 0.0)
        return DBL_TRUE_MIN;
    return ldexp(DBL_EPSILON, exp-1);
}
Crossgarnet answered 1/7, 2013 at 16:46 Comment(1)
This technique is solid, however it does fail for numbers around zero. This is a fundamental problem because relative errors are meaningless at zero. For example sin(M_PI) 'should' be zero but isn't. The relative error is infinite. See this article for details: randomascii.wordpress.com/2012/02/25/…Ungenerous
S
4

Welcome to the world of traps, snares and loopholes. As mentioned elsewhere, a general purpose solution for floating point equality and tolerances does not exist. Given that, there are tools and axioms that a programmer may use in select cases.

fabs(a_float - b_float) < tol has the shortcoming OP mentioned: "does not work well for the general case where a_float might be very small or might be very large." fabs(a_float - ref_float) <= fabs(ref_float * tol) copes with the variant ranges much better.

OP's "single precision floating point number is use tol = 10E-6" is a bit worrisome for C and C++ so easily promote float arithmetic to double and then it's the "tolerance" of double, not float, that comes into play. Consider float f = 1.0; printf("%.20f\n", f/7.0); So many new programmers do not realize that the 7.0 caused a double precision calculation. Recommend using double though out your code except where large amounts of data need the float smaller size.

C99 provides nextafter() which can be useful in helping to gauge "tolerance". Using it, one can determine the next representable number. This will help with the OP "... the full number of significant digits for the storage type minus one ... to allow for roundoff error." if ((nextafter(x, -INF) <= y && (y <= nextafter(x, +INF))) ...

The kind of tol or "tolerance" used is often the crux of the matter. Most often (IMHO) a relative tolerance is important. e. g. "Are x and y within 0.0001%"? Sometimes an absolute tolerance is needed. e.g. "Are x and y within 0.0001"?

The value of the tolerance is often debatable for the best value is often situation dependent. Comparing within 0.01 may work for a financial application for Dollars but not Yen. (Hint: be sure to use a coding style that allows easy updates.)

Steinbok answered 1/7, 2013 at 16:19 Comment(0)
K
2

Although the value of the tolerance depends on the situation, if you are looking for precision comparasion you could used as tolerance the machine epsilon value, numeric_limits::epsilon() (Library limits). The function returns the difference between 1 and the smallest value greater than 1 that is representable for the data type. http://msdn.microsoft.com/en-us/library/6x7575x3.aspx

The value of epsilon differs if you are comparing floats or doubles. For instance, in my computer, if comparing floats the value of epsilon is 1.1920929e-007 and if comparing doubles the value of epsilon is 2.2204460492503131e-016.

For a relative comparison between x and y, multiply the epsilon by the maximum absolute value of x and y.

The result above could be multiplied by the ulps (units in the last place) which allows you to play with the precision.

#include <iostream>
#include <cmath>
#include <limits>

template<class T> bool are_almost_equal(T x, T y, int ulp)
{
    return std::abs(x-y) <= std::numeric_limits<T>::epsilon() * std::max(std::abs(x), std::abs(y)) * ulp
}
Kreit answered 6/2, 2014 at 13:4 Comment(3)
This is a duplicate of the previous answer.Ungenerous
I added a bit of information plus code @BruceDawsonKreit
It's unnecessary to write if (cond) return true; else return false. Why don't you write return cond?Vinegary
I
2

Rounding error varies according to values used for operations.

Instead of a fixed tolerance, you can probably use a factor of epsilon like:

bool nearly_equal(double a, double b, int factor /* a factor of epsilon */)
{
  double min_a = a - (a - std::nextafter(a, std::numeric_limits<double>::lowest())) * factor;
  double max_a = a + (std::nextafter(a, std::numeric_limits<double>::max()) - a) * factor;

  return min_a <= b && max_a >= b;
}
Identity answered 7/2, 2016 at 12:3 Comment(1)
I think a good generalization of the concept would be that "nearly equal" is subjective: nearly equal with respect to another operator/operation. After all, epsilon is used to categorize by proximity, inverse distance, and you need to define a metric for 'distance' as well as an operation to derive/define an 'inverse'. If epsilon is "hand-waving" then epsilon without a frame of reference is outright Jedi mind trickery.Cashandcarry
F
-3

When I need to compare floats, I use code like this

bool same( double a, double b, double error ) {
    double x;
    if( a == 0 ) {
        x = b;
    } else if( b == 0 ) {
        x = a;
    } else {
        x = (a-b) / a;
    }
    return fabs(x) < error;
}
Farlay answered 1/7, 2013 at 13:52 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.