What should be the epsilon value when performing double value equal comparison
Asked Answered
S

6

24

Here is the output for the below program.

value is : 2.7755575615628914E-17
Double.compare with zero : 1
isEqual with zero : true

My question is, what should be an epsilon value? Is there any robust way to obtain the value, instead of picking a number out from the sky.


package sandbox;

/**
 *
 * @author yccheok
 */
public class Main {

    /**
     * @param args the command line arguments
     */
    public static void main(String[] args) {
        double zero = 1.0/5.0 + 1.0/5.0 - 1.0/10.0 - 1.0/10.0 - 1.0/10.0 - 1.0/10.0;
        System.out.println("value is : " + zero);
        System.out.println("Double.compare with zero : " + Double.compare(zero, 0.0));
        System.out.println("isEqual with zero : " + isEqual(zero, 0.0));
    }

    public static boolean isEqual(double d0, double d1) {
        final double epsilon = 0.0000001;
        return d0 == d1 ? true : Math.abs(d0 - d1) < epsilon;
    }
}
Sidekick answered 16/9, 2010 at 15:32 Comment(1)
@Cheok, your question is not clear, what is you expecting from epsilon?Pinwork
L
9

The answer to your second question is no. The magnitude of finite-machine precision error can be arbitrarily large:

public static void main(String[] args) {
    double z = 0.0;
    double x = 0.23;
    double y = 1.0 / x;
    int N = 50000;
    for (int i = 0; i < N; i++) {
        z += x * y - 1.0;
    }
    System.out.println("z should be zero, is " + z);
}

This gives ~5.55E-12, but if you increase N you can get just about any level of error you desire.

There is a vast amount of past and current research on how to write numerically stable algorithms. It is a hard problem.

Lucretialucretius answered 16/9, 2010 at 16:13 Comment(2)
No, just an example of a number where x * (1.0/x) is not quite equal to 1.Lucretialucretius
Isn't we should call 0.23 as magic number? As I can use 0.24 as well, right? I thought if we can pick an arbitrary number, we usually call that as magic number.Sidekick
S
12

I like (pseudo code, I don't do java)

bool fuzzyEquals(double a, double b)
{
    return abs(a - b) < eps * max(abs(a), abs(b));
}

with epsilon being a few times the machine epsilon. Take 10^-12 if you don't know what to use.

This is quite problem dependant however. If the computations giving a and b are prone to roundoff error, or involve many operations, or are themselves within some (known) accuracy, you want to take a bigger epsilon.

Ths point is to use relative precision, not absolute.

Specimen answered 16/9, 2010 at 16:13 Comment(5)
Can I return abs(a - b) < eps * abs(b);Sidekick
I think we should compare across abs(a) and abs(b), and take the minimum value to multiply with eps. See essentiallyEqual : jstock.cvs.sourceforge.net/viewvc/jstock/jstock/src/org/yccheok/…Sidekick
@Yan: It depends on what you want.Specimen
@AlexandreC., @CheokYanCheng, could you elaborate? wouldn't using the minimum while having a very close to 0 and b exactly 0 never return true?Patrizio
@clausavram: this is the "depends on what you want" part. Using the minimum would imply what you say, and it could be what you want. In practice, when assessing the distance between numbers whose magnitude I don't know, I always use abs(b - a) / max(abs(a), abs(b)) (guarding also for the case a == b == 0).Specimen
L
9

The answer to your second question is no. The magnitude of finite-machine precision error can be arbitrarily large:

public static void main(String[] args) {
    double z = 0.0;
    double x = 0.23;
    double y = 1.0 / x;
    int N = 50000;
    for (int i = 0; i < N; i++) {
        z += x * y - 1.0;
    }
    System.out.println("z should be zero, is " + z);
}

This gives ~5.55E-12, but if you increase N you can get just about any level of error you desire.

There is a vast amount of past and current research on how to write numerically stable algorithms. It is a hard problem.

Lucretialucretius answered 16/9, 2010 at 16:13 Comment(2)
No, just an example of a number where x * (1.0/x) is not quite equal to 1.Lucretialucretius
Isn't we should call 0.23 as magic number? As I can use 0.24 as well, right? I thought if we can pick an arbitrary number, we usually call that as magic number.Sidekick
C
8

There is no one right value. You need to compute it relative to the magnitude of the numbers involved. What you're basically dealing with is a number of significant digits, not a specific magnitude. If, for example, your numbers are both in the range of 1e-100, and your calculations should maintain roughly 8 significant digits, then your epsilon should be around 1e-108. If you did the same calculations on numbers in the range of 1e+200, then your epsilon would be around 1e+192 (i.e., epsilon ~= magnitude - significant digits).

I'd also note that isEqual is a poor name -- you want something like isNearlyEQual. For one reason, people quite reasonably expect "equal" to be transitive. At the very least, you need to convey the idea that the result is no longer transitive -- i.e., with your definition of isEqual, isEqual(a, c) can be false, even though isEqual(a, b) and isEqual(b, c) are both true.

Edit: (responding to comments): I said "If [...] your calculations should maintain roughly 8 significant digits, then your epsilon should be...". Basically, it comes to looking at what calculations you're doing, and how much precision you're likely to lose in the process, to provide a reasonable guess at how big a difference has to be before it's significant. Without knowing the calculation you're doing, I can't guess that.

As far as the magnitude of epsilon goes: no, it does not make sense for it to always be less than or equal to 1. A floating point number can only maintain limited precision. In the case of an IEEE double precision floating point, the maximum precision that can be represented is about 20 decimal digits. That means if you start with 1e+200, the absolute smallest difference from that number that the machine can represent at all is about 1e+180 (and a double can represent numbers up to ~1e+308, at which point the smallest difference that can be represented is ~1e+288).

Cider answered 16/9, 2010 at 16:7 Comment(2)
Why my number is 1e-100, then epsilon should be around 1e-108. Why 8 significant digits?Sidekick
@Coffin, why epsilon is 1e+192 ? Isn't epsilon should at least lesser than 1, and greater than 0?Sidekick
M
8

In isEqual, have something like:

epsilon = Math.max(Math.ulp(d0), Math.ulp(d1))

An ulp of a double value is the positive distance between this floating-point value and the double value next larger in magnitude. [1]

[1] http://docs.oracle.com/javase/6/docs/api/java/lang/Math.html#ulp%28double%29

Mendiola answered 12/6, 2013 at 16:5 Comment(0)
C
1

You should definitely read https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/ first.

It discusses various ways of comparing floating point numbers: absolute tolerance, relative tolerance, ulp distance. It makes a fairly good argument that ulp checking is the way to go. The case hinges around the argument that if you want to check if two floating point numbers are the same, you have to take into account the distance between representable floats. In other words, you should check if the two numbers are within e floats of each other.

The algorithms are given in C, but can be translated to java using java.lang.Double#doubleToLongBits and java.lang.Float#floatToIntBits to implement the casting from floating to integer types. Also, with java > 1.5 there are methods ulp(double) ulp(float) and for java > 1.6 nextUp(double) nextUp(float) nextAfter(double, double) nextAfter(float, float) that are useful for quantifying the difference between two floating point numbers.

Circuitry answered 17/9, 2010 at 13:57 Comment(1)
the article points out it is obsolete and this is the replacement: randomascii.wordpress.com/2012/02/25/…Patrizio
P
1

There are two concepts involved here:

  1. A machine precision unit: Double.ulp()
  2. A machine precision for a given double d: Double.ulp(d)

If you call Double.ulp() you will obtain the machine precision unit, which is the precision you can expect from a certain hardware platform... whatever this definition might be!

If you call Double.ulp(d), you will get the machine precision for double d. In other words, every double d has its specific precision. This is more useful than the previous paragraph.

You must take particular attention to detail when you are performing iterations which involve calculations in cascade, i.e.: when results from the previous calculations are employed in the current calculation. This is because errors accumulate in these situations and may end up, in certain circumstances, delivering results which are way off the true value they should deliver. In certain circumstances, the size of the accumulated error may even be greater than the true value. See some disastrous examples here.

In certain business domains, numerical computation errors are simply not acceptable. Depending on the business domain, its regulations, requirements and characteristics, you must take alternative approaches for the simplistic choice of employing floating point arithmetic (i.e: doubles or floats).

In the case of Finance for example, never ever use floating point arithmetic. Never ever use doubles or floats when you are dealing with money. Never. Period. You can employ BigDecimal or fixed point arithmetic, depending on circumstances.

In the specific case of processing stock prices, you know that prices have always 5 digits of precision and, in this case, fixed point arithmetic is plenty enough and also delivers the maximum performance you can possibly obtain, which is a very strong and common requirement in this business domain.

If the business domain really requires numerical computations, you must in this case make sure you keep error propagation under your strict and very careful control. This is a long subject, there are a number of techniques, and very frequently developers overlook the problem simply believing that there's a single magic call to a method which does all the hard work for them. No, it doesn't. You have to do your research, do your homework and do all the hard work necessary in order to make sure you keep errors under control. You need to understand exactly what is going on with the numerical algorithms you've implemented.

Planetoid answered 11/5, 2017 at 19:7 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.