The question is confusingly worded. Let's break it down into many smaller questions:
Why is it that one tenth plus two tenths does not always equal three tenths in floating point arithmetic?
Let me give you an analogy. Suppose we have a math system where all numbers are rounded off to exactly five decimal places. Suppose you say:
x = 1.00000 / 3.00000;
You would expect x to be 0.33333, right? Because that is the closest number in our system to the real answer. Now suppose you said
y = 2.00000 / 3.00000;
You'd expect y to be 0.66667, right? Because again, that is the closest number in our system to the real answer. 0.66666 is farther from two thirds than 0.66667 is.
Notice that in the first case we rounded down and in the second case we rounded up.
Now when we say
q = x + x + x + x;
r = y + x + x;
s = y + y;
what do we get? If we did exact arithmetic then each of these would obviously be four thirds and they would all be equal. But they are not equal. Even though 1.33333 is the closest number in our system to four thirds, only r has that value.
q is 1.33332 -- because x was a little bit small, every addition accumulated that error and the end result is quite a bit too small. Similarly, s is too big; it is 1.33334, because y was a little bit too big. r gets the right answer because the too-big-ness of y is cancelled out by the too-small-ness of x and the result ends up correct.
Does the number of places of precision have an effect on the magnitude and direction of the error?
Yes; more precision makes the magnitude of the error smaller, but can change whether a calculation accrues a loss or a gain due to the error. For example:
b = 4.00000 / 7.00000;
b would be 0.57143, which rounds up from the true value of 0.571428571... Had we gone to eight places that would be 0.57142857, which has far, far smaller magnitude of error but in the opposite direction; it rounded down.
Because changing the precision can change whether an error is a gain or a loss in each individual calculation, this can change whether a given aggregate calculation's errors reinforce each other or cancel each other out. The net result is that sometimes a lower-precision computation is closer to the "true" result than a higher-precision computation because in the lower-precision computation you get lucky and the errors are in different directions.
We would expect that doing a calculation in higher precision always gives an answer closer to the true answer, but this argument shows otherwise. This explains why sometimes a computation in floats gives the "right" answer but a computation in doubles -- which have twice the precision -- gives the "wrong" answer, correct?
Yes, this is exactly what is happening in your examples, except that instead of five digits of decimal precision we have a certain number of digits of binary precision. Just as one-third cannot be accurately represented in five -- or any finite number -- of decimal digits, 0.1, 0.2 and 0.3 cannot be accurately represented in any finite number of binary digits. Some of those will be rounded up, some of them will be rounded down, and whether or not additions of them increase the error or cancel out the error depends on the specific details of how many binary digits are in each system. That is, changes in precision can change the answer for better or worse. Generally the higher the precision, the closer the answer is to the true answer, but not always.
How can I get accurate decimal arithmetic computations then, if float and double use binary digits?
If you require accurate decimal math then use the decimal
type; it uses decimal fractions, not binary fractions. The price you pay is that it is considerably larger and slower. And of course as we've already seen, fractions like one third or four sevenths are not going to be represented accurately. Any fraction that is actually a decimal fraction however will be represented with zero error, up to about 29 significant digits.
OK, I accept that all floating point schemes introduce inaccuracies due to representation error, and that those inaccuracies can sometimes accumulate or cancel each other out based on the number of bits of precision used in the calculation. Do we at least have the guarantee that those inaccuracies will be consistent?
No, you have no such guarantee for floats or doubles. The compiler and the runtime are both permitted to perform floating point calculations in higher precision than is required by the specification. In particular, the compiler and the runtime are permitted to do single-precision (32 bit) arithmetic in 64 bit or 80 bit or 128 bit or whatever bitness greater than 32 they like.
The compiler and the runtime are permitted to do so however they feel like it at the time. They need not be consistent from machine to machine, from run to run, and so on. Since this can only make calculations more accurate this is not considered a bug. It's a feature. A feature that makes it incredibly difficult to write programs that behave predictably, but a feature nevertheless.
So that means that calculations performed at compile time, like the literals 0.1 + 0.2, can give different results than the same calculation performed at runtime with variables?
Yep.
What about comparing the results of 0.1 + 0.2 == 0.3
to (0.1 + 0.2).Equals(0.3)
?
Since the first one is computed by the compiler and the second one is computed by the runtime, and I just said that they are permitted to arbitrarily use more precision than required by the specification at their whim, yes, those can give different results. Maybe one of them chooses to do the calculation only in 64 bit precision whereas the other picks 80 bit or 128 bit precision for part or all of the calculation and gets a difference answer.
So hold up a minute here. You're saying not only that 0.1 + 0.2 == 0.3
can be different than (0.1 + 0.2).Equals(0.3)
. You're saying that 0.1 + 0.2 == 0.3
can be computed to be true or false entirely at the whim of the compiler. It could produce true on Tuesdays and false on Thursdays, it could produce true on one machine and false on another, it could produce both true and false if the expression appeared twice in the same program. This expression can have either value for any reason whatsoever; the compiler is permitted to be completely unreliable here.
Correct.
The way this is usually reported to the C# compiler team is that someone has some expression that produces true when they compile in debug and false when they compile in release mode. That's the most common situation in which this crops up because the debug and release code generation changes register allocation schemes. But the compiler is permitted to do anything it likes with this expression, so long as it chooses true or false. (It cannot, say, produce a compile-time error.)
This is craziness.
Correct.
Who should I blame for this mess?
Not me, that's for darn sure.
Intel decided to make a floating point math chip in which it was far, far more expensive to make consistent results. Small choices in the compiler about what operations to enregister vs what operations to keep on the stack can add up to big differences in results.
How do I ensure consistent results?
Use the decimal
type, as I said before. Or do all your math in integers.
I have to use doubles or floats; can I do anything to encourage consistent results?
Yes. If you store any result into any static field, any instance field of a class or array element of type float or double then it is guaranteed to be truncated back to 32 or 64 bit precision. (This guarantee is expressly not made for stores to locals or formal parameters.) Also if you do a runtime cast to (float)
or (double)
on an expression that is already of that type then the compiler will emit special code that forces the result to truncate as though it had been assigned to a field or array element. (Casts which execute at compile time -- that is, casts on constant expressions -- are not guaranteed to do so.)
To clarify that last point: does the C# language specification make those guarantees?
No. The runtime guarantees that stores into an array or field truncate. The C# specification does not guarantee that an identity cast truncates but the Microsoft implementation has regression tests that ensure that every new version of the compiler has this behaviour.
All the language spec has to say on the subject is that floating point operations may be performed in higher precision at the discretion of the implementation.
Math.Abs(.1d + .2d - .3d) < double.Epsilon
This should be the better equality method. – Caudill==
is not "reference" comparison, and.Equals()
is not "value" comparison. Their implementation is type-specific. – Aluminate==
case, using higher precision to calculated the result. Hence the result is the same as doing the comparison using double precision. – Veronaveronesefloat + float == float
ordouble + double == double
could be false when it appears to be true. – Mandarin0.1 + 0.2 == 0.3
that is a constant expression which can be entirely computed at compile time. In(0.1 + 0.2).Equals(0.3)
the0.1 + 0.2
and the0.3
are all constant expressions but the equality is computed by the runtime, not by the compiler. Is that clear? – Creature