Floating point equality
Asked Answered
A

6

47

It is common knowledge that one has to be careful when comparing floating point values. Usually, instead of using ==, we use some epsilon or ULP based equality testing.

However, I wonder, are there any cases, when using == is perfectly fine?

Look at this simple snippet, which cases are guaranteed to succeed?

void fn(float a, float b) {
    float l1 = a/b;
    float l2 = a/b;

    if (l1==l1) { }        // case a)
    if (l1==l2) { }        // case b)
    if (l1==a/b) { }       // case c)
    if (l1==5.0f/3.0f) { } // case d)
}

int main() {
    fn(5.0f, 3.0f);
}

Note: I've checked this and this, but they don't cover (all of) my cases.

Note2: It seems that I have to add some plus information, so answers can be useful in practice: I'd like to know:

  • what the C++ standard says
  • what happens, if a C++ implementation follows IEEE-754

This is the only relevant statement I found in the current draft standard:

The value representation of floating-point types is implementation-defined. [ Note: This document imposes no requirements on the accuracy of floating-point operations; see also [support.limits]. — end note ]

So, does this mean, that even "case a)" is implementation defined? I mean, l1==l1 is definitely a floating-point operation. So, if an implementation is "inaccurate", then could l1==l1 be false?


I think this question is not a duplicate of Is floating-point == ever OK?. That question doesn't address any of the cases I'm asking. Same subject, different question. I'd like to have answers specifically to case a)-d), for which I cannot find answers in the duplicated question.

Autotrophic answered 2/7, 2018 at 10:26 Comment(22)
Maybe take a look here as wellLeandra
@interjay: yes, this is what I think too, but I'm not 100% sure. IEEE 754 mandates this, but I don't know what does the C++ standard say about this, if says anything at all.Autotrophic
@Pi: that's one of my links in the question. I've checked the answers there, but not all my cases are covered there.Autotrophic
So to be clear, you want a pure C++ answer without using IEEE 754 guarantees?Bocage
There is no guarantee at all.Durden
For the cases in the code section I would agree, but for the last statement 'it is not even guaranteed that a == a in floating point, you basically have to use an epsilon comparison ' I would not be so sure there is no guaranteeLeandra
@BaummitAugen: I think I'm interested in both (I mean pure C++ and IEEE 754). The more we know about this subject, the better.Autotrophic
@SombreroChicken: even for case a)?Autotrophic
@interjay That's untrue.Durden
Related: https://mcmap.net/q/372838/-can-we-rely-on-op-to-binary-compare-floating-point-values/560648Trituration
Couldn't you check the type and / or bytesize of the objects and if they're the same use '==' effectively?Vista
The very answer you'll ever need: Never ==.Achilles
@iBug: can you explain why?Autotrophic
Comments by users such as SombreroChicken and ibug are dangerously, irresponsibly misleading. Despite myths to the contrary, floating-point equality always works, and floating-point computation is both deterministic and specified in detail, with many hard guarantees about accuracy. FUD on the topic only serves to make a generation of programmers stupid, and the upvotes on those comments show that they have succeeded admirably in doing so.Brython
Scary things still happen.Marseilles
@PeterMortensen: scary things happen a lot with floating point, yet we can use it. If the advice is "Never ==", then there should be a strong justification for it. The linked answers don't provide it. If "Never ==" is the true answer, then why doesn't C++ forbid it in the first place? I'm not saying that "Never ==" is untrue, but I'd like to see some proof. For example a proof, that even my case a) is implementation defined, so a==a cannot be trusted to be true.Autotrophic
Reopened; the "duplicate" does not address a == a. It would probably improve this question to make it only about a == a however, as the other cases are covered by the "duplicate" or other questionsReady
Suggest adding language-lawyer tag to get answers based on the standardReady
@M.M: Thanks for reopening! Yes, "case c) and d)" maybe can answered by the "duplicate", but I don't see answers for "case a) and b)". I think I leave the question as it is, as one can see the difference between them (if there are any). But, it no answers arrive for "case a) and b)", I'll maybe simplify the question.Autotrophic
@Achilles "Never" Then what is "safe" on fp values?Krueger
Did you see What is the most effective way for float and double comparison?? An answer point the Google Test Framework Check the "AlmostEqual" fcn.Gunar
@JHBonarius: thanks, I already know this technique :) But now, the question is not ulp/epsilon based comparison, but the contrary: exact equality.Autotrophic
N
15

However, I wonder, are there any cases, when using == is perfectly fine?

Sure there are. One category of examples are usages that involve no computation, e.g. setters that should only execute on changes:

void setRange(float min, float max)
{
    if(min == m_fMin && max == m_fMax)
        return;

    m_fMin = min;
    m_fMax = max;

    // Do something with min and/or max
    emit rangeChanged(min, max);
}

See also Is floating-point == ever OK? and Is floating-point == ever OK?.

Newsom answered 2/7, 2018 at 11:12 Comment(18)
It's not about that.Durden
@SombreroChicken: Why not?Trituration
@LightnessRacesinOrbit Since this answer focuses on the cases where using operator== would be fine as in "is it useful?" not as in "does it result in useful results?"Durden
@SombreroChicken: What's the difference?Trituration
@LightnessRacesinOrbit It answers "why would you use ==" instead of "can I safely use =="Durden
It correctly or not, it does answer this bit "I wonder, are there any cases, when using == is perfectly fine?"Colbert
@SombreroChicken: Right, and the first one was the question. If you want the second one, look here.Trituration
Depending on en.wikibooks.org/wiki/X86_Disassembly/Floating_Point_Numbers you will get diffrent behavions in this function, if value is pass in register and chaged is in memory, it could be always false even if you assigned this value to memory.Flair
This also hinges on something yet unconfirmed by any citation. Does a=c; b=c; makes a==b true. It would be weird if it didn't but I haven't seen it stated in any standard. Neither that, nor that functions must return same binary representation for the same arguments. One thing I've found is that f(x,y) SHOULD be equal to f(y,x) for basic arithmetic operations and non-signalling, non-inexact floating point representations.Colbert
@Colbert x86 FP stack have bigger precision than stored value. ABI on x86 allow passing floats in FP stack. Now you choose value that is greater in precision than memory float but still can be stored in FP stack. when you do m_max = arg; it will truncate value. then every test m_max == arg will fall because you check in reality trunc(arg) == arg.Flair
@Colbert I manage "break" code like this using -mfpmath=387 -ffast-math GCC flags: godbolt.org/g/MDaJDu test3 work as test2 but it use test1, I need use this flags because problems with floats precision is for x86 not for x86-64 that all have sse.Flair
@Flair IOW, x86 has broken floats.Krueger
@Krueger probably yes, but if used correctly you will not have problems and standard allows this. If you use fixed compiler and hardware you can even ignore UB, but this will be portable? No.Flair
@Flair Which part of the std allows it? How can you use broken undefined semantic correctly?Krueger
@Krueger "3.9.1 Fundamental type": "The value representation of floating-point types is implementation-defined"Flair
@Flair And? Fp types are trivial types. And even for a non trivial type, where does it say that fp operations have unspecified results?Krueger
@Krueger "The values of the floating operands and the results of floating expressions may be represented in greater precision and range than that required by the type; the types are not changed thereby", this break == and allow to < work correctly. x86 FPU obey this point.Flair
Let us continue this discussion in chat.Krueger
N
6

Contrived cases may "work". Practical cases may still fail. One additional issue is that often optimisation will cause small variations in the way the calculation is done so that symbolically the results should be equal but numerically they are different. The example above could, theoretically, fail in such a case. Some compilers offer an option to produce more consistent results at a cost to performance. I would advise "always" avoiding the equality of floating point numbers.

Equality of physical measurements, as well as digitally stored floats, is often meaningless. So if your comparing if floats are equal in your code you are probably doing something wrong. You usually want greater than or less that or within a tolerance. Often code can be rewritten so these types of issues are avoided.

Nidianidicolous answered 2/7, 2018 at 11:27 Comment(6)
Not always, see user2301274's answer. Equality can be used the check for a variable change. For example, if a set function uses a value which is the same as the current value, it can return right away.Autotrophic
@Autotrophic Or, more generally, float equality implies "real-world" equality, but the other way around is not necessarily true.Tice
@geza: But it's pretty easy to contrive cases where that set function wouldn't work. E.g. originally set to 1.0, later set to a computed value that's actually 1.(enough zeros for precision)1.Sulfurous
"often meaningless" Than what is guaranteed to be meaningful?Krueger
@Krueger - a meaningful solution is one that actually meets the real formal requirements that you have distilled or mined from the customer's brief. Other than for the investigation of computer science, if you are coding if(length_of_wood == 3.2), your formal requirements are probably wrong. The brief may be that the wood is 3.2m long. Coding this is "meaningless". The formal requirement may be something like, the measured length of the wood should be between 3.19 and 3.21m. Meaningful is meeting the customer brief. (NOT coding the brief as though they are requirements.)Nidianidicolous
@WilliamJBagshaw From a real world applicability POV, yes. From a language semantic integrity POV, "values of a type" containing more information than that type allows is an abomination.Krueger
E
4

Only a) and b) are guaranteed to succeed in any sane implementation (see the legalese below for details), as they compare two values that have been derived in the same way and rounded to float precision. Consequently, both compared values are guaranteed to be identical to the last bit.

Case c) and d) may fail because the computation and subsequent comparison may be carried out with higher precision than float. The different rounding of double should be enough to fail the test.

Note that the cases a) and b) may still fail if infinities or NANs are involved, though.


Legalese

Using the N3242 C++11 working draft of the standard, I find the following:

In the text describing the assignment expression, it is explicitly stated that type conversion takes place, [expr.ass] 3:

If the left operand is not of class type, the expression is implicitly converted (Clause 4) to the cv-unqualified type of the left operand.

Clause 4 refers to the standard conversions [conv], which contain the following on floating point conversions, [conv.double] 1:

A prvalue of floating point type can be converted to a prvalue of another floating point type. If the source value can be exactly represented in the destination type, the result of the conversion is that exact representation. If the source value is between two adjacent destination values, the result of the conversion is an implementation-defined choice of either of those values. Otherwise, the behavior is undefined.

(Emphasis mine.)

So we have the guarantee that the result of the conversion is actually defined, unless we are dealing with values outside the representable range (like float a = 1e300, which is UB).

When people think about "internal floating point representation may be more precise than visible in code", they think about the following sentence in the standard, [expr] 11:

The values of the floating operands and the results of floating expressions may be represented in greater precision and range than that required by the type; the types are not changed thereby.

Note that this applies to operands and results, not to variables. This is emphasized by the attached footnote 60:

The cast and assignment operators must still perform their specific conversions as described in 5.4, 5.2.9 and 5.17.

(I guess, this is the footnote that Maciej Piechotka meant in the comments - the numbering seems to have changed in the version of the standard he's been using.)

So, when I say float a = some_double_expression;, I have the guarantee that the result of the expression is actually rounded to be representable by a float (invoking UB only if the value is out-of-bounds), and a will refer to that rounded value afterwards.

An implementation could indeed specify that the result of the rounding is random, and thus break the cases a) and b). Sane implementations won't do that, though.

Enclitic answered 2/7, 2018 at 11:23 Comment(0)
L
2

Assuming IEEE 754 semantics, there are definitely some cases where you can do this. Conventional floating point number computations are exact whenever they can be, which for example includes (but is not limited to) all basic operations where the operands and the results are integers.

So if you know for a fact that you don't do anything that would result in something unrepresentable, you are fine. For example

float a = 1.0f;
float b = 1.0f;
float c = 2.0f;
assert(a + b == c); // you can safely expect this to succeed

The situation only really gets bad if you have computations with results that aren't exactly representable (or that involve operations which aren't exact) and you change the order of operations.

Note that the C++ standard itself doesn't guarantee IEEE 754 semantics, but that's what you can expect to be dealing with most of the time.

Lavoisier answered 2/7, 2018 at 15:6 Comment(1)
@Ben oops, yeah I've been editing that example several times. Ended up removing it, you're right even if I fixed it it wouldn't add that much.Lavoisier
H
2

Case (a) fails if a == b == 0.0. In this case, the operation yields NaN, and by definition (IEEE, not C) NaN ≠ NaN.

Cases (b) and (c) can fail in parallel computation when floating-point round modes (or other computation modes) are changed in the middle of this thread's execution. Seen this one in practice, unfortunately.

Case (d) can be different because the compiler (on some machine) may choose to constant-fold the computation of 5.0f/3.0f and replace it with the constant result (of unspecified precision), whereas a/b must be computed at runtime on the target machine (which might be radically different). In fact, intermediate calculations may be performed in arbitrary precision. I've seen differences on old Intel architectures when intermediate computation was performed in 80-bit floating-point, a format that the language didn't even directly support.

Heisler answered 4/7, 2018 at 1:46 Comment(2)
"changed in the middle of this thread's execution" Do you mean changed by one thread with impact on others?Krueger
@Krueger — Sadly, yes. Specifically, this was on Intel architecture, where the D3D library changed the CPU's floating-point round mode and didn't reset it. Since the FP mode was a property of the CPU, it affected all threads. Now, the likelyhood of this happening is low, but it has happened to me.Heisler
F
0

In my humble opinion, you should not rely on the == operator because it has many corner cases. The biggest problem is rounding and extended precision. In case of x86, floating point operations can be done with bigger precision than you can store in variables (if you use coprocessors, IIRC SSE operations use same precision as storage).

This is usually good thing, but this causes problems like: 1./2 != 1./2 because one value is form variable and second is from floating point register. In the simplest cases, it will work, but if you add other floating point operations the compiler could decide to split some variables to the stack, changing their values, thus changing the result of the comparison.

To have 100% certainty you need look at assembly and see what operations was done before on both values. Even the order can change the result in non-trivial cases.

Overall what is point of using ==? You should use algorithms that are stable. This means they work even if values are not equal, but they still give the same results. The only place I know where == could be useful is serializing/deserializing where you know what result you want exactly and you can alter serialization to archive your goal.

Flair answered 2/7, 2018 at 11:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.