Why is NaN not equal to NaN? [duplicate]
Asked Answered
C

6

169

The relevant IEEE standard defines a numeric constant NaN (not a number) and prescribes that NaN should compare as not equal to itself. Why is that?

All the languages I'm familiar with implement this rule. But it often causes significant problems, for example unexpected behavior when NaN is stored in a container, when NaN is in the data that is being sorted, etc. Not to mention, the vast majority of programmers expect any object to be equal to itself (before they learn about NaN), so surprising them adds to the bugs and confusion.

IEEE standards are well thought out, so I am sure there is a good reason why NaN comparing as equal to itself would be bad. I just can't figure out what it is.

Edit: please refer to What is the rationale for all comparisons returning false for IEEE754 NaN values? as the authoritative answer.

Condottiere answered 5/4, 2012 at 18:43 Comment(1)
The IEEE standards were designed by engineers, not programmers, computer vendors, or authors of math libraries, for whom the NaN rule is a disaster.Karleen
C
40

My original answer (from 4 years ago) criticizes the decision from the modern-day perspective without understanding the context in which the decision was made. As such, it doesn't answer the question.

The correct answer is given here:

NaN != NaN originated out of two pragmatic considerations:

[...] There was no isnan( ) predicate at the time that NaN was formalized in the 8087 arithmetic; it was necessary to provide programmers with a convenient and efficient means of detecting NaN values that didn’t depend on programming languages providing something like isnan( ) which could take many years

There was one disadvantage to that approach: it made NaN less useful in many situations unrelated to numerical computation. For example, much later when people wanted to use NaN to represent missing values and put them in hash-based containers, they couldn't do it.

If the committee foresaw future use cases, and considered them important enough, they could have gone for the more verbose !(x<x & x>x) instead of x!=x as a test for NaN. However, their focus was more pragmatic and narrow: providing the best solution for a numeric computation, and as such they saw no issue with their approach.

===

Original answer:

I am sorry, much as I appreciate the thought that went into the top-voted answer, I disagree with it. NaN does not mean "undefined" - see http://www.cs.berkeley.edu/~wkahan/ieee754status/IEEE754.PDF, page 7 (search for the word "undefined"). As that document confirms, NaN is a well-defined concept.

Furthermore, IEEE approach was to follow the regular mathematics rules as much as possible, and when they couldn't, follow the rule of "least surprise" - see https://mcmap.net/q/24488/-what-is-the-rationale-for-all-comparisons-returning-false-for-ieee754-nan-values. Any mathematical object is equal to itself, so the rules of mathematics would imply that NaN == NaN should be True. I cannot see any valid and powerful reason to deviate from such a major mathematical principle (not to mention the less important rules of trichotomy of comparison, etc.).

As a result, my conclusion is as follows.

IEEE committee members did not think this through very clearly, and made a mistake. Since very few people understood the IEEE committee approach, or cared about what exactly the standard says about NaN (to wit: most compilers' treatment of NaN violates the IEEE standard anyway), nobody raised an alarm. Hence, this mistake is now embedded in the standard. It is unlikely to be fixed, since such a fix would break a lot of existing code.

Edit: Here is one post from a very informative discussion. Note: to get an unbiased view you have to read the entire thread, as Guido takes a different view to that of some other core developers. However, Guido is not personally interested in this topic, and largely follows Tim Peters recommendation. If anyone has Tim Peters' arguments in favor of NaN != NaN, please add them in comments; they have a good chance to change my opinion.

Condottiere answered 8/4, 2012 at 1:50 Comment(15)
IMHO, having NaN violate trichotomy makes sense, but like you I see no reasonable semantic justification for not having == define an equivalence relation when its operands are both of the same type (going a little further, I think languages should explicitly disallow comparisons between things of different types--even when implicit conversions exist--if such comparisons cannot implement an equivalence relation). The concept of an equivalence relations is so fundamental in both programming and mathematics, it seems crazy to violate it.Catboat
You might read on; Kahan says elsewhere in that document "NaNs must conform to mathematically consistent rules that were deduced, not invented arbitrarily[.]" I will agree that he doesn't mention how NaN != NaN is deduced beyond saying it's needed to distinguish NaN from non-NaNs absent library support like isnan().Gayegayel
Note that even if we consider NaN to represent an unknown value (and hence each NaN may be unequal to another), we cannot conclude that two NaN values necessarily are unequal to one another. In short: it might make sense for NaN == NaN to itself return NaN (or some other representation for uncomputable in your language - undefined, a raised exception, etc.), but it's definitely weird to simply return false.Contagious
@EamonNerbonne: Having NaN==NaN return something other than true or false would have been problematic, but given that (a<b) does not necessarily equal !(a>=b), I see no reason that (a==b) must necessarily equal !(a!=b). Having NaN==NaN and Nan!=NaN both return false would allow code which needs either definition of equality to use the one it needs.Catboat
This answer is WRONG WRONG WRONG! See my answer below.Wingback
You generally don't run across NaNs. The only way you can compute one is to ask for something that isn't mathematically defined. Once you've decided to have an error value, the question is "what semantics are least likely o cause trouble?" I believe that always returning false is the safest thing to do. So you get mis-sorted containers, how harmful is that? Especially compared to alternatives.Lycian
I am not aware of any axiom or postulate that states a mathematical object (how do you even define a mathematical object????) has to equal itself.Slaughterhouse
Even if your based on the identity function f on a set S where f(x) = x, I would argue that NaN is not part of the set of numbers, after all, it's literally not a number. So I don't see any argument from the identity function that NaN should equal itself.Slaughterhouse
With regards to your comment on trichotomy, trichotomy is an order relation on a set of numbers. Again I would argue that NaN is not in the set of numbers. I don't see anything in the IEEE specification that says it is, although I might have missed it.Slaughterhouse
@Slaughterhouse Look up the "Law of Identity". The concept of equality (and it's brothers consistency and completeness) are way too complex to be handled in a comment; but I will say that if you assume x=x can be false; you have opened up a giant can of worms you have to deal with before we can even consider the identify function has since the "=" in f(x)=x is not well defined.Necrology
@chuu look up DialetheismSlaughterhouse
This is wrong. Please see the top voted question below by russbishopAdventure
@Adventure what exactly is wrong? My original answer? My updated answer? The comment just above yours? Or any other particular statement?Condottiere
I would not even consider the possibility that, "IEEE committee members did not think this through very clearly, and made a mistake. " without either Investigating the committee members enough to assure myself that they are not actually experts or an expert (ideally person on the IEEE committee or a compiler writer) making the same complaint. IEEE is both important and a burden to implement, so discussions of it are not uncommon. You at least found a discussion involving such an expert. However, I find your reasoning was wrong, rather than merely your conclusion.Plumbo
"There was no isnan( ) predicate at the time that NaN was formalized in the 8087 arithmetic; it was necessary to provide programmers with a convenient and efficient means of detecting NaN values that didn’t depend on programming languages providing something like isnan( ) which could take many years" I don't understand this rationale. Couldn't one detect NaN values by comparing them for equality with a NaN value, provided as a constant in the language?Piker
W
204

The accepted answer is 100% without question WRONG. Not halfway wrong or even slightly wrong. I fear this issue is going to confuse and mislead programmers for a long time to come when this question pops up in searches.

NaN is designed to propagate through all calculations, infecting them like a virus, so if somewhere in your deep, complex calculations you hit upon a NaN, you don't bubble out a seemingly sensible answer. Otherwise by identity NaN/NaN should equal 1, along with all the other consequences like (NaN/NaN)==1, (NaN*1)==NaN, etc. If you imagine that your calculations went wrong somewhere (rounding produced a zero denominator, yielding NaN), etc then you could get wildly incorrect (or worse: subtly incorrect) results from your calculations with no obvious indicator as to why.

There are also really good reasons for NaNs in calculations when probing the value of a mathematical function; one of the examples given in the linked document is finding the zeros() of a function f(). It is entirely possible that in the process of probing the function with guess values that you will probe one where the function f() yields no sensible result. This allows zeros() to see the NaN and continue its work.

The alternative to NaN is to trigger an exception as soon as an illegal operation is encountered (also called a signal or a trap). Besides the massive performance penalties you might encounter, at the time there was no guarantee that the CPUs would support it in hardware or the OS/language would support it in software; everyone was their own unique snowflake in handling floating-point. IEEE decided to explicitly handle it in software as the NaN values so it would be portable across any OS or programming language. Correct floating point algorithms are generally correct across all floating point implementations, whether that be node.js or COBOL (hah).

In theory, you don't have to set specific #pragma directives, set crazy compiler flags, catch the correct exceptions, or install special signal handlers to make what appears to be the identical algorithm actually work correctly. Unfortunately some language designers and compiler writers have been really busy undoing this feature to the best of their abilities.

Please read some of the information about the history of IEEE 754 floating point. Also this answer on a similar question where a member of the committee responded: What is the rationale for all comparisons returning false for IEEE754 NaN values?

"An Interview with the Old Man of Floating-Point"

"History of IEEE Floating-Point Format"

What every computer scientist should know about floating point arithmetic

Wingback answered 14/5, 2014 at 23:6 Comment(17)
I also like NaN to propagate "like a virus". Unfortunately, it doesn't. The moment you compare, for example, NaN + 1 != 0, or NaN * 1 > 0, it returns True or False as if everything was fine. Therefore, you can't rely on NaN protecting you from problems if you plan to use comparison operators. Given that comparisons won't help you "propagate" NaNs, why not at least make them sensical? As things stand, they mess up the use cases of NaN in dictionaries, they make sort unstable, etc. Also, a minor mistake in your answer. NaN/NaN == 1 would not evaluate True if I had my way.Condottiere
Also, you claim that my answer is 100% positively absolutely WRONG. However, the person on the IEEE committee whom you quoted actually stated in the very post you quoted: ` Many commenters have argued that it would be more useful to preserve reflexivity of equality and trichotomy on the grounds that adopting NaN != NaN doesn’t seem to preserve any familiar axiom. I confess to having some sympathy for this viewpoint, so I thought I would revisit this answer and provide a bit more context.` So maybe, dear Sir, you might consider being a bit less forceful in your statements.Condottiere
I understand, but I wanted it to be 100% clear to new developers or those not interested in the details of floating point that your statement that the committee made a "mistake" was incorrect - the design was deliberate.Wingback
I never said the design wasn't deliberate. A deliberate design guided by poor logic or poor understanding of the problem is still a mistake. But this discussion is pointless. You clearly possess the knowledge of the ultimate truth, and your job is to preach it to the uneducated masses like myself. Enjoy the priesthood.Condottiere
Ignoring your snark, I explained the decision to you: at the time there was little chance of uniform handling of exceptions/signals across languages, compilers, hardware, and operating systems. The decision was made to handle NaN as a bit pattern so it would be completely portable, since it is useful in actual calculations. I think that's the correct decision, you disagree, but the decision was not guided by poor logic or poor understanding.Wingback
Spreading NaN through calculations is completely unrelated to equality comparisons with NaN. Portability and implementing NaN as a bit pattern is also immaterial for the question whether NaN should compare equal to itself or not. In fact, I can't find any rationale for NaN != NaN in this answer, except for the first linked answer at the bottom, which explains that the reason was the unavailability of isnan() at the time, which is valid reason why the decision was taken. However, I can't see any reason that is still valid today, except that it would be a very bad idea to change the semantics.Annal
@SvenMarnach That's not correct. NaN means the answer is unknown. Trying to compare NaN is semantically meaningless. Look at the second answer below... should log(-1) == acos(2)? Because if NaN == NaN then they do which is obvious nonsense. If most languages, platforms, and CPUs had supported trapping/throwing exceptions at the time then the standard might have chosen that route instead of NaN. They didn't. Most developers misuse floating point anyway, you shouldn't be comparing for direct equality most of the time anyway!Wingback
@xenadu I can see that log(-1) == acos(2) provides some argument in favour of the current behaviour. However, you noticed yourself that you shouldn't be comparing floating point numbers for equality anyway, so that's kind of a weak argument (and there are many reasons to decide the other way). However, that wasn't the point of my previous comment. My point was that the answer above, while correct, does not give any reasons why NaN shouldn't compare equal to itself. Everything you talk about is completely unrelated to that question.Annal
@SvenMarnach It does directly relate because you don't know a priori that the values in some x and y are NaN so comparing them as equal would be an invalid answer.Wingback
@xenadu but comparing them as not equal would also be an invalid answer. Should log(-1) == log(-1)? IMO == and != are both incorrect when it comes to comparing two NaN's, but for the sake of practicality == breaks a lot less laws, and a lot less containers.Honshu
Some of the consequences you describe are not true, and the rest are off topic. For instance, "Otherwise by identity NaN/NaN should equal 1." The standard itself violates this reasoning. Infinity == Infinity, and yet Infinity / Infinity == NaN, not 1. Clearly, therefore it is possible for both NaN == NaN and NaN / NaN == NaN. Secondly, the statement (NaN*1)==NaN MUST BE TRUE for NaN to "propagate through all calculations" as you yourself said. These supporting statements are wrong, and NONE of your cited sources claim them. Rather, your first source says it was to satisfy isnan(x) = (x != x)Intertidal
There's a semantic difference between Infinity and NaN though. Infinity has a mathematical meaning. Not a number quite literally means the answer is nonsense. As a programmer I understand the knee-jerk reaction to having two "apparently" equal values compare not equal. Floating point doesn't fit neatly into boolean logic (in more ways than one).Wingback
The statement "Otherwise by identity NaN/NaN should equal 1" is a mathematical statement/rule. Are you are trying to say that your mathematical rule does not apply to Infinity because Infinity is a mathematical concept? Obviously that's not true. The real rule is here: math.stackexchange.com/questions/1773367/… It does not apply to infinity, because no value x times infinity equals 1. Similarly, no value x times NaN equals 1. Rather than argue with your commenters, please provide sources, as you still have none.Intertidal
"snowflake"? Really?Abdella
Note that x / 0.0 is +-Inf for any non-zero finite x. A zero denominator aka divisor only produces NaN if the numerator aka dividend is also 0. (0. / 0.). Infinities also tend to be sticky, except for x/inf == 0.0 for finite x- there are some practical justifications for that, like getting the right result when a denominator calculation overflowed, but there are such cases where 0.0 isn't really a good answer. Like DBL_MAX / (DBL_MAX * 2) == 0.0. But anyway, that's infinity, not NaN; NaN does propagate more strongly.Peggy
Re: FP exceptions: it is possible in some C and C++ implementations, and in asm for most CPUs, to "unmask" the FP-invalid exception, so an "invalid" operation like sqrt(-1) or inf / inf. does trap in hardware, instead of producing NaN and setting a sticky bit in the FP environment. e.g. in GNU C++, feenableexcept() can unmask a set of FP exceptions so they trap. (en.cppreference.com/w/cpp/numeric/fenv), as in this tutorial answer What is the difference between quiet NaN and signaling NaN?Peggy
I disagree with you, as well as with the IEEE choice. Comparing log(-1) and arccos(2) doesn't make sense on a logical plane, both of those things simpy do not exist. But when you decide to assign a "value" to something (where I intend the word "value" in the large sense. The word "nan" is a value) you need to pull stuff together. Of course, nan/nan = nan, not 1, because it's not an "arithmetically valid" value. Pretty much like "infinyty/infinity" is not 1. Most of the examples in favour of nan!=nan do not hold. == does not represent an arithmetical operation, but a logical operation.Northeastward
S
125

Well, log(-1) gives NaN, and acos(2) also gives NaN. Does that mean that log(-1) == acos(2)? Clearly not. Hence it makes perfect sense that NaN is not equal to itself.

Revisiting this almost two years later, here's a "NaN-safe" comparison function:

function compare(a,b) {
    return a == b || (isNaN(a) && isNaN(b));
}
Stenotypy answered 5/4, 2012 at 18:45 Comment(32)
What is the situation where assuming them to be equal (in a real application, not in mathematics) would be a problem?Condottiere
In other words, NaN just means undefined. By definition, you can’t reasonably assert that some undefined value is equal to another one.Mulch
Well, if you were looking for an intersection between the log function and the acos function, then all negative values past -1 would be considered an intersection. Interestingly, Infinity == Infinity is true, despite the fact that the same can't be said in actual mathematics.Stenotypy
Given that Inf == Inf, and given that one might just as easily argue that an object should be equal to itself, I suspect there was some other, very specific and very strong, rationale behind the IEEE choice...Condottiere
The reason inf==inf is that floating point arithmetic is not exact; it's subject to rounding (by default, round-to-nearest). Similarly, 1.0==1.0 might evaluate true in floating point even when the expressions that generated the 1.0's were not exactly equal, because the equality holds after rounding.Arterialize
1 + 3 = 4 and 2 + 2 = 4 . Does that mean that 1 + 3 = 2 + 2 ? Clearly yes. Hence your answer does not make perfect sense.Goosefish
I think it would make more sense if you look at an invertible function (such as log, sqrt, or parseInt) that is called with two different arguments - you expect f(a)=f(b) <=> a=b. Now -3!=2 <=> acos(-3)!=acos(2) => NaN!=NaNShove
But log(-1) != log(-1) does not make sense. So neither NaN equals NaN nor NaN does not equal NaN makes sense in all cases. Arguably, it'd make more sense if NaN == NaN evalutated to something representing unknown, but then == wouldn't return a boolean.Sickler
@Bergi: I certainly wouldn't expect that f(a)=f(b) <=> a=b this would only hold if f was an injective function. But f could be constant. What one can expect is that a=b => f(a)=f(b), but then your argument does not apply.Pili
Even if one thinks NaN==NaN should return false, why should that imply that NaN!=NaN should return true? Given that NaN < NaN and NaN >= NaN both return false, even though they're "opposite" conditions, why could not NaN == NaN and NaN != NaN both return true?Catboat
@Catboat Because maybe they are equal. It all depends on how the NaN was obtained.Stenotypy
@NiettheDarkAbsol: I meant "both return false". If the fact that the things might not be equal is sufficient reason for == to return false, would not the the fact that they might be equal be adequate reason for != to also return false? The idea of equivalence relations is a very useful one, but I know of no nice way using only IEEE operations to test whether x and y are equivalent except (!(x < y) && !(x > y)); being able to test !(x == y) would seem much cleaner.Catboat
1/x == 2/x for large enough x, but 1 != 2. Floats aren't reals, so something have to give somewhere (actually an infinity of somewheres). x != x where x is a NaN is a truly sucky place to give, when isnan() can be implemented in other ways -- and it makes it extremely difficult to design template libraries over numerics that function according to spec.Karleen
"why could not NaN == NaN and NaN != NaN both return true?" -- that would be even worse than x = NaN; x != x because it would ban any language in which (a != b) is defined to be equivalent to !(a == b).Karleen
@JimBalter: I meant "both return false". As it is, the != operator is redundant with ==, while the other "opposite" forms allow a single test to specify the desired behavior when comparing NaN with itself [e.g. !(a > b) is equivalent to a <= b except that the former returns true for NaN and the latter false. If Nan != NaN were false, then code which wanted to test equivalence could test !(a != b) rather than having to use !(a < b) && !(b < a).Catboat
@Catboat Same answer.Karleen
@JimBalter: What languages would compel a==b to be equivalent to !(a!=b) but not also compel a>b to be equivalent to !(a<=b)? Also, can you suggest any usage cases where NaN!=NaN is helpful, which are anywhere near as common as checking whether a value has been stored in a set? If a language can't support two different meanings for equality, specifying that code which wants to test if two things are equal non-NaNs should use (x<=y)||(x>=y) would seem less annoying that saying that what would otherwise be type-agnostic equivalence-testing code must special-case floats.Catboat
@Catboat You are neither going to change the IEEE spec nor change languages in which (a != b) is defined as !(a == b), so this discussion is moot and I'm no longer interested in it.Karleen
Your NaN-safe comparison function returns true if you supply two different numbers which aren't equal to each other. Something like return a == b || (isNaN(a) && isNaN(b)) should work?Sattler
Clearly, NaN==NaN should return NaN.Thibodeau
@Sattler Thank you, kind of embarrassing now I think about it XDStenotypy
The log(-1) and acos(2) examples make it clear why NaN shouldn't be equal to itself. I think it's better to have two separate idioms when testing for equality of two NaNs. One, is symbolic equality (or equivalence) and the other is absolute exactness (or equality or identity). In my opinion, NaN is equivalent to NaN but not equal to NaN. log(-1) and acos(2) can be considered same in meaning but not exact in value.Softspoken
I would love to see an idiom like this in programming languages: NaN == NaN; true, NaN != NaN; false, NaN === NaN; false, NaN !== NaN; false. As supercat says I don't get why PHP, JS etc have designed it like NaN == NaN; false and NaN != NaN; true. The same question raise for Nulls as well. Some platforms consider them equal, some not. If only we could have one opinion on this.Softspoken
@Softspoken While that would be cool, ideally you should be checking for NaNs and Nulls before attempting to do stuff with them :3Stenotypy
@NiettheDarkAbsol or as Bitsios says any operation on NaN should return NaN, clearly which indicates the result is not defined as well :)Softspoken
@Goosefish Firstly, there is a small difference between 2 things being equal & 2 things being considered equal. In programming when you write var x = some_func(y) we are considering x to be equal to RHS. On the other hand 1 + 3 is mathematically equal to 4. It's a side-effect that the addition function or operator return a value that is also equal. 1 + 3 is equal to 4. log(-1) result in NaN. To make it amply clear, even though log(-1) result in NaN, it should be noted log(-1) == NaN fails in many languages. Whether this should fail is the question.Softspoken
@Goosefish Secondly, to talk of your specific eg, 1 + 3 and 2 + 2 result in 4, and 4 == 4. That's why 1 + 3 = 2 + 2. log(-1) and acos(2) result in NaN but is the value NaN equal to NaN? The == in 1st case operates on two numbers. Their behavior is well defined. NaN is not a number and the debate is on what should be the behavior. Just because some operator operates on two operands of one kind in one way doesn't mean it should operate on operands of any kind the same way. To assert NaN == NaN because 4 == 4 leads to a circular argument.Softspoken
Of course it is confusing since most things we know have followed reflexivity but it is important to know the distinction.Softspoken
Clearly, the solution is to have a tri-state boolean system, like some flavors of SQL have, where the result of any comparison on NaN is IdK (I don't Know).Emaemaciate
@Emaemaciate True/False/IdK would cover more cases, definitely.Stenotypy
Without reading all the comments - this solution fails when provided input which is not a number in any case (compare('bob', 'bar') equals true) - this is happening because neither equal but both are NaN (isNaN returns true for all things which are not a number, not solely NaN itself).Faustino
This answer does make the most sense to me, although I don't really like the compare function you provided. It can probably do more harm than good, in most cases, and should only be used when you know what you're doing. In other words, only when you'd be able to come up with such a simple function in the first place. I don't really like your addition of this code to the answer, to be honest.Nonary
C
40

My original answer (from 4 years ago) criticizes the decision from the modern-day perspective without understanding the context in which the decision was made. As such, it doesn't answer the question.

The correct answer is given here:

NaN != NaN originated out of two pragmatic considerations:

[...] There was no isnan( ) predicate at the time that NaN was formalized in the 8087 arithmetic; it was necessary to provide programmers with a convenient and efficient means of detecting NaN values that didn’t depend on programming languages providing something like isnan( ) which could take many years

There was one disadvantage to that approach: it made NaN less useful in many situations unrelated to numerical computation. For example, much later when people wanted to use NaN to represent missing values and put them in hash-based containers, they couldn't do it.

If the committee foresaw future use cases, and considered them important enough, they could have gone for the more verbose !(x<x & x>x) instead of x!=x as a test for NaN. However, their focus was more pragmatic and narrow: providing the best solution for a numeric computation, and as such they saw no issue with their approach.

===

Original answer:

I am sorry, much as I appreciate the thought that went into the top-voted answer, I disagree with it. NaN does not mean "undefined" - see http://www.cs.berkeley.edu/~wkahan/ieee754status/IEEE754.PDF, page 7 (search for the word "undefined"). As that document confirms, NaN is a well-defined concept.

Furthermore, IEEE approach was to follow the regular mathematics rules as much as possible, and when they couldn't, follow the rule of "least surprise" - see https://mcmap.net/q/24488/-what-is-the-rationale-for-all-comparisons-returning-false-for-ieee754-nan-values. Any mathematical object is equal to itself, so the rules of mathematics would imply that NaN == NaN should be True. I cannot see any valid and powerful reason to deviate from such a major mathematical principle (not to mention the less important rules of trichotomy of comparison, etc.).

As a result, my conclusion is as follows.

IEEE committee members did not think this through very clearly, and made a mistake. Since very few people understood the IEEE committee approach, or cared about what exactly the standard says about NaN (to wit: most compilers' treatment of NaN violates the IEEE standard anyway), nobody raised an alarm. Hence, this mistake is now embedded in the standard. It is unlikely to be fixed, since such a fix would break a lot of existing code.

Edit: Here is one post from a very informative discussion. Note: to get an unbiased view you have to read the entire thread, as Guido takes a different view to that of some other core developers. However, Guido is not personally interested in this topic, and largely follows Tim Peters recommendation. If anyone has Tim Peters' arguments in favor of NaN != NaN, please add them in comments; they have a good chance to change my opinion.

Condottiere answered 8/4, 2012 at 1:50 Comment(15)
IMHO, having NaN violate trichotomy makes sense, but like you I see no reasonable semantic justification for not having == define an equivalence relation when its operands are both of the same type (going a little further, I think languages should explicitly disallow comparisons between things of different types--even when implicit conversions exist--if such comparisons cannot implement an equivalence relation). The concept of an equivalence relations is so fundamental in both programming and mathematics, it seems crazy to violate it.Catboat
You might read on; Kahan says elsewhere in that document "NaNs must conform to mathematically consistent rules that were deduced, not invented arbitrarily[.]" I will agree that he doesn't mention how NaN != NaN is deduced beyond saying it's needed to distinguish NaN from non-NaNs absent library support like isnan().Gayegayel
Note that even if we consider NaN to represent an unknown value (and hence each NaN may be unequal to another), we cannot conclude that two NaN values necessarily are unequal to one another. In short: it might make sense for NaN == NaN to itself return NaN (or some other representation for uncomputable in your language - undefined, a raised exception, etc.), but it's definitely weird to simply return false.Contagious
@EamonNerbonne: Having NaN==NaN return something other than true or false would have been problematic, but given that (a<b) does not necessarily equal !(a>=b), I see no reason that (a==b) must necessarily equal !(a!=b). Having NaN==NaN and Nan!=NaN both return false would allow code which needs either definition of equality to use the one it needs.Catboat
This answer is WRONG WRONG WRONG! See my answer below.Wingback
You generally don't run across NaNs. The only way you can compute one is to ask for something that isn't mathematically defined. Once you've decided to have an error value, the question is "what semantics are least likely o cause trouble?" I believe that always returning false is the safest thing to do. So you get mis-sorted containers, how harmful is that? Especially compared to alternatives.Lycian
I am not aware of any axiom or postulate that states a mathematical object (how do you even define a mathematical object????) has to equal itself.Slaughterhouse
Even if your based on the identity function f on a set S where f(x) = x, I would argue that NaN is not part of the set of numbers, after all, it's literally not a number. So I don't see any argument from the identity function that NaN should equal itself.Slaughterhouse
With regards to your comment on trichotomy, trichotomy is an order relation on a set of numbers. Again I would argue that NaN is not in the set of numbers. I don't see anything in the IEEE specification that says it is, although I might have missed it.Slaughterhouse
@Slaughterhouse Look up the "Law of Identity". The concept of equality (and it's brothers consistency and completeness) are way too complex to be handled in a comment; but I will say that if you assume x=x can be false; you have opened up a giant can of worms you have to deal with before we can even consider the identify function has since the "=" in f(x)=x is not well defined.Necrology
@chuu look up DialetheismSlaughterhouse
This is wrong. Please see the top voted question below by russbishopAdventure
@Adventure what exactly is wrong? My original answer? My updated answer? The comment just above yours? Or any other particular statement?Condottiere
I would not even consider the possibility that, "IEEE committee members did not think this through very clearly, and made a mistake. " without either Investigating the committee members enough to assure myself that they are not actually experts or an expert (ideally person on the IEEE committee or a compiler writer) making the same complaint. IEEE is both important and a burden to implement, so discussions of it are not uncommon. You at least found a discussion involving such an expert. However, I find your reasoning was wrong, rather than merely your conclusion.Plumbo
"There was no isnan( ) predicate at the time that NaN was formalized in the 8087 arithmetic; it was necessary to provide programmers with a convenient and efficient means of detecting NaN values that didn’t depend on programming languages providing something like isnan( ) which could take many years" I don't understand this rationale. Couldn't one detect NaN values by comparing them for equality with a NaN value, provided as a constant in the language?Piker
C
13

A nice property is: if x == x returns false, then x is NaN.

(one can use this property to check if x is NaN or not.)

Cornia answered 5/4, 2012 at 18:47 Comment(2)
One could have that property and still have (Nan != Nan) also return false. Had the IEEE done that, code which wanted to test an equivalence relation between a and b could have used !(a != b).Catboat
That's a great substitute for np.isnan() and pd.isnull() ! !Strikebound
B
9

Try this:

var a = 'asdf';
var b = null;

var intA = parseInt(a);
var intB = parseInt(b);

console.log(intA); //logs NaN
console.log(intB); //logs NaN
console.log(intA==intB);// logs false

If intA == intB were true, that might lead you to conclude that a==b, which it clearly isn't.

Another way to look at it is that NaN just gives you information about what something ISN'T, not what it is. For example, if I say 'an apple is not a gorilla' and 'an orange is not a gorilla', would you conclude that 'an apple'=='an orange'?

Bucksaw answered 2/4, 2013 at 12:43 Comment(6)
"that might lead you to conclude that a==b" -- But that would simply be an invalid conclusion -- strtol("010") == strtol("8"), for instance.Karleen
I don't follow your logic. Given a=16777216f, b=0.25, and c=0.125, should the fact that a+b == a+c be taken to imply that b==c? Or merely that the two calculations yield indistinguishable results? Why should not sqrt(-1) and (0.0/0.0) be considered indistinguishable, absent a means of distinguishing them?Catboat
If you are implying that indistinguishable things should be considered equal, I don't agree with that. Equality implies that you DO have a means of distinguishing two subjects of comparison, not just an identical lack of knowledge about them. If you have no means of distinguishing them, then they may be equal or they may not be. I could see NaN==NaN returning 'undefined', but not true.Bucksaw
@MikeC pretty much nailed the reason without too much grammarGrum
So many answers, and I could only understood what you explained, kudos!!Benempt
Hmm this is interesting. It seems like a tradeoff: ruin the reflexivity of == in order to preserve transitivity.Ferriter
F
2

Actually, there is a concept in mathematics known as “unity” values. These values are extensions that are carefully constructed to reconcile outlying problems in a system. For example, you can think of ring at infinity in the complex plane as being a point or a set of points, and some formerly pretentious problems go away. There are other examples of this with respect to cardinalities of sets where you can demonstrate that you can pick the structure of the continuum of infinities so long as |P(A)| > |A| and nothing breaks.

DISCLAIMER: I am only working with my vague memory of my some interesting caveats during my math studies. I apologize if I did a woeful job of representing the concepts I alluded to above.

If you want to believe that NaN is a solitary value, then you are probably going to be unhappy with some of the results like the equality operator not working the way you expect/want. However, if you choose to believe that NaN is more of a continuum of “badness” represented by a solitary placeholder, then you are perfectly happy with the behavior of the equality operator. In other words, you lose sight of the fish you caught in the sea but you catch another that looks the same but is just as smelly.

Faucal answered 19/3, 2013 at 18:54 Comment(1)
Yes, in math you can add infinity and similar values. However, they will never break the equivalence relationship. Programmers' equality represents an equivalence relation in math, which is by definition reflexive. A bad programmer can define == that is not reflexive, symmetric and transitive; it's unfortunate that Python won't stop him. But when Python itself makes == non-reflexive, and you can't even override it, this is a complete disaster from both practical viewpoint (container membership) and elegance/mental clarity viewpointCondottiere

© 2022 - 2024 — McMap. All rights reserved.