The IEEE 754 standard defines arithmetic formats, operations, rounding rules, exceptions etc. for floating point computation. The Delphi compiler implements floating point arithmetic on top of the available hardware units. For the 32 bit Windows compiler this is the x87 unit, and for the 64 bit Windows compiler this is the SSE unit. Both of these hardware units conform to the IEEE 754 standard.
The difference that you are observing arises at the language implementation level. Let us look at the two versions in more detail.
32 bit Windows compiler
The comparison statement is compiled to this:
TestNaNs.dpr.19: if nanDouble <> zeroDouble then
0041C4C8 DD05C03E4200 fld qword ptr [$00423ec0]
0041C4CE DC1DC83E4200 fcomp qword ptr [$00423ec8]
0041C4D4 9B wait
0041C4D5 DFE0 fstsw ax
0041C4D7 9E sahf
0041C4D8 7419 jz $0041c4f3
The Intel software developers manual says that an unordered comparison is indicated by the flags C3, C2 and C0 being set to 1. The full table is here:
Condition C3 C2 C0
ST(0) > Source 0 0 0
ST(0) < Source 0 0 1
ST(0) = Source 1 0 0
Unordered 1 1 1
When you inspect the FPU under the debugger, you can see that this us the case.
0041C4D5 DFE0 fstsw ax
0041C4D7 9E sahf
0041C4D8 7419 jz $0041c4f3
This transfers various bits from of the FPU status register into the CPU flags, see the manual for precise details of which flags go where. The the branch is made if ZF is set. The value of ZF comes from the C3 FPU flag, which, reading from the table above, is set for the unordered case.
In fact, the entire branching code can be expressed in pseudo code as:
jump if C3 = 1
So, looking at the table above, it is clear that if one of the operands is a NaN then any floating point equality comparison evaluates as equals.
64 bit Windows compiler
The comparison statement is compiled to this:
TestNaNs.dpr.19: if nanDouble <> zeroDouble then
0000000000428EB8 F20F100548E50000 movsd xmm0,qword ptr [rel $0000e548]
0000000000428EC0 660F2E0548E50000 ucomisd xmm0,qword ptr [rel $0000e548]
0000000000428EC8 7A02 jp TestNaNs + $5C
0000000000428ECA 7420 jz TestNaNs + $7C
The comparison is performed by the ucomisd
instruction. The manual gives this psuedo code:
RESULT ← UnorderedCompare(SRC1[63:0] <> SRC2[63:0]) {
(* Set EFLAGS *)
CASE (RESULT) OF
GREATER_THAN: ZF, PF, CF ← 000;
LESS_THAN: ZF, PF, CF ← 001;
EQUAL: ZF, PF, CF ← 100;
UNORDERED: ZF, PF, CF ← 111;
ESAC;
OF, AF, SF ← 0;
Notice that in this instruction, the ZF, PF and CF flags are exactly analagous to the C3, C2 and C0 flags on the x87 unit.
The branching is handled by this code:
0000000000428EC8 7A02 jp TestNaNs + $5C
0000000000428ECA 7420 jz TestNaNs + $7C
Notice that there is first a test of the parity flag PF (the jp
instruction), and then the zero flag ZF (the jz
instruction). The compiler has therefore emitted code to handle the unordered case (i.e. one of the operands is NaN). This is handled first with the jp
. Once that is handled, the compiler then checks the zero flag ZF which (because NaNs have been dealt with) is set if and only if the two operands are equal.
Conclusion
The different behaviour is down to the different compilers taking different choices in how to implement the comparison operators. In both situations the hardware is IEEE 754 compliant, and perfectly capable of comparing NaNs as specified by the standard.
My best guess would be that the decisions for the 32 bit compiler were taken a very long time ago. Some of these decisions are questionable. In my view an equality comparison with a NaN operand should evaluate not equals irrespective of the other operand. The weight of history, felt through a desire to maintain backwards compatibility, means that these questionable decisions have never been addressed.
When the 64 bit compiler was created, more recently, the Embarcadero engineers decided to right some of these mistakes. They presumably felt that the break to a new architecture allowed them the freedom to do so.
In an ideal world, the 32 bit compiler could be configured to behave the same way as the 64 bit compiler, by setting a compiler switch.