Am I right that any arithmetic operation on any floating numbers is unambiguously defined by IEEE floating point standard? If yes, just for curiosity, what is (+0)+(-0)
? And is there a way to check such things in practice, in C++ or other commonly used programming language?
The IEEE 754 rules of arithmetic for signed zeros state that +0.0 + -0.0
depends on the rounding mode. In the default rounding mode, it will be +0.0
. When rounding towards -∞, it will be -0.0
.
You can check this in C++ like so:
#include <iostream>
int main() {
std::cout << "+0.0 + +0.0 == " << +0.0 + +0.0 << std::endl;
std::cout << "+0.0 + -0.0 == " << +0.0 + -0.0 << std::endl;
std::cout << "-0.0 + +0.0 == " << -0.0 + +0.0 << std::endl;
std::cout << "-0.0 + -0.0 == " << -0.0 + -0.0 << std::endl;
return 0;
}
+0.0 + +0.0 == 0
+0.0 + -0.0 == 0
-0.0 + +0.0 == 0
-0.0 + -0.0 == -0
fesetround()
function in fenv.h
. Offhand I don't know whether this is available in C++ as well. –
Niela +0.0 + -0.0
(c.f. GCC bug 34678). –
North fesetround
is available from <cfenv>
in C++ (although it seems likely that compilers which ignore it in C will also ignore it in C++) –
Lura My answer deals with IEEE 754:2008, which is the current version of the standard.
In the IEEE 754:2008 standard:
Section 4.3 deals with the rounding of values when performing arithmetic operations in order to fit the bits into the mantissa.
4.3 Rounding-direction attributes
Rounding takes a number regarded as infinitely precise and, if necessary, modifies it to fit in the destination’s format while signaling the inexact exception, underflow, or overflow when appropriate (see 7). Except where stated otherwise, every operation shall be performed as if it first produced an intermediate result correct to infinite precision and with unbounded range, and then rounded that result according to one of the attributes in this clause.
The rounding-direction attribute affects all computational operations that might be inexact. Inexact numeric floating-point results always have the same sign as the unrounded result.
The rounding-direction attribute affects the signs of exact zero sums (see 6.3), and also affects the thresholds beyond which overflow and underflow are signaled.
Section 6.3 prescribes the value of the sign bit when performing arithmetic with special values (NaN, infinities, +0, -0).
6.3 The sign bit
When the sum of two operands with opposite signs (or the difference of two operands with like signs) is exactly zero, the sign of that sum (or difference) shall be +0 in all rounding-direction attributes except
roundTowardNegative
; under that attribute, the sign of an exact zero sum (or difference) shall be −0.However, x + x = x − (−x) retains the same sign as x even when x is zero.
(emphasis mine)
In other words, (+0) + (-0) = +0 except when the rounding mode is roundTowardNegative
, in which case it is (+0) + (-0) = -0.
In the context of C#:
According to §7.7.4 of the C# Language Specification (emphasis mine):
- Floating-point addition:
float operator +(float x, float y);
double operator +(double x, double y);
The sum is computed according to the rules of IEEE 754 arithmetic. The following table lists the results of all possible combinations of nonzero finite values, zeros, infinities, and NaN's. In the table, x and y are nonzero finite values, and z is the result of x + y. If x and y have the same magnitude but opposite signs, z is positive zero. If x + y is too large to represent in the destination type, z is an infinity with the same sign as x + y.
+ • x +0 -0 +∞ -∞ NaN
•••••••••••••••••••••••••••••••••••••••••••••
y • z y y +∞ -∞ NaN
+0 • x +0 +0 +∞ -∞ NaN
-0 • x +0 -0 +∞ -∞ NaN
+∞ • +∞ +∞ +∞ +∞ NaN NaN
-∞ • -∞ -∞ -∞ NaN -∞ NaN
NaN • NaN NaN NaN NaN NaN NaN
(+0) + (-0) in C#:
In other words, based on the specification, the addition of two zeros only results in negative zero if both are negative zero. Therefore, the answer to the original question
What is (+0)+(-0) by IEEE floating point standard?
is +0.
Rounding modes in C#:
In case anyone is interested in changing the rounding mode in C#, in "Is there an C# equivalent of c++ fesetround()
function?", Hans Passant states:
Never tinker with the FPU control word in C#. It is the worst possible global variable you can imagine. With the standard misery that globals cause, your changes cannot last and will arbitrarily disappear. The internal exception handling code in the CLR resets it when it processes an exception.
Assume standard rounding mode (which you are using if you don't know what a rounding mode is and how to change it).
If the exact result is non-zero but so small that it gets rounded to zero, the result is +0 if the exact result is greater than 0, and -0 if the exact result is less than 0. This situation only happens for multiplication and division, not for addition and subtraction.
There are several cases where the exact result is zero. In that case the result is -0 in the following cases: Adding (-0) + (-0). Subtracting (-0) - (+0). Multiplying where one factor is a zero, and the other factor has the opposite sign (including (+0) * (-0). Dividing a zero by a non-zero number including infinity of the opposite sign. In all other cases, the result is +0.
An unfortunate side effect of this rule is that x + 0.0 is not always identical to x (not when x is -0). On the other hand, x - 0.0 is always identical to x. Also, x * 0.0 may be +0 or -0, depending on x. This prevents some optimisations by compilers that support IEE754 precisely, or makes them more difficult.
The answer, by the IEEE floating point standard, is +0.
roundTowardNegative
, the result (according to section 6.3 of IEEE 754) should be -0.0
. –
Sprinkling © 2022 - 2024 — McMap. All rights reserved.
double
unambiguously defined in C++. For example, a calculation may or may not be internally done in a higher precision thandouble
depending on availability of registers etc. – Lura