Midpoint 'rounding' when dealing with large numbers?
Asked Answered
R

3

4

So I was trying to understand JavaScript's behavior when dealing with large numbers. Consider the following (tested in Firefox and Chrome):

console.log(9007199254740993) // 9007199254740992
console.log(9007199254740994) // 9007199254740994
console.log(9007199254740995) // 9007199254740996
console.log(9007199254740996) // 9007199254740996
console.log(9007199254740997) // 9007199254740996
console.log(9007199254740998) // 9007199254740998
console.log(9007199254740999) // 9007199254741000

Now, I'm aware of why it's outputting the 'wrong' numbers—it's trying to convert them to floating point representations and it's rounding off to the nearest possible floating point value—but I'm not entirely sure about why it picks these particular numbers. My guess is that it's trying to round to the nearest 'even' number, and since 9007199254740996 is divisible by 4 while 9007199254740994 is not, it considers 9007199254740996 to be more 'even'.

  • What algorithm is it using to determine the internal representation? My guess is that it's an extension of regular midpoint rounding (round to even is the default rounding mode in IEEE 754 functions).
  • Is this behavior specified as part of the ECMAScript standard, or is it implementation dependent?
Richey answered 24/6, 2014 at 23:35 Comment(4)
@RobG: Evaluating console.log(9007199254740993) involves two conversions: one to convert the decimal literal in the JavaScript code to the internal binary floating-point format, and a second conversion from the internal binary format to a decimal string for console output when executing the log method. Each of those conversions needs quite a sophisticated algorithm (if it's done well). So it's not really true to say that there's no algorithm.Flaxman
Yes, thought it's not a rounding algorithm.Furry
True, but rounding is a significant part of the algorithm: the decimal-to-binary conversion has to decide how to convert the exact decimal value to the nearest number that's representable in IEEE 754 binary64 format. Part of that decision involves choosing which way to round, and the normal convention is to use round-ties-to-even, as @PatriciaShanahan describes.Flaxman
Hmm. It turns out that it's more than just "the normal convention": the ECMA-262 standard specifies explicitly that the round-ties-to-even rounding method should be used (in section 8.5).Flaxman
I
5

As pointed out by Mark Dickinson in a comment on the question, the ECMA-262 ECMAScript Language Specification requires the use of IEEE 754 64-bit binary floating point to represent the Number Type. The relevant rounding rules are "Choose the member of this set that is closest in value to x. If two values of the set are equally close, then the one with an even significand is chosen...".

These rules are general, applying to rounding results of arithmetic as well as the values of literals.

The following are all the numbers in the relevant range for the question that are exactly representable in IEEE 754 64-bit binary floating point. Each is shown as its decimal value, and also as a hexadecimal representation of its bit pattern. A number with an even significand has an even rightmost hexadecimal digit in its bit pattern.

9007199254740992 bit pattern 0x4340000000000000
9007199254740994 bit pattern 0x4340000000000001
9007199254740996 bit pattern 0x4340000000000002
9007199254740998 bit pattern 0x4340000000000003
9007199254741000 bit pattern 0x4340000000000004

Each of the even inputs is one of these numbers, and rounds to that number. Each of the odd inputs is exactly half way between two of them, and rounds to the one with the even significand. This results in rounding the odd inputs to 9007199254740992, 9007199254740996, and 9007199254741000.

Inglebert answered 25/6, 2014 at 3:25 Comment(0)
R
1

Patricia Shanahan's answer helped a lot and explained my primary question. However, to second part of the question—whether or not this behavior is implementation dependent—it turns out that yes it is, but in a slightly different way than I originally thought. Quoting from ECMA-262 5.1 § 7.8.3:

… the rounded value must be the Number value for the MV (as specified in 8.5), unless the literal is a DecimalLiteral and the literal has more than 20 significant digits, in which case the Number value may be either the Number value for the MV of a literal produced by replacing each significant digit after the 20th with a 0 digit or the Number value for the MV of a literal produced by replacing each significant digit after the 20th with a 0 digit and then incrementing the literal at the 20th significant digit position.

In other words, an implementation may choose to ignore everything after the 20th digit. Consider this:

console.log(9007199254740993.00001)

Both Chrome and Firefox will output 9007199254740994, however, Internet Explorer will output 9007199254740992 because it chooses to ignore the after the 20th digit. Interestingly, this doesn't appear to be standards-compliant behavior (at least as I read this standard). it should interpret this the same as 9007199254740993.0001, but it does not.

Richey answered 26/6, 2014 at 18:28 Comment(0)
R
0

JavaScript represents numbers as 64-bit floating point values. This is defined in the standard.

http://en.wikipedia.org/wiki/Double-precision_floating-point_format

So there's nothing related with midpoint rounding going on there.

As a hint, every 32 bit integer has an exact representation in double-precision floating format.


Ok, since you're asking for the exact algorithm, I checked how Chrome's V8 engine does it. V8 defines a StringToDouble function, which calls InternalStringToDouble in the following file:

https://github.com/v8/v8/blob/master/src/conversions-inl.h#L415

And this in turn, calls the Strotd function defined there:

https://github.com/v8/v8/blob/master/src/strtod.cc

Removal answered 24/6, 2014 at 23:38 Comment(3)
A 64 bit float can maintain up to 15 significant digits. 9007199254740995 is 16 digits and therefore has no exact representation.Removal
@LucasTrzesniewski Nice work on finding the source code. However even arriving at the numbers via arithmetic (instead of string conversion) gives the same rounding. E.g. 393447746243*22893 and 950026289921*9481 both return x=9007199254741000 even though the real answers are (x-1) and (x+1), respectively. I don't know what's happening either; just offering another piece to the puzzle.Weighty
@Weighty well like I said 9007199254741000 is above the max precision of doubles, sou you can't really expect the real values there. I just checked, and the two multiplications you provided give the same value in C# as in JS, so I suppose this is just how the CPU does the calculations in hardware.Removal

© 2022 - 2024 — McMap. All rights reserved.