Double precision - decimal places
Asked Answered
R

7

59

From what I have read, a value of data type double has an approximate precision of 15 decimal places. However, when I use a number whose decimal representation repeats, such as 1.0/7.0, I find that the variable holds the value of 0.14285714285714285 - which is 17 places (via the debugger).

I would like to know why it is represented as 17 places internally, and why a precision of 15 is always written at ~15?

Rondelet answered 3/4, 2012 at 18:31 Comment(3)
What's relevant here is that 1/7 is a rational number, but one whose denominator is not a power of 2.Potentilla
@johnnyturbo3 An rational number is one that is the ratio of two integers. 1.0/7.0 is the ratio of the integers one and seven, so clearly rational. All rational numbers are repeating decimals, in this case: .142857 142857 142857 etc. Irrational numbers are not the ratio of two integers. Maybe update your question to fix that?Ruffin
The ~15 notation is there because you can't really convert between decimal and binary digits. For instance, the decimal 0.1 has 1 decimal digit, but it is periodic in binary, which means you'd need infinite binary digits to represent it exactly. In general, any proper fraction whose denominator can be divided by 5 is periodic in binary. (the same applies to 3 and 7, but then it's periodic in decimal as well)Regressive
P
52

An IEEE double has 53 significant bits (that's the value of DBL_MANT_DIG in <cfloat>). That's approximately 15.95 decimal digits (log10(253)); the implementation sets DBL_DIG to 15, not 16, because it has to round down. So you have nearly an extra decimal digit of precision (beyond what's implied by DBL_DIG==15) because of that.

The nextafter() function computes the nearest representable number to a given number; it can be used to show just how precise a given number is.

This program:

#include <cstdio>
#include <cfloat>
#include <cmath>

int main() {
    double x = 1.0/7.0;
    printf("FLT_RADIX = %d\n", FLT_RADIX);
    printf("DBL_DIG = %d\n", DBL_DIG);
    printf("DBL_MANT_DIG = %d\n", DBL_MANT_DIG);
    printf("%.17g\n%.17g\n%.17g\n", nextafter(x, 0.0), x, nextafter(x, 1.0));
}

gives me this output on my system:

FLT_RADIX = 2
DBL_DIG = 15
DBL_MANT_DIG = 53
0.14285714285714282
0.14285714285714285
0.14285714285714288

(You can replace %.17g by, say, %.64g to see more digits, none of which are significant.)

As you can see, the last displayed decimal digit changes by 3 with each consecutive value. The fact that the last displayed digit of 1.0/7.0 (5) happens to match the mathematical value is largely coincidental; it was a lucky guess. And the correct rounded digit is 6, not 5. Replacing 1.0/7.0 by 1.0/3.0 gives this output:

FLT_RADIX = 2
DBL_DIG = 15
DBL_MANT_DIG = 53
0.33333333333333326
0.33333333333333331
0.33333333333333337

which shows about 16 decimal digits of precision, as you'd expect.

Potentilla answered 3/4, 2012 at 18:50 Comment(2)
This answer doesn't clearly distinguish between the number of decimal digits it takes to uniquely identify a double (17 digits) and the number of decimal digits that a double can uniquely identify (15 digits). This is crucial and is covered better in other answers. I cover this in more detail here: randomascii.wordpress.com/2012/03/08/…Ruffin
Where "uniquely identify" means: a double can store said limit of digits either before or after the decimal point, not at both sides at the same time (else much precession is lost). see: https://mcmap.net/q/331069/-double-and-accuracyEnlargement
C
23

It is actually 53 binary places, which translates to 15 stable decimal places, meaning that if you round a start out with a number with 15 decimal places, convert it to a double, and then round the double back to 15 decimal places you'll get the same number. To uniquely represent a double you need 17 decimal places (meaning that for every number with 17 decimal places, there's a unique closest double) which is why 17 places are showing up, but not all 17-decimal numbers map to different double values (like in the examples in the other answers).

Colleencollege answered 3/4, 2012 at 18:40 Comment(2)
It translates to 15 stable integral decimal digits before the decimal place, in an integer. Where fractions are concerned, all bets are off. For example 3 has one significant digit, but 1/3 == 0.333333333333333 which has infinite decimal places. Same thing happens in binary.Guimar
What about subnormal values?Coincident
R
15

Decimal representation of floating point numbers is kind of strange. If you have a number with 15 decimal places and convert that to a double, then print it out with exactly 15 decimal places, you should get the same number. On the other hand, if you print out an arbitrary double with 15 decimal places and the convert it back to a double, you won't necessarily get the same value back—you need 17 decimal places for that. And neither 15 nor 17 decimal places are enough to accurately display the exact decimal equivalent of an arbitrary double. In general, you need over 100 decimal places to do that precisely.

See the Wikipedia page for double-precision and this article on floating-point precision.

Rexanna answered 3/4, 2012 at 18:41 Comment(5)
Do you have a reference regarding "over 100 decimal places"? I can see how you might need 63, but more than that?Colleencollege
@Colleencollege My second link at the bottom is where I got that number from.Rexanna
Wow, I didn't even think about how much the exponent can contribute. Thanks!Colleencollege
@John Calsbeek The "over 100 decimal places" stat from the article you reference is for floats; doubles need more than 750 significant digits.Quinacrine
But the 100/750 digits are needed under the assumption that the floating point representation of the actual number in question is precise, otherwise the extra digits only serve posterity (i.e. they have no actual meaning).Taritariff
N
11

A double holds 53 binary digits accurately, which is ~15.9545898 decimal digits. The debugger can show as many digits as it pleases to be more accurate to the binary value. Or it might take fewer digits and binary, such as 0.1 takes 1 digit in base 10, but infinite in base 2.

This is odd, so I'll show an extreme example. If we make a super simple floating point value that holds only 3 binary digits of accuracy, and no mantissa or sign (so range is 0-0.875), our options are:

binary - decimal
000    - 0.000
001    - 0.125
010    - 0.250
011    - 0.375
100    - 0.500
101    - 0.625
110    - 0.750
111    - 0.875

But if you do the numbers, this format is only accurate to 0.903089987 decimal digits. Not even 1 digit is accurate. As is easy to see, since there's no value that begins with 0.4?? nor 0.9??, and yet to display the full accuracy, we require 3 decimal digits.

tl;dr: The debugger shows you the value of the floating point variable to some arbitrary precision (19 digits in your case), which doesn't necessarily correlate with the accuracy of the floating point format (17 digits in your case).

Nellanellda answered 3/4, 2012 at 18:48 Comment(7)
Now that I look back on this, I wonder why I did my examples in trinary rather than base 2. I can't recall the thought process that led me to that.Nellanellda
Out of curiosity, 4/3 in base 10 is repeating because 10 and 3 are relatively prime, but how can you have the same thing happen with binary, since 10 and 2 are not relatively prime?Colleencollege
Actually it depends less on the pimality, and more on the ratio of log(10)/log(2) = 3.32192809. If you use base 16 (hexidecimal), log(16)/log(2) = 4. So translation between binary and hex will be exact (and easy!) every time.Nellanellda
For reference, 1/10 in base 2 is 0.00011001100110011001100110011...Nellanellda
Primality does matter. Suppose you have a number 1/2^n. Because 10 = 2*5 it has an exact decimal representation given by 5^n / 10^n. (I think I just answered my previous question right there.)Colleencollege
What I said shows that there is always an exact decimal representation for a binary number. It doesn't work the other way because 2 isn't a multiple of 5.Colleencollege
@trutheality: Oh I misunderstood. I'm not positive, but I think it's because you cannot factor 10 entirely to 2s, so going from base 10->2 not all numbers have representation values, but since 2 is a factor, 2->10 will always have a finite decimal representation. Thanks for the train of thought, will fix answer.Nellanellda
C
6

IEEE 754 floating point is done in binary. There's no exact conversion from a given number of bits to a given number of decimal digits. 3 bits can hold values from 0 to 7, and 4 bits can hold values from 0 to 15. A value from 0 to 9 takes roughly 3.5 bits, but that's not exact either.

An IEEE 754 double precision number occupies 64 bits. Of this, 52 bits are dedicated to the significand (the rest is a sign bit and exponent). Since the significand is (usually) normalized, there's an implied 53rd bit.

Now, given 53 bits and roughly 3.5 bits per digit, simple division gives us 15.1429 digits of precision. But remember, that 3.5 bits per decimal digit is only an approximation, not a perfectly accurate answer.

Many (most?) debuggers actually look at the contents of the entire register. On an x86, that's actually an 80-bit number. The x86 floating point unit will normally be adjusted to carry out calculations to 64-bit precision -- but internally, it actually uses a couple of "guard bits", which basically means internally it does the calculation with a few extra bits of precision so it can round the last one correctly. When the debugger looks at the whole register, it'll usually find at least one extra digit that's reasonably accurate -- though since that digit won't have any guard bits, it may not be rounded correctly.

Cleric answered 3/4, 2012 at 18:47 Comment(1)
log2(10) is about 3.32, so 3.32 bits per decimal digit. 53/3.32 is 15.96, almost 16 but not quite.Coincident
M
4

It is because it's being converted from a binary representation. Just because it has printed all those decimal digits doesn't mean it can represent all decimal values to that precision. Take, for example, this in Python:

>>> 0.14285714285714285
0.14285714285714285
>>> 0.14285714285714286
0.14285714285714285

Notice how I changed the last digit, but it printed out the same number anyway.

Milkmaid answered 3/4, 2012 at 18:35 Comment(0)
M
2

In most contexts where double values are used, calculations will have a certain amount of uncertainty. The difference between 1.33333333333333300 and 1.33333333333333399 may be less than the amount of uncertainty that exists in the calculations. Displaying the value of "2/3 + 2/3" as "1.33333333333333" is apt to be more meaningful than displaying it as "1.33333333333333319", since the latter display implies a level of precision that doesn't really exist.

In the debugger, however, it is important to uniquely indicate the value held by a variable, including essentially-meaningless bits of precision. It would be very confusing if a debugger displayed two variables as holding the value "1.333333333333333" when one of them actually held 1.33333333333333319 and the other held 1.33333333333333294 (meaning that, while they looked the same, they weren't equal). The extra precision shown by the debugger isn't apt to represent a numerically-correct calculation result, but indicates how the code will interpret the values held by the variables.

Misinterpret answered 9/6, 2012 at 23:32 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.