Is the most significant decimal digits precision that can be converted to binary and back to decimal without loss of significance 6 or 7.225?
Asked Answered
C

3

25

I've come across two different precision formulas for floating-point numbers.

⌊(N-1) log10(2)⌋ = 6 decimal digits (Single-precision)

and

N log10(2) ≈ 7.225 decimal digits (Single-precision)

Where N = 24 Significant bits (Single-precision)

The first formula is found at the top of page 4 of "IEEE Standard 754 for Binary Floating-Point Arithmetic" written by, Professor W. Kahan.

The second formula is found on the Wikipedia article "Single-precision floating-point format" under section IEEE 754 single-precision binary floating-point format: binary32.

For the first formula, Professor W. Kahan says

If a decimal string with at most 6 sig. dec. is converted to Single and then converted back to the same number of sig. dec., then the final string should match the original.

For the second formula, Wikipedia says

...the total precision is 24 bits (equivalent to log10(224) ≈ 7.225 decimal digits).

The results of both formulas (6 and 7.225 decimal digits) are different, and I expected them to be the same because I assumed they both were meant to represent the most significant decimal digits which can be converted to floating-point binary and then converted back to decimal with the same number of significant decimal digits that it started with.

Why do these two numbers differ, and what is the most significant decimal digits precision that can be converted to binary and back to decimal without loss of significance?

Coltun answered 6/6, 2015 at 23:1 Comment(9)
The second doesn't contradict the first. There is no claim in the Wikipedia article about conversion back and forth. In any case Wikipedia is not a reliable source.Crabwise
There's a difference between to binary and back to decimal, and binary->decimal->binary. Good discussions here.Blandina
@WanderingFool - sorry, I didn't scroll through all the answers.Blandina
Caveat — The answer given by Hans Passant is incorrect and misleading. The correct answer is 6, as given by Jerry Coffin and myself.Coltun
possible duplicate of Decimal precision of floatsPetrous
@Petrous Nice find. I read the question Decimal precision of floats and the question and answers are less detailed than mine. According to How should duplicate questions be handled?, "The general rule is to keep the question with the best collection of answers, and close the other one as a duplicate." I vote to close the other question as a duplicate because it is less detailed and has a vague title.Coltun
True; even before your comment I'd retracted the close vote on this question :) However, I didn't vote to close the other question either, since I realized that decimal precision of floats and digits guaranteed to round-trip from string → float → string (FLT_DIG) aren't the same. You seem to have asked various question regarding both, nice work, thanks!Petrous
@Petrous I'm happy you understand there is a difference between decimal precision and digits guaranteed to round-trip. However, you were right the first time. The question Decimal precision of floats is a duplicate question because it's asking why numeric_limits returns 6. In my answer below, I give 6 as the digits guaranteed to round-trip. Round-trip may not be asked in the question you shared, but it's given as an answer to the OP. This is a question where the OP didn't know what to ask due to lack of knowledge.Coltun
This also shares a good point about the confusion between decimal digit precision and digits guaranteed to round-trip. I'm going to edit my answer below to clarify this distinct difference between the numbers 6 and 7.225 to resolve this ambiguity.Coltun
C
6

what is the most significant decimal digits precision that can be converted to binary and back to decimal without loss of significance?

The most significant decimal digits precision that can be converted to binary and back to decimal without loss of significance (for single-precision floating-point numbers or 24-bits) is 6 decimal digits.


Why do these two numbers differ...

The numbers 6 and 7.225 differ, because they define two different things. 6 is the most decimal digits that can be round-tripped. 7.225 is the approximate number of decimal digits precision for a 24-bit binary integer because a 24-bit binary integer can have 7 or 8 decimal digits depending on its specific value.

7.225 was found using the specific binary integer formula.

dspec = b·log10(2)             (dspec = specific decimal digits, b = bits)

However, what you normally need to know, are the minimum and maximum decimal digits for a b-bit integer. The following formulas are used to find the min and max decimal digits (7 and 8 respectively for 24-bits) of a specific binary integer.

dmin = ⌈(b-1)·log10(2)⌉    (dmin = min decimal digits, b = bits, ⌈x⌉ = smallest integer ≥ x)

dmax = ⌈b·log10(2)⌉         (dmax = max decimal digits, b = bits, ⌈x⌉ = smallest integer ≥ x)

To learn more about how these formulas are derived, read Number of Decimal Digits In a Binary Integer, written by Rick Regan.

This is all well and good, but you may ask, why is 6 the most decimal digits for a round-trip conversion if you say that the span of decimal digits for a 24-bit number is 7 to 8?

The answer is — because the above formulas only work for integers and not floating-point numbers!

Every decimal integer has an exact value in binary. However, the same cannot be said for every decimal floating-point number. Take .1 for example. .1 in binary is the number 0.000110011001100..., which is a repeating or recurring binary. This can produce rounding error.

Moreover, it takes one more bit to represent a decimal floating-point number than it does to represent a decimal integer of equal significance. This is because floating-point numbers are more precise the closer they are to 0, and less precise the further they are from 0. Because of this, many floating-point numbers near the minimum and maximum value ranges (emin = -126 and emax = +127 for single-precision) lose 1 bit of precision due to rounding error. To see this visually, look at What every computer programmer should know about floating point, part 1, written by Josh Haberman.

Furthermore, there are at least 784,757 positive seven-digit decimal numbers that cannot retain their original value after a round-trip conversion. An example of such a number that cannot survive the round-trip is 8.589973e9. This is the smallest positive number that does not retain its original value.

Here's the formula that you should be using for floating-point number precision that will give you 6 decimal digits for round-trip conversion.

dmax = ⌊(b-1)·log10(2)⌋    (dmax = max decimal digits, b = bits, ⌊x⌋ = largest integer ≤ x)

To learn more about how this formula is derived, read Number of Digits Required For Round-Trip Conversions, also written by Rick Regan. Rick does an excellent job showing the formulas derivation with references to rigorous proofs.


As a result, you can utilize the above formulas in a constructive way; if you understand how they work, you can apply them to any programming language that uses floating-point data types. All you have to know is the number of significant bits that your floating-point data type has, and you can find their respective number of decimal digits that you can count on to have no loss of significance after a round-trip conversion.

June 18, 2017 Update: I want to include a link to Rick Regan's new article which goes into more detail and in my opinion better answers this question than any answer provided here. His article is "Decimal Precision of Binary Floating-Point Numbers" and can be found on his website www.exploringbinary.com.

Coltun answered 15/6, 2015 at 19:24 Comment(2)
Bryan: I would have linked to this article a year ago if I had only written in then: exploringbinary.com/… . It supports your answer, but with more nuanced analysis. For example, 6 digits is the maximum guaranteed precision across the whole format, but you can get segments that provide 7 or 8 digits.Arrowroot
@RickRegan Sorry I've been on such a long hiatus. I read your article and I think it's great. I added your link to my answer and I'm honored that you were inspired to write such an informative article after my inquiries. I still have some doubts in the accuracy of my answer. If you see any mistakes, I give you permission to edit and correct my response. Thank you!Coltun
G
15

These are talking about two slightly different things.

The 7.2251 digits is the precision with which a number can be stored internally. For one example, if you did a computation with a double precision number (so you were starting with something like 15 digits of precision), then rounded it to a single precision number, the precision you'd have left at that point would be approximately 7 digits.

The 6 digits is talking about the precision that can be maintained through a round-trip conversion from a string of decimal digits, into a floating point number, then back to another string of decimal digits.

So, let's assume I start with a number like 1.23456789 as a string, then convert that to a float32, then convert the result back to a string. When I've done this, I can expect 6 digits to match exactly. The seventh digit might be rounded though, so I can't necessarily expect it to match (though it probably will be +/- 1 of the original string.

For example, consider the following code:

#include <iostream>
#include <iomanip>

int main() {
    double init = 987.23456789;
    for (int i = 0; i < 100; i++) {
        float f = init + i / 100.0;
        std::cout << std::setprecision(10) << std::setw(20) << f;
    }
}

This produces a table like the following:

     987.2345581         987.2445679         987.2545776         987.2645874
     987.2745972         987.2845459         987.2945557         987.3045654
     987.3145752          987.324585         987.3345947         987.3445435
     987.3545532          987.364563         987.3745728         987.3845825
     987.3945923          987.404541         987.4145508         987.4245605
     987.4345703         987.4445801         987.4545898         987.4645386
     987.4745483         987.4845581         987.4945679         987.5045776
     987.5145874         987.5245972         987.5345459         987.5445557
     987.5545654         987.5645752          987.574585         987.5845947
     987.5945435         987.6045532          987.614563         987.6245728
     987.6345825         987.6445923          987.654541         987.6645508
     987.6745605         987.6845703         987.6945801         987.7045898
     987.7145386         987.7245483         987.7345581         987.7445679
     987.7545776         987.7645874         987.7745972         987.7845459
     987.7945557         987.8045654         987.8145752          987.824585
     987.8345947         987.8445435         987.8545532          987.864563
     987.8745728         987.8845825         987.8945923          987.904541
     987.9145508         987.9245605         987.9345703         987.9445801
     987.9545898         987.9645386         987.9745483         987.9845581
     987.9945679         988.0045776         988.0145874         988.0245972
     988.0345459         988.0445557         988.0545654         988.0645752
      988.074585         988.0845947         988.0945435         988.1045532
      988.114563         988.1245728         988.1345825         988.1445923
      988.154541         988.1645508         988.1745605         988.1845703
     988.1945801         988.2045898         988.2145386         988.2245483

If we look through this, we can see that the first six significant digits always follow the pattern precisely (i.e., each result is exactly 0.01 greater than its predecessor). As we can see in the original double, the value is actually 98x.xx456--but when we convert the single-precision float to decimal, we can see that the 7th digit frequently would not be read back in correctly--since the subsequent digit is greater than 5, it should round up to 98x.xx46, but some of the values won't (e.g,. the second to last item in the first column is 988.154541, which would be round down instead of up, so we'd end up with 98x.xx45 instead of 46. So, even though the value (as stored) is precise to 7 digits (plus a little), by the time we round-trip the value through a conversion to decimal and back, we can't depend on that seventh digit matching precisely any more (even though there's enough precision that it will a lot more often than not).


1. That basically means 7 digits, and the 8th digit will be a little more accurate than nothing, but not a whole lot--for example, if we were converting from a double of 1.2345678, the .225 digits of precision mean that the last digit would be with about +/- .775 of the what started out there (whereas without the .225 digits of precision, it would be basically +/- 1 of what started out there).

Guillema answered 6/6, 2015 at 23:48 Comment(6)
@WanderingFool: No, you haven't disproved anything--it's expected that any attempt at round-tripping like this will round the number back to the number of digits that the type can represent, so your 1.009999990 should be rounded to 6 (or at most 7) digits, giving 1.001000. In this case, we're only seeing a difference between the actual and expected values after 10 digits, which is clearly more precision than we normally expect from a Float32.Guillema
I've a couple of doubts in this, would be ok if I invite you to the Sandbox chat room to discuss about this?Petrous
@legends2k: Hi--sorry, I was offline last night. I'm around right now, but can't really chat (I'm at work).Guillema
@JerryCoffin Looking at this answer again I must confess I don't understand it. How does the program show that floats have 7.22 digits (i.e., slightly more than 7 digits) of internal precision? And if we DO get 7.22 digits of internal precision, shouldn't we always get at least 7 digits of "external" precision? I guess I am struggling to see how 7.22 comes into play at all with floats (vs. integers) since precision (external at least) has more to do with the relative gap sizes in overlapping power of ten and power of two exponent ranges (which brings us back to the magic number "6").Arrowroot
I've thought through this topic a bit more -- please take a look at exploringbinary.com/…Arrowroot
I don't follow you! You are not string-float-string, so the first 7 digit are always guaranteed. Also in your own example, the pattern 98x.xx45 (which is 7 digit) is always that.Reprehend
C
6

what is the most significant decimal digits precision that can be converted to binary and back to decimal without loss of significance?

The most significant decimal digits precision that can be converted to binary and back to decimal without loss of significance (for single-precision floating-point numbers or 24-bits) is 6 decimal digits.


Why do these two numbers differ...

The numbers 6 and 7.225 differ, because they define two different things. 6 is the most decimal digits that can be round-tripped. 7.225 is the approximate number of decimal digits precision for a 24-bit binary integer because a 24-bit binary integer can have 7 or 8 decimal digits depending on its specific value.

7.225 was found using the specific binary integer formula.

dspec = b·log10(2)             (dspec = specific decimal digits, b = bits)

However, what you normally need to know, are the minimum and maximum decimal digits for a b-bit integer. The following formulas are used to find the min and max decimal digits (7 and 8 respectively for 24-bits) of a specific binary integer.

dmin = ⌈(b-1)·log10(2)⌉    (dmin = min decimal digits, b = bits, ⌈x⌉ = smallest integer ≥ x)

dmax = ⌈b·log10(2)⌉         (dmax = max decimal digits, b = bits, ⌈x⌉ = smallest integer ≥ x)

To learn more about how these formulas are derived, read Number of Decimal Digits In a Binary Integer, written by Rick Regan.

This is all well and good, but you may ask, why is 6 the most decimal digits for a round-trip conversion if you say that the span of decimal digits for a 24-bit number is 7 to 8?

The answer is — because the above formulas only work for integers and not floating-point numbers!

Every decimal integer has an exact value in binary. However, the same cannot be said for every decimal floating-point number. Take .1 for example. .1 in binary is the number 0.000110011001100..., which is a repeating or recurring binary. This can produce rounding error.

Moreover, it takes one more bit to represent a decimal floating-point number than it does to represent a decimal integer of equal significance. This is because floating-point numbers are more precise the closer they are to 0, and less precise the further they are from 0. Because of this, many floating-point numbers near the minimum and maximum value ranges (emin = -126 and emax = +127 for single-precision) lose 1 bit of precision due to rounding error. To see this visually, look at What every computer programmer should know about floating point, part 1, written by Josh Haberman.

Furthermore, there are at least 784,757 positive seven-digit decimal numbers that cannot retain their original value after a round-trip conversion. An example of such a number that cannot survive the round-trip is 8.589973e9. This is the smallest positive number that does not retain its original value.

Here's the formula that you should be using for floating-point number precision that will give you 6 decimal digits for round-trip conversion.

dmax = ⌊(b-1)·log10(2)⌋    (dmax = max decimal digits, b = bits, ⌊x⌋ = largest integer ≤ x)

To learn more about how this formula is derived, read Number of Digits Required For Round-Trip Conversions, also written by Rick Regan. Rick does an excellent job showing the formulas derivation with references to rigorous proofs.


As a result, you can utilize the above formulas in a constructive way; if you understand how they work, you can apply them to any programming language that uses floating-point data types. All you have to know is the number of significant bits that your floating-point data type has, and you can find their respective number of decimal digits that you can count on to have no loss of significance after a round-trip conversion.

June 18, 2017 Update: I want to include a link to Rick Regan's new article which goes into more detail and in my opinion better answers this question than any answer provided here. His article is "Decimal Precision of Binary Floating-Point Numbers" and can be found on his website www.exploringbinary.com.

Coltun answered 15/6, 2015 at 19:24 Comment(2)
Bryan: I would have linked to this article a year ago if I had only written in then: exploringbinary.com/… . It supports your answer, but with more nuanced analysis. For example, 6 digits is the maximum guaranteed precision across the whole format, but you can get segments that provide 7 or 8 digits.Arrowroot
@RickRegan Sorry I've been on such a long hiatus. I read your article and I think it's great. I added your link to my answer and I'm honored that you were inspired to write such an informative article after my inquiries. I still have some doubts in the accuracy of my answer. If you see any mistakes, I give you permission to edit and correct my response. Thank you!Coltun
R
2

Do keep in mind that they are the exact same formulas. Remember your high-school math book identity:

    Log(x^y) == y * Log(x)

It helps to actually calculate the values for N = 24 with your calculator:

  Kahan's:      23 * Log(2) = 6.924
  Wikipedia's:   Log(2^24)  = 7.225

Kahan was forced to truncate 6.924 down to 6 digits because of floor(), bummer. The only actual difference is that Kahan used 1 less bit of precision.

Pretty hard to guess why, the professor might have relied on old notes. Written before IEEE-754 and not taking into account that the 24th bit of precision is for free. The format uses a trick, the most significant bit of a floating point value that isn't 0 is always 1. So it doesn't need to be stored. The processor adds it back before it performs a calculation. Turning 23 bits of stored precision into 24 of effective precision.

Or he took into account that the conversion from a decimal string to a binary floating point value itself generates an error. Many nice round decimal values, like 0.1, cannot be perfectly converted to binary. It has an endless number of digits, just like 1/3 in decimal. That however generates a result that is off by +/- 0.5 bits, achieved by simple rounding. So the result is accurate to 23.5 * Log(2) = 7.074 decimal digits. If he assumed that the conversion routine is clumsy and doesn't properly round then the result can be off by +/-1 bit and N-1 is appropriate. They are not clumsy.

Or he thought like a typical scientist or (heaven forbid) accountant and wants the result of a calculation converted back to decimal as well. Such as you'd get when you trivially look for a 7 digit decimal number whose conversion back-and-forth does not produce the same number. Yes, that adds another +/- 0.5 bit error, summing up to 1 bit error total.

But never, never make that mistake, you always have to include any errors you get from manipulating the number in a calculation. Some of them lose significant digits very quickly, subtraction in particular is very dangerous.

Renter answered 6/6, 2015 at 23:43 Comment(6)
@Hans Passant They are not the same formulas. The "N-1" formula is for decimal to floating-point to decimal round-trip conversions. In the single-precision case, N=24. The N-1 comes from a derivation based on work from Matula and I Goldberg, which I've written about recently (exploringbinary.com/… ).Arrowroot
I don't see anything that disproves the pessimistic rounding assumption. I'll pass on buying the article.Renter
WanderingFool: @JerryCoffin's answer and comments explain everything well enough I think. The answer to your question is "6", meaning that every number 6 digits or less will round-trip through single precision floating-point. That's not to say that you won't find numbers with more digits round trip (0.100000001490116119384765625 round-trips, for example).Arrowroot
There is nothing in Matula's or I. Goldberg's or Kahan's writings that assume that the conversion routine is "clumsy and doesn't properly round".Arrowroot
@WanderingFool It is the number of decimal digits that can be represented by a 24-bit integer: 24*log10(2) = approx 7.22. That means between 7 and 8 digits. All 8-digit numbers up to 2^24 = 16,777,216 can be represented, but 16,777,217 - 99,999,999 can't. Yes, you may refer to my site.Arrowroot
@HansPassant please edit your answer because it's incorrect and misleading. The answer is 6, not 7. In my answer below, I give the 7 digit number 8.589973e9 which cannot be round-tripped. Try it yourself; 7 digits is clearly wrong. Also, the formulas are not the same as @RickRegan already pointed out.Coltun

© 2022 - 2024 — McMap. All rights reserved.