How to avoid floating point arithmetics issues?
Asked Answered
R

3

5

Python (and almost anything else) has known limitations while working with floating point numbers (nice overview provided here).

While problem is described well in the documentation it avoids providing any approach to fixing it. And with this question I am seeking to find a more or less robust way to avoid situations like the following:

print(math.floor(0.09/0.015))   # >> 6
print(math.floor(0.009/0.0015)) # >> 5

print(99.99-99.973) # >> 0.016999999999825377
print(.99-.973)     # >> 0.017000000000000015

var = 0.009
step = 0.0015
print(var < math.floor(var/step)*step+step) # False
print(var < (math.floor(var/step)+1)*step)  # True

And unlike suggested in this question, their solution does not help to fix a problem like next peace of code failing randomly:

  total_bins = math.ceil((data_max - data_min) / width)  # round to upper
  new_max = data_min + total_bins * width
  assert new_max >= data_max 
  # fails. because for example 1.9459999999999997 < 1.946
Rubbing answered 12/10, 2017 at 0:43 Comment(6)
Have you looked into using the decimal module?Auliffe
@Christian Dean, does it mean converting every argument to Decimal for any arithmetical operation?Rubbing
In essence, yes.Auliffe
Side-note: If data_max, data_min and width are int, you can avoid the precision issues of float entirely by adapting integer floor division (//) into integer ceiling division: total_bins = ((data_max - data_min) + (width - 1)) // width. By adding one less than the divisor to the dividend, then using floor division, you get ceiling division for purely integer operands, with no floating point involved at all.Rese
Do you care about whether stuff like (1.0/49.0)*49.0 is exactly equal to 1.0, or do you just want things with finite decimal expansions to behave intuitively? (Do you care about things like (2**0.2)**5.0?)Corncob
(The Right Answer is usually to not do things that fail if a result is a tiny bit inaccurate. Whether you use decimals, fractions, or symbolic math, you can't get away from the fundamental limits of working with real numbers.)Corncob
K
9

If you deal in discrete quantities, use int.

Sometimes people use float in places where they definitely shouldn't. If you're counting something (like number of cars in the world) as opposed to measuring something (like how much gasoline is used per day), floating-point is probably the wrong choice. Currency is another example where floating point numbers are often abused: if you're storing your bank account balance in a database, it's really not 123.45 dollars, it's 12345 cents. (But also see below about Decimal.)

Most of the rest of the time, use float.

Floating-point numbers are general-purpose. They're extremely accurate; they just can't represent certain fractions, like finite decimal numbers can't represent the number 1/3. Floats are generally suited for any kind of analog quantity where the measurement has error bars: length, mass, frequency, energy -- if there's uncertainty on the order of 2^(-52) or greater, there's probably no good reason not to use float.

If you need human-readable numbers, use float but format it.

"This number looks weird" is a bad reason not to use float. But that doesn't mean you have to display the number to arbitrary precision. If a number with only three significant figures comes out to 19.99909997918947, format it to one decimal place and be done with it.

>>> print('{:0.1f}'.format(e**pi - pi))
20.0

If you need precise decimal representation, use Decimal.

Sraw's answer refers to the decimal module, which is part of the standard library. I already mentioned currency as a discrete quantity, but you may need to do calculations on amounts of currency in which not all numbers are discrete, for example calculating interest. If you're writing code for an accounting system, there will be rules that say when rounding is applied and to what accuracy various calculations are done, and those specifications will be written in terms of decimal places. In this situation and others where the decimal representation is inherent to the problem specification, you'll want to use a decimal type.

>>> from decimal import Decimal
>>> rate = Decimal('0.0345')
>>> principal = Decimal('3412.65')
>>> interest = rate*principal
>>> interest
Decimal('117.736425')
>>> interest.quantize(Decimal('0.01'))
Decimal('117.74')

But most importantly, use data types and operations that make sense in context.

Several of your examples use math.floor, which takes a float and chops off the fractional part. In any situation where you should use math.floor, floating-point error doesn't matter. (If you want to round to the nearest integer, use round instead.) Yes, there are ways to use floating-point operations that have wrong results from a mathematical standpoint. But real-world quantities usually fall into one of these categories:

  1. Exact, and therefore should not be put in a float;
  2. Imprecise to a degree far exceeding the likely accumulation of floating-point error.

As a programmer, it's part of your job to know the quantities you're dealing with and choose appropriate data types. So there's no "fix" for floating point numbers, because there's no "problem" really -- just people using the wrong type for the wrong thing.

Kickoff answered 12/10, 2017 at 2:16 Comment(0)
C
4

Let's talk about decimal. Actually, this library converts number into a string-like object, and then do any arithmetical operation based on chars.

So in this case, it can handle significantly huge number with almost perfect precision.

But, as it calculate number based on chars, it cost much more.

Further, if you want to use decimal, to ensure precision, you need consistently use it. If you mix decimal with normal types such as float, it may cause unexpected problems.

Finally, when you construct a Decimal object, it is better to pass a string but not a number.

>>> print(Decimal(99.99) - Decimal(99.973))
0.01699999999999590727384202182
>>> print(Decimal("99.99") - Decimal("99.973"))
0.017
Cochineal answered 12/10, 2017 at 1:30 Comment(0)
E
4

It depends what your end goal is - there is no way to "perfectly" store floating point numbers. Only "good enough".

If you are working with money for example (dollars and cents) it is common practice to not store dollars - and only cents. (dollar = 100 cents) - this is how paypal stores your account balance on their servers.

There is also the python Decimal class for fixed point arithmetic.

Extortioner answered 12/10, 2017 at 1:36 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.