How to avoid incorrect rounding with numpy.round?
Asked Answered
H

4

9

I'm working with floating point numbers. If I do:

import numpy as np
np.round(100.045, 2)

I get:

Out[15]: 100.04

Obviously, this should be 100.05. I know about the existence of IEEE 754 and that the way that floating point numbers are stored is the cause of this rounding error.

My question is: how can I avoid this error?

Hierogram answered 16/5, 2018 at 15:25 Comment(5)
maybe check out the decimal module?Vague
Is this a floating point representation issue, or a round to the nearest even issue?Surrogate
@Surrogate A bit of both, combined with the fact that NumPy's rounding algorithm introduces errors in intermediate steps. As a result of those errors, NumPy falsely detects a halfway case and ends up applying the round-ties-to-even rule even though technically it shouldn't, because the original value being rounded isn't an exact tie (thanks to the usual binary floating-point representation issues).Midvictorian
@hpaulj: The annoying part is that NumPy's rounding algorithm more often gives users the result that satisfies their naive expectations (assuming that they're aware of round-ties-to-even), compared with Python's round operation which does do correct rounding in all cases, but ends up surprising users more often.Midvictorian
100.045 is really 100.0450000000000017053025658242404460906982421875 which , mathematically, should round to 2 digits to 100.05. To solve this problem, in general, with float64 requires higher precision. I am unfamiliar with NumPy to help more.Disrobe
S
14

You are partly right, often the cause of this "incorrect rounding" is because of the way floating point numbers are stored. Some float literals can be represented exactly as floating point numbers while others cannot.

>>> a = 100.045
>>> a.as_integer_ratio()  # not exact
(7040041011254395, 70368744177664)

>>> a = 0.25
>>> a.as_integer_ratio()  # exact
(1, 4)

It's also important to know that there is no way you can restore the literal you used (100.045) from the resulting floating point number. So the only thing you can do is to use an arbitrary precision data type instead of the literal. For example you could use Fraction or Decimal (just to mention two built-in types).

I mentioned that you cannot restore the literal once it is parsed as float - so you have to input it as string or something else that represents the number exactly and is supported by these data types:

>>> from fractions import Fraction
>>> f = Fraction(100045, 100)
>>> f
Fraction(20009, 20)

>>> f = Fraction("100.045")
>>> f
Fraction(20009, 20)

>>> from decimal import Decimal
>>> Decimal("100.045")
Decimal('100.045')

However these don't work well with NumPy and even if you get it to work at all - it will almost certainly be very slow compared to basic floating point operations.

>>> import numpy as np

>>> a = np.array([Decimal("100.045") for _ in range(1000)])
>>> np.round(a)
AttributeError: 'decimal.Decimal' object has no attribute 'rint'

In the beginning I said that you're are only partly right. There is another twist!

You mentioned that rounding 100.045 will obviously give 100.05. But that's not obvious at all, in your case it is even wrong (in the context of floating point math in programming - it would be true for "normal calculations"). In many programming languages a "half" value (where the number after the decimal you're rounding is 5) isn't always rounded up - for example Python (and NumPy) use a "round half to even" approach because it's less biased. For example 0.5 will be rounded to 0 while 1.5 will be rounded to 2.

So even if 100.045 could be represented exactly as float - it would still round to 100.04 because of that rounding rule!

>>> round(Fraction("100.045"), 1)
Fraction(5002, 5)

>>> 5002 / 5
1000.4

>>> d = Decimal("100.045")
>>> round(d, 2)
Decimal('100.04')

This is even mentioned in the NumPy docs for numpy.around:

Notes

For values exactly halfway between rounded decimal values, NumPy rounds to the nearest even value. Thus 1.5 and 2.5 round to 2.0, -0.5 and 0.5 round to 0.0, etc. Results may also be surprising due to the inexact representation of decimal fractions in the IEEE floating point standard [R1011] and errors introduced when scaling by powers of ten.

(Emphasis mine.)

The only (at least that I know) numeric type in Python that allows setting the rounding rule manually is Decimal - via ROUND_HALF_UP:

>>> from decimal import Decimal, getcontext, ROUND_HALF_UP
>>> dc = getcontext()
>>> dc.rounding = ROUND_HALF_UP
>>> d = Decimal("100.045")
>>> round(d, 2)
Decimal('100.05')

Summary

So to avoid the "error" you have to:

  • Prevent Python from parsing it as floating point value and
  • use a data type that can represent it exactly
  • then you have to manually override the default rounding mode so that you will get rounding up for "halves".
  • (abandon NumPy because it doesn't have arbitrary precision data types)
Sarsenet answered 16/5, 2018 at 16:43 Comment(0)
A
2

Basically there is no general solution for this problem IMO, unless you have a general rule for all the different cases (see Floating Point Arithmetic: Issues and Limitation). However, in this case you can round the decimal part separately:

In [24]: dec, integ = np.modf(100.045)

In [25]: integ + np.round(dec, 2)
Out[25]: 100.05

The reason for such behavior is not because separating integer from decimal part makes any difference on round()'s logic. It's because when you use fmod it gives you a more realistic version of the decimal part of the number which is actually a rounded representation.

In this case here is what dec is:

In [30]: dec
Out[30]: 0.045000000000001705

And you can check that round gives same result with 0.045:

In [31]: round(0.045, 2)
Out[31]: 0.04

Now if you try with another number like 100.0333, the decimal part is a slightly smaller version which as I mentioned, the result you want depends on your rounding policies.

In [37]: dec, i = np.modf(100.0333)

In [38]: dec
Out[38]: 0.033299999999997

There are also modules like fractions and decimal that provide support for fast correctly-rounded decimal floating point and rational arithmetic, that you can use in situations as such.

Agger answered 16/5, 2018 at 15:40 Comment(2)
OK so basically fmod reduces the chance of this error occurring... or how should I interpret this? I'm actually looking to avoid these cases, not just reduce the chance of them occurring.Hierogram
@Hierogram Yes it can be. Something like np.modf(100.0333). But in these cases it still depends on your rounding policies. There is no general solution for this problem IMHO, unless you have a general rule for all the difference cases.Agger
B
1

This is not a bug, but a feature )))

you can simple use this trick:

def myround(val):
"Fix pythons round"
d,v = math.modf(val)
if d==0.5:
    val += 0.000000001
return round(val)
Bahadur answered 7/6, 2019 at 16:9 Comment(2)
This answer doesn't address NumpyTelega
it address both numpy's and native round functionsBahadur
B
1

This works.

>>> a=np.array([-0.5, 0.5, 1.5, 1.7], dtype=float)
>>> np.array(a+0.5, dtype=int)
array([0, 1, 2, 2])

In your case,

>>> a=100.045
>>> decim=2

def round(a, decim):
    return np.array((a* 10**decim) +0.5, dtype=int) / 10**decim

>>> round(a, decim)
100.05
Bixby answered 11/6 at 18:45 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.