I'm studying C, and the idea of guard digits and rounding errors came up. Do practitioners of scripting languages (I'm thinking of Python and Perl here) need to worry about this stuff? What if they are doing scientific programming?
I would have to disagree with Lutz... While the rounding errors you mentioned do exist in Python/Perl/Ruby, they have absolutely nothing to do with the languages being implemented in C. The problem goes deeper than that.
Floating-point numbers, like all data, are represented in binary on modern computers. Just as there are numbers with periodic decimal representations (e.g., 1/3 = 0.333333...), there are also numbers with periodic binary representations (e.g., 1/10 = 0.0001100110011...). Since these numbers cannot be exactly represented in (a finite amount of) computer memory, any calculations involving them will introduce error.
This can be worked around by using high-precision math libraries, which represent the numbers either as the two numbers of a fraction (i.e., "numerator = 1, denominator = 10") or as string instead of using a native binary representation. However, because of the extra work involved in doing any calculations on numbers that are being stored as something else, these libraries necessarily slow down any math that has to go through them.
It depends. double
s behave the same everywhere, so if you do math with doubles, you are going to have the same problem with any language. If you use a native arbitrary precision type, then no, it's not a problem. Consider:
use Math::BigFloat;
my $big = Math::BigFloat->new("1_000_000_000_000_000_000_000");
my $small = Math::BigFloat->new("0.000000000000000000000000001");
print $big + $small;
(Or, if you really want to hide what's going on:
use bignum;
print 1_000_000_000_000_000_000_000 + 0.000000000000000000000000001
)
As expected, this yields:
1000000000000000000000.000000000000000000000000001
Also as expected, this is not done in one CPU instruction.
I would have to disagree with Lutz... While the rounding errors you mentioned do exist in Python/Perl/Ruby, they have absolutely nothing to do with the languages being implemented in C. The problem goes deeper than that.
Floating-point numbers, like all data, are represented in binary on modern computers. Just as there are numbers with periodic decimal representations (e.g., 1/3 = 0.333333...), there are also numbers with periodic binary representations (e.g., 1/10 = 0.0001100110011...). Since these numbers cannot be exactly represented in (a finite amount of) computer memory, any calculations involving them will introduce error.
This can be worked around by using high-precision math libraries, which represent the numbers either as the two numbers of a fraction (i.e., "numerator = 1, denominator = 10") or as string instead of using a native binary representation. However, because of the extra work involved in doing any calculations on numbers that are being stored as something else, these libraries necessarily slow down any math that has to go through them.
There are several types of non-integer numbers in Python:
x = 1 / 2
would give you the standard float. Its type is float
, it's essentially the same as in C, it's handled by the hardware, and it has the same problems as every other float
in the world.
However, there is also fractional type:
from fractions import Fraction
x = Fraction(1, 2)
which has exact arithmetics with rational numbers.
In the case you want to perform rounding, but are not satisfied with the number of meaningful digits on your computer, or the fact that it could be different across platforms, decimal type is your friend:
from decimal import Decimal
x = Decimal('0.5')
You'll be able to set its precision to, say, 100 digits, if you want to. Or set it to 2 for bank applications.
As long as computers are stupid, we'll probably need this many different types. At least, in accordance with Pythonic principles, Python requires you to make an explicit choice about what you want from your numbers.
Moreover, it's a big misunderstanding that exact arithmetics doesn't lead to problems with rounding. Any time you round exact value to do something useful for a user to it --- e.g. print it to the user or add that many dollars to user's bank account --- you encounter "strange behavior" of rounding. This is inherent to non-integer arithmetics.
It depends on how you represent your numbers, not the language you use.
For example, if I write all my code in 8051 assember, but have implemented a slick rational number library, then round off isn't a problem. 1/3 is only equal to 1/3.
However if I am using the latest snazzy dynamic language, and it uses IEE754 floats, then all the limitations of IEEE754 apply.
If you need to care about the details of the numbers you generate, then you need to understand their representation and how they are manipulated by your choice of tools.
Update:
PDL is a popular library for doing scientific computing in Perl.
Since the underlying intepreter of both CPython and Perl are implemented in C, they behave like a C program.
For Python there is SciPY and NumPy for scientific computation.
You can do multiple precision calculations with Python, with external modules. The Multi Precision Math section in the official web site lists many of them.
Well, you're not immune to floating point errors in Ruby. For example:
irb(main):033:0> (2.01 * 1000).to_i
=> 2009
irb(main):034:0> ((2.01 * 1000.0) + 0.5).floor
=> 2010
Sure they do!
An example from Python 2.6:
>>> 1440.0 / 900.0
1.6000000000000001
As lutz says, since scripting languages are often implemented in C, they inherit these "features". Compensating for them in the language would undoubtedly mean some kind of trade-off in performance or portability.
When you do scientific programming, you'll always have to worry about rounding errors, no matter which programming language or numeric library you use.
Proof: Say you want to track the movement of a molecule near the border of the universe. The size of the universe is about 93 billion light-years (as far as we know). A molecule is pretty tiny, so you'll want at least nanometer precision (10^-6). That's 50 orders of magnitude.
For some reason, you need to rotate that molecule. That involves sin()
and cos()
operations and a multiply. The multiply is not an issue since the number of valid digits is simply the sum of the length of both operands. But how about sin()
?
You must create the error equation to be sure that you keep enough digits so that the final result will have a know maximum error. I don't know any "simple" numeric library which can do this operation automatically (say, as part of the call to sin()
). This is where you need Matlab or something similar.
© 2022 - 2024 — McMap. All rights reserved.