Do scripters have to consider roundoff error?
Asked Answered
S

9

5

I'm studying C, and the idea of guard digits and rounding errors came up. Do practitioners of scripting languages (I'm thinking of Python and Perl here) need to worry about this stuff? What if they are doing scientific programming?

Sanasanabria answered 31/8, 2009 at 7:9 Comment(2)
Depends on which scripting language. I'm sure that 99% of PHP programmers wouldn't even know what you're talking about ... :-SForest
@Vince, if you don't mind I interchanged parts of your post so that title has short eords but body longer ones.Panlogism
S
7

I would have to disagree with Lutz... While the rounding errors you mentioned do exist in Python/Perl/Ruby, they have absolutely nothing to do with the languages being implemented in C. The problem goes deeper than that.

Floating-point numbers, like all data, are represented in binary on modern computers. Just as there are numbers with periodic decimal representations (e.g., 1/3 = 0.333333...), there are also numbers with periodic binary representations (e.g., 1/10 = 0.0001100110011...). Since these numbers cannot be exactly represented in (a finite amount of) computer memory, any calculations involving them will introduce error.

This can be worked around by using high-precision math libraries, which represent the numbers either as the two numbers of a fraction (i.e., "numerator = 1, denominator = 10") or as string instead of using a native binary representation. However, because of the extra work involved in doing any calculations on numbers that are being stored as something else, these libraries necessarily slow down any math that has to go through them.

Sullivan answered 31/8, 2009 at 7:39 Comment(3)
Actually, they have everything to do with the language being implemented in C. Since they reuse the C operations (not the assemply operations), they have very similar semantics to C. If there were some processor with funky floating point semantics, the Python/etc implementation would still be guaranteed certain behaviour by the C compiler. If there were infinite precision floating point available on an architecture (somehow, lets assume magic), the Python/etc implementation could still only rely on what is provided on standard C.Milkmaid
If you don't rely on the underlying behaviour of libc, you don't carry around its baggage. As you say, use a different library and you don't have the same problem. Sure, they may be slower, but if you want wrong answers, I have some really fast algorithms for those :)Kickstand
@Paul: Unless I'm sorely mistaken, the IEEE floating point spec which defines native binary representation of floating-point numbers is implemented in the CPU hardware, not in libc. It is the use of the CPU's binary representation which both allows for fast math (through CPU-level floating point operations) and makes you susceptible to errors when dealing with values which cannot be represented precisely in that format.Sullivan
R
10

It depends. doubles behave the same everywhere, so if you do math with doubles, you are going to have the same problem with any language. If you use a native arbitrary precision type, then no, it's not a problem. Consider:

use Math::BigFloat;
my $big   = Math::BigFloat->new("1_000_000_000_000_000_000_000");
my $small = Math::BigFloat->new("0.000000000000000000000000001"); 
print $big + $small;

(Or, if you really want to hide what's going on:

use bignum;
print 1_000_000_000_000_000_000_000 + 0.000000000000000000000000001

)

As expected, this yields:

1000000000000000000000.000000000000000000000000001

Also as expected, this is not done in one CPU instruction.

Roentgenology answered 31/8, 2009 at 7:35 Comment(0)
S
7

I would have to disagree with Lutz... While the rounding errors you mentioned do exist in Python/Perl/Ruby, they have absolutely nothing to do with the languages being implemented in C. The problem goes deeper than that.

Floating-point numbers, like all data, are represented in binary on modern computers. Just as there are numbers with periodic decimal representations (e.g., 1/3 = 0.333333...), there are also numbers with periodic binary representations (e.g., 1/10 = 0.0001100110011...). Since these numbers cannot be exactly represented in (a finite amount of) computer memory, any calculations involving them will introduce error.

This can be worked around by using high-precision math libraries, which represent the numbers either as the two numbers of a fraction (i.e., "numerator = 1, denominator = 10") or as string instead of using a native binary representation. However, because of the extra work involved in doing any calculations on numbers that are being stored as something else, these libraries necessarily slow down any math that has to go through them.

Sullivan answered 31/8, 2009 at 7:39 Comment(3)
Actually, they have everything to do with the language being implemented in C. Since they reuse the C operations (not the assemply operations), they have very similar semantics to C. If there were some processor with funky floating point semantics, the Python/etc implementation would still be guaranteed certain behaviour by the C compiler. If there were infinite precision floating point available on an architecture (somehow, lets assume magic), the Python/etc implementation could still only rely on what is provided on standard C.Milkmaid
If you don't rely on the underlying behaviour of libc, you don't carry around its baggage. As you say, use a different library and you don't have the same problem. Sure, they may be slower, but if you want wrong answers, I have some really fast algorithms for those :)Kickstand
@Paul: Unless I'm sorely mistaken, the IEEE floating point spec which defines native binary representation of floating-point numbers is implemented in the CPU hardware, not in libc. It is the use of the CPU's binary representation which both allows for fast math (through CPU-level floating point operations) and makes you susceptible to errors when dealing with values which cannot be represented precisely in that format.Sullivan
P
6

There are several types of non-integer numbers in Python:

x = 1 / 2

would give you the standard float. Its type is float, it's essentially the same as in C, it's handled by the hardware, and it has the same problems as every other float in the world.

However, there is also fractional type:

from fractions import Fraction

x = Fraction(1, 2)

which has exact arithmetics with rational numbers.

In the case you want to perform rounding, but are not satisfied with the number of meaningful digits on your computer, or the fact that it could be different across platforms, decimal type is your friend:

from decimal import Decimal

x = Decimal('0.5')

You'll be able to set its precision to, say, 100 digits, if you want to. Or set it to 2 for bank applications.

As long as computers are stupid, we'll probably need this many different types. At least, in accordance with Pythonic principles, Python requires you to make an explicit choice about what you want from your numbers.

Moreover, it's a big misunderstanding that exact arithmetics doesn't lead to problems with rounding. Any time you round exact value to do something useful for a user to it --- e.g. print it to the user or add that many dollars to user's bank account --- you encounter "strange behavior" of rounding. This is inherent to non-integer arithmetics.

Panlogism answered 31/8, 2009 at 7:28 Comment(1)
A nitpick: Python float is a C double.Shovelboard
E
4

It depends on how you represent your numbers, not the language you use.

For example, if I write all my code in 8051 assember, but have implemented a slick rational number library, then round off isn't a problem. 1/3 is only equal to 1/3.

However if I am using the latest snazzy dynamic language, and it uses IEE754 floats, then all the limitations of IEEE754 apply.

If you need to care about the details of the numbers you generate, then you need to understand their representation and how they are manipulated by your choice of tools.

Update:

PDL is a popular library for doing scientific computing in Perl.

Extinction answered 31/8, 2009 at 7:36 Comment(0)
J
2

Since the underlying intepreter of both CPython and Perl are implemented in C, they behave like a C program.

For Python there is SciPY and NumPy for scientific computation.

Jannet answered 31/8, 2009 at 7:13 Comment(2)
Note that SciPy and NumPy do not currently offer arbitrary precision calculations, though.Costermansville
It's not that they are implemented in C that give them the C behaviour. It's that they use C's numerics, which they don't have to do (but it would be a lot harder to not use).Kickstand
C
1

You can do multiple precision calculations with Python, with external modules. The Multi Precision Math section in the official web site lists many of them.

Costermansville answered 31/8, 2009 at 8:2 Comment(0)
M
0

Well, you're not immune to floating point errors in Ruby. For example:

irb(main):033:0> (2.01 * 1000).to_i
=> 2009
irb(main):034:0> ((2.01 * 1000.0) + 0.5).floor
=> 2010
Michaelmichaela answered 31/8, 2009 at 7:14 Comment(0)
A
0

Sure they do!

An example from Python 2.6:

>>> 1440.0 / 900.0
1.6000000000000001

As lutz says, since scripting languages are often implemented in C, they inherit these "features". Compensating for them in the language would undoubtedly mean some kind of trade-off in performance or portability.

Allowed answered 31/8, 2009 at 7:23 Comment(2)
BTW, if you tried your example in Python 3.1, the answer you'd see is "1.6". The underlying bits of the results are the same as in Python 2.x (so all the issues with propagation of roundoff errors et al remain) but the repr method for floats in 3.x has been changed to use Gay's algorithm to produce the shortest possible fp representation.Dermott
Cool, thanks for the note :) I've updated my answer to specify the code sample is from Python 2.6.Allowed
S
0

When you do scientific programming, you'll always have to worry about rounding errors, no matter which programming language or numeric library you use.

Proof: Say you want to track the movement of a molecule near the border of the universe. The size of the universe is about 93 billion light-years (as far as we know). A molecule is pretty tiny, so you'll want at least nanometer precision (10^-6). That's 50 orders of magnitude.

For some reason, you need to rotate that molecule. That involves sin() and cos() operations and a multiply. The multiply is not an issue since the number of valid digits is simply the sum of the length of both operands. But how about sin()?

You must create the error equation to be sure that you keep enough digits so that the final result will have a know maximum error. I don't know any "simple" numeric library which can do this operation automatically (say, as part of the call to sin()). This is where you need Matlab or something similar.

Salerno answered 31/8, 2009 at 8:8 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.