How deterministic is floating point inaccuracy?
Asked Answered
S

10

31

I understand that floating point calculations have accuracy issues and there are plenty of questions explaining why. My question is if I run the same calculation twice, can I always rely on it to produce the same result? What factors might affect this?

  • Time between calculations?
  • Current state of the CPU?
  • Different hardware?
  • Language / platform / OS?
  • Solar flares?

I have a simple physics simulation and would like to record sessions so that they can be replayed. If the calculations can be relied on then I should only need to record the initial state plus any user input and I should always be able to reproduce the final state exactly. If the calculations are not accurate errors at the start may have huge implications by the end of the simulation.

I am currently working in Silverlight though would be interested to know if this question can be answered in general.

Update: The initial answers indicate yes, but apparently this isn't entirely clear cut as discussed in the comments for the selected answer. It looks like I will have to do some tests and see what happens.

Systemize answered 30/11, 2008 at 8:11 Comment(3)
In Silverlight you are dealing with JIT compiler - that means math operations might automatically take advantage of SSE, MMX and other special instructions, and those or other changes might modify the exact order instructions are executed: A+B+C may not give the same result as C+B+A when using floating point values. As a result you'll get deterministic results when running on the same machine, but may get different results on another processor, or even a slightly different system configuration.Roar
floating point numbers ordered by their precision: decimal, double, float.Alpenhorn
It depends on the Lunar phase.Burford
C
25

From what I understand you're only guaranteed identical results provided that you're dealing with the same instruction set and compiler, and that any processors you run on adhere strictly to the relevant standards (ie IEEE754). That said, unless you're dealing with a particularly chaotic system any drift in calculation between runs isn't likely to result in buggy behavior.

Specific gotchas that I'm aware of:

  1. some operating systems allow you to set the mode of the floating point processor in ways that break compatibility.

  2. floating point intermediate results often use 80 bit precision in register, but only 64 bit in memory. If a program is recompiled in a way that changes register spilling within a function, it may return different results compared to other versions. Most platforms will give you a way to force all results to be truncated to the in memory precision.

  3. standard library functions may change between versions. I gather that there are some not uncommonly encountered examples of this in gcc 3 vs 4.

  4. The IEEE itself allows some binary representations to differ... specifically NaN values, but I can't recall the details.

Cly answered 30/11, 2008 at 8:51 Comment(7)
@Jason Watkins: There are only two logical representations of NaN, quiet and signaling, however there are many binary representations of NaN. Otherwise +1 good stuff.Am
#1 is particularly important on Windows. There have been versions of DirectX that would put the CPU in a lower-precision mode, leading to unexpected results.Hali
Re CPU implementations, it depends on the language you're using. C is rather nonspecific in what FP you'll get. C# and Java specify IEEE754 semantics and it's then the implementation's job to hide what the processor is actually capable of. If I ran Java on an old VAX or a broken Pentium, then I would expect to see IEEE754 behaviour, despite that not being what the processor implements, because the language definition mandates it. If I didn't the JVM would be broken by definition.Lashonda
Lots of additional information found here: gafferongames.com/networking-for-game-programmers/…Systemize
@ijw: Actually, JVM is intentionally broken for double/floats. You need strictfp keyword to make it strict IEEE754, but this may make the program way slower.Nautical
It's worthwhile to note that the 80-bit hardware was intended to be used with languages that would allow 80-bit values to be stored in memory. The intention was that most values stored to memory would be down-converted to 64 bits, but temporary results from common sub-expressions would be kept as 80. Unfortunately, the C standard failed to provide a means by which variable-argument functions could indicate whether they wanted 64-bit or 80-bit floats, and compiler vendors decided the easiest way to avoid compatibility problems was to have their "long double" type actually store 64 bits.Mohsen
While many people think of the 80-bit type as being an x87 quirk, it was actually designed to be faster to work with than 64-bit double on machines without floating-point units. The "classic" Macintosh never used 8x87, but it did floating-point computations much the same way as the 8x87 did.Mohsen
B
21

The short answer is that FP calculations are entirely deterministic, as per the IEEE Floating Point Standard, but that doesn't mean they're entirely reproducible across machines, compilers, OS's, etc.

The long answer to these questions and more can be found in what is probably the best reference on floating point, David Goldberg's What Every Computer Scientist Should Know About Floating Point Arithmetic. Skip to the section on the IEEE standard for the key details.

To answer your bullet points briefly:

  • Time between calculations and state of the CPU have little to do with this.

  • Hardware can affect things (e.g. some GPUs are not IEEE floating point compliant).

  • Language, platform, and OS can also affect things. For a better description of this than I can offer, see Jason Watkins's answer. If you are using Java, take a look at Kahan's rant on Java's floating point inadequacies.

  • Solar flares might matter, hopefully infrequently. I wouldn't worry too much, because if they do matter, then everything else is screwed up too. I would put this in the same category as worrying about EMP.

Finally, if you are doing the same sequence of floating point calculations on the same initial inputs, then things should be replayable exactly just fine. The exact sequence can change depending on your compiler/os/standard library, so you might get some small errors this way.

Where you usually run into problems in floating point is if you have a numerically unstable method and you start with FP inputs that are approximately the same but not quite. If your method's stable, you should be able to guarantee reproducibility within some tolerance. If you want more detail than this, then take a look at Goldberg's FP article linked above or pick up an intro text on numerical analysis.

Bobsledding answered 30/11, 2008 at 8:33 Comment(1)
See my response to @JaredPar, there are many things that can cause discrepancies between calculations on two IEEE-compliant implementations. Saying that calculations are deterministic isn't particularly helpful since deterministic doesn't necessarily mean reproducible.Philous
G
8

I think your confusion lies in the type of inaccuracy around floating point. Most languages implement the IEEE floating point standard This standard lays out how individual bits within a float/double are used to produce a number. Typically a float consists of a four bytes, and a double eight bytes.

A mathmatical operation between two floating point numbers will have the same value every single time (as specified within the standard).

The inaccuracy comes in the precision. Consider an int vs a float. Both typically take up the same number of bytes (4). Yet the maximum value each number can store is wildly different.

  • int: roughly 2 billion
  • float: 3.40282347E38 (quite a bit larger)

The difference is in the middle. int, can represent every number between 0 and roughly 2 billion. Float however cannot. It can represent 2 billion values between 0 and 3.40282347E38. But that leaves a whole range of values that cannot be represented. If a math equation hits one of these values it will have to be rounded out to a representable value and is hence considered "inaccurate". Your definition of inaccurate may vary :).

Gambit answered 30/11, 2008 at 8:38 Comment(6)
This glosses over the reproduceability aspect which isn't as clear cut. IEEE makes certain guarantees but these are based on strict assumptions and don't extend to all operations or library functions. @jason-watkins did a good job explaining the major gotchas in his answer.Philous
The bottom line is that if you use limited operations on the same implementation (computer/compiler/runtime) you will probably be able to reproduce the results exactly but it is very likely that results will differ slightly across different implementations, even those that support IEEE-754.Philous
Robert inspired by Jason's comments should be appended to this answer, I think.Underhand
Read the para. "Reproducibility" in the wiki article linked here. Summary: IEEE 754-1985 does not guarantee reproducibility between implementations. 754-2008 encourages it but still doesn't mandate it. If your language uses 754, it will almost certainly be a version prior to 2008.Sammer
This is a good explanation about floating point accuracy in general but it doesn't address the question asked at all, Jason Watkin's answer does and should be the accepted answer in my opinion.Philous
<pedant>every natural number</pedant>Lashonda
C
4

Also, while Goldberg is a great reference, the original text is also wrong: IEEE754 is not gaurenteed to be portable. I can't emphasize this enough given how often this statement is made based on skimming the text. Later versions of the document include a section that discusses this specifically:

Many programmers may not realize that even a program that uses only the numeric formats and operations prescribed by the IEEE standard can compute different results on different systems. In fact, the authors of the standard intended to allow different implementations to obtain different results.

Cly answered 30/11, 2008 at 9:1 Comment(0)
J
3

This answer in the C++ FAQ probably describes it the best:

http://www.parashift.com/c++-faq-lite/newbie.html#faq-29.18

It is not only that different architectures or compiler might give you trouble, float pointing numbers already behave in weird ways within the same program. As the FAQ points out if y == x is true, that can still mean that cos(y) == cos(x) will be false. This is because the x86 CPU calculates the value with 80bit, while the value is stored as 64bit in memory, so you end up comparing a truncated 64bit value with a full 80bit value.

The calculation are still deterministic, in the sense that running the same compiled binary will give you the same result each time, but the moment you adjust the source a bit, the optimization flags or compile it with a different compiler all bets are off and anything can happen.

Practically speaking, I it is not quite that bad, I could reproduce simple float pointing math with different version of GCC on 32bit Linux bit for bit, but the moment I switched to 64bit Linux the result were no longer the same. Demos recordings created on 32bit wouldn't work on 64bit and vice versa, but would work fine when run on the same arch.

Joshi answered 17/6, 2009 at 16:46 Comment(0)
H
2

Since your question is tagged C#, it's worth emphasising the issues faced on .NET:

  1. Floating point maths is not associative - that is, (a + b) + c is not guaranteed to equal a + (b + c);
  2. Different compilers will optimize your code in different ways, and that may involve re-ordering arithmetic operations.
  3. In .NET the CLR's JIT compiler will compile your code on the fly, so compilation is dependent upon the version of the .NET on the machine at runtime.

This means, that you shouldn't rely upon your .NET application producing the same floating point calculation results when run on different versions of the .NET CLR.

For example, in your case, if you record the initial state and inputs to your simulation, then install a service pack that updates the CLR, your simulation may not replay identically the next time you run it.

See Shawn Hargreaves's blog post Is floating point math deterministic? for further discussion relevant to .NET.

Holden answered 19/4, 2011 at 6:57 Comment(0)
A
1

Sorry, but I can't help thinking that everybody is missing the point.

If the inaccuracy is significant to what you are doing then you should look for a different algorithm.

You say that if the calculations are not accurate, errors at the start may have huge implications by the end of the simulation.

That my friend is not a simulation. If you are getting hugely different results due to tiny differences due to rounding and precision then the chances are that none of the results has any validity. Just because you can repeat the result does not make it any more valid.

On any non-trivial real world problem that includes measurements or non-integer calculation, it is always a good idea to introduce minor errors to test how stable your algorithm is.

Alessandraalessandria answered 30/11, 2008 at 9:12 Comment(2)
No I think you've missed the point. The question is really about the repeatability not about accuracy.Conspectus
In this particular case Anthony is right, I'm looking for repeatibility rather than accuracy as I'm trying to create something interesting rather than a true 'simulation'. Perhaps a better word could have been used there... playground?Systemize
H
0

HM. Since the OP asked for C#:

Is the C# bytecode JIT deterministic or does it generate different code between different runs? I don't know, but I wouldn't trust the Jit.

I could think of scenarios where the JIT has some quality of service features and decides to spend less time on optimization because the CPU is doing heavy number crunching somewhere else (think background DVD encoding)? This could lead to subtle differences that may result in huge differences later on.

Also if the JIT itself gets improved (maybe as part of a service pack maybe) the generated code will change for sure. The 80 bit internal precision issue has already been mentioned.

Handbill answered 30/11, 2008 at 13:41 Comment(0)
S
-1

This is not a full answer to your question, but here is an example demonstrating that double calculations in C# are non-deterministic. I don't know why, but seemingly unrelated code can apparently affect the outcome of a downstream double calculation.

  1. Create a new WPF application in Visual Studio Version 12.0.40629.00 Update 5, and accept all the default options.
  2. Replace the contents of MainWindow.xaml.cs with this:

    using System;
    using System.Windows;
    
    namespace WpfApplication1
    {
        /// <summary>
        /// Interaction logic for MainWindow.xaml
        /// </summary>
        public partial class MainWindow : Window
        {
            public MainWindow()
            {
                InitializeComponent();
                Content = FooConverter.Convert(new Point(950, 500), new Point(850, 500));
            }
        }
    
        public static class FooConverter
        {
            public static string Convert(Point curIPJos, Point oppIJPos)
            {
                var ij = " Insulated Joint";
                var deltaX = oppIJPos.X - curIPJos.X;
                var deltaY = oppIJPos.Y - curIPJos.Y;
                var teta = Math.Atan2(deltaY, deltaX);
                string result;
                if (-Math.PI / 4 <= teta && teta <= Math.PI / 4)
                    result = "Left" + ij;
                else if (Math.PI / 4 < teta && teta <= Math.PI * 3 / 4)
                    result = "Top" + ij;
                else if (Math.PI * 3 / 4 < teta && teta <= Math.PI || -Math.PI <= teta && teta <= -Math.PI * 3 / 4)
                    result = "Right" + ij;
                else
                    result = "Bottom" + ij;
                return result;
            }
        }
    }
    
  3. Set build configuration to "Release" and build, but do not run in Visual Studio.

  4. Double-click the built exe to run it.
  5. Note that the window shows "Bottom Insulated Joint".
  6. Now add this line just before "string result":

    string debug = teta.ToString();
    
  7. Repeat steps 3 and 4.

  8. Note that the window shows "Right Insulated Joint".

This behavior was confirmed on a colleague's machine. Note that the window consistently shows "Right Insulated Joint" if any of the following are true: the exe is run from within Visual Studio, the exe was built using the Debug configuration, or "Prefer 32-bit" is unchecked in project properties.

It's quite difficult to figure out what's going on, since any attempt to observe the process appears to change the result.

Santosantonica answered 4/3, 2016 at 0:27 Comment(0)
P
-4

Very few FPUs meet the IEEE standard (despite their claims). So running the same program in different hardware will indeed give you different results. The results are likely to be in corner cases that you should already avoid as part of using an FPU in your software.

IEEE bugs are often patched in software and are you sure that the operating system you are running today includes the proper traps and patches from the manufacturer? What about before or after the OS has an update? Are all bugs removed and bug fixes added? Is the C compiler in sync with all of this and is the C compiler producing the proper code?

Testing this may prove futile. You wont see the problem until you deliver the product.

Observe FP rule number 1: Never use an if(something==something) comparison. And rule number two IMO would have to do with ascii to fp or fp to ascii (printf, scanf, etc). There are more accuracy and bug problems there than in the hardware.

With each new generation of hardware (density) the affects from the sun are more apparent. We already have problems with SEU's on the planets surface, so independent of floating point calculations, you will have problems (few vendors have bothered to care, so expect crashes more often with new hardware).

By consuming enormous amounts of logic the fpu is likely to be a very fast (a single clock cycle). Not any slower than an integer alu. Do not confuse this with modern fpus being as simple as alus, fpus are expensive. (alus likewise consume more logic for multiply and divide to get that down to one clock cycle but its not nearly as big as the fpu).

Keep to the simple rules above, study floating point a bit more, understand the warts and traps that go with it. You may want to check for infinity or nans periodically. Your problems are more likely to be found in the compiler and operating system than the hardware (in general not just fp math). Modern hardware (and software) is, these days, by definition full of bugs, so just try to be less buggy than what your software runs on.

Portland answered 30/11, 2008 at 15:59 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.