Can I use rounding to ensure determinism of atomic floating point operations?
Asked Answered
G

1

6

I am developing a C application which needs floating-point determinism. I would also like the floating-point operations to be fairly fast. This includes standard transcendental functions not specified by IEEE754 like sine and logarithm. The software floating-point implementations I have considered are relatively slow, compared to hardware floating point, so I am considering simply rounding away one or two of the least significant bits from each each answer. The loss of precision is an adequate compromise for my application, but will this suffice to ensure deterministic results across platforms? All floating-point values will be doubles.

I realize order of operations is another potential source for variance in floating-point results. I have a way to address that already.

It would be terrific if there were software implementations of the major floating-point hardware implementations in use today, so I could test a hypothesis like this directly.

Girl answered 10/2, 2012 at 23:0 Comment(10)
Most compilers today have some sort of strict-FP option. That probably does what you want.Corrida
IEE-754 already has rules for rounding, so it's unlikely that any rules you apply on top of it will improve things. Can you be more specific about what you mean by "deterministic results"?Merca
@Mark The same code will give different results on different hardware for floating point code. I imagine that's what is meant here by deterministic.Iseabal
What you want is not deterministic behavior, but identical results across platforms. Rounding will not suffice on it's own. If you want identical results from a math library, you need to write it yourself (or use an existing portable library). That's simply not a design requirement that system math libraries are made to support.Annihilation
@StephenCanon, yes, the same results on different platforms, every time. Can you give an example of a standard floating-point operation or function call which has different results on different platforms, modulo the lowest couple of siginificant bits? I'm guessing the variation comes from normalization, since surely all competent implementations of floating-point functions agree to high precision?Girl
A fully IEEE-754 compliant floating point implementation will give the same results on all platforms. If it doesn't then it isn't 754 compliant. All modern hardware is supposed to be IEEE 754 compliant. (Obsolete hardware such as early Intel FPUs were not compliant by default, due to excess precision in the floating point registers.)Distort
@markgz: the questioner is asking about "transcendental functions not specified by IEEE754 like sine and logarithm".Annihilation
@HarryCollins: unfortunately that's not the case (though the situation is getting better!) For a simple example, consider computing the sine of a very large input; different platforms give different answers, depending on how accurate of an approximation to pi they use to compute the argument reduction -- many platforms use a fully accurate approximation to pi (requiring about 1200 bits for double), but some platforms use only 53 or 64 or 66 bits, which results in wildly different results for large inputs.Annihilation
@HarryCollins: Do note that you can work around that particular sort of behavior by defining your own argument reduction and then calling sin or cos of the reduced value yourself.Annihilation
@HarryCollins: "modulo the lowest couple of significant bits" is a definite no if you want floating-point operations to be deterministic. The last bit is just as important as all the other.Tralee
C
3

As I understand it, you have a software implementation of a transcendental function like sin(x), expressed in terms of IEEE standard operations such as floating point add and multiply, and you want to ensure that you get the same answer on all machines (or, at least, all the machines that you care about).

First, understand: this will not be portable to all machines. E.g. IBM mainframe hex floating point is not IEEE, and will not give the same answers. To get that exact, you would need to have a software implementation of the IEEEE standard operations like FP add and multiply.

I'm guessing that you only care about machines that implement IEEE standard floating point. And I am also guessing that you are not worried about NaNs, since NaNs were not completely standardized by IEEE 754-1985, and two opposite implementations arose: HP and MIPS, vedrsus almost everyone else.1

With those restrictions, how can you get variability in your calculations?

(1) If the code is being parallelized. Make sure that is not happening. (It's unlikely, but some machines might.) Parallelization is a major source of result variation in FP. At least one company I know, who cares about reproduceability and parallelism, refuses to use FP, and only uses integer.

(2) Ensure that the machine is set up appropriately.

E.g. most machines calculate in 32 or 64 bit precision (C original standard was 64 bit "double" everywhere. But Intel x86/x87 can calculate in 80 bit in registers, and round to 64 or 32 when spilling. 1 shows how to change the x86/x87 precision control from 80 bit to 64 bit, using inline assembly. Note that this code is assembly level and not portable - but most other machines already do computation in 32 or 64 bit precision, and you don't need to worry about the x87 80 bit.

(By the way, on x86 you you can only avoid all issues by using SSE FP; the old legacy Intel x87 FP can never give exactly the same answers (although if you set precision control (PC) to 64 bit rather than 80 bit, you will get the same results except if there was an intermediate overflow, since the exponent width is not affected, just the mantissa))

E.g. ensure that you are using the same underflow mode on all machines. I.e. ensure denorms or enabled, or oppositely that all machines are in flush to zero mode. Here it is a Dobson's choice: flush to zero modes are not standardized, but some machines, e.g. GPUs, simply have not had denormalized numbers. I.e. many machines have IEEE standard number FORMATS, but not actual IEEE standard arithmetic (with denorms). My druther is to require IEEE denorms, but if I were absolutely paranoid I would go with flush to zero, and force that flushing myself in the software.

(3) Ensure that you are using the same language ioptions. Older C programs do all calculations in "double" (64-bit), but it is now permissible to calculate in single precision. Whatever, you want to do it the same way on all machines.

(4) Some smaller items wrt your code:

Avoid big expressions that a compiler is likely to rearrange (if it doesn't implement strict FP switches properly)

Possible write all of your code in simple form like

double a = ...;
double b = ...;
double c = a *b;
double d = ...;
double e = a*d;
double f = c + e;

Rather than

f = (a*b) + (a*c);

which might be optimized to

f = a*(b+c);

I'll leave talking about compiler options for the last, because it is longer.

If you do all of these things, then your calculations should be absolutely repeatable. IEEE floating point is exact - it always gives the same answers. It is the rearranging of the calculations by the compiler on the way to the IEEE FP that introduces variability.

There should be no need for you to round off low order bits. But doing so also will not hurt, and may mask some issues. Remember: you may need to mask off at least one bit for every add....

(2) Compiler optimizations rearranging the code in different ways on different machines. As one commenter said, use whatever your compiler switches for strict FP are.

You might have to disable all optimization for the file containing your sin code.

You might have to use volatiles.

Hopefully there are compiler switches that are more specific. E.g. for gcc:

-ffp-contract=off --- disable fused multiply add, since not all of your target machines may have them.

-fexcess precision=standard --- disables stuff like Intel x86/x87 excess precision in internal registers

-std=c99 --- specifies fairly strict C language standard. Unfortunately not completely implemented, as I google it today

make sure you do not have optimizations enabled like -funsafe-math and -fassociativbe-math

Compressibility answered 25/2, 2012 at 4:16 Comment(1)
I can't believe I wrote all of the above, and forget to mention: Superaccumulators are a really cool way of providing bit-precise "floating point" calculations. I quote "fp", because superaccumulators are really not floating point - they are really BIG integer fixed point values. I mention them at semipublic.comp-arch.net/wiki/Superaccumulator, with pointers to some of the classic papers.Compressibility

© 2022 - 2024 — McMap. All rights reserved.