Strange multiplication result
Asked Answered
C

1

0

In my code I have this multiplications in a C++ code with all variable types as double[]

f1[0] = (f1_rot[0] * xu[0]) + (f1_rot[1] * yu[0]); 
f1[1] = (f1_rot[0] * xu[1]) + (f1_rot[1] * yu[1]); 
f1[2] = (f1_rot[0] * xu[2]) + (f1_rot[1] * yu[2]); 

f2[0] = (f2_rot[0] * xu[0]) + (f2_rot[1] * yu[0]); 
f2[1] = (f2_rot[0] * xu[1]) + (f2_rot[1] * yu[1]);
f2[2] = (f2_rot[0] * xu[2]) + (f2_rot[1] * yu[2]);

corresponding to these values

Force Rot1 : -5.39155e-07, -3.66312e-07
Force Rot2 : 4.04383e-07, -1.51852e-08

xu: 0.786857, 0.561981, 0.255018
yu: 0.534605, -0.82715, 0.173264

F1: -6.2007e-07, -4.61782e-16, -2.00963e-07
F2: 3.10073e-07, 2.39816e-07, 1.00494e-07

this multiplication in particular produces a wrong value -4.61782e-16 instead of 1.04745e-13

f1[1] = (f1_rot[0] * xu[1]) + (f1_rot[1] * yu[1]);  

I hand verified the other multiplications on a calculator and they all seem to produce the correct values.

this is an open mpi compiled code and the above result is for running a single processor, there are different values when running multiple processors for example 40 processors produces 1.66967e-13 as result of F1[1] multiplication.

Is this some kind of mpi bug ? or a type precision problem ? and why does it work okay for the other multiplications ?

Cerf answered 26/6, 2014 at 14:3 Comment(6)
MPI hints at a race condition: math is almost never the bug.Charitacharitable
what's f1_rot[0] and f1_rot[1]/Deraign
Yes, I'd also say that you produced a race condition. Maybe you forgot some barrier before the code you showed (f1_rot, xu, etc. not being set reliably before the multiplications are executed).Hidrosis
@Deraign Force Rot1 and Force Rot2Cerf
Can't reproduce.Vassar
@MadScienceDreams, data races in a paradigm that is modelled on processes not sharing memory, really?Factor
S
5

Your problem is an obvious result of what is called catastrophic summations: As we know, a double precision float can handle numbers of around 16 significant decimals.

f1[1] = (f1_rot[0] * xu[1]) + (f1_rot[1] * yu[1])
      = -3.0299486605499998e-07 + 3.0299497080000003e-07
      = 1.0474500005332475e-13

This is what we obtain with the numbers you have given in your example. Notice that (-7) - (-13) = 6, which corresponds to the number of decimals in the float you give in your example: (ex: -5.39155e-07 -3.66312e-07, each mantissa is of a precision of 6 decimals). It means that you used here single precision floats.

I am sure that in your calculations, the precision of your numbers is bigger, that's why you find a more precise result.

Anyway, if you use single precision floats, you can't expect a better precision. With a double precision, you can find a precision up to 16. You shouldn't trust a difference between two numbers, unless it is bigger than the mantissa:

  • Simple precision floats: (a - b) / b >= ~1e-7
  • Double precision floats: (a - b) / b >= ~4e-16

For further information, see these examples ... or the table in this article ...

Swearingen answered 26/6, 2014 at 14:52 Comment(4)
Ok so yes I understand now how I get these numbers. However this means that there is a loss of precision when moving to a larger numbers of processors ? how does that happen and can it be fixed ?Cerf
It seems like you are doing a matrix product. For big matrices, more than two terms are summed. In case you are interested by the coefficients, theoretically, some technics can be used to reduce the loss of information: for example, summing separately negative and positive numbers from the smallest (in absolute value) to the biggest. There is also the so called Kahan summation ... that can be more adapted to matrices. Any way, you wont do this manually, some libraries do it fine.Swearingen
>>> In general, you wont be needing a big precision for all the coefficients of your matrix ; indeed, a matrix is a set of a lot of values that have a meaning together. When you notice a loss of precision, it wont concern the biggest coefficients, only those who are negligible will be affected. You may however obtain an ill-conditioned matrix if the contrast between smaller and bigger coefficients is high.Swearingen
In order to increase the precision of your numbers, there must be some (symbolic) libraries that do it very well. You may however notice a slowdown in your calculations. You can try GNU ... for instanceSwearingen

© 2022 - 2024 — McMap. All rights reserved.