Why adding big to small in floating point introduce more error?
Asked Answered
F

1

-1

Not sure why, but if I add big to small fp numbers it seems that the increment error is bigger:

#include <iostream>
#include <math.h>

int main() {
    std::cout.precision(50);

    const int numLoops = 1000;
    const long length = 10000;
    const double rate = 0.1;

    long totalLength = length * numLoops;
    long long steps = (long long)(totalLength / rate);

    double sum = 0.0;
    double sumRemainder = 0.0;
    for (long long step = 0; step < steps; step++) {
        if (sumRemainder >= length) {
            sumRemainder = fmod(sumRemainder, length);
        }

        sum += rate;
        sumRemainder += rate;
    }

    std::cout << "                  length: " << length << std::endl;
    std::cout << "               num loops: " << numLoops << std::endl;
    std::cout << "                    rate: " << rate << std::endl;
    std::cout << "                   steps: " << steps << std::endl << std::endl;
    std::cout << "                     sum: " << sum << std::endl;
    std::cout << "           sum remainder: " << sumRemainder << std::endl;
    std::cout << "                   error: " << abs(totalLength - sum) << std::endl;
    std::cout << "         error remainder: " << abs(length - sumRemainder) << std::endl;
    std::cout << std::endl;
}

The only differences between the two sums are that one is for all steps, while on the other I simply fmod the result once reach a limit (thus, it clamp to a small value):

sumRemainder = fmod(sumRemainder, length);

It seems that's the cause it introduce low error while summing the same amount: 1.884836819954216480255126953125e-05 vs 0.01887054927647113800048828125

Can someone please explain to me why this happens with a clever example?

Forelli answered 4/11, 2018 at 11:0 Comment(5)
@MatthieuBrucher: no!!! My question is clear. Its why the "broken" is heavy adding BIG to SMALL. Please read again.Forelli
The reason is the same whatsoever. The duplicate explains why and is valid.Pyatt
@πάνταῥεῖ: No, the marked original does not explain why. It states some rules about floating-point arithmetic but does not explain the question asked here. Asserting that it does is like asserting that the C++ standard answers all questions about C++. But we do not mark all C++ questions as duplicates of one that says there is a C++ standard and states a few things about it. We explain, in context, how the C++ rules give rise to particular behaviors. In contrast, floating-point questions are indiscriminately closed and marked as duplicates without explaining or educating. It is a shame.Rhoden
@EricPostpischil Feel free to cast your vote for reopening the question if you think it's worth it.Pyatt
@πάνταῥεῖ: I likely will when I have time to draft an answer. But my argument is not just about this one question. A number of people indiscriminately close floating-point questions as duplicates when there is significant material to address. It seems like a number of people with experience in diverse fields (C++, processors, whatever) treat floating-point as something that must be accepted as imprecise and tolerated without understanding. They close floating-point questions on that basis, which interferes with people answering with floating-point information and education. Please stop.Rhoden
R
6

To handle large or small numbers, a floating-point format scales numbers. A fixed number of digits are used for the significand of a number, and they are scaled by some base (often 2) raised to a power, called the exponent. There is also a sign, + or −, although the sign is sometimes included in the significand.

For example, using a binary format, a significand of 1.0112 represents 1+3/16 when scaled with an exponent of zero (1.0112•20 = 1+3/16), 11 when scaled with an exponent of four (1.0112•24 = 11), and 11/32 when scaled with an exponent of −1 (1.0112•2−1 = 11/32).

There are a fixed number of digits available for the significand. So only certain numbers can be represented. When any arithmetic is performed, the exact mathematical result is rounded to the nearest representable number. (The common default rule for rounding is to round to the nearest representable value, and, in case of ties, to round so the low digit is even.)

For example, in a decimal format which has three decimal digits in its significand, consider adding the numbers 567 (5.67•102) and 789 (7.89•102). The result is 1356, but that has too many digits. So it is rounded to 1360 (1.36•103). There is a rounding error of 4.

So, when working with floating-point numbers, there are rounding errors that are some fraction of the position value of the least significant digit in the significand. When the numbers have bigger exponents, the possible errors are larger. The error is always between zero and half the position value of the least significant digit (because any number between two representable numbers is either at the midpoint or closer to one or the other, so it is never necessary to move a number more than half the distance between representable numbers.)

Thus, when working with larger numbers, the rounding errors are larger.

Rhoden answered 5/11, 2018 at 0:41 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.