Handling extremly small numbers in C++

Asked 8/7, 2015 at 20:47 Answered 9/7, 2015 at 6:42

Let a and b be two numbers between 0 and 1. How to calculate pow(a,10000)/(pow(a,10000)+pow(b,10000))?

Ex:- This following code gives -nan as output instead of 0.5

double a = 0.5,b = 0.5; 
cout<<pow(a,10000)/(pow(a,10000)+pow(b,10000));

Murrhine answered 8/7, 2015 at 20:47 Comment(8)

You should probably write your own class for that--the functions must all be evaluated before they're used in the operator functions, and so you're trying to retrieve information from a number that doesn't have the accuracy you want--in quite a few cases, a double can't store numbers that small. If you go sufficiently small, a long double would also be insufficient. So make your own small number class and overload the power function and operators for it, and then write a function to convert to double. – Spirant 8/7, 2015 at 20:54

Try treating your numbers as a fraction, ex: 0.5 -> 5/10, operate on the numerator and denominator separately and recalculate in the end. – Barnwell 8/7, 2015 at 20:57

Consider using Fixed Point numeric types, they may be more accurate. – Necrophobia 8/7, 2015 at 20:58

Can't you just convert this to 1 / (1 + pow(b/a, 10000)), only compute pow(b/a, 10000) and then return 1/(1+result) ? This would depend on the size of result which would depend ultimately on size of b/a. – Grimmett 8/7, 2015 at 21:10

Fixed point arithmetic will just make the problem even less accurate @Thomas :D – Stanch 8/7, 2015 at 22:21

@ThomasMatthews: Fixed point would have to be HUGE to accomodate for numbers like pow(0.5, 10000). I doubt fixed point is a solution for such numbers. – Chapel 9/7, 2015 at 8:7

@user1952500: since aand b are the same (0.5), a/b = 1. – Chapel 9/7, 2015 at 8:11

In that case the answer is 1/2 = 0.5, no further computation needed. – Grimmett 9/7, 2015 at 8:47

There is no simple generic solution to your problem. Writing computer programs dealing with very small and/or very big numbers is an "art of science" - often called numerical analysis. Typical tricks involves scaling before calculating.

In your case each pow(..) is rounded to zero because that is the closest representable value to the real result. After that you do 0/(0 + 0) which is NaN, i.e. Not a Number.

You could go for long double:

long double a = 0.5;
long double b = 0.5;
long double c =pow(a,10000);
long double d =pow(b,10000);
cout << c << endl;
cout << d << endl;
cout<<c/(c+d);

which result in:

5.01237e-3011
5.01237e-3011
0.5

but that will only help for " a while". Increasing the power a bit (just an extra zero) and the problem is back.

long double a = 0.5;
long double b = 0.5;
long double c =pow(a,100000);
long double d =pow(b,100000);
cout << c << endl;
cout << d << endl;
cout<<c/(c+d);

0
0
nan

So you need to write a very complicated class yourself or study how this is handle in numerical analysis.

Start here: https://en.wikipedia.org/wiki/Numerical_analysis or https://en.wikipedia.org/wiki/Arbitrary-precision_arithmetic

If you know the exp is the same all three place then you can do:

double a = 0.5,b = 0.5; 
int exp = 10000;
//cout<<pow(a,exp)/(pow(a,exp)+pow(b,exp)); is the same as:
cout<<1/(1+pow(b/a,exp));

It will work better for most a and b values but don't expect any precision. If a and b just differs a little bit, you'll get 0 (for a less b) or 1 (for b less a). But the NaN part will be solved.

Lapwing answered 8/7, 2015 at 21:37 Comment(0)

The standard approach for such problems is to work log-space, that is represent each number as e^x where x is your standard floating point type:

addition/subtraction (and summation more generally) can be performed using the log-sum-exp trick, i.e.
- e^x+e^y = e^x (1+e^y-x) = e^{x + log(1+exp(y-x))}
multiplication/division become addition/subtraction
- e^x × e^x = e^x+y
raising to a power is multiplication by an exponent:
- pow(e^x,e^x) = e^{x exp(y)}

But in your particular case, you're probably better off using the approach suggested at the end of StillLearning's answer

Gamaliel answered 9/7, 2015 at 6:42 Comment(1)

exp(x) is the same as e^x. It is sad that you are inconsecutive. – Milt 14/7, 2015 at 22:56

-2

Compilers don't know enough math to be able to simplify complex code like this (the compiler has no idea what pow actually does, it just knows its a function that takes in 1 type and returns another). So it actually has to go through the calculations step by step. Unfortunately, the result is so small that it won't fit into a double. THat's why it returns NAN- you triggered an underflow of the variable.

If you really need to do this (and I'd question why you need that level of accuracy), you'll need to either do some math tricks to work with a non-standard number system (for example, instead of storing dollars you can store pennies- but use a much higher factor) or you can write your own math class and do all of your own math libraries.

Or you can move to a platform like Mathematica or MatLab that's better for this kind of work and has a lot of these type of issues built in.

Chloramine answered 8/7, 2015 at 21:20 Comment(6)

I don't think there is any underflow involved. It is a simple "round-to-zero" or "round-to-nearst" which causes the intermediate results to be zero. And then 0/(0+0) which is NaN. – Lapwing 8/7, 2015 at 21:45

Depends on the exact implementation of the pow function. But you're right, it could be only getting NAN on the last step. The answer doesn't change though- he'd need a special number class, some math tricks, or move to a platform made for symbolic math. – Chloramine 8/7, 2015 at 22:1

Well - maybe - of cause I don't know all implementations of pow and floating poing processors/libs. Maybe there are some dealing with "underflow" in this case even though I haven't heard about it :-) Anyway - the only correct answer is - study the math (numerical analysis) and then use their tricks. But writing code to deal safely with such numbers are really, really difficult. And going into such details is out of scope for this place. – Lapwing 8/7, 2015 at 22:16

pow(0.5,10000) does underflow, in the sense that the true answer is non-zero, but the result will be zero due to inability to express a sufficiently small number in the given format. – Gamaliel 9/7, 2015 at 8:36

@SimonByrne - seems this is actually called underflow. Quote:

The interval between −fminN and fminN, where fminN is the smallest positive normal floating point value, is called the underflow gap.

(source: en.wikipedia.org/wiki/Arithmetic_underflow ) Actually I didn't know as I considered it "normal rouding" but it seems to be called flush to zero - my mistake :) - that's why I got the name: – Lapwing 9/7, 2015 at 10:19

Most compilers know quite a bit about what pow does; it's not just an arbitrary function. For example, many compilers will optimize pow(x,2) to x*x. The particular optimization that would be required here, however, is (a) of extremely limited use and (b) would drastically change results (as in this case, from NaN to a finite number). Compilers don't make optimizations that would result in radically different results unless they're explicitly told to do so. – Bib 9/7, 2015 at 12:1

Recommended topics

Hot tags