Trying to write a code for finding the machine epsilon
Asked Answered
H

6

5

I am trying to find out the precision level for various floating point formats in C (i.e. float, double and long double). Here is the code I'm using at the moment:

#include <stdio.h>
#define N 100000

int main(void)
{
   float max = 1.0, min = 0.0, test;
   int i;                              /* Counter for the conditional loop */

   for (i = 0; i < N; i++) {
      test = (max + min) / 2.0;
      if( (1.0 + test) != 1.0)         /* If too high, set max to test and try again */
     max = test;
  if( (1.0 + test) == 1.0)     /* If too low, set min to test and try again */
         min = test;
   }
   printf("The epsilon machine is %.50lf\n", max);
   return 0;
}

This gives the value of roughly ~2^-64 as expected. However when I change the decelerations to doubles or 'long doubles' I get the same answer I should get a smaller value but I don't. Anybody got any ideas?

Habitue answered 13/8, 2010 at 16:14 Comment(1)
Doesn't float have a 23-bit mantissa? Why would you expect 2^-64?Impair
I
2

A guess why you're getting the same answer:

if( (1.0 + test) != 1.0)

Here 1.0 is a double constant, so it's promoting your float to a double and performing the addition as a double. You probably want to declare a temporary float here to perform the addition, or make those float numeric constants (1.0f IIRC).

You might also be falling into the excess-precision-in-temporary-floats problem and might need to force it to store the intermediates in memory to reduce to the correct precision.


Here's a quick go at redoing your range search method but computing the test in the correct type. I get an answer that's slightly too large, though.

#include <stdio.h>
#define N 100000
#define TYPE float

int main(void)
{
   TYPE max = 1.0, min = 0.0, test;
   int i;

   for (i = 0; i < N; i++)
   {
      TYPE one_plus_test;

      test = (max + min) / ((TYPE)2.0);
      one_plus_test = ((TYPE)1.0) + test;
      if (one_plus_test == ((TYPE)1.0))
      {
         min = test;
      }
      else
      {
         max = test;
      }
   }
   printf("The epsilon machine is %.50lf\n", max);
   return 0;
}
Impair answered 13/8, 2010 at 16:22 Comment(3)
How do I 'cast' it as a float? I'll give it a go and see what happensHabitue
Yeah I tried that but its still giving me and epsilon value of 2^-64Habitue
OK, try storing it to, then reading it from, a volatile float variable. That is: volatile float tmp = 1.0 + test; if (tmp == 1.0) ...Adiell
P
10

It depends upon what you mean by "precision level".

Floating-point numbers have "regular" (normal) values, but there are special, sub-normal numbers as well. If you want to find out different limits, the C standard has predefined constants:

#include <math.h>
#include <stdio.h>
#include <float.h>

int main(void)
{
    printf("%30s: %g\n", "FLT_EPSILON", FLT_EPSILON);
    printf("%30s: %g\n", "FLT_MIN", FLT_MIN);
    printf("%30s: %g\n", "nextafterf(0.0, 1.0)", nextafterf(0.0, 1.0));
    printf("%30s: %g\n", "nextafterf(1.0, 2.0)-1", (nextafterf(1.0, 2.0) - 1.0f));
    puts("");
    printf("%30s: %g\n", "DBL_EPSILON", DBL_EPSILON);
    printf("%30s: %g\n", "DBL_MIN", DBL_MIN);
    printf("%30s: %g\n", "nextafter(0.0, 1.0)", nextafter(0.0, 1.0));
    printf("%30s: %g\n", "nextafter(1.0, 2.0)-1", (nextafter(1.0, 2.0) - 1.0));
    puts("");
    printf("%30s: %Lg\n", "LDBL_EPSILON", LDBL_EPSILON);
    printf("%30s: %Lg\n", "LDBL_MIN", LDBL_MIN);
    printf("%30s: %Lg\n", "nextafterl(0.0, 1.0)", nextafterl(0.0, 1.0));
    printf("%30s: %Lg\n", "nextafterl(1.0, 2.0)-1", (nextafterl(1.0, 2.0) - 1.0));
    return 0;
}

The above program prints 4 values for each type:

  • the difference between 1 and the least value greater than 1 in that type (TYPE_EPSILON),
  • the minimum positive normalized value in a given type (TYPE_MIN). This does not include subnormal numbers,
  • the minimum positive value in a given type (nextafter*(0...)). This includes subnormal numbers,
  • the minimum number greater than 1. This is the same as TYPE_EPSILON, but calculated in a different way.

Depending upon what you mean by "precision", any or none of the above can be useful to you.

Here is the output of the above program on my computer:

               FLT_EPSILON: 1.19209e-07
                   FLT_MIN: 1.17549e-38
      nextafterf(0.0, 1.0): 1.4013e-45
    nextafterf(1.0, 2.0)-1: 1.19209e-07

               DBL_EPSILON: 2.22045e-16
                   DBL_MIN: 2.22507e-308
       nextafter(0.0, 1.0): 4.94066e-324
     nextafter(1.0, 2.0)-1: 2.22045e-16

              LDBL_EPSILON: 1.0842e-19
                  LDBL_MIN: 3.3621e-4932
      nextafterl(0.0, 1.0): 3.6452e-4951
    nextafterl(1.0, 2.0)-1: 1.0842e-19
Plunger answered 13/8, 2010 at 19:10 Comment(3)
Cheers, im glad C has in built thing for this. However my task is to write a code to find the smallest value greater than one that can be represented by a floating number. The first number there: 1.19209e-07 is the one i expected but for some reason my code doesnt give me that. MAny ThanksHabitue
@Jack: OK. Then you should make sure that all the floating point numbers used in your calculation are float values. So instead of doing 1.0 + test != 1.0, I would do: float try = 1.0 + test; if (try != 1.0), etc.Plunger
Cheers guys, its starting to give sensible answers nowHabitue
I
2

A guess why you're getting the same answer:

if( (1.0 + test) != 1.0)

Here 1.0 is a double constant, so it's promoting your float to a double and performing the addition as a double. You probably want to declare a temporary float here to perform the addition, or make those float numeric constants (1.0f IIRC).

You might also be falling into the excess-precision-in-temporary-floats problem and might need to force it to store the intermediates in memory to reduce to the correct precision.


Here's a quick go at redoing your range search method but computing the test in the correct type. I get an answer that's slightly too large, though.

#include <stdio.h>
#define N 100000
#define TYPE float

int main(void)
{
   TYPE max = 1.0, min = 0.0, test;
   int i;

   for (i = 0; i < N; i++)
   {
      TYPE one_plus_test;

      test = (max + min) / ((TYPE)2.0);
      one_plus_test = ((TYPE)1.0) + test;
      if (one_plus_test == ((TYPE)1.0))
      {
         min = test;
      }
      else
      {
         max = test;
      }
   }
   printf("The epsilon machine is %.50lf\n", max);
   return 0;
}
Impair answered 13/8, 2010 at 16:22 Comment(3)
How do I 'cast' it as a float? I'll give it a go and see what happensHabitue
Yeah I tried that but its still giving me and epsilon value of 2^-64Habitue
OK, try storing it to, then reading it from, a volatile float variable. That is: volatile float tmp = 1.0 + test; if (tmp == 1.0) ...Adiell
M
2

I'm not sure how your algorithm is supposed to work. This one (C++) gives the correct answers:

#include <iostream>

template<typename T>
int epsilon() {
    int pow = 0;
    T eps = 1;
    while (eps + 1 != 1) {
        eps /= 2;
        --pow;
    }
    return pow + 1;
}

int main() {
    std::cout << "Epsilon for float: 2^" << epsilon<float>() << '\n';
    std::cout << "Epsilon for double: 2^" << epsilon<double>() << '\n';
}

This computes the smallest value such that, when added to 1, is still distinguishable from 1.

Output:

Epsilon for float: 2^-23
Epsilon for double: 2^-52
Meisel answered 13/8, 2010 at 16:32 Comment(2)
I dont know any C++ Im afraidHabitue
It shouldn't be too hard to kunderstand if you know C. The template just lets me write T and substitute either float or double for that. And the printing works differently, but don't worry about that.Meisel
G
2

IEEE 754 floating-point formats have the property that, when reinterpreted as a two's complement integer of the same width, they monotonically increase over positive values and monotonically decrease over negative values (see the binary representation of 32 bit floats). They also have the property that 0 < |f(x)| < ∞, and |f(x+1) − f(x)| ≥ |f(x) − f(x−1)| (where f(x) is the aforementioned integer reinterpretation of x). In languages that allow type punning and always use IEEE 754-1985, we can exploit this to compute a machine epsilon in constant time. For example, in C:

typedef union {
  long long i64;
  double d64;
} dbl_64;

double machine_eps (double value)
{
    dbl_64 s;
    s.d64 = value;
    s.i64++;
    return s.d64 - value;
}

From https://en.wikipedia.org/wiki/Machine_epsilon

Gertie answered 22/5, 2016 at 20:5 Comment(0)
S
1

I would like to add that you can get the highest precision from a floating point calculation by using long double.

To apply that to @Rup's solution, just change the TYPE to long double and the printf statement to:

printf("The epsilon machine is %.50Lf\n", max);

This is the Epsilon on my machine using float:

0.00000005960465188081798260100185871124267578125000

And using long double:

0.00000000000000000005421010862427522170625011179761

The difference is quite significant.

Stasiastasis answered 25/1, 2016 at 0:31 Comment(0)
L
0

A problem with such code is that the compiler will load the floating point variables into floating point registers of the microprocessor. And if your microprocessor only has double precision floating point registers, the precision will be the same for float and double.

You'll need to find a way to force the compiler to store the floating point value back to the memory between each two calculations (into a variable of the correct type). That way it has to throw away the additional precision of the registers. But today's compilers are clever in optimizing your code. So this can be difficult to achieve.

Lyse answered 13/8, 2010 at 18:38 Comment(1)
Why not just compile in debug mode or similar, that doesn't perform optimizations?Cystotomy

© 2022 - 2024 — McMap. All rights reserved.