How to alter a float by its smallest increment (or close to it)?

Asked 30/9, 2008 at 22:29 Answered 1/10, 2008 at 2:36

I have a double value f and would like a way to nudge it very slightly larger (or smaller) to get a new value that will be as close as possible to the original but still strictly greater than (or less than) the original.

It doesn't have to be close down to the last bit—it's more important that whatever change I make is guaranteed to produce a different value and not round back to the original.

Preiser answered 30/9, 2008 at 22:29 Comment(2)

A double or a float? Depending on which you have, the smallest value will be different. – Benjie 30/9, 2008 at 22:34

Yeah, I realize my question title and description were inconsistent. I figured the answers could address both cases, which the accepted answer does. – Preiser 30/9, 2008 at 22:48

Check your math.h file. If you're lucky you have the nextafter and nextafterf functions defined. They do exactly what you want in a portable and platform independent way and are part of the C99 standard.

Another way to do it (could be a fallback solution) is to decompose your float into the mantissa and exponent part. Incrementing is easy: Just add one to the mantissa. If you get an overflow you have to handle this by incrementing your exponent. Decrementing works the same way.

EDIT: As pointed out in the comments it is sufficient to just increment the float in it's binary representation. The mantissa-overflow will increment the exponent, and that's exactly what we want.

That's in a nutshell the same thing that nextafter does.

This won't be completely portable though. You would have to deal with endianess and the fact that not all machines do have IEEE floats (ok - the last reason is more academic).

Also handling NAN's and infinites can be a bit tricky. You cannot simply increment them as they are by definition not numbers.

Marasmus answered 30/9, 2008 at 22:34 Comment(11)

You specifically do NOT want to handle mantissa overflow, since the overflow will roll over onto the exponent which is what you want. – Embonpoint 30/9, 2008 at 22:40

Cool - I never thought about that. Incrementing the float as an integer will exactly do what needed. – Marasmus 30/9, 2008 at 22:47

It is cool :) Now could the idiot who downvoted my answer saying so please undo it? – Embonpoint 30/9, 2008 at 22:49

How would that work if the exponent increment overflowed into the sign bit? – Reparative 30/9, 2008 at 22:51

Yep, you would need special cases for inf and nan. – Embonpoint 30/9, 2008 at 22:53

I think negative values have to be treated different as well. If you increment those the result will be more negative than the original value. And btw - I just disassembled nextafter in the VS.NET 2008 implementation. They do quite a bit more than I would have expected. – Marasmus 30/9, 2008 at 22:57

Nils: Ah, v. true about -ve numbers. – Embonpoint 30/9, 2008 at 23:8

You need extra care around 0 and -0 too – Depreciation 13/3, 2015 at 20:52

In Visual C++ 2008 I found _nextafter() in float.h. – Iphigenia 21/5, 2015 at 16:21

GNU libc source for nextafterf, and nextafter (using only 32bit integers, so it's clunky). These functions handle the +/-0.0 special case, negative floats, and going towards a given 2nd arg, not always +Inf. Remember, that's LGPLed code. Even though it's linked from SO, you can only copy-paste it into GPL-compatible projects. – Haematite 15/2, 2016 at 20:38

@NilsPipenbrinck I don't think there is a portable way to increment a float as an integer unless you use C++'s bit_cast. – Wengert 21/2 at 6:44

u64 &x = *(u64*)(&f);
x++;

Yes, seriously.

Edit: As someone pointed out, this does not deal with -ve numbers, Inf, Nan or overflow properly. A safer version of the above is

u64 &x = *(u64*)(&f);
if( ((x>>52) & 2047) != 2047 )    //if exponent is all 1's then f is a nan or inf.
{
    x += f>0 ? 1 : -1;
}

Embonpoint answered 30/9, 2008 at 22:38 Comment(10)

I wonder if the downvoter could comment as to why this wasn't helpful... Myself, having learned of the nextafter() function, I'd prefer those but if this one would work then I figure it's noteworthy in its own right. – Preiser 30/9, 2008 at 22:47

Mike, what's the include file / compiler that will make this run? – Everick 30/9, 2008 at 22:55

This is totally implementation dependent and non-portable. Works ok if you have EEMMM, but if you have MMMEE won't give you the result you want. – Daphinedaphna 30/9, 2008 at 22:58

@David: Try replacing u64 with 'long long'. @Benoit: yep, it assumes ieee754, I think that's a fairly safe bet nowadays. – Embonpoint 30/9, 2008 at 23:6

undefined behaviour, violation of strict aliasing rules – Ectomorph 26/10, 2009 at 16:36

@sellibitze: possibly undefined but in practice C compilers that do not handle this kind of type-casting will not be able to build real-world code. So it isn't a problem unless you get over happy with optimization settings. – Geminate 7/1, 2010 at 0:6

can't you just cast it through a void pointer *(u64*)(void*)(&f) you are then telling the compiler you know about the aliasing issues and you're OK with it? – Norinenorita 12/12, 2011 at 14:41

+1 If doing a x86-64 build, the float would be stored in a SSE register. The fastest way to do this would then be: _mm_cvtss_f32(_mm_castsi128_ps(_mm_add_epi32(_mm_castps_si128(_mm_set_ss(val)), _mm_set1_epi32(1)))). Note that cast/cvt/set_ss will not generate any instructions. – Connotative 30/1, 2013 at 0:48

This doesn't handle -0 either. – Depreciation 13/3, 2015 at 20:53

It's not quite this simple. See glibc's implementation of nextafterf. Note how you have to decrease the magnitude when x is negative and non-zero. (And note that integer compares of FP numbers compare them as 2's complement, so you get the opposite result with two negative nubmers). (The implementation of double nextafter is the same, but way more clunky because it only uses 32bit integers.) – Haematite 15/2, 2016 at 15:29

In absolute terms, the smallest amount you can add to a floating point value to make a new distinct value will depend on the current magnitude of the value; it will be the type's machine epsilon multiplied by the current exponent.

Check out the IEEE spec for floating point represenation. The simplest way would be to reinterpret the value as an integer type, add 1, then check (if you care) that you haven't flipped the sign or generated a NaN by examining the sign and exponent bits.

Alternatively, you could use frexp to obtain the current mantissa and exponent, and hence calculate a value to add.

Pulpiteer answered 30/9, 2008 at 22:38 Comment(0)

I needed to do the exact same thing and came up with this code:

double DoubleIncrement(double value)
{
  int exponent;
  double mantissa = frexp(value, &exponent);
  if(mantissa == 0)
    return DBL_MIN;

  mantissa += DBL_EPSILON/2.0f;
  value = ldexp(mantissa, exponent);
  return value;
}

Puss answered 30/9, 2008 at 22:47 Comment(0)

For what it's worth, the value for which standard ++ incrementing ceases to function is 9,007,199,254,740,992.

Morocco answered 30/9, 2008 at 23:1 Comment(0)

This may not be exactly what you want, but you still might find numeric_limits in of use. Particularly the members min(), and epsilon().

I don't believe that something like mydouble + numeric_limits::epsilon() will do what you want, unless mydouble is already close to epsilon. If it is, then you're in luck.

Enravish answered 1/10, 2008 at 2:36 Comment(0)

-2

I found this code a while back, maybe it will help you determine the smallest you can push it up by then just increment it by that value. Unfortunately i can't remember the reference for this code:

#include <stdio.h>

int main()
{
    /* two numbers to work with */
    double number1, number2;    // result of calculation
    double result;
    int counter;        // loop counter and accuracy check

    number1 = 1.0;
    number2 = 1.0;
    counter = 0;

    while (number1 + number2 != number1) {
        ++counter;
        number2 = number2 / 10;
    }
    printf("%2d digits accuracy in calculations\n", counter);

    number2 = 1.0;
    counter = 0;

    while (1) {
        result = number1 + number2;
        if (result == number1)
            break;
        ++counter;
        number2 = number2 / 10.0;
    }

    printf("%2d digits accuracy in storage\n", counter );

    return (0);
}

Trevortrevorr answered 30/9, 2008 at 22:38 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags