Detect FPU rounding mode on a GPU

Asked 16/7, 2014 at 11:20 Answered 18/7, 2014 at 14:46

Solved floating-point floating-point-precision fpu

I was delving into multi-precision arithmetics, and there is a nice fast class of algorithms, described in Jonathan Richard Shewchuk, "Adaptive Precision Floating-Point Arithmetic and Fast Robust Geometric Predicates", 1997, Discrete & Computational Geometry, pages: 305–363. However, these algorithms rely on the FPU using round-to-even tiebreaking.

On CPU, it would be easy, one would just check or set the FPU state word and would be sure. However, there is no such instruction (yet?) for GPU programming.

That is why I was wondering if there is a dependable way of detecting (not setting) the rounding mode on an unknown FPU, perhaps by calculating a couple of tests and looking at the bit patterns of the resulting floating point numbers?

EDIT:

Just to summarize, the accepted code indeed seems to work, you can try:

#include <stdio.h>
#include <stdlib.h>
#include <float.h> // _controlfp()
#include <stdint.h>

int is_round_to_nearest()
{
    union {
        double f;
        uint64_t n;
    } special;

    special.n = 0 | (((uint64_t)(-0x100 + 1023) & 0x7ff) << 52) | 0; // no sign, 1.0 mantissa is expressed as zeroes, the 1 is implicit
    //const double special.f = atof("0x1.0p-100");

    if( 1.0 + special.f !=  1.0)
        return 0;
    if( 1.0 - special.f !=  1.0)
        return 0;
    if(-1.0 + special.f != -1.0)
        return 0;
    if(-1.0 - special.f != -1.0)
        return 0;
    return 1;
}

void main()
{
    printf("default : %d\n", is_round_to_nearest());
    _controlfp(_RC_CHOP, _MCW_RC);
    printf("_RC_CHOP : %d\n", is_round_to_nearest());
    _controlfp(_RC_UP, _MCW_RC);
    printf("_RC_UP : %d\n", is_round_to_nearest());
    _controlfp(_RC_DOWN, _MCW_RC);
    printf("_RC_DOWN : %d\n", is_round_to_nearest());
    _controlfp(_RC_NEAR, _MCW_RC);
    printf("_RC_NEAR : %d\n", is_round_to_nearest());
}

And that gives the following answers:

default : 1
_RC_CHOP : 0
_RC_UP : 0
_RC_DOWN : 0
_RC_NEAR : 1

Note that I was unable to set round to nearest, ties away from zero mode on my machine. In Visual Studio, floating point mode needs to be set to strict (/fp:strict), otherwise this will not work in release mode (all modes identified as nearest).

The following code seems to work even in release, even with the default or fast (/fp:precise, /fp:fast) rounding modes, but there is still no guarantee how will your compiler optimize the code:

int is_round_to_nearest()
{
    union {
        double f;
        uint64_t n;
    } special;

    special.n = 0 | (((uint64_t)(-0x100 + 1023) & 0x7ff) << 52) | 0; // no sign, 1.0 mantissa is expressed as zeroes, the 1 is implicit
    //const double special.f = atof("0x1.0p-100");

    volatile double v;
    v = 1.0; v += special.f;
    if(v !=  1.0)
        return 0;
    v = 1.0; v -= special.f;
    if(v !=  1.0)
        return 0;
    v = -1.0; v += special.f;
    if(v != -1.0)
        return 0;
    v = -1.0; v -= special.f;
    if(v != -1.0)
        return 0;
    return 1;
}

Brahman answered 16/7, 2014 at 11:20 Comment(0)

This C code tells you that you are either in round-to-nearest-even or using a strange floating-point architecture indeed:

int is_round_to_nearest(void)
{
  if ( 1.0 + 0x1.0p-100 !=  1.0) return 0;
  if ( 1.0 - 0x1.0p-100 !=  1.0) return 0;
  if (-1.0 + 0x1.0p-100 != -1.0) return 0;
  if (-1.0 - 0x1.0p-100 != -1.0) return 0;
  return 1;
}

You can add an f suffix to all twelve floating-point constants above to obtain a single-precision function.

Garrison answered 16/7, 2014 at 13:25 Comment(8)

Wouldn't an FPU set to roundTiesToAway would still pass this test? – Thermit 16/7, 2014 at 14:40

@MarkDickinson Such a rounding mode would qualify as “strange” for the first sentence of this answer. I was expecting the four IEEE 754 rounding modes (nearest-even, up, down, toward zero), and added the third and fourth test for good measure. I don't know what exotic rounding mode should be expected and should be tested for. Do you have a list of them? – Garrison 16/7, 2014 at 14:44

Well, roundTiesToAway is one of the IEEE 754 rounding modes (there are five specified in IEEE 754-2008). But as far as I can tell, it's not supported by any popular architecture. – Thermit 16/7, 2014 at 14:45

... and reading the standard more closely, roundTiesToAway seems to be aimed mainly at the decimal formats. Sorry for the noise. – Thermit 16/7, 2014 at 14:49

@MarkDickinson Oh, yes, that makes sense now. The rounding that could be called “grade school rounding”. – Garrison 16/7, 2014 at 14:58

@PascalCuoq thanks for the answer, this is what I was hoping for. However, what is the 0x1.0p-100 number literal format? This is the first time I see it, my compiler won't accept it (accepts 0x1.0e-100, not sure if that's the same thing). Is there perhaps another way to write those constants? Is this way of writing numbers documented somewhere? – Brahman 16/7, 2014 at 16:31

@PascalCuoq Nevermind, found it in https://mcmap.net/q/430232/-p-in-constant-declaration. – Brahman 16/7, 2014 at 16:32

@theswine Sorry for the late answer. You can use any small number instead of 0x1.0p-100. 1e-30 should do nicely. – Garrison 16/7, 2014 at 21:22

I ended up developing a slightly modified routine, which tests how the ties are handled, instead of which rounding mode is in place, as in case it is round to nearest (as detected correctly by code of Pascal Cuoq), the tie-breaking can still be ties away from zero - but usually will not be, at least on x86 machines.

The code to detect ties to nearest even is:

int b_TieBreak_ToEven()
{
    //                                      <- 16B double ->
    //                                         <- fraction->
    const double special = f_ParseXDouble("0x0.00000000000008p+0"); // one, at the position one past the LSB
    const double oddone =  f_ParseXDouble("0x1.0000000000001p+0"); // one, ending with a single one at LSB
    const double evenone = f_ParseXDouble("0x1.0000000000002p+0"); // one, ending with a single one to the left of LSB

    volatile double v;
    v = 1.0; v += special;
    if(v != 1.0)
        return 0;
    v = oddone; v += special;
    if(v != evenone) // odd + half rounds to even
        return 0;
    v = evenone; v += special;
    if(v != evenone) // even + half rounds to the same even
        return 0;
    v = -1.0; v -= special;
    if(v != -1.0)
        return 0;
    v = -oddone; v -= special;
    if(v != -evenone) // -odd - half rounds to -even
        return 0;
    v = -evenone; v -= special;
    if(v != -evenone) // -even - half rounds to the same -even
        return 0;

    return 1;
}

I tested this on Windows, on Linux, on BSD and on Raspbian, it seems to work pretty nice. It contains a small routine that parses doubles and floats in the hexadecimal format (the f_ParseXDouble). You can download the source codes here.

On my AMD-based windows machine, this says:

all unit tests passed
default : mode : to nearest, ties to even (ties to even: 1)
_RC_CHOP : mode : towards zero (ties to even: 0)
_RC_UP : mode : towards positive infinity (ties to even: 0)
_RC_DOWN : mode : towards negative infinity (ties to even: 0)
_RC_NEAR : mode : to nearest, ties to even (ties to even: 1)

On Raspberry PI:

all unit tests passed
default : mode : to nearest, ties to even (ties to even: 1)

On NVIDIA GPUs (480, 680 and 780, in OpenCL):

OpenCL platform 'NVIDIA CUDA' by NVIDIA Corporation, version OpenCL 1.1 CUDA 6.0.1, FULL_PROFILE
device: NVIDIA Corporation 'GeForce GTX 680' (driver version: 331.65)
        OpenCL version: OpenCL 1.1 CUDA
        OpenCL "C" version: OpenCL C 1.1

GPU mode: round to nearest, ties to even

Brahman answered 18/7, 2014 at 14:46 Comment(0)

Recommended topics

Hot tags