Endianness for floating point
Asked Answered
C

3

4

I'm writing and reading binary data (std::ios::binary) in C++ using std::fstream - this includes integer and floating point values. While my code works on my own architecture, I wan't to make sure, that it's portable and i.e. binary files written on my machine shall still be read properly of an machine with different endianness. So my idea was that I will add in the binary file at first byte a value which will indicate the endianness of the file.

As there is no guarantee, that endianness of integer and floating points are the same, I need to get the information for both datatypes separately. While getting the endianess for integer was rather simple with pointer arithmetic, I'm lost how to get the endianess for float at runtime. Any idea?

My code looks like:

#include <cstdint>

#define INT_LITTLE_ENDIAN     0x01u
#define INT_BIG_ENDIAN        0x02u
#define FLOAT_LITTLE_ENDIAN   0x04u
#define FLOAT_BIG_ENDIAN      0x08u

uint8_t getEndianess(){
  uint8_t endianess = 0x00;
  uint16_t integerNumber = 0x1;
  uint8_t *numPtr = (uint8_t*)&integerNumber;
  if (numPtr[0] == 1) {
    endianess |= INT_LITTLE_ENDIAN;
  }else {
    endianess |= INT_BIG_ENDIAN;
  }
  /* TODO: check endianess for float */
  return endianess;
}
Countersignature answered 3/3, 2016 at 5:19 Comment(1)
"getting the endianess for integer was rather simple" --> The posted code does not well handle integer endian for less common endian like PDP endian.Vaporous
V
3

A check for float endian is also a check for the encoding.

If the encoding is not float32, detect that.

Instead of checking with a byte pattern like 0xBF800000 (-1.0f) with multiple zero bytes, consider using a pattern where the expected byte pattern is different for every byte. Also check every byte.

  const float f = -0x1.ca8642p-113f;  // 0x87654321, IEEE-754 binary32
  if (sizeof(float) != 4) {
    printf("float is not 4 bytes\n");
  } else if (memcmp(&f, (uint8_t[4]){0x87, 0x65, 0x43, 0x21}, sizeof f) == 0) {
    printf("Big\n");
  } else  if (memcmp(&f, (uint8_t[4]){0x21, 0x43, 0x65, 0x87}, sizeof f) == 0) {
    printf("Little\n");
  } else {
    printf("Unknown\n");  // TBD endian or float encoding
  }
Vaporous answered 2/8, 2022 at 18:25 Comment(3)
Example non-IEEE-754 binary32 where sign bit is in the 2nd byte.Vaporous
Would checking the endiannes of a double be similar to this code? Or would that require additional considerations? Great answer by the way.Pucka
@Pucka Yes, integer endian check is similar. The integer size check can easily be done at compile time, e.g. #if INT_MAX != 0x7FFFFFFF #error Unexpected int size #endif. Compile time size check is tricky with FP. With C23, there is __STDC_ENDIAN_LITTLE__, __STDC_ENDIAN_BIG__, __STDC_ENDIAN_NATIVE__ to simplify endian testing. I suspect current/future systems with integer and FP having different endian will be increasingly uncommon.Vaporous
P
3

If we assume floats have sign as the topmost bit (like IEEE) and they aren't, for example, like two's complement, you can easily make a number, negate it and check if the first or the last byte changed.

Pea answered 3/3, 2016 at 5:24 Comment(0)
L
3

Well, besides just endianness, you also have the potential for non-IEEE-754 formats, but that is admittedly rare.

If you can assume IEEE-754 binary, then it almost certainly uses the same endianness as integers, but you can readily check by using a floating point value that is a negative power of two (such as -1.0), which will have a non-zero MSbyte (containing the sign and part of the exponent) and a zero LSbyte (containing the least significant mantissa bits).

float floatNumber = -1.0;
uint8_t *numPtr = (uint8_t*)&floatNumber;
if (numPtr[0] == 0) {
  endianess |= FLOAT_LITTLE_ENDIAN;
} else {
  endianess |= FLOAT_BIG_ENDIAN;
}
Logwood answered 3/3, 2016 at 6:26 Comment(0)
V
3

A check for float endian is also a check for the encoding.

If the encoding is not float32, detect that.

Instead of checking with a byte pattern like 0xBF800000 (-1.0f) with multiple zero bytes, consider using a pattern where the expected byte pattern is different for every byte. Also check every byte.

  const float f = -0x1.ca8642p-113f;  // 0x87654321, IEEE-754 binary32
  if (sizeof(float) != 4) {
    printf("float is not 4 bytes\n");
  } else if (memcmp(&f, (uint8_t[4]){0x87, 0x65, 0x43, 0x21}, sizeof f) == 0) {
    printf("Big\n");
  } else  if (memcmp(&f, (uint8_t[4]){0x21, 0x43, 0x65, 0x87}, sizeof f) == 0) {
    printf("Little\n");
  } else {
    printf("Unknown\n");  // TBD endian or float encoding
  }
Vaporous answered 2/8, 2022 at 18:25 Comment(3)
Example non-IEEE-754 binary32 where sign bit is in the 2nd byte.Vaporous
Would checking the endiannes of a double be similar to this code? Or would that require additional considerations? Great answer by the way.Pucka
@Pucka Yes, integer endian check is similar. The integer size check can easily be done at compile time, e.g. #if INT_MAX != 0x7FFFFFFF #error Unexpected int size #endif. Compile time size check is tricky with FP. With C23, there is __STDC_ENDIAN_LITTLE__, __STDC_ENDIAN_BIG__, __STDC_ENDIAN_NATIVE__ to simplify endian testing. I suspect current/future systems with integer and FP having different endian will be increasingly uncommon.Vaporous

© 2022 - 2024 — McMap. All rights reserved.