Finding endian-ness programmatically at compile-time using C++11
Asked Answered
P

2

14

I have referred many questions in SO on this topic, but couldn't find any solution so far. One natural solution was mentioned here: Determining endianness at compile time.
However, the related problems mentioned in the comments & the same answer.

With some modifications, I am able to compile a similar solution with g++ & clang++ (-std=c++11) without any warning.

static_assert(sizeof(char) == 1, "sizeof(char) != 1");
union U1
{
  int i;
  char c[sizeof(int)];
};  
union U2
{ 
  char c[sizeof(int)];
  int i;
};  

constexpr U1 u1 = {1};
constexpr U2 u2 = {{1}};
constexpr bool IsLittleEndian ()
{ 
  return u1.i == u2.c[0];  // ignore different type comparison
}   

static_assert(IsLittleEndian(), "The machine is BIG endian");

Demo.

Can this be considered a deterministic method to decide the endian-ness or does it miss type-punning or something else?

Pluralize answered 29/9, 2016 at 7:5 Comment(13)
Doesn't uint8_t(u2.i) produce the same value on either endianness? A cast should be value preserving, not just pick the first byte.Weanling
There are 24 possible orderings of bytes within a 4-byte integer. At least three have been used by real computers. Also, it is not entirely clear that the exception to the strict aliasing rules granted to [[un]signed] char applies to uint8_t.Louiselouisette
@BoPersson, I wanted to avoid any possible compiler warning related to "comparison of different size types" (as I try to claim in the Q!). Since here, 1 will be representable with the smallest type, I found it to be acceptable for typecasting. Or did I misunderstood your concern? I will modify the code a bit.Pluralize
I belive that if you actually run this on a big endian machine, it would still test if 1 == 1 and return true.Weanling
sizeof(char) == 1 is true by definition. sizeof is given in units of char, so this assertion can literally never fail.Gynandrous
Quite sure there is no way to use constexpr to do this, since any union/reinterpret_cast approach invokes UB (which is caught at compile time inside a constexpr), and memcpy is not constexpr. Compiler specific macros are the only way around it (look for __BYTE_ORDER).Johathan
@sbabbi, not sure why union will cause UB. It doesn't generate any warning in either g++/clang++. BTW, regarding compiler specific macros, there is a platform specific file supported, <endian.h>, as mentioned in this answer: C Macro definition to determine big endian or little endian machine?Pluralize
@Pluralize See #11373703 . IIRC gcc defines the behavior (of accessing a non-active union member, basically they promise they are not going to optimize on this), but it is UB in the standard.Johathan
One easy way to find endianness at compile time in C++, is to just use OS macro sniffing. There's a nice collection of OS-indicator macros over at some SourceForge project. Yes, as the site indicates it's old, so maybe all the Androids are not covered, but it should be doable.Deflagrate
Planned proposal: howardhinnant.github.io/endian.htmlHards
@HowardHinnant, nice to see the proposal. As a common C++ coder, I feel it to be little complex though. May be you can explain in that blog that why is 'simply defining __ORDER_LITTLE_ENDIAN__ (& big, native)' not enough? IMO, enum trick is trivial and hence not needed.Pluralize
Afaik, what you want is not possible. If you must have that information at compile time, consider using a trivial, short test program that outputs either const char* const ENDIANESS = "little"; or const char* const ENDIANESS = "big"; into a file "endianess.h", which is then used by your actual source code.Tidemark
Also note, that there are other byte orders than just little endian or big endian out there. Braindead stuff like 0x01020304 being stored as 0x03 0x04 0x01 0x02. So, if I were you, I would write the test with char[8] = {1, 2, 3, 4, 5, 6, 7, 8};, copy over to at least a uint64_t, and then check for equality with either 0x0102030405060708 or 0x0807060504030201. If neither test succeeds, you should probably error out hard.Tidemark
M
3

Since C++20 you can use std::endian from the <type_traits> header:

#include <type_traits>

int main()
{
    static_assert(std::endian::native==std::endian::big,
                  "Not a big endian platform!");
}

See it live

Maze answered 2/11, 2018 at 15:17 Comment(1)
Are you from the future?Oscillator
P
2

Your attempt is no different from this obviously non-working one (where IsLittleEndian() is identical to true):

constexpr char c[sizeof(int)] = {1};
constexpr int i = {1};
constexpr bool IsLittleEndian ()
{ 
  return i == c[0];  // ignore different type comparison
}   

static_assert(IsLittleEndian(), "The machine is BIG endian");

I believe that C++11 doesn't provide means to programatically determine the endianness of the target platform during compile time. My argument is that the only valid way to perform that check during runtime is to examine an int variable using an unsigned char pointer (since other ways of type punning inevitably contain undefined behavior):

const uint32_t i = 0xffff0000;

bool isLittleEndian() {
    return 0 == *reinterpret_cast<const unsigned char*>(&i);
}

C++11 doesn't allow to make this function constexpr, therefore this check cannot be performed during compile time.

Phenyl answered 29/9, 2016 at 8:59 Comment(7)
Can you explain in more detail about why the attempted solution will not work? Or why is it same as the one you mentioned (non-working) in beginning of answer. I haven't got a chance to check on a big-endian machine, however I assume that what you said is true. But the question is Why?Pluralize
@Pluralize What purpose do the unions serve in your "solution"? You never access/use U1::c and you never access/use U2::i. Hence, after eliminating them we arrive at my version.Phenyl
The 2 unions are to just eliminate compiler error, which comes otherwise with 1 union. If you check u1.i == u1.c[0] (same for u2) then it works fine. This solution is in many other answers in SO. But those are limited to runtime. They don't work compile time because of constexpr limitations. Here U1 & U2 act as mirror to each other. I have attempted to trick the compiler to allow it for compile time. May be it is wrong. But good if someone checks it in big endian.Pluralize
@Pluralize I understand that. But if you forget the prehistory of how you arrived at your version, doesn't the existence of U1::c and U2::i appear to be completely artificial?Phenyl
@iammilind: The way I understand the initialization in U2 u2 = {{1}} is that it always sets first item of array to 1. Doesn't it?Deflagrate
@Cheers, correct. But now I do understand that without making a comparison with other member of union, the purpose is not served. And according to C++11, accessing a union member which is not set latest, is undefined; though supported by many. For that much part we have to have a .c file/function, which will at least give runtime type info.Pluralize
BTW, C++ 20 finally addresses this : en.cppreference.com/w/cpp/types/endianInstitutionalism

© 2022 - 2024 — McMap. All rights reserved.