Detecting Endianness

N

18

32

I'm currently trying to create a C source code which properly handles I/O whatever the endianness of the target system.

I've selected "little endian" as my I/O convention, which means that, for big endian CPU, I need to convert data while writing or reading.

Conversion is not the issue. The problem I face is to detect endianness, preferably at compile time (since CPU do not change endianness in the middle of execution...).

Up to now, I've been using this :

#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
...
#else
...
#endif

It's documented as a GCC pre-defined macro, and Visual seems to understand it too.

However, I've received report that the check fails for some big_endian systems (PowerPC).

So, I'm looking for a foolproof solution, which ensures that endianess is correctly detected, whatever the compiler and the target system. well, most of them at least...

[Edit] : Most of the solutions proposed rely on "run-time tests". These tests may sometimes be properly evaluated by compilers during compilation, and therefore cost no real runtime performance.

However, branching with some kind of << if (0) { ... } else { ... } >> is not enough. In the current code implementation, variable and functions declaration depend on big_endian detection. These cannot be changed with an if statement.

Well, obviously, there is fall back plan, which is to rewrite the code...

I would prefer to avoid that, but, well, it looks like a diminishing hope...

[Edit 2] : I have tested "run-time tests", by deeply modifying the code. Although they do their job correctly, these tests also impact performance.

I was expecting that, since the tests have predictable output, the compiler could eliminate bad branches. But unfortunately, it doesn't work all the time. MSVC is good compiler, and is successful in eliminating bad branches, but GCC has mixed results, depending on versions, kind of tests, and with greater impact on 64 bits than on 32 bits.

It's strange. And it also means that the run-time tests cannot be ensured to be dealt with by the compiler.

Edit 3 : These days, I'm using a compile-time constant union, expecting the compiler to solve it to a clear yes/no signal. And it works pretty well : https://godbolt.org/g/DAafKo

Niobous answered 23/1, 2012 at 21:35 Comment(8)

@BoPersson - this is not a compile time detection – Pascoe 23/1, 2012 at 21:40

Run time is your best bet, but compile time is included in the below answers: 1. https://mcmap.net/q/18517/-detecting-endianness-programmatically-in-a-c-program 2. https://mcmap.net/q/18518/-macro-definition-to-determine-big-endian-or-little-endian-machine – Delorsedelos 23/1, 2012 at 21:43

Some CPUs actually can have different endianness for different executables. en.wikipedia.org/wiki/Endianness#Bi-endian_hardware – Stork 23/1, 2012 at 21:47

@bo : indeed, i've checked this question too, but unfortunately it's a run-time detection. It advises to depend on the result of a function. This might be okay for some other use, but I need a compile (preprocessor) detection. – Niobous 23/1, 2012 at 21:47

@Niobous , bar the ones you've mentioned, there isn't one. So either compile a small program that detects the endianess, and feed the result into your build system so it defines a preprocessor macro, or write the code so it is independent of the host endianess. – Snag 23/1, 2012 at 22:2

Run-time detection is a great interview question :) – Driftage 23/1, 2012 at 22:10

The reason your preprocessor-based test can fail (false positive) is that undefined symbols get replaced with 0 in #if directives. – Viviparous 24/1, 2012 at 1:23

@R. : +1, yes, i guess this is why it seems to works for Visual. – Niobous 24/1, 2012 at 10:58

S

19

At compile time in C you can't do much more than trusting preprocessor #defines, and there are no standard solutions because the C standard isn't concerned with endianness.

Still, you could add an assertion that is done at runtime at the start of the program to make sure that the assumption done when compiling was true:

inline int IsBigEndian()
{
    int i=1;
    return ! *((char *)&i);
}

/* ... */

#ifdef COMPILED_FOR_BIG_ENDIAN
assert(IsBigEndian());
#elif COMPILED_FOR_LITTLE_ENDIAN
assert(!IsBigEndian());
#else
#error "No endianness macro defined"
#endif

(where COMPILED_FOR_BIG_ENDIAN and COMPILED_FOR_LITTLE_ENDIAN are macros #defined previously according to your preprocessor endianness checks)

Skeens answered 23/1, 2012 at 21:43 Comment(9)

The value of a union member other than the last one stored into is an unspecified behavior in C. – Growl 23/1, 2012 at 21:53

@ouah: the C standard knows nothing about endianness, so we are already going out of the standard domain and working on implementation-specific behavior (and I don't think you'll ever find a compiler implementing unions differently or an optimizer messing with them). Although, I agree that the other "classic method" (cast of the pointer to char *) does not exhibit UB problems due to the exceptions to the aliasing rules. – Skeens 23/1, 2012 at 22:0

@ouah: also, §6.7.2.1 doesn't mention UB, it just says that "The value of at most one of the members can be stored in a union object at any time"; also, I dare to say that §6.7.2.1 ¶14 implicitly allows the use of unions as a replacement for that cast, since "A pointer to a union object, suitably converted, points to each of its members [...] and vice versa.". So, &u.i = &u = &u.c (with the appropriate casts), thus u.c[0] = (*(&u.c))[0]=*((char *)&u.i), which is as legal as the "other method". – Skeens 23/1, 2012 at 22:13

In C99, Annex J (non-normative) "J.1 Unspecified behavior. The following are unspecified: The value of a union member other than the last one stored into (6.2.6.1)." and 6.2.6.1p7 says "When a value is stored in a member of an object of union type, the bytes of the object representation that do not correspond to that member but do correspond to other members take unspecified values." – Growl 23/1, 2012 at 22:20

@ouah: the first one is solved by working on §6.7.2.1 ¶14 as I already wrote before (it's still unspecified behavior, but exactly as it is the cast - and hey, that code is there to understand exactly how the compiler implements that "unspecified behavior"). Your second quotation is irrelevant, since the two members in my union are of the same size, so both members "completely fill" the union (and this would still hold even if I declared a single char, because the biggest member is stored first). – Skeens 23/1, 2012 at 22:28

yes the second quote is actually irrelevant. +1 for the addition of the pedantic version I just see now;) – Growl 23/1, 2012 at 22:55

That's a good idea. Unfortunately, the number of functions to duplicate in this case is a bit too high (8), and will hamper future code maintenance. The basic idea was to embed the small differences due to endianess into macro (or inlined functions), keeping the code leans and common to all platforms. – Niobous 28/1, 2012 at 10:39

Better not call your macros BIG_ENDIAN and LITTLE_ENDIAN -- <endian.h> on Linux/*BSD defines macros by such names, and they will therefore both be always defined if you happen to include <endian.h>. – Axial 8/5, 2018 at 11:52

@LauriNurmi: woa, well spotted; I'll change them to something else. – Skeens 8/5, 2018 at 12:42

N

24

As stated earlier, the only "real" way to detect Big Endian is to use runtime tests.

However, sometimes, a macro might be preferred.

Unfortunately, I've not found a single "test" to detect this situation, rather a collection of them.

For example, GCC recommends : __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ . However, this only works with latest versions, and earlier versions (and other compilers) will give this test a false value "true", since NULL == NULL. So you need the more complete version : defined(__BYTE_ORDER__)&&(__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__)

OK, now this works for newest GCC, but what about other compilers ?

You may try __BIG_ENDIAN__ or __BIG_ENDIAN or _BIG_ENDIAN which are often defined on big endian compilers.

This will improve detection. But if you specifically target PowerPC platforms, you can add a few more tests to improve even more detection. Try _ARCH_PPC or __PPC__ or __PPC or PPC or __powerpc__ or __powerpc or even powerpc. Bind all these defines together, and you have a pretty fair chance to detect big endian systems, and powerpc in particular, whatever the compiler and its version.

So, to summarize, there is no such thing as a "standard pre-defined macros" which guarantees to detect big-endian CPU on all platforms and compilers, but there are many such pre-defined macros which, collectively, give a high probability of correctly detecting big endian under most circumstances.

Niobous answered 5/2, 2012 at 23:18 Comment(2)

Writing for other people who find this answer useful. gcc supports __BYTE_ORDER__ from about 4.6 and clang from 3.2 – Bastinado 10/7, 2019 at 5:14

You can write static_assert(__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__); which works on GCC, and will otherwise fail on compilers that don't define these. – Piezoelectricity 26/5, 2022 at 19:56

S

19

At compile time in C you can't do much more than trusting preprocessor #defines, and there are no standard solutions because the C standard isn't concerned with endianness.

Still, you could add an assertion that is done at runtime at the start of the program to make sure that the assumption done when compiling was true:

inline int IsBigEndian()
{
    int i=1;
    return ! *((char *)&i);
}

/* ... */

#ifdef COMPILED_FOR_BIG_ENDIAN
assert(IsBigEndian());
#elif COMPILED_FOR_LITTLE_ENDIAN
assert(!IsBigEndian());
#else
#error "No endianness macro defined"
#endif