Is there a safe, portable way to determine (during compile time) the endianness of the platform that my program is being compiled on? I'm writing in C.
[EDIT] Thanks for the answers, I decided to stick with the runtime solution!
Is there a safe, portable way to determine (during compile time) the endianness of the platform that my program is being compiled on? I'm writing in C.
[EDIT] Thanks for the answers, I decided to stick with the runtime solution!
This is for compile time checking
You could use information from the boost header file endian.hpp
, which covers many platforms.
edit for runtime checking
bool isLittleEndian()
{
short int number = 0x1;
char *numPtr = (char*)&number;
return (numPtr[0] == 1);
}
Create an integer, and read its first byte (least significant byte). If that byte is 1, then the system is little endian, otherwise it's big endian.
edit Thinking about it
Yes you could run into a potential issue in some platforms (can't think of any) where sizeof(char) == sizeof(short int)
. You could use fixed width multi-byte integral types available in <stdint.h>
, or if your platform doesn't have it, again you could adapt a boost header for your use: stdint.hpp
sizeof(char)==sizeof(short)
, then uint8_t
cannot exist on the implementation. C99 requires uint8_t
to have no padding and be exactly 8 bits, and also defines the representation of types in terms of char
/bytes, so uint8_t
can only exist if CHAR_BIT==8
. But then short
could not hold the required minimum range. :-) –
Spandau endian.hpp
: it's not compile time checking the endianess. it's extracting the endianess from headers, if they're exposed. so it's not guaranteed to work. –
Inclination detail
header. –
Leuctra sizeof(char) == sizeof(short int)
even possible? –
Slinky char
is the smallest unit in the implementation, and that a short int
is at least as large as char
and could hold the range [-32768, 32767]
. –
Biffin CHAR_BIT > 8
–
Slinky CHAR_BIT == 16
. Some of them are pretty widely used. –
Sacramental To answer the original question of a compile-time check, there's no standardized way to do it that will work across all existing and all future compilers, because none of the existing C, C++, and POSIX standards define macros for detecting endianness.
But, if you're willing to limit yourself to some known set of compilers, you can look up each of those compilers' documentations to find out which predefined macros (if any) they use to define endianness. This page lists several macros you can look for, so here's some code which would work for those:
#if defined(__BYTE_ORDER) && __BYTE_ORDER == __BIG_ENDIAN || \
defined(__BIG_ENDIAN__) || \
defined(__ARMEB__) || \
defined(__THUMBEB__) || \
defined(__AARCH64EB__) || \
defined(_MIBSEB) || defined(__MIBSEB) || defined(__MIBSEB__)
// It's a big-endian target architecture
#elif defined(__BYTE_ORDER) && __BYTE_ORDER == __LITTLE_ENDIAN || \
defined(__LITTLE_ENDIAN__) || \
defined(__ARMEL__) || \
defined(__THUMBEL__) || \
defined(__AARCH64EL__) || \
defined(_MIPSEL) || defined(__MIPSEL) || defined(__MIPSEL__)
// It's a little-endian target architecture
#else
#error "I don't know what architecture this is!"
#endif
If you can't find what predefined macros your compiler uses from its documentation, you can also try coercing it to spit out its full list of predefined macros and guess from there what will work (look for anything with ENDIAN, ORDER, or the processor architecture name in it). This page lists a number of methods for doing that in different compilers:
Compiler C macros C++ macros
Clang/LLVM clang -dM -E -x c /dev/null clang++ -dM -E -x c++ /dev/null
GNU GCC/G++ gcc -dM -E -x c /dev/null g++ -dM -E -x c++ /dev/null
Hewlett-Packard C/aC++ cc -dM -E -x c /dev/null aCC -dM -E -x c++ /dev/null
IBM XL C/C++ xlc -qshowmacros -E /dev/null xlc++ -qshowmacros -E /dev/null
Intel ICC/ICPC icc -dM -E -x c /dev/null icpc -dM -E -x c++ /dev/null
Microsoft Visual Studio (none) (none)
Oracle Solaris Studio cc -xdumpmacros -E /dev/null CC -xdumpmacros -E /dev/null
Portland Group PGCC/PGCPP pgcc -dM -E (none)
Finally, to round it out, the Microsoft Visual C/C++ compilers are the odd ones out and don't have any of the above. Fortunately, they have documented their predefined macros here, and you can use the target processor architecture to infer the endianness. While all of the currently supported processors in Windows are little-endian (_M_IX86
, _M_X64
, _M_IA64
, and _M_ARM
are little-endian), some historically supported processors like the PowerPC (_M_PPC
) were big-endian. But more relevantly, the Xbox 360 is a big-endian PowerPC machine, so if you're writing a cross-platform library header, it can't hurt to check for _M_PPC
.
#define __BYTE_ORDER__ __ORDER_LITTLE_ENDIAN__
–
Shelia This is for compile time checking
You could use information from the boost header file endian.hpp
, which covers many platforms.
edit for runtime checking
bool isLittleEndian()
{
short int number = 0x1;
char *numPtr = (char*)&number;
return (numPtr[0] == 1);
}
Create an integer, and read its first byte (least significant byte). If that byte is 1, then the system is little endian, otherwise it's big endian.
edit Thinking about it
Yes you could run into a potential issue in some platforms (can't think of any) where sizeof(char) == sizeof(short int)
. You could use fixed width multi-byte integral types available in <stdint.h>
, or if your platform doesn't have it, again you could adapt a boost header for your use: stdint.hpp
sizeof(char)==sizeof(short)
, then uint8_t
cannot exist on the implementation. C99 requires uint8_t
to have no padding and be exactly 8 bits, and also defines the representation of types in terms of char
/bytes, so uint8_t
can only exist if CHAR_BIT==8
. But then short
could not hold the required minimum range. :-) –
Spandau endian.hpp
: it's not compile time checking the endianess. it's extracting the endianess from headers, if they're exposed. so it's not guaranteed to work. –
Inclination detail
header. –
Leuctra sizeof(char) == sizeof(short int)
even possible? –
Slinky char
is the smallest unit in the implementation, and that a short int
is at least as large as char
and could hold the range [-32768, 32767]
. –
Biffin CHAR_BIT > 8
–
Slinky CHAR_BIT == 16
. Some of them are pretty widely used. –
Sacramental With C99, you can perform the check as:
#define I_AM_LITTLE (((union { unsigned x; unsigned char c; }){1}).c)
Conditionals like if (I_AM_LITTLE)
will be evaluated at compile-time and allow the compiler to optimize out whole blocks.
I don't have the reference right off for whether this is strictly speaking a constant expression in C99 (which would allow it to be used in initializers for static-storage-duration data), but if not, it's the next best thing.
const
for the type. –
Ministerial .
(member access) operator. However, it violates the rules on operands; compound literals are not one of the operand types allowed in arithmetic constant expressions. –
Spandau Interesting read from the C FAQ:
You probably can't. The usual techniques for detecting endianness involve pointers or arrays of char, or maybe unions, but preprocessor arithmetic uses only long integers, and there is no concept of addressing. Another tempting possibility is something like
#if 'ABCD' == 0x41424344
but this isn't reliable, either.
if( (char[4])(U'A')[0] == 65)
would do, if the cast was legal. But the multi-character constant is indeed completely legal as-is. –
Hideaway I would like to extend the answers for providing a constexpr
function for C++
union Mix {
int sdat;
char cdat[4];
};
static constexpr Mix mix { 0x1 };
constexpr bool isLittleEndian() {
return mix.cdat[0] == 1;
}
Since mix
is constexpr
too it is compile time and can be used in constexpr bool isLittleEndian()
. Should be safe to use.
As @Cheersandhth pointed out below, these seems to be problematic.
The reason is, that it is not C++11-Standard conform, where type punning is forbidden. There can always only one union member be active at a time. With a standard conforming compiler you will get an error.
So, don't use it in C++. It seems, you can do it in C though. I leave my answer in for educational purposes :-) and because the question is about C...
This assumes that int
has the size of 4 char
s, which is not always given as @PetrVepřek correctly pointed out below. To make your code truly portable you have to be more clever here. This should suffice for many cases though. Note that sizeof(char)
is always 1
, by definition. The code above assumes sizeof(int)==4
.
(data[0] << 8) | data[1]
). –
Conspecific "read of member 'cdat' of union with active member 'sdat' is not allowed in a constant expression"
. I wasn't aware of "active members" of unions. I wonder if it is in the Standard? See here, goo.gl/Gs6qrG. Oh yes, https://mcmap.net/q/15754/-accessing-inactive-union-member-and-undefined-behavior explains, that C++11 disallows what C11 allows. Thanks. I'll update my answer. –
Lanni Use CMake TestBigEndian as
INCLUDE(TestBigEndian)
TEST_BIG_ENDIAN(ENDIAN)
IF (ENDIAN)
# big endian
ELSE (ENDIAN)
# little endian
ENDIF (ENDIAN)
CMAKE_<LANG>_BYTE_ORDER
variable. –
Thyroxine Not during compile time, but perhaps during runtime. Here's a C function I wrote to determine endianness:
/* Returns 1 if LITTLE-ENDIAN or 0 if BIG-ENDIAN */
#include <inttypes.h>
int endianness()
{
union { uint8_t c[4]; uint32_t i; } data;
data.i = 0x12345678;
return (data.c[0] == 0x78);
}
char
array. –
Spandau char
that would have been fine, but uint8_t
isn't (necessarily) a char
. (Does that mean the behavior in this case is strictly implementation-defined, instead of undefined?) –
Weisler From Finally, one-line endianness detection in the C preprocessor:
#include <stdint.h>
#define IS_BIG_ENDIAN (*(uint16_t *)"\0\xff" < 0x100)
Any decent optimizer will resolve this at compile-time. gcc does at -O1
.
Of course stdint.h
is C99. For ANSI/C89 portability see Doug Gwyn's Instant C9x library.
I took it from rapidjson library:
#define BYTEORDER_LITTLE_ENDIAN 0 // Little endian machine.
#define BYTEORDER_BIG_ENDIAN 1 // Big endian machine.
//#define BYTEORDER_ENDIAN BYTEORDER_LITTLE_ENDIAN
#ifndef BYTEORDER_ENDIAN
// Detect with GCC 4.6's macro.
# if defined(__BYTE_ORDER__)
# if (__BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__)
# define BYTEORDER_ENDIAN BYTEORDER_LITTLE_ENDIAN
# elif (__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__)
# define BYTEORDER_ENDIAN BYTEORDER_BIG_ENDIAN
# else
# error "Unknown machine byteorder endianness detected. User needs to define BYTEORDER_ENDIAN."
# endif
// Detect with GLIBC's endian.h.
# elif defined(__GLIBC__)
# include <endian.h>
# if (__BYTE_ORDER == __LITTLE_ENDIAN)
# define BYTEORDER_ENDIAN BYTEORDER_LITTLE_ENDIAN
# elif (__BYTE_ORDER == __BIG_ENDIAN)
# define BYTEORDER_ENDIAN BYTEORDER_BIG_ENDIAN
# else
# error "Unknown machine byteorder endianness detected. User needs to define BYTEORDER_ENDIAN."
# endif
// Detect with _LITTLE_ENDIAN and _BIG_ENDIAN macro.
# elif defined(_LITTLE_ENDIAN) && !defined(_BIG_ENDIAN)
# define BYTEORDER_ENDIAN BYTEORDER_LITTLE_ENDIAN
# elif defined(_BIG_ENDIAN) && !defined(_LITTLE_ENDIAN)
# define BYTEORDER_ENDIAN BYTEORDER_BIG_ENDIAN
// Detect with architecture macros.
# elif defined(__sparc) || defined(__sparc__) || defined(_POWER) || defined(__powerpc__) || defined(__ppc__) || defined(__hpux) || defined(__hppa) || defined(_MIPSEB) || defined(_POWER) || defined(__s390__)
# define BYTEORDER_ENDIAN BYTEORDER_BIG_ENDIAN
# elif defined(__i386__) || defined(__alpha__) || defined(__ia64) || defined(__ia64__) || defined(_M_IX86) || defined(_M_IA64) || defined(_M_ALPHA) || defined(__amd64) || defined(__amd64__) || defined(_M_AMD64) || defined(__x86_64) || defined(__x86_64__) || defined(_M_X64) || defined(__bfin__)
# define BYTEORDER_ENDIAN BYTEORDER_LITTLE_ENDIAN
# elif defined(_MSC_VER) && (defined(_M_ARM) || defined(_M_ARM64))
# define BYTEORDER_ENDIAN BYTEORDER_LITTLE_ENDIAN
# else
# error "Unknown machine byteorder endianness detected. User needs to define BYTEORDER_ENDIAN."
# endif
#endif
I once used a construct like this one:
uint16_t HI_BYTE = 0,
LO_BYTE = 1;
uint16_t s = 1;
if(*(uint8_t *) &s == 1) {
HI_BYTE = 1;
LO_BYTE = 0;
}
pByte[HI_BYTE] = 0x10;
pByte[LO_BYTE] = 0x20;
gcc with -O2 was able to make it completely compile time. That means, the HI_BYTE
and LO_BYTE
variables were replaced entirely and even the pByte acces was replaced in the assembler by the equivalent of *(unit16_t *pByte) = 0x1020;
.
It's as compile time as it gets.
To my knowledge no, not during compile time.
At run-time, you can do trivial checks such as setting a multi-byte value to a known bit string and inspect what bytes that results in. For instance using a union,
typedef union {
uint32_t word;
uint8_t bytes[4];
} byte_check;
or casting,
uint32_t word;
uint8_t * bytes = &word;
Please note that for completely portable endianness checks, you need to take into account both big-endian, little-endian and mixed-endian systems.
For my part, I decided to use an intermediate approach: try the macros, and if they don't exist, or if we can't find them, then do it in runtime. Here is one that works on the GNU-compiler:
#define II 0x4949 // arbitrary values != 1; examples are
#define MM 0x4D4D // taken from the TIFF standard
int
#if defined __BYTE_ORDER__ && __BYTE_ORDER__ == __LITTLE_ENDIAN
const host_endian = II;
# elif defined __BYTE_ORDER__ && __BYTE_ORDER__ == __BIG__ENDIAN
const host_endian = MM;
#else
#define _no_BYTE_ORDER
host_endian = 1; // plain "int", not "int const" !
#endif
and then, in the actual code:
int main(int argc, char **argv) {
#ifdef _no_BYTE_ORDER
host_endian = * (char *) &host_endian ? II : MM;
#undef _no_BYTE_ORDER
#endif
// .... your code here, for instance:
printf("Endedness: %s\n", host_endian == II ? "little-endian"
: "big-endian");
return 0;
}
On the other hand, as the original poster noted, the overhead of a runtime check is so little (two lines of code, and micro-seconds of time) that it's hardly worth the bother to try and do it in the preprocessor.
© 2022 - 2024 — McMap. All rights reserved.
#ifdef __LITTLE_ENDIAN__
etc ? – Warrigal__LITTLE_ENDIAN__
is an indicator that the machine is little endian and not one of two macros (along with__BIG_ENDIAN__
) which are possible values for__BYTE_ORDER__
? You can't know. As soon as you start inspecting macro names that were reserved for the implementation, your're on the road to the dark world of UB. Good code never directly inspects macros beginning with_[A-Z_]
but instead uses aconfigure
script or similar to work out its environment then uses#include "config.h"
and#ifdef HAVE_FOO
etc. – Spandau