Is there a way to do a C++ style compile-time assertion to determine machine's endianness?
Asked Answered
W

4

32

I have some low level serialization code that is templated, and I need to know the system's endianness at compiletime obviously (because the templates specializes based on the system's endianness).

Right now I have a header with some platform defines, but I'd rather have someway to make assertions about endianness with some templated test (like a static_assert or boost_if). Reason being my code will need to be compiled and ran on a wide range of machines, of many specialized vendor, and probably devices that don't exist in 2008, so I can't really guess what might need to go into that header years down the road. And since the code-base has an expected lifetime of about 10 years. So I can't follow the code for-ever.

Hopefully this makes my situation clear.

So does anyone know of a compile-time test that can determine endianness, without relying on vendor specific defines?

Wherry answered 11/11, 2008 at 6:29 Comment(0)
R
21

If you're using autoconf, you can use the AC_C_BIGENDIAN macro, which is fairly guaranteed to work (setting the WORDS_BIGENDIAN define by default)

alternately, you could try something like the following (taken from autoconf) to get a test that will probably be optimized away (GCC, at least, removes the other branch)

int is_big_endian()
{
    union {
        long int l;
        char c[sizeof (long int)];
    } u;

    u.l = 1;

    if (u.c[sizeof(long int)-1] == 1)
    {
        return 1;
    }
    else
        return 0;
}
Rumanian answered 11/11, 2008 at 10:28 Comment(2)
Thanks, that is a nice trick. And I think I can apply it with a little reengineering. not sure yet, but its a good lead.Wherry
This causes undefined behaviour in Standard C++. Also the result is not usable in a compile-time testQuotha
S
18

There is no portable way to do this at compile time, your best bet is probably to use the Boost endian macros or emulate the methods they use.

Swirly answered 11/11, 2008 at 6:38 Comment(1)
Thanks for the link to the header. It's a good reference. And similar to what I have.Wherry
A
5

Hmm, that's an interesting Question. My bet is that this is not possible. I think you have to continue using macros, and go with BOOST_STATIC_ASSERT(!BIG_ENDIAN);, or static_assert in c++0x. The reason i think this is because the endian'nes is a property if your execution environment. However, static_assert is considered at compile time.

I suggest you look into the code of the new GNU gold ELF linker. Ian Lance Taylor, its author, used templates to select the right endianness at compile time, to ensure optimal performance at run time. He explicitely instantiates all possible endians, so that he still has separate compilation (not all templates in headers) of the template definition and declaration. His code is excellent.

Abebi answered 11/11, 2008 at 6:33 Comment(3)
Yes I've not been able to figure it out myself, and as you say it might not even be possible. But I've seen the impossible done before for other situations... so maybe, just maybe someone might come up with a great idea here :)Wherry
Why can't this work: #define IS_LITTLE_ENDIAN char(0x00ff). This solution is bit inspired by @MichaelBurr's answer. It works well with templates as well, demo. Assume uint8_t in place of char to be precise.Backflow
@iamm i don't think it's possible to construct bitpatterns in C or C++. You are constructing the value 255 and then cast it to char. Even if C allowed you to specify a bit representation, then you still habe to convince the compiler to interpret that bitpattern as contents of memory and load it into a register rather than to have it as an immediate value. For register-exclusive such as truncation and zero extension, endianness is irrelevant IIRC.Abebi
C
-1

This answer is based on the following specs (this is for clarity):

Language: C++ v17, 64-Bit
Compilers: g++ v8 (The GNU Compiler Collection https://www.gnu.org/software/gcc/) & MingW 8.1.0 toolchain (https://sourceforge.net/projects/mingw-w64/files/)
OS's: Linux Mint & Windows

The following two lines of code can be used to successfully detect a processor's endianness:

const uint8_t IsLittleEndian = char (0x0001);

or

#define IsLittleEndian char (0x0001)

These two little magical statement gems take advantage of how the processor stores a 16-Bit value in memory.

On a "Little Endian" processor, like the Intel and AMD chipsets, a 16-Bit value is stored in a [low order/least significant byte][high order/most significant byte] fashion (the brackets represent a byte in memory).

On a "Big Endian" processor, like the PowerPC, Sun Sparc, and IBM S/390 chipsets, a 16-Bit value is stored in a [high order/most significant byte][low order/least significant byte] fashion.

For example, when we store a 16-Bit (two byte) value, let's say 0x1234, into a C++ uint16_t (type defined in C++ v11, and later https://en.cppreference.com/w/cpp/types/integer) size variable on a "Little Endian" processor, then peer into the memory block the value is stored in, you will find the byte sequence, [34][12].

On a "Big Endian processor", the 0x1234 value is stored as [12][34].

Here is a little demo to help demonstrate how various size C++ integer variables are stored in memory on little and big endian processors:

#define __STDC_FORMAT_MACROS // Required for the MingW toolchain
#include <iostream>
#include <inttypes.h>

const uint8_t IsLittleEndian = char (0x0001);
//#define IsLittleEndian char (0x0001)

std::string CurrentEndianMsg;
std::string OppositeEndianMsg;

template <typename IntegerType>
void PrintIntegerDetails(IntegerType IntegerValue)
{
    uint16_t SizeOfIntegerValue = sizeof(IntegerValue);
    int8_t i;

    std::cout << "Integer size (in bytes): " << SizeOfIntegerValue << "\n";
    std::cout << "Integer value (Decimal): " << IntegerValue << "\n";
    std::cout << "Integer value (Hexidecimal): ";

    switch (SizeOfIntegerValue)
    {
        case 2: printf("0x%04X\n", (unsigned int) IntegerValue);
                break;
        case 4: printf("0x%08X\n", (unsigned int) IntegerValue);
                break;
        case 8: printf("0x%016" PRIX64 "\n", (uint64_t) IntegerValue);
                break;
    }

    std::cout << "Integer stored in memory in byte order:\n";
    std::cout << "        " << CurrentEndianMsg << " processor [current]: ";

    for(i = 0; i < SizeOfIntegerValue; i++)https://stackoverflow.com/qhttps://mcmap.net/q/18624/-is-there-a-way-to-do-a-c-style-compile-time-assertion-to-determine-machine-39-s-endianness/54175491#54175491uestions/280162/is-there-a-way-to-do-a-c-style-compile-time-assertion-to-determine-machines-e/54175491#54175491
    {
        printf("%02X ", (((unsigned char*) &IntegerValue)[i]));
    }

    std::cout << "\n        " << OppositeEndianMsg << " processor  [simulated]: ";

    for(i = SizeOfIntegerValue - 1; i >= 0; i--)
    {
        printf("%02X ", (((unsigned char*) &IntegerValue)[i]));
    }

    std::cout << "\n\n";
}


int main()
{
    uint16_t ValueUInt16a = 0x0001;
    uint16_t ValueUInt16b = 0x1234;
    uint32_t ValueUInt32a = 0x00000001;
    uint32_t ValueUInt32b = 0x12345678;
    uint64_t ValueUInt64a = 0x0000000000000001;
    uint64_t ValueUInt64b = 0x123456789ABCDEF0;

    std::cout << "Current processor endianness: ";

    switch (IsLittleEndian) {
        case 0: CurrentEndianMsg = "Big Endian";
                OppositeEndianMsg = "Little Endian";
                break;
        case 1: CurrentEndianMsg = "Little Endian";
                OppositeEndianMsg = "Big Endian";
                break;
    }

    std::cout << CurrentEndianMsg << "\n\n";

    PrintIntegerDetails(ValueUInt16a);
    PrintIntegerDetails(ValueUInt16b);
    PrintIntegerDetails(ValueUInt32a);
    PrintIntegerDetails(ValueUInt32b);
    PrintIntegerDetails(ValueUInt64a);
    PrintIntegerDetails(ValueUInt64b);

    return 0;
}

Here is the output of the demo on my machine:

Current processor endianness: Little Endian

Integer size (in bytes): 2
Integer value (Decinal): 1
Integer value (Hexidecimal): 0x0001
Integer stored in memory in byte order:
        Little Endian processor [current]: 01 00
        Big Endian processor  [simulated]: 00 01

Integer size (in bytes): 2
Integer value (Decinal): 4660
Integer value (Hexidecimal): 0x1234
Integer stored in memory in byte order:
        Little Endian processor [current]: 34 12
        Big Endian processor  [simulated]: 12 34

Integer size (in bytes): 4
Integer value (Decinal): 1
Integer value (Hexidecimal): 0x00000001
Integer stored in memory in byte order:
        Little Endian processor [current]: 01 00 00 00
        Big Endian processor  [simulated]: 00 00 00 01

Integer size (in bytes): 4
Integer value (Decinal): 305419896
Integer value (Hexidecimal): 0x12345678
Integer stored in memory in byte order:
        Little Endian processor [current]: 78 56 34 12
        Big Endian processor  [simulated]: 12 34 56 78

Integer size (in bytes): 8
Integer value (Decinal): 1
Integer value (Hexidecimal): 0x0000000000000001
Integer stored in memory in byte order:
        Little Endian processor [current]: 01 00 00 00 00 00 00 00
        Big Endian processor  [simulated]: 00 00 00 00 00 00 00 01

Integer size (in bytes): 8
Integer value (Decinal): 13117684467463790320
Integer value (Hexidecimal): 0x123456789ABCDEF0While the process
Integer stored in memory in byte order:
        Little Endian processor [current]: F0 DE BC 9A 78 56 34 12
        Big Endian processor  [simulated]: 12 34 56 78 9A BC DE F0

I wrote this demo with the GNU C++ toolchain in Linux Mint and don't have the means to test in other flavors of C++ like Visual Studio or the MingW toolchain, so I do not know what is required for this to compile in them, nor do I have access to Windows at the moment.

However, a friend of mine tested the code with MingW, 64-Bit (x86_64-8.1.0-release-win32-seh-rt_v6-rev0) and it had errors. After a little bit of research, I discovered I needed to add the line #define __STDC_FORMAT_MACROS at the top of the code for it to compile with MingW.

Now that we can visually see how a 16-Bit value is stored in memory, let's see how we can use that to our advantage to determine the endianness of a processor.

To give a little extra help in visualizing the way that 16-Bit values are stored in memory, let's look at the following chart:

16-Bit Value (Hex):  0x1234

Memory Offset:       [00] [01]
                     ---------
Memory Byte Values:  [34] [12]  <Little Endian>
                     [12] [34]  <Big Endian>

================================================

16-Bit Value (Hex):  0x0001

Memory Offset:       [00] [01]
                     ---------
Memory Byte Values:  [01] [00]  <Little Endian>
                     [00] [01]  <Big Endian>

When we convert the 16-Bit value 0x0001 to a char (8-Bit) with the snippet char (0x0001), the compiler uses the first memory offset of the 16-Bit value for the new value. Here is another chart that shows what happens on both the "Little Endian" and "Big Endian" processors:

Original 16-Bit Value: 0x0001

Stored in memory as: [01][00]  <-- Little Endian
                     [00][01]  <-- Big Endian

Truncate to char:    [01][xx]  <-- Little Endian
                     [01]      Final Result
                     [00][xx]  <-- Big Endian
                     [00]      Final Result

As you can see, we can easily determine the endianness of a processor.


UPDATE:

I am unable to test the demo above on a "Big Endian" processor, so I based the code on information I found on the web. Thanks to M.M for pointing out the obvious to me.

I have updated the demo code (as seen below) to test for endianness or a processor correctly.

#define __STDC_FORMAT_MACROS // Required for the MingW toolchain
#include <iostream>
#include <inttypes.h>

std::string CurrentEndianMsg;
std::string OppositeEndianMsg;

template <typename IntegerType>
void PrintIntegerDetails(IntegerType IntegerValue)
{
    uint16_t SizeOfIntegerValue = sizeof(IntegerValue);
    int8_t i;

    std::cout << "Integer size (in bytes): " << SizeOfIntegerValue << "\n";
    std::cout << "Integer value (Decimal): " << IntegerValue << "\n";
    std::cout << "Integer value (Hexidecimal): ";

    switch (SizeOfIntegerValue)
    {
        case 2: printf("0x%04X\n", (unsigned int) IntegerValue);
                break;
        case 4: printf("0x%08X\n", (unsigned int) IntegerValue);
                break;
        case 8: printf("0x%016" PRIX64 "\n", (uint64_t) IntegerValue);
                break;
    }

    std::cout << "Integer stored in memory in byte order:\n";
    std::cout << "        " << CurrentEndianMsg << " processor [current]: ";

    for(i = 0; i < SizeOfIntegerValue; i++)
    {
        printf("%02X ", (((unsigned char*) &IntegerValue)[i]));
    }

    std::cout << "\n        " << OppositeEndianMsg << " processor  [simulated]: ";

    for(i = SizeOfIntegerValue - 1; i >= 0; i--)
    {
        printf("%02X ", (((unsigned char*) &IntegerValue)[i]));
    }

    std::cout << "\n\n";
}


int main()
{
    uint16_t ValueUInt16a = 0x0001;
    uint16_t ValueUInt16b = 0x1234;
    uint32_t ValueUInt32a = 0x00000001;
    uint32_t ValueUInt32b = 0x12345678;
    uint64_t ValueUInt64a = 0x0000000000000001;
    uint64_t ValueUInt64b = 0x123456789ABCDEF0;

    uint16_t EndianTestValue = 0x0001;
    uint8_t IsLittleEndian = ((unsigned char*) &EndianTestValue)[0];

    std::cout << "Current processor endianness: ";

    switch (IsLittleEndian) {
        case 0: CurrentEndianMsg = "Big Endian";
                OppositeEndianMsg = "Little Endian";
                break;
        case 1: CurrentEndianMsg = "Little Endian";
                OppositeEndianMsg = "Big Endian";
                break;
    }

    std::cout << CurrentEndianMsg << "\n\n";

    PrintIntegerDetails(ValueUInt16a);
    PrintIntegerDetails(ValueUInt16b);
    PrintIntegerDetails(ValueUInt32a);
    PrintIntegerDetails(ValueUInt32b);
    PrintIntegerDetails(ValueUInt64a);
    PrintIntegerDetails(ValueUInt64b);

    return 0;
}

This updated demo creates a 16-Bit value 0x0001 and then reads the first byte in the variables memory. As seen in the output shown above, on the "Little Endian" processors, the value would be 0x01. On "Big Endian" processors, the value would be 0x00.

Caterwaul answered 14/1, 2019 at 3:37 Comment(3)
char (0x0001) produces 1 regardless of actual endianness. char(x) is not the same as *(char *)&xQuotha
@M.M, Thank you for pointing out the obvious. I have updated my post to use ((uint8_t*) &TestUint16_tVar)[0] to accurately get the byte value needed to check for processor endianness.Caterwaul
The updated version doesn't meet the requirements of the question to be a compile-time test.Quotha

© 2022 - 2024 — McMap. All rights reserved.