How do I convert between big-endian and little-endian values in C++?
Asked Answered
C

35

267

How do I convert between big-endian and little-endian values in C++?

For clarity, I have to translate binary data (double-precision floating point values and 32-bit and 64-bit integers) from one CPU architecture to another. This doesn't involve networking, so ntoh() and similar functions won't work here.


Note: The answer I accepted applies directly to compilers I'm targeting (which is why I chose it). However, there are other very good, more portable answers here.

Chef answered 19/9, 2008 at 20:23 Comment(8)
Do you need to convert between big-endian and little-endian, or between one of these and your native format, for other processing?Matrilineage
It would be helpfull to include the platform you're talking about.Froehlich
ntoh hton will work fine, even if it doesn't have anything to do with networking.Foss
The best way to deal with endianness in general is to make sure that the code runs on both little- and big-endian host machines. If that works, you probably did it right. Assuming you are on x86/be is dangerous as a practice.Newberry
hton ntoh will not work if the machine is big-endian, because the question asker explicitly wants to perform the conversion.Vermicular
@Newberry is the only person to mention this. Almost all of the examples on this page use concepts like "swap" bytes instead of doing it agnostic of the underlying endianness. If you are dealing with external file formats (which have well defined endianness) then the most portable thing to do is treat the external data as a byte stream, and convert the byte stream to and from the native integers. I cringe everytime I see short swap(short x) code, since it will break if you move to a platform with different endianness. Matthieu M has the only right answer below.Chimney
You are thinking about the problem completely wrong. The task is not "how do I convert between big-endian and little-endian values". The task is "how do I convert floating point and integer values in a particular format to my platform's native format". If you do it right, the native format can be big endian, little endian, mixed endian, or ternary for all your code cares.Rivkarivkah
htons is host-to-network-short.Gaudreau
H
219

If you're using Visual C++ do the following: You include intrin.h and call the following functions:

For 16 bit numbers:

unsigned short _byteswap_ushort(unsigned short value);

For 32 bit numbers:

unsigned long _byteswap_ulong(unsigned long value);

For 64 bit numbers:

unsigned __int64 _byteswap_uint64(unsigned __int64 value);

8 bit numbers (chars) don't need to be converted.

Also these are only defined for unsigned values they work for signed integers as well.

For floats and doubles it's more difficult as with plain integers as these may or not may be in the host machines byte-order. You can get little-endian floats on big-endian machines and vice versa.

Other compilers have similar intrinsics as well.

In GCC for example you can directly call some builtins as documented here:

uint32_t __builtin_bswap32 (uint32_t x)
uint64_t __builtin_bswap64 (uint64_t x)

(no need to include something). Afaik bits.h declares the same function in a non gcc-centric way as well.

16 bit swap it's just a bit-rotate.

Calling the intrinsics instead of rolling your own gives you the best performance and code density btw..

Hyps answered 19/9, 2008 at 20:31 Comment(14)
Do you know if the gcc intrinsics are available on otehr platforms as well? i.e., are they tied to x86 host or would they work on PPC, SPARC, etc?Newberry
They should work on all gcc supported platforms. If the target CPU does not support the byteswap as a single instruction the compiler will either inline optimized code or call a runtime function.Hyps
With GCC, I might use: #include <byteswap.h> int32_t bswap_32(int32_t x) int64_t bswap_64(int64_t x)Romeyn
__builtin_bswapX is only available from GCC-4.3 onwardsSchiffman
It's also worth noting that these intrinsics /always/ swap bytes, they aren't like htonl, htons, etc. You have to know from the context of your situation when to actually swap the bytes.Osteology
Wait, what about 8-bit numbers? Why are there no built in functions for those?Scintilla
@Scintilla because 8 bit numbers are the same in big and little endian. :-)Hyps
@BrianVandenberg Right; using htonl and ntohl without worrying about the context would work when writing portable code since the platform defining these functions would swap it if it's little/mid-endian and on big-endian it'd be a no-op. However, when decoding a standard file type which is defined as little-endian (say BMP), one still has to know the context and can't just rely on htonl and ntohl.Darwen
FYI, I'm using GCC 4.8.1 and there is also a __builtin_bswap16.Diactinic
Boost 1.58 now has Endian library.Berezina
I suppose that the GCC solution does not work for llvm / mac. Is there an equivalent?Caballero
It's also worth checking the effective implementation speed-wise hardwarebug.org/2010/01/14/beware-the-builtinsNoddy
@Caballero the builtin functions are also in clang, see for example usage in boost.org/doc/libs/1_60_0/boost/endian/detail/intrinsic.hppNoddy
This answer should really say something about detecting whether you're on a big-endian host or not. (Windows+MSVC can target big-endian xbox360, according to this attempt at portable_endian.h, which I don't totally recommend since it uses ntohl and so on even on Windows where it's a non-inlined call to the Winsock DLL). Anyway, detecting when to byte-swap is the other hard problem in a portable C++ program, since AFAIK the ISO C++ standard doesn't define macros for host byte-order detection. Just a link to a good SO Q&A about that would be good.Obrian
B
114

Simply put:

#include <climits>

template <typename T>
T swap_endian(T u)
{
    static_assert (CHAR_BIT == 8, "CHAR_BIT != 8");

    union
    {
        T u;
        unsigned char u8[sizeof(T)];
    } source, dest;

    source.u = u;

    for (size_t k = 0; k < sizeof(T); k++)
        dest.u8[k] = source.u8[sizeof(T) - k - 1];

    return dest.u;
}

usage: swap_endian<uint32_t>(42).

Broadtail answered 10/2, 2011 at 11:27 Comment(14)
Have an upvote. I just used uchars, and assigned 4 to 1, 3 to 2, 2 to 3, and 1 to 4, but this is more flexible if you have different sizes. 6 clocks on a 1st Gen Pentium IIRC. BSWAP is 1 clock, but is platform specific.Fiddlededee
@RocketRoy: Yes, and if speed turns out to be an issue, it it very simple to write overloads with platform- and type-specific intrisics.Broadtail
This has got to be the most creative use of a union that I've seen. Thank you for providing this code!Antoineantoinetta
@MihaiTodor: This use of unions for typecasting through an array of chars is explicitly allowed by the standard. See eg. this question.Broadtail
@AlexandreC. Not in the C++ standard -- only in C. In C++ (which this code is) this code is undefined behaviour.Humeral
@Rapptz: 3.10 seems clear: "If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined: [...] a char or unsigned char type.". Maybe I'm missing something here, but it was pretty clear to me that accessing any type through char pointers was explicitly allowed.Broadtail
I think this is an overly complicated solution to a simple problem. In fact, I'd bet that most programmers out there (including me) would need some time to figure out what this function does (and to tell whether the code is correct).Garrot
@FrerichRaabe: There are not many ways to access individual bytes from a variable in C++. Type punning rules are quite complex, I agree, but I don't think this can get any simpler than the code I wrote, except perhaps if you restrain yourself to types with known fixed size (which int, long, float, etc are not).Broadtail
This is undefined behavior. 9.5.1: "In a union, at most one of the non-static data members can be active at any time."Prototrophic
Make this function constexpr please.Salesmanship
@AlexandreC. This function can be made UB free in 1 minute by just interpreting everything as uint8_t or byte array.Itemized
So instead of returning dest.u;, returning *reinterpret_cast<T*>(dest.u8) would make it defined behaviour since the last written-to member is accessed?Covenant
The standard excerpts quoted here (https://mcmap.net/q/16455/-what-is-the-strict-aliasing-rule) seem to say it's ok as long as you alias via a char type (which I guess unsigned char[] falls under). Am I wrong?Covenant
Answering my own question maybe: It would be ok to access dest.u via dest.u8, but not the other way around.Covenant
A
98

From The Byte Order Fallacy by Rob Pike:

Let's say your data stream has a little-endian-encoded 32-bit integer. Here's how to extract it (assuming unsigned bytes):

i = (data[0]<<0) | (data[1]<<8) | (data[2]<<16) | ((unsigned)data[3]<<24);

If it's big-endian, here's how to extract it:

i = (data[3]<<0) | (data[2]<<8) | (data[1]<<16) | ((unsigned)data[0]<<24);

TL;DR: don't worry about your platform native order, all that counts is the byte order of the stream your are reading from, and you better hope it's well defined.

Note 1: It is expected that int and unsigned int be 32 bits here, types may require adjustment otherwise.

Note 2: The last byte must be explicitly cast to unsigned before shifting, as by default it's promoted to int, and a shift by 24 bits means manipulating the sign bit which is Undefined Behavior.

Azazel answered 27/4, 2012 at 6:51 Comment(24)
This is cool, but it seems to me that it only applies to integers and the variants. What to do with floats/doubles?Permafrost
@Brett: strictly no idea :)Azazel
this is true only because your i variable was created by the compiler on the stack. When you use pointers that refers to mapped data (or file that you entirely pulled in memory) you need to care about host endianness as well, AND alignment which makes it dual difficult.Torbert
@v.oddou: yes and no, memory mapped files are exactly the same than network frames; if you accept not to read them directly, all that matters is their endianness: if little-endian, use the first formula, if it's big-endian, use the second. Any compiler worth its salt will optimize-out unneeded transformations if the endianness match.Azazel
@meowsqueak: Yes, I would expect it does work, because only the order of bytes change, not the order of bits within each byte.Azazel
On a loosely related note, the linked post is some unpleasant read... The guy seems to value brevity, yet he preferred to write a long rant about all those bad programmers that are not as enlightened as he is regarding endianness, instead of actually explaining the situation and WHY his solution always works.Cherlycherlyn
If you're using this method, make sure you cast your data to (unsigned char*)Pellucid
@joseph: why unsigned char*? char* should work just as well, as does uint8_t*.Divisionism
@Divisionism : i didn't know about uint8_t, that's why ;) . My main point was to use unsigned. I tried this answer without unsigned and got some weird answers.Pellucid
@joseph: That's an important point indeed; if you using a signed char, then data[x] is promoted to int, which is signed, and data[x] << 24 may therefore shift a 1 into the sign bit, which is undefined behavior.Azazel
@Permafrost Then it will be harder, but still possible. First, ensure that __STD_IEC_559__ is defined, FLT_RADIX == 2, sizeof(float) == 4 and sizeof(double) == 8. Scan uint32_t, extract sign, exponent and mantissa. Then use ldexp to get the float/double. Tricky parts: exponent = scanned_exponent - 127 // or 1023 for double and mantissa = 1.0f + scanned_mantissa / 8388608.0f // or 18014398509481984.0 for double. For reading you use frexp instead.Malarkey
@AdN Even if you find it unpleasant, Rob is still right. If you let your code depend on your machine's endianness even though you don't need to, it's as bad as if you make your code depend on the phase of the moon.Edify
@MatthieuM. C99§6.3.1.1 says: "If an int can represent all values of the original type, the value is converted to an int;". So even if data[x] is an uint8_t, the value is going to go as a signed int, and the << 24 is then UB if bit 7 is set. Isn't that right?Catalysis
@Gauthier: That's... a good question. The conversion from uint8_t to int will preserve the value (it will be positive), and then the shift could indeed make the int negative which I do think is technically UB. I'm not sure whether Pike considered it.Azazel
@MatthieuM. I would ask him, but can't seem to be able to comment on his article. My guess is that although UB, most reasonable compilers do what we mean even without the cast. Also, I think I've seen you answer someone else some time, with that into consideration.Catalysis
@Gauthier: So, in 2017 I made an edit adding the note that the type needed to be unsigned (unsigned char) to avoid promotion to int... but I am not sure what I was thinking since as you mention in this case int is big enough and therefore will be selected regardless. I'll amend the formula with an extra unsigned cast and tweak the note.Azazel
@MatthieuM. and for the little-endian too :)Catalysis
@Catalysis Oopsie, fixed!Azazel
I really do think that opinions like Rob Pike's here are on the nose... EXCEPT for the case of floating-point numbers, which is what the question specified as the use-case (and it's also my own use-case.)Handknit
@JamesTheAwesomeDude: Floating points are just bytes under the hood, you need to create an integer of the appropriate size (std::uint32_t or std::uint64_t) and then std::bit_cast it into the floating point of the appropriate type.Azazel
I found this answer useful to handle floating point values.Fishhook
@LudovicKuty: Be careful, the answer you linked is good for C, but Undefined Behavior in C++. In C++, use a memcpy to copy the bytes between integral and floating point.Azazel
Oops ok, I didn't know. I will dig deeperFishhook
Undefined Behavior in C++: Type Punning is interesting.Fishhook
K
62

If you are doing this for purposes of network/host compatability you should use:

ntohl() //Network to Host byte order (Long)
htonl() //Host to Network byte order (Long)

ntohs() //Network to Host byte order (Short)
htons() //Host to Network byte order (Short)

If you are doing this for some other reason one of the byte_swap solutions presented here would work just fine.

Kenya answered 19/9, 2008 at 20:38 Comment(12)
network byte ordering is big endian I believe. These functions can be used with that in mind even if you're not using network code. However there is no float versions ntohf or htonfFerrotype
Matt H. that is only mostly correct. Not all computer systems have little-endian byte order. If you were working on, say a motorolla 68k, a PowerPC, or another big-endian architecture these functions will not swap bytes at all because they are already in 'Network byte order.Kenya
Unfortunately, htonl and ntohl can't go to little endian on a big-endian platform.Osteology
@BrianVandenberg: That's not what they are made for. They are made for giving a consistent external format. I'd say unless you're actually implementing those functions, you normally shouldn't even care about what that format actually is.Fission
@Matt: As long as you use it only to give a consistent external format, but don't need to care what that format actually is, you don't need to have in mind what that ordering actually is.Fission
@celtschk, understood; however, the OP wants a way to switch endianness, even in a big-endian environment.Osteology
To head off the inevitable question: there are a number of reasons to need LE for a BE platform; a number of file formats (bmp, fli, pcx, qtm, rtf, tga to name a few) use little endian values ... or at least, some version of the format did at one time anyway.Osteology
@BrianVandenberg Wow, didn't see this comment of yours before replying to that one :) Upvoted both!Darwen
Does POSIX guarantee that network order is big endian?Olva
@CiroSantilli新疆改造中心996ICU六四事件 No, the convention comes from elsewhere, and you'll need knowledge of the protocol you're working with in order to determine whether this definition of "network order" is applicable (but it will be with e.g. Berkeley sockets parameters)Prairial
@LightnessRacesinOrbit POSIX does guarantee "network byte order" is big-endian: "The convention is that all such values are stored with 8 bits in each octet, and with the first (lowest-addressed) octet holding the most-significant bits. This is called "network byte order"."Thomasenathomasin
@CiroSantilliOurBigBook.com See my previous comment...Thomasenathomasin
S
30

I took a few suggestions from this post and put them together to form this:

#include <boost/type_traits.hpp>
#include <boost/static_assert.hpp>
#include <boost/detail/endian.hpp>
#include <stdexcept>
#include <cstdint>

enum endianness
{
    little_endian,
    big_endian,
    network_endian = big_endian,
    
    #if defined(BOOST_LITTLE_ENDIAN)
        host_endian = little_endian
    #elif defined(BOOST_BIG_ENDIAN)
        host_endian = big_endian
    #else
        #error "unable to determine system endianness"
    #endif
};

namespace detail {

template<typename T, size_t sz>
struct swap_bytes
{
    inline T operator()(T val)
    {
        throw std::out_of_range("data size");
    }
};

template<typename T>
struct swap_bytes<T, 1>
{
    inline T operator()(T val)
    {
        return val;
    }
};

template<typename T>
struct swap_bytes<T, 2>
{
    inline T operator()(T val)
    {
        return ((((val) >> 8) & 0xff) | (((val) & 0xff) << 8));
    }
};

template<typename T>
struct swap_bytes<T, 4>
{
    inline T operator()(T val)
    {
        return ((((val) & 0xff000000) >> 24) |
                (((val) & 0x00ff0000) >>  8) |
                (((val) & 0x0000ff00) <<  8) |
                (((val) & 0x000000ff) << 24));
    }
};

template<>
struct swap_bytes<float, 4>
{
    inline float operator()(float val)
    {
        uint32_t mem =swap_bytes<uint32_t, sizeof(uint32_t)>()(*(uint32_t*)&val);
        return *(float*)&mem;
    }
};

template<typename T>
struct swap_bytes<T, 8>
{
    inline T operator()(T val)
    {
        return ((((val) & 0xff00000000000000ull) >> 56) |
                (((val) & 0x00ff000000000000ull) >> 40) |
                (((val) & 0x0000ff0000000000ull) >> 24) |
                (((val) & 0x000000ff00000000ull) >> 8 ) |
                (((val) & 0x00000000ff000000ull) << 8 ) |
                (((val) & 0x0000000000ff0000ull) << 24) |
                (((val) & 0x000000000000ff00ull) << 40) |
                (((val) & 0x00000000000000ffull) << 56));
    }
};

template<>
struct swap_bytes<double, 8>
{
    inline double operator()(double val)
    {
        uint64_t mem =swap_bytes<uint64_t, sizeof(uint64_t)>()(*(uint64_t*)&val);
        return *(double*)&mem;
    }
};

template<endianness from, endianness to, class T>
struct do_byte_swap
{
    inline T operator()(T value)
    {
        return swap_bytes<T, sizeof(T)>()(value);
    }
};
// specialisations when attempting to swap to the same endianess
template<class T> struct do_byte_swap<little_endian, little_endian, T> { inline T operator()(T value) { return value; } };
template<class T> struct do_byte_swap<big_endian,    big_endian,    T> { inline T operator()(T value) { return value; } };

} // namespace detail

template<endianness from, endianness to, class T>
inline T byte_swap(T value)
{
    // ensure the data is only 1, 2, 4 or 8 bytes
    BOOST_STATIC_ASSERT(sizeof(T) == 1 || sizeof(T) == 2 || sizeof(T) == 4 || sizeof(T) == 8);
    // ensure we're only swapping arithmetic types
    BOOST_STATIC_ASSERT(boost::is_arithmetic<T>::value);

    return detail::do_byte_swap<from, to, T>()(value);
}

You would then use it as follows:

// swaps val from host-byte-order to network-byte-order
auto swapped = byte_swap<host_endian, network_endian>(val);

and vice-versa

// swap a value received from the network into host-byte-order
auto val = byte_swap<network_endian, host_endian>(val_from_network);
Salad answered 19/8, 2010 at 14:36 Comment(1)
you also have to include <cstdint> or <stdint.h>, for example, for uint32_tAlpinist
D
23

The procedure for going from big-endian to little-endian is the same as going from little-endian to big-endian.

Here's some example code:

void swapByteOrder(unsigned short& us)
{
    us = (us >> 8) |
         (us << 8);
}

void swapByteOrder(unsigned int& ui)
{
    ui = (ui >> 24) |
         ((ui<<8) & 0x00FF0000) |
         ((ui>>8) & 0x0000FF00) |
         (ui << 24);
}

void swapByteOrder(unsigned long long& ull)
{
    ull = (ull >> 56) |
          ((ull<<40) & 0x00FF000000000000) |
          ((ull<<24) & 0x0000FF0000000000) |
          ((ull<<8) & 0x000000FF00000000) |
          ((ull>>8) & 0x00000000FF000000) |
          ((ull>>24) & 0x0000000000FF0000) |
          ((ull>>40) & 0x000000000000FF00) |
          (ull << 56);
}
Drone answered 19/9, 2008 at 20:31 Comment(2)
The last function posted here is incorrect, and should be edited to: void swapByteOrder(unsigned long long& ull) { ull = (ull >> 56) | ... (ull << 56); }Lost
I don't think it's correct to be using logical-and (&&) as opposed to bitwise-and (&). According to the C++ spec, both operands are implicitly converted to bool, which is not what you want.Woolgrower
B
17

There is an assembly instruction called BSWAP that will do the swap for you, extremely fast. You can read about it here.

Visual Studio, or more precisely the Visual C++ runtime library, has platform intrinsics for this, called _byteswap_ushort(), _byteswap_ulong(), and _byteswap_int64(). Similar should exist for other platforms, but I'm not aware of what they would be called.

Blesbok answered 19/9, 2008 at 20:34 Comment(2)
That's a great link. It's rekindled my interest in x86 assembler.Mansell
Timing results for BSWAP are presented here. gmplib.org/~tege/x86-timing.pdf ... and here ... agner.org/optimize/instruction_tables.pdfFiddlededee
K
12

We've done this with templates. You could do something like this:

// Specialization for 2-byte types.
template<>
inline void endian_byte_swapper< 2 >(char* dest, char const* src)
{
    // Use bit manipulations instead of accessing individual bytes from memory, much faster.
    ushort* p_dest = reinterpret_cast< ushort* >(dest);
    ushort const* const p_src = reinterpret_cast< ushort const* >(src);
    *p_dest = (*p_src >> 8) | (*p_src << 8);
}

// Specialization for 4-byte types.
template<>
inline void endian_byte_swapper< 4 >(char* dest, char const* src)
{
    // Use bit manipulations instead of accessing individual bytes from memory, much faster.
    uint* p_dest = reinterpret_cast< uint* >(dest);
    uint const* const p_src = reinterpret_cast< uint const* >(src);
    *p_dest = (*p_src >> 24) | ((*p_src & 0x00ff0000) >> 8) | ((*p_src & 0x0000ff00) << 8) | (*p_src << 24);
}
Kilocycle answered 19/9, 2008 at 20:29 Comment(0)
T
9

The same way you do in C:

short big = 0xdead;
short little = (((big & 0xff)<<8) | ((big & 0xff00)>>8));

You could also declare a vector of unsigned chars, memcpy the input value into it, reverse the bytes into another vector and memcpy the bytes out, but that'll take orders of magnitude longer than bit-twiddling, especially with 64-bit values.

Thyroxine answered 19/9, 2008 at 20:30 Comment(0)
N
8

If you're doing this to transfer data between different platforms look at the ntoh and hton functions.

Nall answered 19/9, 2008 at 20:26 Comment(0)
F
8

On most POSIX systems (through it's not in the POSIX standard) there is the endian.h, which can be used to determine what encoding your system uses. From there it's something like this:

unsigned int change_endian(unsigned int x)
{
    unsigned char *ptr = (unsigned char *)&x;
    return (ptr[0] << 24) | (ptr[1] << 16) | (ptr[2] << 8) | ptr[3];
}

This swaps the order (from big endian to little endian):

If you have the number 0xDEADBEEF (on a little endian system stored as 0xEFBEADDE), ptr[0] will be 0xEF, ptr[1] is 0xBE, etc.

But if you want to use it for networking, then htons, htonl and htonll (and their inverses ntohs, ntohl and ntohll) will be helpful for converting from host order to network order.

Froehlich answered 19/9, 2008 at 20:33 Comment(2)
That's funny - the POSIX standard at opengroup.org/onlinepubs/9699919799/toc.htm does not mention a header '<endian.h>`.Hypnotherapy
You can use htonl and friends regardless of whether the use-case has anything to do with networking. Network byte order is big-endian, so just treat those functions as host_to_be and be_to_host. (Doesn't help if you need host_to_le, though.)Obrian
E
7

Note that, at least for Windows, htonl() is much slower than their intrinsic counterpart _byteswap_ulong(). The former is a DLL library call into ws2_32.dll, the latter is one BSWAP assembly instruction. Therefore, if you are writing some platform-dependent code, prefer using the intrinsics for speed:

#define htonl(x) _byteswap_ulong(x)

This may be especially important for .PNG image processing where all integers are saved in Big Endian with explanation "One can use htonl()..." {to slow down typical Windows programs, if you are not prepared}.

Edmonton answered 20/8, 2013 at 10:45 Comment(1)
Demo: godbolt.org/z/G79hrEPba Here you can see gcc and clang inlining htonl to a single bswap instruction, whereas msvc calls a function.Kartis
R
7

Seriously... I don't understand why all solutions are that complicated! How about the simplest, most general template function that swaps any type of any size under any circumstances in any operating system????

template <typename T>
void SwapEnd(T& var)
{
    static_assert(std::is_pod<T>::value, "Type must be POD type for safety");
    std::array<char, sizeof(T)> varArray;
    std::memcpy(varArray.data(), &var, sizeof(T));
    for(int i = 0; i < static_cast<int>(sizeof(var)/2); i++)
        std::swap(varArray[sizeof(var) - 1 - i],varArray[i]);
    std::memcpy(&var, varArray.data(), sizeof(T));
}

It's the magic power of C and C++ together! Simply swap the original variable character by character.

Point 1: No operators: Remember that I didn't use the simple assignment operator "=" because some objects will be messed up when the endianness is flipped and the copy constructor (or assignment operator) won't work. Therefore, it's more reliable to copy them char by char.

Point 2: Be aware of alignment issues: Notice that we're copying to and from an array, which is the right thing to do because the C++ compiler doesn't guarantee that we can access unaligned memory (this answer was updated from its original form for this). For example, if you allocate uint64_t, your compiler cannot guarantee that you can access the 3rd byte of that as a uint8_t. Therefore, the right thing to do is to copy this to a char array, swap it, then copy it back (so no reinterpret_cast). Notice that compilers are mostly smart enough to convert what you did back to a reinterpret_cast if they're capable of accessing individual bytes regardless of alignment.

To use this function:

double x = 5;
SwapEnd(x);

and now x is different in endianness.

Rubdown answered 7/8, 2014 at 8:7 Comment(9)
This will work anywhere, but assembly ocde produced will often be suboptimal: see my question stackoverflow.com/questions/36657895/…Christmann
You use new/delete to allocate a buffer for this?!? sizeof(var) is a compile-time constant, so you could do char varSwapped[sizeof(var)]. Or you could do char *p = reinterpret_cast<char*>(&var) and swap in-place.Obrian
@Peter this answer is quick and dirty made to prove a point. I'll implement your suggestions. However, you don't have to be a mega SO AH and down-vote the 5-line solution compared to the 50-line solutions that are given up there. I'm not gonna say more.Rubdown
This answer makes some useful points about being careful with constructors and overloaded operators on wrong-endian data, so I'd be happy to remove my downvote once the code isn't horrible, and is something that a good compiler could compile into a bswap instruction. Also, I'd suggest using for(size_t i = 0 ; i < sizeof(var) ; i++) instead of a static_cast<long>. (Or actually, in-place swap will use an ascending and descending char* so that goes away anyway).Obrian
e.g. see Mark Ransom's answer using std::swap to reverse in-place.Obrian
@PeterCordes I didn't know that answer existed. Anyway, I improved my answer.Rubdown
Thanks for taking the time to do that; looks much better now. A smart compiler might manage to compile that to an endian-swap instruction like bswap if used on integers of the right size. (But probably still not :/)Obrian
Why static_cast<long> and not simply use size_t i instead? See C++11, §5.3.3: "The result of sizeof and sizeof... is a constant of type std::size_t"Semiliterate
@Semiliterate force of habit :-) . I always use signed in loops to avoid underflows in certain cases, and cast the comparison to avoid warnings.Rubdown
V
6

Most platforms have a system header file that provides efficient byteswap functions. On Linux it is in <endian.h>. You can wrap it nicely in C++:

#include <iostream>

#include <endian.h>

template<size_t N> struct SizeT {};

#define BYTESWAPS(bits) \
template<class T> inline T htobe(T t, SizeT<bits / 8>) { return htobe ## bits(t); } \
template<class T> inline T htole(T t, SizeT<bits / 8>) { return htole ## bits(t); } \
template<class T> inline T betoh(T t, SizeT<bits / 8>) { return be ## bits ## toh(t); } \
template<class T> inline T letoh(T t, SizeT<bits / 8>) { return le ## bits ## toh(t); }

BYTESWAPS(16)
BYTESWAPS(32)
BYTESWAPS(64)

#undef BYTESWAPS

template<class T> inline T htobe(T t) { return htobe(t, SizeT<sizeof t>()); }
template<class T> inline T htole(T t) { return htole(t, SizeT<sizeof t>()); }
template<class T> inline T betoh(T t) { return betoh(t, SizeT<sizeof t>()); }
template<class T> inline T letoh(T t) { return letoh(t, SizeT<sizeof t>()); }

int main()
{
    std::cout << std::hex;
    std::cout << htobe(static_cast<unsigned short>(0xfeca)) << '\n';
    std::cout << htobe(0xafbeadde) << '\n';

    // Use ULL suffix to specify integer constant as unsigned long long 
    std::cout << htobe(0xfecaefbeafdeedfeULL) << '\n';
}

Output:

cafe
deadbeaf
feeddeafbeefcafe
Valerianaceous answered 10/2, 2011 at 12:2 Comment(2)
Change:#define BYTESWAPS(bits) \ template<class T> inline T htobe(T t, SizeT<bits / 8>) { return htobe ## bits(t); } \ template<class T> inline T htole(T t, SizeT<bits / 8>) { return htole ## bits(t); } \ template<class T> inline T betoh(T t, SizeT<bits / 8>) { return be ## bits ## toh(t); } \ template<class T> inline T letoh(T t, SizeT<bits / 8>) { return le ## bits ## toh(t); }Baler
Thanks, forgot to test betoh() and letoh().Valerianaceous
H
4

I have this code that allow me to convert from HOST_ENDIAN_ORDER (whatever it is) to LITTLE_ENDIAN_ORDER or BIG_ENDIAN_ORDER. I use a template, so if I try to convert from HOST_ENDIAN_ORDER to LITTLE_ENDIAN_ORDER and they happen to be the same for the machine for wich I compile, no code will be generated.

Here is the code with some comments:

// We define some constant for little, big and host endianess. Here I use 
// BOOST_LITTLE_ENDIAN/BOOST_BIG_ENDIAN to check the host indianess. If you
// don't want to use boost you will have to modify this part a bit.
enum EEndian
{
  LITTLE_ENDIAN_ORDER,
  BIG_ENDIAN_ORDER,
#if defined(BOOST_LITTLE_ENDIAN)
  HOST_ENDIAN_ORDER = LITTLE_ENDIAN_ORDER
#elif defined(BOOST_BIG_ENDIAN)
  HOST_ENDIAN_ORDER = BIG_ENDIAN_ORDER
#else
#error "Impossible de determiner l'indianness du systeme cible."
#endif
};

// this function swap the bytes of values given it's size as a template
// parameter (could sizeof be used?).
template <class T, unsigned int size>
inline T SwapBytes(T value)
{
  union
  {
     T value;
     char bytes[size];
  } in, out;

  in.value = value;

  for (unsigned int i = 0; i < size / 2; ++i)
  {
     out.bytes[i] = in.bytes[size - 1 - i];
     out.bytes[size - 1 - i] = in.bytes[i];
  }

  return out.value;
}

// Here is the function you will use. Again there is two compile-time assertion
// that use the boost librarie. You could probably comment them out, but if you
// do be cautious not to use this function for anything else than integers
// types. This function need to be calles like this :
//
//     int x = someValue;
//     int i = EndianSwapBytes<HOST_ENDIAN_ORDER, BIG_ENDIAN_ORDER>(x);
//
template<EEndian from, EEndian to, class T>
inline T EndianSwapBytes(T value)
{
  // A : La donnée à swapper à une taille de 2, 4 ou 8 octets
  BOOST_STATIC_ASSERT(sizeof(T) == 2 || sizeof(T) == 4 || sizeof(T) == 8);

  // A : La donnée à swapper est d'un type arithmetic
  BOOST_STATIC_ASSERT(boost::is_arithmetic<T>::value);

  // Si from et to sont du même type on ne swap pas.
  if (from == to)
     return value;

  return SwapBytes<T, sizeof(T)>(value);
}
Horsley answered 20/9, 2008 at 4:25 Comment(0)
B
4

If a big-endian 32-bit unsigned integer looks like 0xAABBCCDD which is equal to 2864434397, then that same 32-bit unsigned integer looks like 0xDDCCBBAA on a little-endian processor which is also equal to 2864434397.

If a big-endian 16-bit unsigned short looks like 0xAABB which is equal to 43707, then that same 16-bit unsigned short looks like 0xBBAA on a little-endian processor which is also equal to 43707.

Here are a couple of handy #define functions to swap bytes from little-endian to big-endian and vice-versa -->

// can be used for short, unsigned short, word, unsigned word (2-byte types)
#define BYTESWAP16(n) (((n&0xFF00)>>8)|((n&0x00FF)<<8))

// can be used for int or unsigned int or float (4-byte types)
#define BYTESWAP32(n) ((BYTESWAP16((n&0xFFFF0000)>>16))|((BYTESWAP16(n&0x0000FFFF))<<16))

// can be used for unsigned long long or double (8-byte types)
#define BYTESWAP64(n) ((BYTESWAP32((n&0xFFFFFFFF00000000)>>32))|((BYTESWAP32(n&0x00000000FFFFFFFF))<<32))
Bobker answered 6/9, 2014 at 21:6 Comment(0)
E
3

If you take the common pattern for reversing the order of bits in a word, and cull the part that reverses bits within each byte, then you're left with something which only reverses the bytes within a word. For 64-bits:

x = ((x & 0x00000000ffffffff) << 32) ^ ((x >> 32) & 0x00000000ffffffff);
x = ((x & 0x0000ffff0000ffff) << 16) ^ ((x >> 16) & 0x0000ffff0000ffff);
x = ((x & 0x00ff00ff00ff00ff) <<  8) ^ ((x >>  8) & 0x00ff00ff00ff00ff);

The compiler should clean out the superfluous bit-masking operations (I left them in to highlight the pattern), but if it doesn't you can rewrite the first line this way:

x = ( x                       << 32) ^  (x >> 32);

That should normally simplify down to a single rotate instruction on most architectures (ignoring that the whole operation is probably one instruction).

On a RISC processor the large, complicated constants may cause the compiler difficulties. You can trivially calculate each of the constants from the previous one, though. Like so:

uint64_t k = 0x00000000ffffffff; /* compiler should know a trick for this */
x = ((x & k) << 32) ^ ((x >> 32) & k);
k ^= k << 16;
x = ((x & k) << 16) ^ ((x >> 16) & k);
k ^= k << 8;
x = ((x & k) <<  8) ^ ((x >>  8) & k);

If you like, you can write that as a loop. It won't be efficient, but just for fun:

int i = sizeof(x) * CHAR_BIT / 2;
uintmax_t k = (1 << i) - 1;
while (i >= 8)
{
    x = ((x & k) << i) ^ ((x >> i) & k);
    i >>= 1;
    k ^= k << i;
}

And for completeness, here's the simplified 32-bit version of the first form:

x = ( x               << 16) ^  (x >> 16);
x = ((x & 0x00ff00ff) <<  8) ^ ((x >>  8) & 0x00ff00ff);
Extrude answered 20/8, 2013 at 11:45 Comment(0)
O
3

Just thought I added my own solution here since I haven't seen it anywhere. It's a small and portable C++ templated function and portable that only uses bit operations.

template<typename T> inline static T swapByteOrder(const T& val) {
    int totalBytes = sizeof(val);
    T swapped = (T) 0;
    for (int i = 0; i < totalBytes; ++i) {
        swapped |= (val >> (8*(totalBytes-i-1)) & 0xFF) << (8*i);
    }
    return swapped;
}
Opine answered 29/1, 2016 at 19:30 Comment(0)
F
2

Here's a generalized version I came up with off the top of my head, for swapping a value in place. The other suggestions would be better if performance is a problem.

 template<typename T>
    void ByteSwap(T * p)
    {
        for (int i = 0;  i < sizeof(T)/2;  ++i)
            std::swap(((char *)p)[i], ((char *)p)[sizeof(T)-1-i]);
    }

Disclaimer: I haven't tried to compile this or test it yet.

Faience answered 19/9, 2008 at 21:7 Comment(0)
U
2

i like this one, just for style :-)

long swap(long i) {
    char *c = (char *) &i;
    return * (long *) (char[]) {c[3], c[2], c[1], c[0] };
}
Unstriped answered 26/6, 2011 at 22:13 Comment(2)
I get an error on char[] saying 'Error: incomplete type is not allowed'Kutenai
This is undefined behavior. It's always a strict aliasing violation and can also violate any alignment requirements for long.Thomasenathomasin
B
2

Using the codes below, you can swap between BigEndian and LittleEndian easily

#define uint32_t unsigned 
#define uint16_t unsigned short

#define swap16(x) ((((uint16_t)(x) & 0x00ff)<<8)| \
(((uint16_t)(x) & 0xff00)>>8))

#define swap32(x) ((((uint32_t)(x) & 0x000000ff)<<24)| \
(((uint32_t)(x) & 0x0000ff00)<<8)| \
(((uint32_t)(x) & 0x00ff0000)>>8)| \
(((uint32_t)(x) & 0xff000000)>>24))
Blare answered 17/9, 2016 at 15:58 Comment(0)
H
2

I am really surprised no one mentioned htobeXX and betohXX functions. They are defined in endian.h and are very similar to network functions htonXX.

Huertas answered 11/5, 2017 at 19:15 Comment(0)
B
1

I recently wrote a macro to do this in C, but it's equally valid in C++:

#define REVERSE_BYTES(...) do for(size_t REVERSE_BYTES=0; REVERSE_BYTES<sizeof(__VA_ARGS__)>>1; ++REVERSE_BYTES)\
    ((unsigned char*)&(__VA_ARGS__))[REVERSE_BYTES] ^= ((unsigned char*)&(__VA_ARGS__))[sizeof(__VA_ARGS__)-1-REVERSE_BYTES],\
    ((unsigned char*)&(__VA_ARGS__))[sizeof(__VA_ARGS__)-1-REVERSE_BYTES] ^= ((unsigned char*)&(__VA_ARGS__))[REVERSE_BYTES],\
    ((unsigned char*)&(__VA_ARGS__))[REVERSE_BYTES] ^= ((unsigned char*)&(__VA_ARGS__))[sizeof(__VA_ARGS__)-1-REVERSE_BYTES];\
while(0)

It accepts any type and reverses the bytes in the passed argument. Example usages:

int main(){
    unsigned long long x = 0xABCDEF0123456789;
    printf("Before: %llX\n",x);
    REVERSE_BYTES(x);
    printf("After : %llX\n",x);

    char c[7]="nametag";
    printf("Before: %c%c%c%c%c%c%c\n",c[0],c[1],c[2],c[3],c[4],c[5],c[6]);
    REVERSE_BYTES(c);
    printf("After : %c%c%c%c%c%c%c\n",c[0],c[1],c[2],c[3],c[4],c[5],c[6]);
}

Which prints:

Before: ABCDEF0123456789
After : 8967452301EFCDAB
Before: nametag
After : gateman

The above is perfectly copy/paste-able, but there's a lot going on here, so I'll break down how it works piece by piece:

The first notable thing is that the entire macro is encased in a do while(0) block. This is a common idiom to allow normal semicolon use after the macro.

Next up is the use of a variable named REVERSE_BYTES as the for loop's counter. The name of the macro itself is used as a variable name to ensure that it doesn't clash with any other symbols that may be in scope wherever the macro is used. Since the name is being used within the macro's expansion, it won't be expanded again when used as a variable name here.

Within the for loop, there are two bytes being referenced and XOR swapped (so a temporary variable name is not required):

((unsigned char*)&(__VA_ARGS__))[REVERSE_BYTES]
((unsigned char*)&(__VA_ARGS__))[sizeof(__VA_ARGS__)-1-REVERSE_BYTES]

__VA_ARGS__ represents whatever was given to the macro, and is used to increase the flexibility of what may be passed in (albeit not by much). The address of this argument is then taken and cast to an unsigned char pointer to permit the swapping of its bytes via array [] subscripting.

The final peculiar point is the lack of {} braces. They aren't necessary because all of the steps in each swap are joined with the comma operator, making them one statement.

Finally, it's worth noting that this is not the ideal approach if speed is a top priority. If this is an important factor, some of the type-specific macros or platform-specific directives referenced in other answers are likely a better option. This approach, however, is portable to all types, all major platforms, and both the C and C++ languages.

Besant answered 17/12, 2016 at 7:15 Comment(1)
found this somewhere in some code. confused the heck out of me. Thanks for the explanation. However why the use of __VA_ARGS__?Whiten
U
1

If you have C++ 17 then add this header

#include <algorithm>

Use this template function to swap the bytes:

template <typename T>
void swapEndian(T& buffer)
{
    static_assert(std::is_pod<T>::value, "swapEndian support POD type only");
    char* startIndex = static_cast<char*>((void*)buffer.data());
    char* endIndex = startIndex + sizeof(buffer);
    std::reverse(startIndex, endIndex);
}

call it like:

swapEndian (stlContainer);
Unlawful answered 10/4, 2020 at 18:42 Comment(0)
M
0

Wow, I couldn't believe some of the answers I've read here. There's actually an instruction in assembly which does this faster than anything else. bswap. You could simply write a function like this...

__declspec(naked) uint32_t EndianSwap(uint32 value)
{
    __asm
    {
        mov eax, dword ptr[esp + 4]
        bswap eax
        ret
    }
}

It is MUCH faster than the intrinsics that have been suggested. I've disassembled them and looked. The above function has no prologue/epilogue so virtually has no overhead at all.

unsigned long _byteswap_ulong(unsigned long value);

Doing 16 bit is just as easy, with the exception that you'd use xchg al, ah. bswap only works on 32-bit registers.

64-bit is a little more tricky, but not overly so. Much better than all of the above examples with loops and templates etc.

There are some caveats here... Firstly bswap is only available on 80x486 CPU's and above. Is anyone planning on running it on a 386?!? If so, you can still replace bswap with...

mov ebx, eax
shr ebx, 16
xchg al, ah
xchg bl, bh
shl eax, 16
or eax, ebx

Also inline assembly is only available in x86 code in Visual Studio. A naked function cannot be lined and also isn't available in x64 builds. I that instance, you're going to have to use the compiler intrinsics.

Martica answered 24/9, 2014 at 1:47 Comment(7)
_byteswap_ulong and _uint64 (e.g. in the accepted answer) both compile to use the bswap instruction. I would be surprised but interested to know if this asm is that much faster as it only omits the prologue/epilogue -- did you benchmark it?Mcnabb
@stdcall The question didn't ask for a portable solution or even mentioned anything about a platform. As my answer said, the above is about the fastest way to endian swap. Sure, if you're writing this on a non-X86 platform then this isn't going to work, but as I also mentioned, you're then limited to compiler intrinsics, if your compiler even supports them.Martica
@Mcnabb In this particular case, I think omitting the prologue and epilogue is going to give you a decent saving because you're essentially only executing 1 instruction. The prologue is going to have to push onto the stack, do a subtraction, set the base-pointer and then similar at the end. I've not benchmarked it, but the above has a 0 dependency chain which you're simply not going to get without it being naked. Maybe a good compiler would inline it, but then you're in a different ball-park.Martica
Perhaps. But note that in the common case of swapping an array of numbers, the compiler intrinsics discussed in other answers will use SSE/AVX extensions and emit PSHUFB, which outperforms BSWAP. See wm.ite.pl/articles/reverse-array-of-bytes.htmlMcnabb
It's bad form IMHO to post a platform-specific solution, when the OP didn't specify that they only needed a solution for x86. And to disparage the other solutions, when yours is unusable on many very widely used OS's such as iOS and Android (which use ARM or MIPS CPUs.)Constitutionalism
@Jens Alfke When the question was posted, there was no mention of architecture or even what size of endian swap they wanted. I know that such a solution wouldn't work on architectures other than x86, but I did mention the architecture it was for. Futher, the accepted answer contains platform specific types and functions and wouldn't work on iOS or Android anymore that mine would. I took at punt and guessed at what the user wanted based on a vague question. It wasn't my intention to be insulting, I just thought some of the answers weren't good given that most were bit shifting and or-ing.Martica
This can't inline. A compiler intrinsic should inline to a single bswap instruction, but this answer forces the compiler to actually make a function call. Were you comparing the asm output with optimization disabled?Obrian
R
0

Portable technique for implementing optimizer-friendly unaligned non-inplace endian accessors. They work on every compiler, every boundary alignment and every byte ordering. These unaligned routines are supplemented, or mooted, depending on native endian and alignment. Partial listing but you get the idea. BO* are constant values based on native byte ordering.

uint32_t sw_get_uint32_1234(pu32)
uint32_1234 *pu32;
{
  union {
    uint32_1234 u32_1234;
    uint32_t u32;
  } bou32;
  bou32.u32_1234[0] = (*pu32)[BO32_0];
  bou32.u32_1234[1] = (*pu32)[BO32_1];
  bou32.u32_1234[2] = (*pu32)[BO32_2];
  bou32.u32_1234[3] = (*pu32)[BO32_3];
  return(bou32.u32);
}

void sw_set_uint32_1234(pu32, u32)
uint32_1234 *pu32;
uint32_t u32;
{
  union {
    uint32_1234 u32_1234;
    uint32_t u32;
  } bou32;
  bou32.u32 = u32;
  (*pu32)[BO32_0] = bou32.u32_1234[0];
  (*pu32)[BO32_1] = bou32.u32_1234[1];
  (*pu32)[BO32_2] = bou32.u32_1234[2];
  (*pu32)[BO32_3] = bou32.u32_1234[3];
}

#if HAS_SW_INT64
int64 sw_get_int64_12345678(pi64)
int64_12345678 *pi64;
{
  union {
    int64_12345678 i64_12345678;
    int64 i64;
  } boi64;
  boi64.i64_12345678[0] = (*pi64)[BO64_0];
  boi64.i64_12345678[1] = (*pi64)[BO64_1];
  boi64.i64_12345678[2] = (*pi64)[BO64_2];
  boi64.i64_12345678[3] = (*pi64)[BO64_3];
  boi64.i64_12345678[4] = (*pi64)[BO64_4];
  boi64.i64_12345678[5] = (*pi64)[BO64_5];
  boi64.i64_12345678[6] = (*pi64)[BO64_6];
  boi64.i64_12345678[7] = (*pi64)[BO64_7];
  return(boi64.i64);
}
#endif

int32_t sw_get_int32_3412(pi32)
int32_3412 *pi32;
{
  union {
    int32_3412 i32_3412;
    int32_t i32;
  } boi32;
  boi32.i32_3412[2] = (*pi32)[BO32_0];
  boi32.i32_3412[3] = (*pi32)[BO32_1];
  boi32.i32_3412[0] = (*pi32)[BO32_2];
  boi32.i32_3412[1] = (*pi32)[BO32_3];
  return(boi32.i32);
}

void sw_set_int32_3412(pi32, i32)
int32_3412 *pi32;
int32_t i32;
{
  union {
    int32_3412 i32_3412;
    int32_t i32;
  } boi32;
  boi32.i32 = i32;
  (*pi32)[BO32_0] = boi32.i32_3412[2];
  (*pi32)[BO32_1] = boi32.i32_3412[3];
  (*pi32)[BO32_2] = boi32.i32_3412[0];
  (*pi32)[BO32_3] = boi32.i32_3412[1];
}

uint32_t sw_get_uint32_3412(pu32)
uint32_3412 *pu32;
{
  union {
    uint32_3412 u32_3412;
    uint32_t u32;
  } bou32;
  bou32.u32_3412[2] = (*pu32)[BO32_0];
  bou32.u32_3412[3] = (*pu32)[BO32_1];
  bou32.u32_3412[0] = (*pu32)[BO32_2];
  bou32.u32_3412[1] = (*pu32)[BO32_3];
  return(bou32.u32);
}

void sw_set_uint32_3412(pu32, u32)
uint32_3412 *pu32;
uint32_t u32;
{
  union {
    uint32_3412 u32_3412;
    uint32_t u32;
  } bou32;
  bou32.u32 = u32;
  (*pu32)[BO32_0] = bou32.u32_3412[2];
  (*pu32)[BO32_1] = bou32.u32_3412[3];
  (*pu32)[BO32_2] = bou32.u32_3412[0];
  (*pu32)[BO32_3] = bou32.u32_3412[1];
}

float sw_get_float_1234(pf)
float_1234 *pf;
{
  union {
    float_1234 f_1234;
    float f;
  } bof;
  bof.f_1234[0] = (*pf)[BO32_0];
  bof.f_1234[1] = (*pf)[BO32_1];
  bof.f_1234[2] = (*pf)[BO32_2];
  bof.f_1234[3] = (*pf)[BO32_3];
  return(bof.f);
}

void sw_set_float_1234(pf, f)
float_1234 *pf;
float f;
{
  union {
    float_1234 f_1234;
    float f;
  } bof;
  bof.f = (float)f;
  (*pf)[BO32_0] = bof.f_1234[0];
  (*pf)[BO32_1] = bof.f_1234[1];
  (*pf)[BO32_2] = bof.f_1234[2];
  (*pf)[BO32_3] = bof.f_1234[3];
}

double sw_get_double_12345678(pd)
double_12345678 *pd;
{
  union {
    double_12345678 d_12345678;
    double d;
  } bod;
  bod.d_12345678[0] = (*pd)[BO64_0];
  bod.d_12345678[1] = (*pd)[BO64_1];
  bod.d_12345678[2] = (*pd)[BO64_2];
  bod.d_12345678[3] = (*pd)[BO64_3];
  bod.d_12345678[4] = (*pd)[BO64_4];
  bod.d_12345678[5] = (*pd)[BO64_5];
  bod.d_12345678[6] = (*pd)[BO64_6];
  bod.d_12345678[7] = (*pd)[BO64_7];
  return(bod.d);
}

void sw_set_double_12345678(pd, d)
double_12345678 *pd;
double d;
{
  union {
    double_12345678 d_12345678;
    double d;
  } bod;
  bod.d = d;
  (*pd)[BO64_0] = bod.d_12345678[0];
  (*pd)[BO64_1] = bod.d_12345678[1];
  (*pd)[BO64_2] = bod.d_12345678[2];
  (*pd)[BO64_3] = bod.d_12345678[3];
  (*pd)[BO64_4] = bod.d_12345678[4];
  (*pd)[BO64_5] = bod.d_12345678[5];
  (*pd)[BO64_6] = bod.d_12345678[6];
  (*pd)[BO64_7] = bod.d_12345678[7];
}

These typedefs have the benefit of raising compiler errors if not used with accessors, thus mitigating forgotten accessor bugs.

typedef char int8_1[1], uint8_1[1];

typedef char int16_12[2], uint16_12[2]; /* little endian */
typedef char int16_21[2], uint16_21[2]; /* big endian */

typedef char int24_321[3], uint24_321[3]; /* Alpha Micro, PDP-11 */

typedef char int32_1234[4], uint32_1234[4]; /* little endian */
typedef char int32_3412[4], uint32_3412[4]; /* Alpha Micro, PDP-11 */
typedef char int32_4321[4], uint32_4321[4]; /* big endian */

typedef char int64_12345678[8], uint64_12345678[8]; /* little endian */
typedef char int64_34128756[8], uint64_34128756[8]; /* Alpha Micro, PDP-11 */
typedef char int64_87654321[8], uint64_87654321[8]; /* big endian */

typedef char float_1234[4]; /* little endian */
typedef char float_3412[4]; /* Alpha Micro, PDP-11 */
typedef char float_4321[4]; /* big endian */

typedef char double_12345678[8]; /* little endian */
typedef char double_78563412[8]; /* Alpha Micro? */
typedef char double_87654321[8]; /* big endian */
Ramin answered 23/5, 2016 at 8:47 Comment(1)
For this question, the C++ tag makes a difference. There is lots of undefined behavior due to C++ and the union.Jordanson
R
0

Byte swapping with ye olde 3-step-xor trick around a pivot in a template function gives a flexible, quick O(ln2) solution that does not require a library, the style here also rejects 1 byte types:

template<typename T>void swap(T &t){
    for(uint8_t pivot = 0; pivot < sizeof(t)/2; pivot ++){
        *((uint8_t *)&t + pivot) ^= *((uint8_t *)&t+sizeof(t)-1- pivot);
        *((uint8_t *)&t+sizeof(t)-1- pivot) ^= *((uint8_t *)&t + pivot);
        *((uint8_t *)&t + pivot) ^= *((uint8_t *)&t+sizeof(t)-1- pivot);
    }
}
Restaurant answered 11/4, 2019 at 14:18 Comment(0)
T
0

Seems like the safe way would be to use htons on each word. So, if you have...

std::vector<uint16_t> storage(n);  // where n is the number to be converted

// the following would do the trick
std::transform(word_storage.cbegin(), word_storage.cend()
  , word_storage.begin(), [](const uint16_t input)->uint16_t {
  return htons(input); });

The above would be a no-op if you were on a big-endian system, so I would look for whatever your platform uses as a compile-time condition to decide whether htons is a no-op. It is O(n) after all. On a Mac, it would be something like ...

#if (__DARWIN_BYTE_ORDER != __DARWIN_BIG_ENDIAN)
std::transform(word_storage.cbegin(), word_storage.cend()
  , word_storage.begin(), [](const uint16_t input)->uint16_t {
  return htons(input); });
#endif
Tinsel answered 19/4, 2019 at 3:40 Comment(0)
A
0

Here is a basic function to swap to/from little and big endian. It's basic but it doesn't require supplementary libraries.

void endianness_swap(uint32_t& val) {
    uint8_t a, b, c;
    a = (val & 0xFF000000) >> 24;
    b = (val & 0x00FF0000) >> 16;
    c = (val & 0x0000FF00) >> 8;
    val=(val & 0x000000FF) << 24;
    val = val + (c << 16) + (b << 8) + (a);
}
Auriculate answered 26/7, 2020 at 22:27 Comment(0)
S
0

Not as efficient as using an intrinsic function, but certainly portable. My answer:

#include <cstdint>
#include <type_traits>

/**
 * Perform an endian swap of bytes against a templatized unsigned word.
 *
 * @tparam value_type The data type to perform the endian swap against.
 * @param value       The data value to swap.
 *
 * @return value_type The resulting swapped word.
 */
template <typename value_type>
constexpr inline auto endian_swap(value_type value) -> value_type
{
    using half_type = typename std::conditional<
        sizeof(value_type) == 8u,
        uint32_t,
        typename std::conditional<sizeof(value_type) == 4u, uint16_t, uint8_t>::
            type>::type;

    size_t const    half_bits  = sizeof(value_type) * 8u / 2u;
    half_type const upper_half = static_cast<half_type>(value >> half_bits);
    half_type const lower_half = static_cast<half_type>(value);

    if (sizeof(value_type) == 2u)
    {
        return (static_cast<value_type>(lower_half) << half_bits) | upper_half;
    }

    return ((static_cast<value_type>(endian_swap(lower_half)) << half_bits) |
            endian_swap(upper_half));
}
Staging answered 25/5, 2021 at 2:34 Comment(0)
L
0

Came here looking for a Boost solution and left disappointed, but finally found it elsewhere. You can use boost::endian::endian_reverse. It's templated/overloaded for all the primitive types:

#include <iostream>
#include <iomanip>
#include "boost/endian/conversion.hpp"

int main()
{
  uint32_t word = 0x01;
  std::cout << std::hex << std::setfill('0') << std::setw(8) << word << std::endl;
  // outputs 00000001;

  uint32_t word2 = boost::endian::endian_reverse(word);
  // there's also a `void ::endian_reverse_inplace(...) function
  // that reverses the value passed to it in place and returns nothing

  std::cout << std::hex << std::setfill('0') << std::setw(8) << word2 << std::endl;
  // outputs 01000000

  return 0;
}

Demonstration

Although, it looks like c++23 finally put this to bed with std::byteswap. (I'm using c++17, so this was not an option.)

Leniency answered 5/11, 2022 at 19:47 Comment(0)
P
-1

A c++20 branchless version now that std::endian exists but before c++23 adds std::byteswap

#include <bit>
#include <type_traits>
#include <concepts>
#include <array>
#include <cstring>
#include <iostream>
#include <bitset>

template <int LEN, int OFF=LEN/2>
class do_swap
{
    // FOR 8 bytes:
    // LEN=8 (LEN/2==4)       <H><G><F><E><D><C><B><A>
    // OFF=4: FROM=0, TO=7 => [A]<G><F><E><D><C><B>[H]
    // OFF=3: FROM=1, TO=6 => [A][B]<F><E><D><C>[G][H]
    // OFF=2: FROM=2, TO=5 => [A][B][C]<E><D>[F][G][H]
    // OFF=1: FROM=3, TO=4 => [A][B][C][D][E][F][G][H]
    // OFF=0: FROM=4, TO=3 => DONE
public:
    enum consts {FROM=LEN/2-OFF, TO=(LEN-1)-FROM};
    using NXT=do_swap<LEN, OFF-1>;
// flip the first and last for the current iteration's range
    static void flip(std::array<std::byte, LEN>& b)
    {
        std::byte tmp=b[FROM];
        b[FROM]=b[TO];
        b[TO]=tmp;
        NXT::flip(b);
    }
};
template <int LEN>
class do_swap<LEN, 0> // STOP the template recursion
{
public:
    static void flip(std::array<std::byte, LEN>&)
    {
    }
};

template<std::integral T, std::endian TO, std::endian FROM=std::endian::native>
        requires ((TO==std::endian::big) || (TO==std::endian::little))
              && ((FROM==std::endian::big) || (FROM==std::endian::little))
class endian_swap
{
public:
    enum consts {BYTE_COUNT=sizeof(T)};
    static T cvt(const T integral)
    {
    // if FROM and TO are the same -- nothing to do
        if (TO==FROM)
        {
                return integral;
        }

    // endian::big --> endian::little is the same as endian::little --> endian::big
    // the bytes have to be reversed
    // memcpy seems to be the most supported way to do byte swaps in a defined way
        std::array<std::byte, BYTE_COUNT> bytes;
        std::memcpy(&bytes, &integral, BYTE_COUNT);
        do_swap<BYTE_COUNT>::flip(bytes);
        T ret;
        std::memcpy(&ret, &bytes, BYTE_COUNT);
        return ret;
    }
};

std::endian big()
{
    return std::endian::big;
}

std::endian little()
{
    return std::endian::little;
}

std::endian native()
{
    return std::endian::native;
}

long long swap_to_big(long long x)
{
    return endian_swap<long long, std::endian::big>::cvt(x);
}

long long swap_to_little(long long x)
{
    return endian_swap<long long, std::endian::little>::cvt(x);
}

void show(std::string label, long long x)
{
    std::cout << label << "\t: " << std::bitset<64>(x) << " (" << x << ")" << std::endl;
}

int main(int argv, char ** argc)
{
    long long init=0xF8FCFEFF7F3F1F0;
    long long to_big=swap_to_big(init);
    long long to_little=swap_to_little(init);
    show("Init", init);
    show(">big", to_big);
    show(">little", to_little);
}
Pigling answered 8/10, 2021 at 19:11 Comment(0)
M
-2

Here's how to read a double stored in IEEE 754 64 bit format, even if your host computer uses a different system.

/*
* read a double from a stream in ieee754 format regardless of host
*  encoding.
*  fp - the stream
*  bigendian - set to if big bytes first, clear for little bytes
*              first
*
*/
double freadieee754(FILE *fp, int bigendian)
{
    unsigned char buff[8];
    int i;
    double fnorm = 0.0;
    unsigned char temp;
    int sign;
    int exponent;
    double bitval;
    int maski, mask;
    int expbits = 11;
    int significandbits = 52;
    int shift;
    double answer;

    /* read the data */
    for (i = 0; i < 8; i++)
        buff[i] = fgetc(fp);
    /* just reverse if not big-endian*/
    if (!bigendian)
    {
        for (i = 0; i < 4; i++)
        {
            temp = buff[i];
            buff[i] = buff[8 - i - 1];
            buff[8 - i - 1] = temp;
        }
    }
    sign = buff[0] & 0x80 ? -1 : 1;
    /* exponet in raw format*/
    exponent = ((buff[0] & 0x7F) << 4) | ((buff[1] & 0xF0) >> 4);

    /* read inthe mantissa. Top bit is 0.5, the successive bits half*/
    bitval = 0.5;
    maski = 1;
    mask = 0x08;
    for (i = 0; i < significandbits; i++)
    {
        if (buff[maski] & mask)
            fnorm += bitval;

        bitval /= 2.0;
        mask >>= 1;
        if (mask == 0)
        {
            mask = 0x80;
            maski++;
        }
    }
    /* handle zero specially */
    if (exponent == 0 && fnorm == 0)
        return 0.0;

    shift = exponent - ((1 << (expbits - 1)) - 1); /* exponent = shift + bias */
    /* nans have exp 1024 and non-zero mantissa */
    if (shift == 1024 && fnorm != 0)
        return sqrt(-1.0);
    /*infinity*/
    if (shift == 1024 && fnorm == 0)
    {

#ifdef INFINITY
        return sign == 1 ? INFINITY : -INFINITY;
#endif
        return  (sign * 1.0) / 0.0;
    }
    if (shift > -1023)
    {
        answer = ldexp(fnorm + 1.0, shift);
        return answer * sign;
    }
    else
    {
        /* denormalised numbers */
        if (fnorm == 0.0)
            return 0.0;
        shift = -1022;
        while (fnorm < 1.0)
        {
            fnorm *= 2;
            shift--;
        }
        answer = ldexp(fnorm, shift);
        return answer * sign;
    }
}

For the rest of the suite of functions, including the write and the integer routines see my github project

https://github.com/MalcolmMcLean/ieee754

Moulton answered 22/5, 2017 at 22:40 Comment(0)
S
-2
void writeLittleEndianToBigEndian(void* ptrLittleEndian, void* ptrBigEndian , size_t bufLen )
{
    char *pchLittleEndian = (char*)ptrLittleEndian;

    char *pchBigEndian = (char*)ptrBigEndian;

    for ( size_t i = 0 ; i < bufLen ; i++ )    
        pchBigEndian[bufLen-1-i] = pchLittleEndian[i];
}

std::uint32_t row = 0x12345678;

char buf[4]; 

writeLittleEndianToBigEndian( &row, &buf, sizeof(row) );
Sikko answered 9/5, 2021 at 14:32 Comment(0)
A
-4

Look up bit shifting, as this is basically all you need to do to swap from little -> big endian. Then depending on the bit size, you change how you do the bit shifting.

Antigua answered 19/9, 2008 at 20:30 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.