I need to write a function to convert a big endian integer to a little endian integer in C. I cannot use any library function. How would I do this?
Assuming what you need is a simple byte swap, try something like
Unsigned 16 bit conversion:
swapped = (num>>8) | (num<<8);
Unsigned 32-bit conversion:
swapped = ((num>>24)&0xff) | // move byte 3 to byte 0
((num<<8)&0xff0000) | // move byte 1 to byte 2
((num>>8)&0xff00) | // move byte 2 to byte 1
((num<<24)&0xff000000); // byte 0 to byte 3
This swaps the byte orders from positions 1234 to 4321. If your input was 0xdeadbeef
, a 32-bit endian swap might have output of 0xefbeadde
.
The code above should be cleaned up with macros or at least constants instead of magic numbers, but hopefully it helps as is
EDIT: as another answer pointed out, there are platform, OS, and instruction set specific alternatives which can be MUCH faster than the above. In the Linux kernel there are macros (cpu_to_be32 for example) which handle endianness pretty nicely. But these alternatives are specific to their environments. In practice endianness is best dealt with using a blend of available approaches
((num & 0xff) >> 8) | (num << 8)
, gcc 4.8.3 generates a single rol
instruction. And if 32 bit conversion is written as ((num & 0xff000000) >> 24) | ((num & 0x00ff0000) >> 8) | ((num & 0x0000ff00) << 8) | (num << 24)
, same compiler generates a single bswap
instruction. –
Lockyer struct byte_t reverse(struct byte_t b) { struct byte_t rev; rev.ba = b.bh; rev.bb = b.bg; rev.bc = b.bf; rev.bd = b.be; rev.be = b.bd; rev.bf = b.bc; rev.bg = b.bb; rev.bh = b.ba; return rev;}
where this is a bitfield with 8 fields 1 bit each. But I am not sure if thats as fast as the other suggestions. For ints use the union { int i; byte_t[sizeof(int)]; }
to reverse byte by byte in the integer. –
Overweening By including:
#include <byteswap.h>
you can get an optimized version of machine-dependent byte-swapping functions. Then, you can easily use the following functions:
__bswap_32 (uint32_t input)
or
__bswap_16 (uint16_t input)
#include <byteswap.h>
, see comment in the .h file itself. This post contains helpful information so I up-voted despite the author ignoring the OP requirement to not use a lib function. –
Banksia #include <byteswap.h>
is not part of the C standard library. –
Melodie #include <stdint.h>
//! Byte swap unsigned short
uint16_t swap_uint16( uint16_t val )
{
return (val << 8) | (val >> 8 );
}
//! Byte swap short
int16_t swap_int16( int16_t val )
{
return (val << 8) | ((val >> 8) & 0xFF);
}
//! Byte swap unsigned int
uint32_t swap_uint32( uint32_t val )
{
val = ((val << 8) & 0xFF00FF00 ) | ((val >> 8) & 0xFF00FF );
return (val << 16) | (val >> 16);
}
//! Byte swap int
int32_t swap_int32( int32_t val )
{
val = ((val << 8) & 0xFF00FF00) | ((val >> 8) & 0xFF00FF );
return (val << 16) | ((val >> 16) & 0xFFFF);
}
Update : Added 64bit byte swapping
int64_t swap_int64( int64_t val )
{
val = ((val << 8) & 0xFF00FF00FF00FF00ULL ) | ((val >> 8) & 0x00FF00FF00FF00FFULL );
val = ((val << 16) & 0xFFFF0000FFFF0000ULL ) | ((val >> 16) & 0x0000FFFF0000FFFFULL );
return (val << 32) | ((val >> 32) & 0xFFFFFFFFULL);
}
uint64_t swap_uint64( uint64_t val )
{
val = ((val << 8) & 0xFF00FF00FF00FF00ULL ) | ((val >> 8) & 0x00FF00FF00FF00FFULL );
val = ((val << 16) & 0xFFFF0000FFFF0000ULL ) | ((val >> 16) & 0x0000FFFF0000FFFFULL );
return (val << 32) | (val >> 32);
}
int32_t
and int64_t
variants, what is the reasoning behind the masking of ... & 0xFFFF
and ... & 0xFFFFFFFFULL
? Is there something going on with sign-extension here I'm not seeing? Also, why is swap_int64
returning uint64_t
? Shouldn't that be int64_t
? –
Colligan swap_int64
in your answer. +1 for the helpful answer, BTW! –
Colligan LL
are unnecessary in (u)swap_uint64()
much like an L
is not needed in (u)swap_uint32()
. The U
is not needed in uswap_uint64()
much like the U
is not needed in uswap_uint32()
–
Melodie Here's a fairly generic version; I haven't compiled it, so there are probably typos, but you should get the idea,
void SwapBytes(void *pv, size_t n)
{
assert(n > 0);
char *p = pv;
size_t lo, hi;
for(lo=0, hi=n-1; hi>lo; lo++, hi--)
{
char tmp=p[lo];
p[lo] = p[hi];
p[hi] = tmp;
}
}
#define SWAP(x) SwapBytes(&x, sizeof(x));
NB: This is not optimised for speed or space. It is intended to be clear (easy to debug) and portable.
Update 2018-04-04 Added the assert() to trap the invalid case of n == 0, as spotted by commenter @chux.
bswap
instruction by a decent X86 compiler with optimisation enabled. This version with a parameter for the size couldn't do that. –
Sudden SwapBytes()
to nicely handle the corner case of SwapBytes(pv, 0)
. With this code, SwapBytes(pv, 0)
leads to UB. –
Melodie for(lo=0, hi=n; lo<hi; ) { char tmp=a[--hi]; a[hi] = a[lo]; a[lo++] = tmp; }
to handle the zero case/ –
Melodie Edit: These are library functions. Following them is the manual way to do it.
I am absolutely stunned by the number of people unaware of __byteswap_ushort, __byteswap_ulong, and __byteswap_uint64. Sure they are Visual C++ specific, but they compile down to some delicious code on x86/IA-64 architectures. :)
Here's an explicit usage of the bswap
instruction, pulled from this page. Note that the intrinsic form above will always be faster than this, I only added it to give an answer without a library routine.
uint32 cq_ntohl(uint32 a) {
__asm{
mov eax, a;
bswap eax;
}
}
uint32 __fastcall(uint32 a) { __asm { mov eax, edx; bswap eax; } }
–
Fallacy If you need macros (e.g. embedded system):
#define SWAP_UINT16(x) (((x) >> 8) | ((x) << 8))
#define SWAP_UINT32(x) (((x) >> 24) | (((x) & 0x00FF0000) >> 8) | (((x) & 0x0000FF00) << 8) | ((x) << 24))
UINT
in their name. –
Philbrook As a joke:
#include <stdio.h>
int main (int argc, char *argv[])
{
size_t sizeofInt = sizeof (int);
int i;
union
{
int x;
char c[sizeof (int)];
} original, swapped;
original.x = 0x12345678;
for (i = 0; i < sizeofInt; i++)
swapped.c[sizeofInt - i - 1] = original.c[i];
fprintf (stderr, "%x\n", swapped.x);
return 0;
}
int i, size_t sizeofInt
and not the same type for both. –
Melodie here's a way using the SSSE3 instruction pshufb using its Intel intrinsic, assuming you have a multiple of 4 int
s:
unsigned int *bswap(unsigned int *destination, unsigned int *source, int length) {
int i;
__m128i mask = _mm_set_epi8(12, 13, 14, 15, 8, 9, 10, 11, 4, 5, 6, 7, 0, 1, 2, 3);
for (i = 0; i < length; i += 4) {
_mm_storeu_si128((__m128i *)&destination[i],
_mm_shuffle_epi8(_mm_loadu_si128((__m128i *)&source[i]), mask));
}
return destination;
}
Will this work / be faster?
uint32_t swapped, result;
((byte*)&swapped)[0] = ((byte*)&result)[3];
((byte*)&swapped)[1] = ((byte*)&result)[2];
((byte*)&swapped)[2] = ((byte*)&result)[1];
((byte*)&swapped)[3] = ((byte*)&result)[0];
char
, not byte
. –
Bandolier This code snippet can convert 32bit little Endian number to Big Endian number.
#include <stdio.h>
main(){
unsigned int i = 0xfafbfcfd;
unsigned int j;
j= ((i&0xff000000)>>24)| ((i&0xff0000)>>8) | ((i&0xff00)<<8) | ((i&0xff)<<24);
printf("unsigned int j = %x\n ", j);
}
((i>>24)&0xff) | ((i>>8)&0xff00) | ((i&0xff00)<<8) | (i<<24);
might be faster on some platforms (eg. recycling the AND mask constants). Most compilers would do this, though, but some simple compilers are not able to optimize it for you. –
Prosaism EDIT: This function only swaps the endianness of aligned 16 bit words. A function often necessary for UTF-16/UCS-2 encodings. EDIT END.
If you want to change the endianess of a memory block you can use my blazingly fast approach. Your memory array should have a size that is a multiple of 8.
#include <stddef.h>
#include <limits.h>
#include <stdint.h>
void ChangeMemEndianness(uint64_t *mem, size_t size)
{
uint64_t m1 = 0xFF00FF00FF00FF00ULL, m2 = m1 >> CHAR_BIT;
size = (size + (sizeof (uint64_t) - 1)) / sizeof (uint64_t);
for(; size; size--, mem++)
*mem = ((*mem & m1) >> CHAR_BIT) | ((*mem & m2) << CHAR_BIT);
}
This kind of function is useful for changing the endianess of Unicode UCS-2/UTF-16 files.
t know if it
s as fast as the suggestions but it wokrs: github.com/heatblazer/helpers/blob/master/utils.h –
Overweening CHAR_BIT
instead of 8
is curious as 0xFF00FF00FF00FF00ULL
is dependent on CHAR_BIT == 8
. Note that LL
not needed in the constant. –
Melodie CHAR_BIT
to augment the exposure of that macro. As for the LL, it's more an annotation than anything else. It's also a habit I catched from a long time ago with buggy compilers (pre standard) which would not do the right thing. –
Fraxinella Here's a function I have been using - tested and works on any basic data type:
// SwapBytes.h
//
// Function to perform in-place endian conversion of basic types
//
// Usage:
//
// double d;
// SwapBytes(&d, sizeof(d));
//
inline void SwapBytes(void *source, int size)
{
typedef unsigned char TwoBytes[2];
typedef unsigned char FourBytes[4];
typedef unsigned char EightBytes[8];
unsigned char temp;
if(size == 2)
{
TwoBytes *src = (TwoBytes *)source;
temp = (*src)[0];
(*src)[0] = (*src)[1];
(*src)[1] = temp;
return;
}
if(size == 4)
{
FourBytes *src = (FourBytes *)source;
temp = (*src)[0];
(*src)[0] = (*src)[3];
(*src)[3] = temp;
temp = (*src)[1];
(*src)[1] = (*src)[2];
(*src)[2] = temp;
return;
}
if(size == 8)
{
EightBytes *src = (EightBytes *)source;
temp = (*src)[0];
(*src)[0] = (*src)[7];
(*src)[7] = temp;
temp = (*src)[1];
(*src)[1] = (*src)[6];
(*src)[6] = temp;
temp = (*src)[2];
(*src)[2] = (*src)[5];
(*src)[5] = temp;
temp = (*src)[3];
(*src)[3] = (*src)[4];
(*src)[4] = temp;
return;
}
}
source
is aligned as needed - yet if that assumption does not hold, the code is UB. –
Melodie Stumbled upon a Reverse an N-bit quantity in parallel in 5 * lg(N) operations solution.
unsigned int v; // 32-bit word to reverse bit order
// swap odd and even bits
v = ((v >> 1) & 0x55555555) | ((v & 0x55555555) << 1);
// swap consecutive pairs
v = ((v >> 2) & 0x33333333) | ((v & 0x33333333) << 2);
// swap nibbles ...
v = ((v >> 4) & 0x0F0F0F0F) | ((v & 0x0F0F0F0F) << 4);
// swap bytes
v = ((v >> 8) & 0x00FF00FF) | ((v & 0x00FF00FF) << 8);
// swap 2-byte long pairs
v = ( v >> 16 ) | ( v << 16);
In this a variation with removing dependencies on constants is provided which is also O(log(n)).
#define CHAR_BIT (8)
uint64_t reverse_bits(uint64_t v) {
uint64_t mask = ~0;
uint64_t s = sizeof(uint64_t) * CHAR_BIT;
while((s >>= 1) > 0) {
mask ^= (mask << s);
v = (((v >> s) & mask) | ((v << s) & ~mask));
}
return v;
}
If we closely look at the operations we see that first we swap the odd and even bits then we swap consecutive pairs then nibbles and then bytes and 2 bytes and so on, the code executes in reverse order.
Now if we only restrict ourselves to swap until the bytes which is changing the endianness, a small change for while loop can result in quick implementation. (Change the (s >> 1) > 0
check condition). From this we can also reverse pairs, nibbles, bytes, 2 bytes and 4 bytes.
#define CHAR_BIT (8)
typedef enum {
REVERSE_BITS = 0,
REVERSE_PAIRS = 1,
REVERSE_NIBBLES = 2,
REVERSE_BYTES = 4,
REVERSE_WORDS = 8,
REVERSE_DWORDS = 16,
} eReverseOrder;
uint64_t reverse_bit_order(uint64_t v, eReverseOrder r) {
uint64_t mask = ~0;
uint64_t s = sizeof(uint64_t) * CHAR_BIT;
while((s >>= 1) > r) {
mask ^= (mask << s);
v = (((v >> s) & mask) | ((v << s) & ~mask));
}
return v;
}
uint64_t reverse_bits(uint64_t v) {
return reverse_bit_order(v, REVERSE_BITS);
}
uint64_t reverse_pairs(uint64_t v) {
return reverse_bit_order(v, REVERSE_PAIRS);
}
uint64_t reverse_nibbles(uint64_t v) {
return reverse_bit_order(v, REVERSE_NIBBLES);
}
uint64_t reverse_bytes(uint64_t v) {
return reverse_bit_order(v, REVERSE_BYTES);
}
uint64_t reverse_words(uint64_t v) {
return reverse_bit_order(v, REVERSE_WORDS);
}
uint64_t reverse_dwords(uint64_t v) {
return reverse_bit_order(v, REVERSE_DWORDS);
}
#include <limits.h>
to get the correct value for your compiler. –
Hippocampus I am working on an STM32 project, and this code did NOT work for me for uint16_t data type:
swapped = (num>>8) | (num<<8);
But this is working:
#define REV32(n) ( ((n&0xff000000)>>24) | (((n&0x00ff0000)<<8)>>16) | (((n&0x0000ff00)>>8)<<16) | ((n&0x000000ff) << 24) )
#define REV16(n) ((((n) >> 8) & 0xff) | (((n) & 0xff) << 8))
If the system is big endian:
For 16 bit values:
unsigned short big = value;
unsigned short little = ((big & 0xFF) << 8) | (big >> 8);
For 32 bit values:
unsigned int big = value;
unsigned int little = ((big & 0xFF) << 24)
| ((big & 0xFF00) << 8)
| ((big >> 8) & 0xFF00)
| (big >> 24);
This isn't the most efficient solution unless the compiler recognizes that this is byte level manipulation and generates byte swapping code. But it doesn't depend on any memory layout tricks and can be turned into a macro pretty easily.
© 2022 - 2024 — McMap. All rights reserved.
-O3
or at least-O2
. So you should write one simple function to do the swap withinline
and it automatically will do the work for you. – Carpo