How do I convert a value from host byte order to little endian?
Asked Answered
C

7

10

I need to convert a short value from the host byte order to little endian. If the target was big endian, I could use the htons() function, but alas - it's not.

I guess I could do:

swap(htons(val))

But this could potentially cause the bytes to be swapped twice, rendering the result correct but giving me a performance penalty which is not alright in my case.

Corneille answered 9/12, 2009 at 11:40 Comment(4)
don't worry about the performance penalty. If you do redundant things (swapping a int twice) the compiler will detect that and remove the code during its optimization phase.Thorite
What Nils said, but more careful approach is to first check generated code on optimization level you can afford(if you are stuck with debug then uh-oh). If double swap is optimized away, your performance problem is solved instantly.Arsphenamine
Will the compiler really manage to optimize this? I guess if swap() and htons() are macros or inline functions, it will, but otherwise?Corneille
It depends. Sometimes compilers can inline automatically (or indirectly through link time code generation), sometimes one must hint using (forced) inline or use cross-module compilation... Exact advice is difficult without trying first.Arsphenamine
D
4

Something like the following:

unsigned short swaps( unsigned short val)
{
    return ((val & 0xff) << 8) | ((val & 0xff00) >> 8);
}

/* host to little endian */

#define PLATFORM_IS_BIG_ENDIAN 1
#if PLATFORM_IS_LITTLE_ENDIAN
unsigned short htoles( unsigned short val)
{
    /* no-op on a little endian platform */
    return val;
}
#elif PLATFORM_IS_BIG_ENDIAN
unsigned short htoles( unsigned short val)
{
    /* need to swap bytes on a big endian platform */
    return swaps( val);
}
#else
unsigned short htoles( unsigned short val)
{
    /* the platform hasn't been properly configured for the */
    /* preprocessor to know if it's little or big endian    */

    /* use potentially less-performant, but always works option */

    return swaps( htons(val));
}
#endif

If you have a system that's properly configured (such that the preprocessor knows whether the target id little or big endian) you get an 'optimized' version of htoles(). Otherwise you get the potentially non-optimized version that depends on htons(). In any case, you get something that works.

Nothing too tricky and more or less portable.

Of course, you can further improve the optimization possibilities by implementing this with inline or as macros as you see fit.

You might want to look at something like the "Portable Open Source Harness (POSH)" for an actual implementation that defines the endianness for various compilers. Note, getting to the library requires going though a pseudo-authentication page (though you don't need to register to give any personal details): http://hookatooka.com/poshlib/

Detrain answered 9/12, 2009 at 16:46 Comment(1)
Just what I was looking for. I'm in a Linux/gcc environment, so by including <endian.h>, __BYTE_ORDER is either defined as __LITTLE_ENDIAN or __BIG_ENDIAN (or __PDP_ENDIAN)Corneille
V
9

Here is an article about endianness and how to determine it from IBM:

Writing endian-independent code in C: Don't let endianness "byte" you

It includes an example of how to determine endianness at run time ( which you would only need to do once )

const int i = 1;
#define is_bigendian() ( (*(char*)&i) == 0 )

int main(void) {
    int val;
    char *ptr;
    ptr = (char*) &val;
    val = 0x12345678;
    if (is_bigendian()) {
        printf(“%X.%X.%X.%X\n", u.c[0], u.c[1], u.c[2], u.c[3]);
    } else {
        printf(“%X.%X.%X.%X\n", u.c[3], u.c[2], u.c[1], u.c[0]);
    }
    exit(0);
}

The page also has a section on methods for reversing byte order:

short reverseShort (short s) {
    unsigned char c1, c2;

    if (is_bigendian()) {
        return s;
    } else {
        c1 = s & 255;
        c2 = (s >> 8) & 255;

        return (c1 << 8) + c2;
    }
}

;

short reverseShort (char *c) {
    short s;
    char *p = (char *)&s;

    if (is_bigendian()) {
        p[0] = c[0];
        p[1] = c[1];
    } else {
        p[0] = c[1];
        p[1] = c[0];
    }

    return s;
}
Voracity answered 9/12, 2009 at 11:57 Comment(5)
Is checking condition any better than doing extra swap?Juryman
The point of this is to 1. make the code portable, and 2. only do the swap on platforms that aren't already little endian. In the case where the system is already little endian you end up with one test and jump vs. 4 byte swaps.Voracity
One other thing, since the conditional never changes #define is_bigendian() ( (*(char*)&i) == 0 ) , I'm guessing the branch predictor on the cpu will probably eliminate it resulting on this effectively becoming a noop when the system is already little endian.Voracity
The nice idea about this technique is that it will work on systems that don't have the ntohs() functions. Also, swap generally only works with two parameters. This method allows for expansion into integer widths that have more than 2 octets.Hackney
In the first implementation of reverseShort(), c1 should be cast to short in the second return statement prior to shifting the bits. Otherwise, they'll just end up going to the big bit bucket in the sky.Hitt
J
6

Then you should know your endianness and call htons() conditionally. Actually, not even htons, but just swap bytes conditionally. Compile-time, of course.

Juryman answered 9/12, 2009 at 11:46 Comment(1)
+1. If you're sweating over the performance, #ifdef is your friend here.Retired
D
4

Something like the following:

unsigned short swaps( unsigned short val)
{
    return ((val & 0xff) << 8) | ((val & 0xff00) >> 8);
}

/* host to little endian */

#define PLATFORM_IS_BIG_ENDIAN 1
#if PLATFORM_IS_LITTLE_ENDIAN
unsigned short htoles( unsigned short val)
{
    /* no-op on a little endian platform */
    return val;
}
#elif PLATFORM_IS_BIG_ENDIAN
unsigned short htoles( unsigned short val)
{
    /* need to swap bytes on a big endian platform */
    return swaps( val);
}
#else
unsigned short htoles( unsigned short val)
{
    /* the platform hasn't been properly configured for the */
    /* preprocessor to know if it's little or big endian    */

    /* use potentially less-performant, but always works option */

    return swaps( htons(val));
}
#endif

If you have a system that's properly configured (such that the preprocessor knows whether the target id little or big endian) you get an 'optimized' version of htoles(). Otherwise you get the potentially non-optimized version that depends on htons(). In any case, you get something that works.

Nothing too tricky and more or less portable.

Of course, you can further improve the optimization possibilities by implementing this with inline or as macros as you see fit.

You might want to look at something like the "Portable Open Source Harness (POSH)" for an actual implementation that defines the endianness for various compilers. Note, getting to the library requires going though a pseudo-authentication page (though you don't need to register to give any personal details): http://hookatooka.com/poshlib/

Detrain answered 9/12, 2009 at 16:46 Comment(1)
Just what I was looking for. I'm in a Linux/gcc environment, so by including <endian.h>, __BYTE_ORDER is either defined as __LITTLE_ENDIAN or __BIG_ENDIAN (or __PDP_ENDIAN)Corneille
C
0

This trick should would: at startup, use ntohs with a dummy value and then compare the resulting value to the original value. If both values are the same, then the machine uses big endian, otherwise it is little endian.

Then, use a ToLittleEndian method that either does nothing or invokes ntohs, depending on the result of the initial test.

(Edited with the information provided in comments)

Concerned answered 9/12, 2009 at 11:46 Comment(3)
Did you notice that OP is concerned about performance penalty? ;-)Juryman
the OP only needs to do this check once, at startup.Karren
flyfishr64, and check the results each time. Never postpone till the runtime what you can do at compile time.Juryman
A
0

My rule-of-thumb performance guess is that depends whether you are little-endian-ising a big block of data in one go, or just one value:

If just one value, then the function call overhead is probably going to swamp the overhead of unnecessary byte-swaps, and that's even if the compiler doesn't optimise away the unnecessary byte swaps. Then you're maybe going to write the value as the port number of a socket connection, and try to open or bind a socket, which takes an age compared with any sort of bit-manipulation. So just don't worry about it.

If a large block, then you might worry the compiler won't handle it. So do something like this:

if (!is_little_endian()) {
    for (int i = 0; i < size; ++i) {
        vals[i] = swap_short(vals[i]);
    }
}

Or look into SIMD instructions on your architecture which can do it considerably faster.

Write is_little_endian() using whatever trick you like. I think the one Robert S. Barnes provides is sound, but since you usually know for a given target whether it's going to be big- or little-endian, maybe you should have a platform-specific header file, that defines it to be a macro evaluating either to 1 or 0.

As always, if you really care about performance, then look at the generated assembly to see whether pointless code has been removed or not, and time the various alternatives against each other to see what actually goes fastest.

Abeyta answered 9/12, 2009 at 13:38 Comment(0)
D
0

Unfortunately, there's not really a cross-platform way to determine a system's byte order at compile-time with standard C. I suggest adding a #define to your config.h (or whatever else you or your build system uses for build configuration).

A unit test to check for the correct definition of LITTLE_ENDIAN or BIG_ENDIAN could look like this:

#include <assert.h>
#include <limits.h>
#include <stdint.h>

void check_bits_per_byte(void)
{ assert(CHAR_BIT == 8); }

void check_sizeof_uint32(void)
{ assert(sizeof (uint32_t) == 4); }

void check_byte_order(void)
{
    static const union { unsigned char bytes[4]; uint32_t value; } byte_order =
        { { 1, 2, 3, 4 } };

    static const uint32_t little_endian = 0x04030201ul;
    static const uint32_t big_endian = 0x01020304ul;

    #ifdef LITTLE_ENDIAN
    assert(byte_order.value == little_endian);
    #endif

    #ifdef BIG_ENDIAN
    assert(byte_order.value == big_endian);
    #endif

    #if !defined LITTLE_ENDIAN && !defined BIG_ENDIAN
    assert(!"byte order unknown or unsupported");
    #endif
}

int main(void)
{
    check_bits_per_byte();
    check_sizeof_uint32();
    check_byte_order();
}
Doreathadoreen answered 9/12, 2009 at 19:29 Comment(0)
D
0

On many Linux systems, there is a <endian.h> or <sys/endian.h> with conversion functions. man page for ENDIAN(3)

Deth answered 1/12, 2018 at 0:47 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.