Deserialize a byte array to a struct
Asked Answered
M

7

7

I get a transmission over the network that's an array of chars/bytes. It contains a header and some data. I'd like to map the header onto a struct. Here's an example:

#pragma pack(1)

struct Header
{
    unsigned short bodyLength;
    int msgID;
    unsigned short someOtherValue;
    unsigned short protocolVersion;
};

int main()
{
    boost::array<char, 128> msgBuffer;
    Header header;

    for(int x = 0; x < sizeof(Header); x++)
        msgBuffer[x] = 0x01; // assign some values

    memcpy(&header, msgBuffer.data(), sizeof(Header));

    system("PAUSE");    

    return 0;
}

Will this always work assuming the structure never contains any variable length fields? Is there a platform independent / idiomatic way of doing this?

Note:

I have seen quite a few libraries on the internet that let you serialize/deserialize, but I get the impression that they can only deserialize something if it has ben previously serialized with the same library. Well, I have no control over the format of the transmission. I'm definitely going to get a byte/char array where all the values just follow upon each other.

Muth answered 6/2, 2009 at 11:15 Comment(0)
D
6

Some processors require that certain types are properly aligned. They will not accept the specified packing and generate a hardware trap.

And even on common x86 packed structures can cause the code to run more slowly.

Also you will have to take care when working with different endianness platforms.

By the way, if you want a simple and platform-independent communication mechanism with bindings to many programming languages, then have a look at YAMI.

Dakota answered 6/2, 2009 at 11:30 Comment(0)
V
6

Just plain copying is very likely to break, at least if the data can come from a different architecture (or even just compiler) than what you are on. This is for reasons of:

That second link is GCC-specific, but this applies to all compilers.

I recommend reading the fields byte-by-byte, and assembling larger field (ints, etc) from those bytes. This gives you control of endianness and padding.

Vonvona answered 6/2, 2009 at 11:35 Comment(0)
P
2

The #pragma pack(1) directive should work on most compilers but you can check by working out how big your data structure should be (10 in your case if my maths is correct) and using printf("%d", sizeof(Header)); to check that the packing is being done.

As others have said you still need to be wary of Endianness if you're going between architectures.

Playoff answered 6/2, 2009 at 11:35 Comment(0)
M
1

I strongly disagree with the idea of reading byte by byte. If you take care of the structure packing in the struct declaration, you can copy into the struct without a problem. For the endiannes problem again reading byte by byte solves the problem but does not give you a generic solution. That method is very lame. I have done something like this before for a similar job and it worked allright without a glitch.

Think about this. I have a structure, I also have a corresponding definition of that structure. You may construct this by hand but I have had written a parser for this and used it for other things as well.

For example, the definition of the structure you gave above is "s i s s". ( s = short , i = int ) Then I give the struct address , this definition and structure packing option of this struct to a special function that deals with the endiannes thing and voila it is done.

SwitchEndianToBig(&header, "s i s s", 4); // 4 = structure packing option

Mojave answered 6/2, 2009 at 12:27 Comment(0)
M
0

I know who I'm communicating with, so I don't really have to worry about endianness. But I like to stay away from compiler specific commands anyway.

So how about this:

const int kHeaderSizeInBytes = 6;

struct Header
{
    unsigned short bodyLength;
    unsigned short msgID;
    unsigned short protocolVersion; 

    unsigned short convertUnsignedShort(char inputArray[sizeof(unsigned short)])
        {return (((unsigned char) (inputArray[0])) << 8) + (unsigned char)(inputArray[1]);}

    void operator<<(char inputArray[kHeaderSizeInBytes])
    {
        bodyLength = convertUnsignedShort(inputArray);
        msgID = convertUnsignedShort(inputArray + sizeof(bodyLength));
        protocolVersion = convertUnsignedShort(inputArray + sizeof(bodyLength) + sizeof(msgID));
    }
};

int main()
{
    boost::array<char, 128> msgBuffer;
    Header header;

    for(int x = 0; x < kHeaderSizeInBytes; x++)
        msgBuffer[x] = x;

    header << msgBuffer.data();

    system("PAUSE");    

    return 0;
}

Gets rid of the pragma, but it isn't as general purpose as I'd like. Every time you add a field to the header you have to modify the << function. Can you iterate over struct fields somehow, get the type of the field and call the corresponding function?

Muth answered 6/2, 2009 at 12:16 Comment(1)
About iterating over struct fields: Do you have to use a struct? I'm asking because replacing it by a tuple would allow iteration of the fields.Maryjanemaryjo
S
0

Tell me if I'm wrong, but AFAIK, doing it that way will guarantee you that the data is correct - assuming the types have the same size on your different platforms :

#include <array>
#include <algorithm>

//#pragma pack(1) // not needed

struct Header
{
    unsigned short bodyLength;
    int msgID;
    unsigned short someOtherValue;
    unsigned short protocolVersion;
    float testFloat;

    Header() : bodyLength(42), msgID(34), someOtherValue(66), protocolVersion(69), testFloat( 3.14f ) {}
};

int main()
{
    std::tr1::array<char, 128> msgBuffer;
    Header header;

    const char* rawData = reinterpret_cast< const char* >( &header );

    std::copy( rawData, rawData + sizeof(Header), msgBuffer.data()); // assuming msgBuffer is always big enough

    system("PAUSE");    

    return 0;
}

If the types are different on your targeted plateforms, you have to uses aliases (typedef) for each type to be sure the size of each used type is the same.

Seafood answered 6/2, 2009 at 18:7 Comment(1)
Uhm, you're going the other way around (converting a struct to a byte array). And even that doesn't quite work as it should because you're copying the padding of the struct into the array.Muth
L
0

My version based on @drby example

#include <array>
#include <iostream>
const int kHeaderSizeInBytes = 6;

struct Header
{
    uint16_t bodyLength;
    uint16_t msgID;
    uint16_t protocolVersion; 

    uint16_t convertUnsignedShort(char inputArray[sizeof(uint16_t)])
        {return (((uint8_t) (inputArray[0])) << 8) + (uint8_t)(inputArray[1]);}

    void operator<<(char inputArray[kHeaderSizeInBytes])
    {
        bodyLength = convertUnsignedShort(inputArray);
        msgID = convertUnsignedShort(inputArray + sizeof(bodyLength));
        protocolVersion = convertUnsignedShort(inputArray + sizeof(bodyLength) + sizeof(msgID));
    }
};

int main()
{
    //Prepare array
    std::array<char, kHeaderSizeInBytes> msgBuffer;
    msgBuffer[0]= 0x00;
    msgBuffer[1]= 0x11;
    msgBuffer[2]= 0x22;
    msgBuffer[3]= 0x33;
    msgBuffer[4]= 0x44;
    msgBuffer[5]= 0x55;
    //Array to struct
    Header header;
    header << msgBuffer.data(); 
    
    printf("header fields 0x%04X, 0x%04X, 0x%04X", header.bodyLength, header.msgID, header.protocolVersion);

    return 0;
}

out: header fields 0x0011, 0x2233, 0x4455

Leopoldine answered 20/5 at 6:23 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.