Boost Serialization Binary Archive giving incorrect output
Asked Answered
A

2

1

I am trying to Serialize a class.

Class definition:

class StartPeerSessionRequest {
public:
    StartPeerSessionRequest();
    virtual ~StartPeerSessionRequest();
    void composeRequestwithHardCodeValues();
    void save();
    stringstream serializedRequest;
    /*boost::serialization::binary_object serlreq;*/

private:
    StartPeerSessionRequest(const StartPeerSessionRequest &);

    uint16_t mProtocolVersion;
    uint16_t mSessionFlags;
    uint16_t mMaxResponseLength;
    string   mMake;
    string   mModel;
    string   mSerialNumber;
    uint8_t  mTrackDelay;
    string   mHeadUnitModel;
    string   mCarModelYear;
    string   mVin;
    uint16_t mVehicleMileage;
    uint8_t  mShoutFormat;
    uint8_t  mNotificationInterval;

    friend class boost::serialization::access;
    template <typename Archive> void serialize(Archive &ar, const unsigned int version);
};

StartPeerSessionRequest::StartPeerSessionRequest() {

    mProtocolVersion      = 1 * 10000 + 14 * 100 + 4;
    mSessionFlags         = 1;
    mMaxResponseLength    = 0;
    mMake                 = "MyMake";
    mModel                = "MyModel";
    mSerialNumber         = "10000";
    mTrackDelay           = 0;
    mHeadUnitModel        = "Headunit";
    mCarModelYear         = "2014";
    mVin                  = "1234567980";
    mVehicleMileage       = 1000;
    mShoutFormat          = 3;
    mNotificationInterval = 1;
}

template <class Archive> void StartPeerSessionRequest::serialize(Archive &ar, const unsigned int version) {
    ar & mProtocolVersion;
    ar & mSessionFlags;
    ar & mMaxResponseLength;
    ar & mMake;
    ar & mModel;
    ar & mSerialNumber;
    ar & mTrackDelay;
    ar & mHeadUnitModel;
    ar & mCarModelYear;
    ar & mVin;
    ar & mVehicleMileage;
    ar & mShoutFormat;
    ar & mNotificationInterval;
}

void StartPeerSessionRequest::save() {
    boost::archive::binary_oarchive oa(serlreq, boost::archive::no_header);
    oa << (*this);
    /*cout<<"\n binary_oarchive :"<<serlreq.size();*/

    boost::archive::text_oarchive ota(serializedRequest, boost::archive::no_header);
    ota << (*this);
    cout << "\n text_oarchive :" << serializedRequest.str() << "size :" << serializedRequest.str().size();
}

serializedRequest.str.size() provides me a length of 87

Actually it should provide me 65 bytes. (I've counted u can figure that out from the constructor)

I suspect it is appending lengths in between.

I have tried using text_archive also it doesnt work.

What I need is to just plain serialize class members as it is.

I guess i need to use some traits or wrappers.

Please let me know

Thanks

Anemograph answered 24/10, 2014 at 20:57 Comment(8)
what is your question?Bigamous
How do you define "incorrect" here? You seem to have a very specific idea of how the "correct" format should work--what is it? Also, there's no way to characterize the number of bytes even given a spec, because you're not showing any types in your question.Eloiseelon
added class defenition . I calculated the bytes from the data type and strlen.Anemograph
actually, on my 64bit box with boost 1_56, the size is 107 instead of 87.Nolie
if possible please try to print out a proper binary format in your machine using binary archive thanksAnemograph
what, @MarshelAbraham, what?! I did. You basically don't know what you're doing and this is what you tell me? Also: here's my calculations: pastebin. Now, you tell me how you arrived at 65 bytes? I can see how you'd expect 57, 63, or 75 bytes. But 65?Nolie
ok sorry for the trouble , I have moved on to use simple stringstream for serialization . Thanks for the help.Anemograph
@MarshelAbraham I've just added a proof of concept that uses Boost Spirit's binary formatters and parsers. I made it reflect those minimum sizes (57 and up) with live demos. Hope that helps.Nolie
N
1

Okay, so, just to see how I'd do, I've tried to reach the optimum sizes I calculated on the back of my napkin:

I can see how you'd expect 57, 63, or 75 bytes

mProtocolVersion      = 1*10000+14*100+4; // 2 bytes
mSessionFlags         = 1;                // 2 bytes
mMaxResponseLength    = 0;                // 2 bytes
mMake                 = "MyMake";         // 6 bytes + length
mModel                = "MyModel";        // 7 bytes + length
mSerialNumber         = "10000";          // 5 bytes + length
mTrackDelay           = 0;                // 1 byte
mHeadUnitModel        = "Headunit";       // 8 bytes + length
mCarModelYear         = "2014";           // 4 bytes + length
mVin                  = "1234567980";     // 10 bytes + length
mVehicleMileage       = 1000;             // 2 byte
mShoutFormat          = 3;                // 1 byte
mNotificationInterval = 1;                // 1 byte
// -------------------------------------- // 51 bytes + 6 x length

In this instance, I created binary serialization code using Boost Spirit (Karma for serialization and Qi for de-serialization). I made the size of the length field configurable (8,16,32 or 64 bit unsigned).

Here's a working proof of concept: Live On Coliru

generate()

The const generate member function delegates the work to helper functions in a separate namespace:

template <typename Container>
bool generate(Container& bytes) const {
    auto out = std::back_inserter(bytes);

    using my_serialization_helpers::do_generate;
    return do_generate(out, mProtocolVersion)
        && do_generate(out, mSessionFlags)
        && do_generate(out, mMaxResponseLength)
        && do_generate(out, mMake)
        && do_generate(out, mModel)
        && do_generate(out, mSerialNumber)
        && do_generate(out, mTrackDelay)
        && do_generate(out, mHeadUnitModel)
        && do_generate(out, mCarModelYear)
        && do_generate(out, mVin)
        && do_generate(out, mVehicleMileage)
        && do_generate(out, mShoutFormat)
        && do_generate(out, mNotificationInterval);
}

Note that

  • do_generate overloads can be freely added as required for future types
  • the container can easily be switched from e.g. std::vector<unsigned char>, to e.g. boost::interprocess::containers::string<char, char_traits<char>, boost::interprocess::allocator<char, boost::interprocess::managed_shared_memory::segment_manager> >.

parse()

The parse method is very similar except it delegates to do_parse overloads to do the work.

Testing

The test program roundtrips with all possible configurations:

  • 8-bit length field, net 57 bytes, with boost serialization: 70
  • 16-bit length field, net 63 bytes, with boost serialization: 76
  • 32-bit length field, net 75 bytes, with boost serialization: 88
  • 64-bit length field, net 99 bytes, with boost serialization: 112

As you can see it's not even that outrageous that the natural Boost Serialization solution would take 107 bytes on my system (it's only 8 bytes more than my last configuration).

Note also, that since the Karma generators all take any output iterator, it should be relatively easy to wire it directly into the low-level Boost Archive operations for performance and to avoid allocating intermediate storage.

Nolie answered 25/10, 2014 at 23:12 Comment(4)
Oh, PS. I forgot to mention that you should consider not paying too much attention and just running the data through a compression algorithm if (1.) packet sizes are favourable (2.) CPU is not the bottleneckNolie
thanks for ur effort , if i need to send my serialized data over a network i mean its a part of protocol, is it worth risking boost serialization techniques, i mean '75' bytes wat expected so 107 or 88 bytes coulde mess it up right ?Anemograph
@MarshelAbraham 75 bytes could mess it up just as easily. If you have an existing protocol, implement it. Period. No ifs, no buts. (Your expectation about Boost Serialization was just plain incorrect in that respect). You can of course use the Boost Spirit code I showed here as a starting point. It's not "risking" boost spirit, it's using it: You are in control of your program (or you should be!) so you cannot be the victim. Just implement the right thing correctly...Nolie
PS. You still haven't explained how (the hell) you came up with 65 bytes. If now you "require" 75 bytes instead, it looks like you want 32-bit length fields. Here's the code for that without boost serialization (note that this has no Boost dependencies at runtime because Spirit is completely header-only)Nolie
N
0

You seem to have some highly specific assumptions about how Boost Serialization should serialize to it's proprietary, non-portable binary format.

Boost serialization is much more highlevel, more or less specifically designed to deal with non-POD data. If you insist, you should be able to serialize an array of your POD type directly. In your question, though, the class is not at all POD and hence not bitwise serializable anyway.

For portable archives, see EOS Portable Archive.

Boost Archives have optional flags that suppress the format header:

enum archive_flags {
    no_header = 1,          // suppress archive header info
    no_codecvt = 2,         // suppress alteration of codecvt facet
    no_xml_tag_checking = 4 // suppress checking of xml tags - igored on saving
};

See Archive Models

Here's a backgrounder to see what introduces overhead over simple bitwise serialization:

Nolie answered 24/10, 2014 at 22:42 Comment(5)
thanks for the help, so what would you suggest me : I have a whole lot of classess. i need to serialize/deserialize its members to plaint binary format corresponding to uint8_t*. I thought Boost Serialization would be smart and do it for me :( What would you suggest..? I have added some more code in my questionAnemograph
@MarshelAbraham I'd fix my assumptions about Boost Serialization or spend the time writing serialization code that matches the expected/required formatNolie
I've seen a nice talk on using Boost Fusion toi "generate" the serialization code on CppCon 2014. This is what I'd consider for large numbers of classes.Nolie
I m trying to figure out Boost fusion, omg its heavy and I doubt if it supports classes, am worried about my code redablity too. Would love to get it done by the Boost Serialization. I think the size of bytes are getting appended, what does the no_codecvt mean thanksAnemograph
If you "doubt it supports classes", then what part of Fusion did you look at? Anyways, yes this would lead to some (~200 lines?) of library code to avoid having to write repetitious code elsewhere. This was in response to "I have a whole lot of classes".Nolie

© 2022 - 2024 — McMap. All rights reserved.