Is there a way to specify a simpler JSON (de-)serialization for std::map using Cereal / C++?
Asked Answered
E

1

11

The project I'm working on is a C++ application that manages a large number of custom hardware devices. The app has a socket/port interface for the client (like a GUI). Each device type has its own well-defined JSON schema and we can serialize those with Cereal just fine.

But the app also needs to parse inbound JSON requests from the client. One portion of the request specifies device filter parameters, roughly analogous to a SQL 'WHERE' clause in which all the expressions are ANDed together. E.g.:

"filter": { "type": "sensor", "status": "critical" }

This would indicate that the client wants to perform an operation on every "sensor" device with a "critical" status. On the surface, it seemed like the C++ implementation for the filter parameters would be a std::map. But when we experimented with using Cereal to deserialize the object it failed. And when we serialize a hard-coded filter map it looks like this:

"filter": [
   { "key": "type", "value": "sensor" },
   { "key": "status", "value": "critical" }
]

Now I can understand why Cereal supports this kind of verbose serialization of map. After all, the key of a map could be a non-string type. But in this case the key is a string.

I'm not exactly keen on rewriting our interface spec and making our clients generate clearly non-idiomatic JSON just to satisfy Cereal. I'm new to Cereal and we're stuck on this point. Is there a way to tell Cereal to parse this filter as a std::map? Or maybe I'm asking it the wrong way. Is there some other stl container that we should be deserializing into?

Elmoelmore answered 21/3, 2014 at 21:26 Comment(2)
Two quick questions - do you expect your input to always be stored in a JSON object? You might have noticed that cereal uses JSON arrays for variable sized containers. Secondly, if you don't have control over this, do you have a way of knowing the number of name-value pairs that your query will return within the JSON object?Bran
@Azoth, yes, the protocol is defined as JSON. And no, there is no pre-defined number of pairs -- that's kind of the point. I'm suspecting at this point that Cereal just won't work and we'll pick some other way of deserializing the filter.Elmoelmore
B
10

Let me first address why cereal outputs a more verbose style than one you may desire. cereal is written to work with arbitrary serialization archives and takes a middle ground approach of satisfying all of them. Imagine that the key type is something entirely more complicated than a string or arithmetic type - how could we serialize it in a simple "key" : "value" way?

Also note that cereal expects to be the progenitor of any data it reads in.


That being said, what you want is entirely possible with cereal but there are a few obstacles:

The largest obstacle to overcome is the fact that your desired input serializes some unknown number of name-value pairs inside of a JSON object and not a JSON array. cereal was designed to use JSON arrays when dealing with containers that can hold a variable number of elements, since this made the most sense given the underlying rapidjson parser it uses.

Secondly, cereal currently does not expect the name in a name-value-pair to actually be loaded into memory - it just uses them as an organizational tool.


So rambling done, here is a fully working solution (could be made more elegant) to your problem with very minimal changes to cereal (this in fact uses a change that is slated for cereal 1.1, the current version is 1.0):

Add this function to JSONInputArchive:

//! Retrieves the current node name
/*! @return nullptr if no name exists */
const char * getNodeName() const
{
  return itsIteratorStack.back().name();
}

You can then write a specialization of the serialization for std::map (or unordered, whichever you prefer) for a pair of strings. Make sure to put this in the cereal namespace so that it can be found by the compiler. This code should exist in your own files somewhere:

namespace cereal
{
  //! Saving for std::map<std::string, std::string>
  template <class Archive, class C, class A> inline
  void save( Archive & ar, std::map<std::string, std::string, C, A> const & map )
  {
    for( const auto & i : map )
      ar( cereal::make_nvp( i.first, i.second ) );
  }

  //! Loading for std::map<std::string, std::string>
  template <class Archive, class C, class A> inline
  void load( Archive & ar, std::map<std::string, std::string, C, A> & map )
  {
    map.clear();

    auto hint = map.begin();
    while( true )
    {
      const auto namePtr = ar.getNodeName();

      if( !namePtr )
        break;

      std::string key = namePtr;
      std::string value; ar( value );
      hint = map.emplace_hint( hint, std::move( key ), std::move( value ) );
    }
  }
} // namespace cereal

This isn't the most elegant solution, but it does work well. I left everything generically templated but what I wrote above will only work on JSON archives given the changes made. Adding a similar getNodeName() to the XML archive would likely let it work there too, but obviously this wouldn't make sense for binary archives.

To make this clean, you'd want to put enable_if around that for the archives it works with. You would also need to modify the JSON archives in cereal to work with variable sized JSON objects. To get an idea of how to do this, look at how cereal sets up state in the archive when it gets a SizeTag to serialize. Basically you'd have to make the archive not open an array and instead open an object, and then create your own version of loadSize() that would see how big the object is (this would be a Member in rapidjson parlance).


To see the above in action, run this code:

int main()
{
  std::stringstream ss;
  {
    cereal::JSONOutputArchive ar(ss);
    std::map<std::string, std::string> filter = {{"type", "sensor"}, {"status", "critical"}};

    ar( CEREAL_NVP(filter) );
  }

  std::cout << ss.str() << std::endl;

  {
    cereal::JSONInputArchive ar(ss);
    cereal::JSONOutputArchive ar2(std::cout);

    std::map<std::string, std::string> filter;

    ar( CEREAL_NVP(filter) );
    ar2( CEREAL_NVP(filter) );
  }

  std::cout << std::endl;
  return 0;
}

and you will get:

{
    "filter": {
        "status": "critical",
        "type": "sensor"
    }
}
{
    "filter": {
        "status": "critical",
        "type": "sensor"
    }
}
Bran answered 23/3, 2014 at 5:40 Comment(4)
I'm trying to test this, but I've hit a couple of snags. First snag was that in our headers JSONInputArchive doesn't have a member itsIteratorStack. The data member list looks like this: private: ReadStream itsReadStream; //!< Rapidjson write stream std::vector<Iterator> itsValueStack; //!< Stack of values rapidjson::Document itsDocument; //!< Rapidjson document I rolled the bones and tried substituting itsValueStack for itsIteratorStack, but then the compiler complains about name:Elmoelmore
Ugh, not used to StackOverflow comment limitations. Compiler error after my substitution: /usr/local/include/cereal/archives/json.hpp:327:39: error: ‘const value_type’ has no member named ‘name’ return itsValueStack.back().name(); I suspect there's either (1) a version difference (and I can't determine what version of cereal we have -- there's no identifier in the headers), or (2) maybe there's another bit of code that's missing, like a name() method for class Iterator nested within JSONInputArchive.Elmoelmore
Are you using the latest version of cereal? 1.0 was released just a day or so ago and contains a large number of changes: github.com/USCiLab/cereal/releases.Bran
It works! We had to upgrade our cereal headers to 1.0 then it worked. Thanks so much for your help!Elmoelmore

© 2022 - 2024 — McMap. All rights reserved.