Boost Karma generator for composition of classes
Asked Answered
T

2

1

I've the following class diagram:

class diagram

There's some unused class like BinaryOperator, but my real code needs them so I want to keep them also in the example.

I want to use boost::karma in order to obtain a JSON representation of this. The JSON should be like the following one:

{
  "name": "Plus",
  "type": "Function",
  "arguments": [
    {
      "name": "IntegerValue",
      "type": "Value",
      "value": "4"
    },
    {
      "name": "Plus",
      "type": "Function",
      "arguments": [
        {
          "name": "IntegerValue",
          "type": "Value",
          "value": "5"
        },
        {
          "name": "IntegerValue",
          "type": "Value",
          "value": "6"
        }
      ]
    }
  ]
}

Since it's a simple example, I'd like to use BOOST_FUSION_ADAPT_ADT macro for my classes in order to modularize the generator.

I'm new to Karma, I've read the tutorial on boost site but I don't understand how to attack my problem. I can't find some good tutorial about that macro.

I don't want to use existing libraries for JSON because at first I want to learn Karma, and in second place JSON is only an example, I need to export my expression in many formats, and I can do it by simply changing generators while the code that uses BOOST_FUSION_ADAPT_ADT for my classes should be the same.

You can find the code for creating a sample expression. Where I need to start in order to solve my problem?

#include <boost/lexical_cast.hpp>
#include <iostream>
#include <vector>

class Expression {
public:

  virtual std::string getName() const = 0;
};

class Value : public Expression {
public:

  virtual std::string getValue() const = 0;
};

class IntegerValue : public Value {
public:

  IntegerValue(int value) : m_value(value) {}
  virtual std::string getName() const override { return "IntegerValue"; }
  virtual std::string getValue() const override { return boost::lexical_cast<std::string>(m_value); }

private:

  int m_value;
};

class Function : public Expression {
public:

  void addArgument(Expression* expression) { m_arguments.push_back(expression); }
  virtual std::string getName() const override { return m_name; }

protected:

  std::vector<Expression*> m_arguments;
  std::string m_name;
};

class Plus : public Function {
public:

  Plus() : Function() { m_name = "Plus"; }
};

///////////////////////////////////////////////////////////////////////////////

int main(int argc, char **argv) {

  // Build expression 4 + 5 + 6 as 4 + (5 + 6)
  Function* plus1 = new Plus();
  Function* plus2 = new Plus();
  Value* iv4   = new IntegerValue(4);
  Value* iv5   = new IntegerValue(5);
  Value* iv6   = new IntegerValue(6);
  plus2->addArgument(iv5);
  plus2->addArgument(iv6);
  plus1->addArgument(iv4);
  plus1->addArgument(plus2);

  // Generate json string here, but how?

  return 0;
}
Trilemma answered 19/10, 2017 at 21:10 Comment(0)
A
3

I'd advise against using Karma to generate JSON. I'd advise strongly against ADAPT_ADT (it's prone to very subtle UB bugs and it means you're trying to adapt something that wasn't designed for it. Just say no).

Here's my take on it. Let's take the high road and be as unintrusive as possible. That means

  • We can't just overload operator<< to print json (because you may want to naturally print the expressions instead)
  • It also means that what ever function is responsible for generating the JSON doesn't

    • have to bother with json implementation details
    • have to bother with pretty formatting
  • Finally, I wouldn't want to intrude on the expression tree with anything JSON specific. The most that could be acceptable is an opaque friend declaration.


A simple JSON facility:

This might well be the most simplistic JSON representation, but it does the required subset and makes a number of smart choices (supporting duplicate properties, retaining property order for example):

#include <boost/variant.hpp>
namespace json {
    // adhoc JSON rep
    struct Null {};
    using String = std::string;

    using Value = boost::make_recursive_variant<
        Null,
        String,
        std::vector<boost::recursive_variant_>,
        std::vector<std::pair<String, boost::recursive_variant_> >
    >::type;

    using Property = std::pair<String, Value>;
    using Object = std::vector<Property>;
    using Array = std::vector<Value>;
}

That's all. This is fully functional. Let's prove it


Pretty Printing JSON

Like with the Expression tree itself, let's not hardwire this, but instead create a pretty-printing IO manipulator:

#include <iomanip>
namespace json {

    // pretty print it
    struct pretty_io {
        using result_type = void;

        template <typename Ref>
        struct manip {
            Ref ref;
            friend std::ostream& operator<<(std::ostream& os, manip const& m) {
                pretty_io{os,""}(m.ref);
                return os;
            }
        };

        std::ostream& _os;
        std::string _indent;

        void operator()(Value const& v) const {
            boost::apply_visitor(*this, v);
        }
        void operator()(Null) const {
            _os << "null";
        }
        void operator()(String const& s) const {
            _os << std::quoted(s);
        }
        void operator()(Property const& p) const {
            _os << '\n' << _indent; operator()(p.first);
            _os << ": ";            operator()(p.second);
        }
        void operator()(Object const& o) const {
            pretty_io nested{_os, _indent+"  "};
            _os << "{";
            bool first = true;
            for (auto& p : o) { first||_os << ","; nested(p); first = false; }
            _os << "\n" << _indent << "}";
        }
        void operator()(Array const& o) const {
            pretty_io nested{_os, _indent+"  "};
            _os << "[\n" << _indent << "  ";
            bool first = true;
            for (auto& p : o) { first||_os << ",\n" << _indent << "  "; nested(p); first = false; }
            _os << "\n" << _indent << "]";
        }
    };

    Value to_json(Value const& v) { return v; }

    template <typename T, typename V = decltype(to_json(std::declval<T const&>()))>
    pretty_io::manip<V> pretty(T const& v) { return {to_json(v)}; }
}

The to_json thing dubs as a handy ADL-enabled extension point, you can already us it now:

std::cout << json::pretty("hello world"); // prints as a JSON String

Connecting it up

To make the following work:

std::cout << json::pretty(plus1);

All we need is the appropriate to_json overload. We could jot it all in there, but we might end up needing to "friend" a function named to_json, worse still, forward declare types from the json namespace (json::Value at the very least). That's too intrusive. So, let's add anothe tiny indirection:

auto to_json(Expression const* expression) {
    return serialization::call(expression);
}

The trick is to hide the JSON stuff inside an opaque struct that we can then befriend: struct serialization. The rest is straightforward:

struct serialization {
    static json::Value call(Expression const* e) {
        if (auto* f = dynamic_cast<Function const*>(e)) {
            json::Array args;
            for (auto& a : f->m_arguments)
                args.push_back(call(a));
            return json::Object {
                { "name", f->getName() },
                { "type", "Function" },
                { "arguments", args },
            };
        }

        if (auto* v = dynamic_cast<Value const*>(e)) {
            return json::Object {
                { "name", v->getName() },
                { "type", "Value" },
                { "value", v->getValue() },
            };
        }

        return {}; // Null in case we didn't implement a node type
    }
};

Full Demo

See it Live On Coliru

#include <boost/lexical_cast.hpp>
#include <iostream>
#include <iomanip>
#include <vector>

struct Expression {
    virtual std::string getName() const = 0;
};

struct Value : Expression {
    virtual std::string getValue() const = 0;
};

struct IntegerValue : Value {
    IntegerValue(int value) : m_value(value) {}
    virtual std::string getName() const override { return "IntegerValue"; }
    virtual std::string getValue() const override { return boost::lexical_cast<std::string>(m_value); }

  private:
    int m_value;
};

struct Function : Expression {
    void addArgument(Expression *expression) { m_arguments.push_back(expression); }
    virtual std::string getName() const override { return m_name; }

  protected:
    std::vector<Expression *> m_arguments;
    std::string m_name;

    friend struct serialization;
};

struct Plus : Function {
    Plus() : Function() { m_name = "Plus"; }
};

///////////////////////////////////////////////////////////////////////////////
// A simple JSON facility
#include <boost/variant.hpp>
namespace json {
    // adhoc JSON rep
    struct Null {};
    using String = std::string;

    using Value = boost::make_recursive_variant<
        Null,
        String,
        std::vector<boost::recursive_variant_>,
        std::vector<std::pair<String, boost::recursive_variant_> >
    >::type;

    using Property = std::pair<String, Value>;
    using Object = std::vector<Property>;
    using Array = std::vector<Value>;
}

///////////////////////////////////////////////////////////////////////////////
// Pretty Print manipulator
#include <iomanip>
namespace json {

    // pretty print it
    struct pretty_io {
        using result_type = void;

        template <typename Ref>
        struct manip {
            Ref ref;
            friend std::ostream& operator<<(std::ostream& os, manip const& m) {
                pretty_io{os,""}(m.ref);
                return os;
            }
        };

        std::ostream& _os;
        std::string _indent;

        void operator()(Value const& v) const {
            boost::apply_visitor(*this, v);
        }
        void operator()(Null) const {
            _os << "null";
        }
        void operator()(String const& s) const {
            _os << std::quoted(s);
        }
        void operator()(Property const& p) const {
            _os << '\n' << _indent; operator()(p.first);
            _os << ": ";            operator()(p.second);
        }
        void operator()(Object const& o) const {
            pretty_io nested{_os, _indent+"  "};
            _os << "{";
            bool first = true;
            for (auto& p : o) { first||_os << ","; nested(p); first = false; }
            _os << "\n" << _indent << "}";
        }
        void operator()(Array const& o) const {
            pretty_io nested{_os, _indent+"  "};
            _os << "[\n" << _indent << "  ";
            bool first = true;
            for (auto& p : o) { first||_os << ",\n" << _indent << "  "; nested(p); first = false; }
            _os << "\n" << _indent << "]";
        }
    };

    Value to_json(Value const& v) { return v; }

    template <typename T, typename V = decltype(to_json(std::declval<T const&>()))>
    pretty_io::manip<V> pretty(T const& v) { return {to_json(v)}; }
}

///////////////////////////////////////////////////////////////////////////////
// Expression -> JSON
struct serialization {
    static json::Value call(Expression const* e) {
        if (auto* f = dynamic_cast<Function const*>(e)) {
            json::Array args;
            for (auto& a : f->m_arguments)
                args.push_back(call(a));
            return json::Object {
                { "name", f->getName() },
                { "type", "Function" },
                { "arguments", args },
            };
        }

        if (auto* v = dynamic_cast<Value const*>(e)) {
            return json::Object {
                { "name", v->getName() },
                { "type", "Value" },
                { "value", v->getValue() },
            };
        }

        return {};
    }
};

auto to_json(Expression const* expression) {
    return serialization::call(expression);
}

int main() {
    // Build expression 4 + 5 + 6 as 4 + (5 + 6)
    Function *plus1 = new Plus();
    Function *plus2 = new Plus();
    Value *iv4 = new IntegerValue(4);
    Value *iv5 = new IntegerValue(5);
    Value *iv6 = new IntegerValue(6);
    plus2->addArgument(iv5);
    plus2->addArgument(iv6);
    plus1->addArgument(iv4);
    plus1->addArgument(plus2);

    // Generate json string here, but how?

    std::cout << json::pretty(plus1);
}

Output is picture-perfect from your question:

{
  "name": "Plus",
  "type": "Function",
  "arguments": [
    {
      "name": "IntegerValue",
      "type": "Value",
      "value": "4"
    },
    {
      "name": "Plus",
      "type": "Function",
      "arguments": [
        {
          "name": "IntegerValue",
          "type": "Value",
          "value": "5"
        },
        {
          "name": "IntegerValue",
          "type": "Value",
          "value": "6"
        }
      ]
    }
  ]
}
Anthonyanthophore answered 20/10, 2017 at 1:29 Comment(2)
PS. I forgot to mention that - obviously - you should consider a JSON library , but the approach would still work. If you're smart you can task the serialization struct with back-end/format agnostic accessors only, and separate all backends out to different TUs. I'll leave that as the proverbial exercise to the reader.Anthonyanthophore
Thanks, the fact is that json is only one of many formats that I must to use, some format is proprietary and there are no libraries, so I want to use an uniform way for all. I've decided to use json for the question because is known to community more than, for example, asciimath or other formats created by us.Trilemma
A
1

Thanks, the fact is that json is only one of many formats that I must to use, some format is proprietary and there are no libraries, so I want to use an uniform way for all. I've decided to use json for the question because is known to community more than, for example, asciimath or other formats created by us – Jepessen 9 hours ago

This changes nothing about my recommendation. If anything, it really emphasizes that you don't want arbitrary restrictions imposed.

The problems with Karma

  • Karma is an "inline" DSL for statically generated generators. They work well for statically typed things. Your AST uses dynamic polymorphism.

    That removes any chance of writing a succinct generator barring the use of many, complicated semantic actions. I don't remember writing many explicit answers related to Karma, but the problems with both dynamic polymorphism and semantic actions are much the same on the Qi side:

    The key draw backs all apply, except obviously that AST creation is not happening, so the performance effect of allocations is less severe than with Qi parsers.

    However, the same logic still stands: Karma generators are statically combined for efficiency. However your dynamic type hierarchy precludes most of that efficiency. In other words, you are not the target audience for Karma.

  • Karma has another structural limitation that will bite here, regardless of the way your AST is designed: it's (very) hard to make use of stateful rules to do pretty printing.

    This is, for me, a key reason to practically never use Karma. Even if pretty printing isn't a goal you can still get similar mileage just generating output visiting the AST using Boost Fusion directly (we used this in our project to generate different versions of OData XML and JSON representations of API types for use in restful APIs).

    Granted, there are some stateful generating tasks that have custom directives builtin to Karma, and sometimes they hit the sweet spot for rapid prototyping, e.g.

Let's Do It Anyways

Because I'm not a masochist, I'll do borrow a concept from the other answer: creating an intermediate representation that facilitates Karma a lot better.

In this sample the intermediate representation can be exceedingly simple, but I suspect your other requirements like "for example, asciimath or other formats created by us" will require a more detailed design.

///////////////////////////////////////////////////////////////////////////////
// A simple intermediate representation
#include <boost/variant.hpp>
namespace output_ast {
    struct Function;
    struct Value;
    using Expression = boost::variant<Function, Value>;

    using Arguments = std::vector<Expression>;

    struct Value    { std::string name, value; };
    struct Function { std::string name; Arguments args; };
}

Firstly, because we're going to use Karma, we do need to actually adapt the intermediate representation:

#include <boost/fusion/include/struct.hpp>
BOOST_FUSION_ADAPT_STRUCT(output_ast::Value, name, value)
BOOST_FUSION_ADAPT_STRUCT(output_ast::Function, name, args)

A generator

Here's the simplest generator I can think of, give and take 2 things:

  • I have tweaked it for considerable time to get some "readable" format. It gets simpler if you remove all insignificant whitespace.
  • I opted to not store redundant information (such as the static "type" representation in the intermediate representation). Doing so would slightly uncomplicate, mostly by making the type rule more similar to name and value.
namespace karma_json {
    namespace ka = boost::spirit::karma;

    template <typename It>
    struct Generator : ka::grammar<It, output_ast::Expression()> {
        Generator() : Generator::base_type(expression) {
            expression = function|value;

            function
                = "{\n  " << ka::delimit(",\n  ") 
                   [name << type(+"Function") ]
                << arguments 
                << "\n}"
                ;

            arguments = "\"arguments\": [" << -(("\n  " << expression) % ",") << ']';

            value
                = "{\n  " << ka::delimit(",\n  ") 
                    [name << type(+"Value") ]
                << value_ 
                << "\n}"
                ;

            type   = "\"type\":\"" << ka::string(ka::_r1) << "\"";
            string = '"' << *('\\' << ka::char_("\\\"") | ka::char_) << '"';
            name   = "\"name\":" << string;
            value_ = "\"value\":" << string;
        }

      private:
        ka::rule<It, output_ast::Expression()> expression;
        ka::rule<It, output_ast::Function()> function;
        ka::rule<It, output_ast::Arguments()> arguments;
        ka::rule<It, output_ast::Value()> value;
        ka::rule<It, std::string()> string, name, value_;
        ka::rule<It, void(std::string)> type;
    };
}

Post Scriptum

I was making the simplified take for completeness. And ran into this excellent demonstration of completely unobvious attribute handling quirks. The following (just stripping whitespace handling) does not work:

function = '{' << ka::delimit(',') [name << type] << arguments << '}';
value = '{' << ka::delimit(',') [name << type] << value_ << '}' ;

You can read the error novel here in case you like drama. The problem is that the delimit[] block magically consolidates the attributes into a single string (huh). The error message reflects that the string attribute has not been consumed when e.g. starting the arguments generator.

The most direct way to treat the symptom would be to break up the attribute, but there's no real way:

function = '{' << ka::delimit(',') [name << ka::eps << type] << arguments << '}';
value = '{' << ka::delimit(',') [name << ka::eps << type] << value_ << '}' ;

No difference

function = '{' << ka::delimit(',') [ka::as_string[name] << ka::as_string[type]] << arguments << '}';
value = '{' << ka::delimit(',') [ka::as_string[name] << ka::as_string[type]] << value_ << '}' ;

Would be nice if it actually worked. No amount of adding includes or replacing with incantations like ka::as<std::string>()[...] made the compilation error go away.²

So, to just end this sob-story, we'll stoop to the mind-numbingly tedious:

function = '{' << name << ',' << type << ',' << arguments << '}';
arguments = "\"arguments\":[" << -(expression % ',') << ']';

See the section labeled "Simplified Version" below for the live demo.

Using it

The shortest way to generate using that grammar is to create the intermediate representation:

///////////////////////////////////////////////////////////////////////////////
// Expression -> output_ast
struct serialization {
    static output_ast::Expression call(Expression const* e) {
        if (auto* f = dynamic_cast<Function const*>(e)) {
            output_ast::Arguments args;
            for (auto& a : f->m_arguments) args.push_back(call(a));
            return output_ast::Function { f->getName(), args };
        }

        if (auto* v = dynamic_cast<Value const*>(e)) {
            return output_ast::Value { v->getName(), v->getValue() };
        }

        return {};
    }
};

auto to_output(Expression const* expression) {
    return serialization::call(expression);
}

And use that:

using It = boost::spirit::ostream_iterator;
std::cout << format(karma_json::Generator<It>{}, to_output(plus1));

Full Demo

Live On Wandbox¹

#include <boost/lexical_cast.hpp>
#include <iostream>
#include <vector>

struct Expression {
    virtual std::string getName() const = 0;
};

struct Value : Expression {
    virtual std::string getValue() const = 0;
};

struct IntegerValue : Value {
    IntegerValue(int value) : m_value(value) {}
    virtual std::string getName() const override { return "IntegerValue"; }
    virtual std::string getValue() const override { return boost::lexical_cast<std::string>(m_value); }

  private:
    int m_value;
};

struct Function : Expression {
    void addArgument(Expression *expression) { m_arguments.push_back(expression); }
    virtual std::string getName() const override { return m_name; }

  protected:
    std::vector<Expression *> m_arguments;
    std::string m_name;

    friend struct serialization;
};

struct Plus : Function {
    Plus() : Function() { m_name = "Plus"; }
};

///////////////////////////////////////////////////////////////////////////////
// A simple intermediate representation
#include <boost/variant.hpp>
namespace output_ast {
    struct Function;
    struct Value;
    using Expression = boost::variant<Function, Value>;

    using Arguments = std::vector<Expression>;

    struct Value    { std::string name, value; };
    struct Function { std::string name; Arguments args; };
}

#include <boost/fusion/include/struct.hpp>
BOOST_FUSION_ADAPT_STRUCT(output_ast::Value, name, value)
BOOST_FUSION_ADAPT_STRUCT(output_ast::Function, name, args)

#include <boost/spirit/include/karma.hpp>
namespace karma_json {
    namespace ka = boost::spirit::karma;

    template <typename It>
    struct Generator : ka::grammar<It, output_ast::Expression()> {
        Generator() : Generator::base_type(expression) {
            expression = function|value;

            function
                = "{\n  " << ka::delimit(",\n  ") 
                   [name << type(+"Function") ]
                << arguments 
                << "\n}"
                ;

            arguments = "\"arguments\": [" << -(("\n  " << expression) % ",") << ']';

            value
                = "{\n  " << ka::delimit(",\n  ") 
                    [name << type(+"Value") ]
                << value_ 
                << "\n}"
                ;

            type   = "\"type\":\"" << ka::string(ka::_r1) << "\"";
            string = '"' << *('\\' << ka::char_("\\\"") | ka::char_) << '"';
            name   = "\"name\":" << string;
            value_ = "\"value\":" << string;
        }

      private:
        ka::rule<It, output_ast::Expression()> expression;
        ka::rule<It, output_ast::Function()> function;
        ka::rule<It, output_ast::Arguments()> arguments;
        ka::rule<It, output_ast::Value()> value;
        ka::rule<It, std::string()> string, name, value_;
        ka::rule<It, void(std::string)> type;
    };
}

///////////////////////////////////////////////////////////////////////////////
// Expression -> output_ast
struct serialization {
    static output_ast::Expression call(Expression const* e) {
        if (auto* f = dynamic_cast<Function const*>(e)) {
            output_ast::Arguments args;
            for (auto& a : f->m_arguments) args.push_back(call(a));
            return output_ast::Function { f->getName(), args };
        }

        if (auto* v = dynamic_cast<Value const*>(e)) {
            return output_ast::Value { v->getName(), v->getValue() };
        }

        return {};
    }
};

auto to_output(Expression const* expression) {
    return serialization::call(expression);
}

int main() {
    // Build expression 4 + 5 + 6 as 4 + (5 + 6)
    Function *plus1 = new Plus();
    Function *plus2 = new Plus();
    Value *iv4 = new IntegerValue(4);
    Value *iv5 = new IntegerValue(5);
    Value *iv6 = new IntegerValue(6);
    plus2->addArgument(iv5);
    plus2->addArgument(iv6);
    plus1->addArgument(iv4);
    plus1->addArgument(plus2);

    // Generate json string here, but how?
    using It = boost::spirit::ostream_iterator;
    std::cout << format(karma_json::Generator<It>{}, to_output(plus1));
}

The Output

The generator is being as as readable/robust/functional as I'd like (there are quirks related to delimiters, there are issues when type contain characters that would need to be quoted, there's no stateful indentation).

The result doesn't look as expected, though it's valid JSON:

{
  "name":"Plus",
  "type":"Function",
  "arguments": [
  {
  "name":"IntegerValue",
  "type":"Value",
  "value":"4"
},
  {
  "name":"Plus",
  "type":"Function",
  "arguments": [
  {
  "name":"IntegerValue",
  "type":"Value",
  "value":"5"
},
  {
  "name":"IntegerValue",
  "type":"Value",
  "value":"6"
}]
}]
}

Fixing it is... a nice challenge if you want to try it.

The Simplified Version

The simplified version, complete with attribute-handling workaround documented above:

Live On Coliru

namespace karma_json {
    namespace ka = boost::spirit::karma;

    template <typename It>
    struct Generator : ka::grammar<It, output_ast::Expression()> {
        Generator() : Generator::base_type(expression) {
            expression = function|value;

            function = '{' << name << ',' << type << ',' << arguments << '}';
            arguments = "\"arguments\":[" << -(expression % ',') << ']';

            value = '{' << name << ',' << type << ',' << value_ << '}' ;

            string = '"' << *('\\' << ka::char_("\\\"") | ka::char_) << '"';
            type   = "\"type\":" << string;
            name   = "\"name\":" << string;
            value_ = "\"value\":" << string;
        }

      private:
        ka::rule<It, output_ast::Expression()> expression;
        ka::rule<It, output_ast::Function()> function;
        ka::rule<It, output_ast::Arguments()> arguments;
        ka::rule<It, output_ast::Value()> value;
        ka::rule<It, std::string()> string, name, type, value_;
    };
}

Yields the following output:

{"name":"Plus","type":"Function","arguments":[{"name":"IntegerValue","type":"Value","value":"4"},{"name":"Plus","type":"Function","arguments":[{"name":"IntegerValue","type":"Value","value":"5"},{"name":"IntegerValue","type":"Value","value":"6"}]}]}

I'm inclined to think this is a much better cost/benefit ratio than the failed attempt at "pretty" formatting. But the real story here is that the maintenance cost is through the roof anyways.


¹ Interestingly, Coliru exceeds the compilation time... This too could be an argument guiding your design descisions

² makes you wonder how many people actually use Karma day-to-day

Anthonyanthophore answered 20/10, 2017 at 18:1 Comment(1)
I hesitate how much time I wasted battling Karma to get a ADT/polymorphic version done. Here's the thing that finally worked: wandbox.org/permlink/H4ybpbM5hEi2c78x TL/DR: you can not use polymorphic types with Karma, certainly not with abstract types. All attributes need to be copyable and default-constructible. BOOST_ADAPT_ADT_NAMED doesn't work for immutable attributes (gasp, why) etc. Anyways, now you have all the info to base decisions on.Anthonyanthophore

© 2022 - 2024 — McMap. All rights reserved.