Thanks, the fact is that json is only one of many formats that I must to use, some format is proprietary and there are no libraries, so I want to use an uniform way for all. I've decided to use json for the question because is known to community more than, for example, asciimath or other formats created by us – Jepessen 9 hours ago
This changes nothing about my recommendation. If anything, it really emphasizes that you don't want arbitrary restrictions imposed.
The problems with Karma
Karma is an "inline" DSL for statically generated generators. They work well for statically typed things. Your AST uses dynamic polymorphism.
That removes any chance of writing a succinct generator barring the use of many, complicated semantic actions. I don't remember writing many explicit answers related to Karma, but the problems with both dynamic polymorphism and semantic actions are much the same on the Qi side:
The key draw backs all apply, except obviously that AST creation is not happening, so the performance effect of allocations is less severe than with Qi parsers.
However, the same logic still stands: Karma generators are statically combined for efficiency. However your dynamic type hierarchy precludes most of that efficiency. In other words, you are not the target audience for Karma.
Karma has another structural limitation that will bite here, regardless of the way your AST is designed: it's (very) hard to make use of stateful rules to do pretty printing.
This is, for me, a key reason to practically never use Karma. Even if pretty printing isn't a goal you can still get similar mileage just generating output visiting the AST using Boost Fusion directly (we used this in our project to generate different versions of OData XML and JSON representations of API types for use in restful APIs).
Granted, there are some stateful generating tasks that have custom directives builtin to Karma, and sometimes they hit the sweet spot for rapid prototyping, e.g.
Let's Do It Anyways
Because I'm not a masochist, I'll do borrow a concept from the other answer: creating an intermediate representation that facilitates Karma a lot better.
In this sample the intermediate representation can be exceedingly simple, but I suspect your other requirements like "for example, asciimath or other formats created by us" will require a more detailed design.
///////////////////////////////////////////////////////////////////////////////
// A simple intermediate representation
#include <boost/variant.hpp>
namespace output_ast {
struct Function;
struct Value;
using Expression = boost::variant<Function, Value>;
using Arguments = std::vector<Expression>;
struct Value { std::string name, value; };
struct Function { std::string name; Arguments args; };
}
Firstly, because we're going to use Karma, we do need to actually adapt the intermediate representation:
#include <boost/fusion/include/struct.hpp>
BOOST_FUSION_ADAPT_STRUCT(output_ast::Value, name, value)
BOOST_FUSION_ADAPT_STRUCT(output_ast::Function, name, args)
A generator
Here's the simplest generator I can think of, give and take 2 things:
- I have tweaked it for considerable time to get some "readable" format. It gets simpler if you remove all insignificant whitespace.
- I opted to not store redundant information (such as the static "type" representation in the intermediate representation). Doing so would slightly uncomplicate, mostly by making the
type
rule more similar to name
and value
.
namespace karma_json {
namespace ka = boost::spirit::karma;
template <typename It>
struct Generator : ka::grammar<It, output_ast::Expression()> {
Generator() : Generator::base_type(expression) {
expression = function|value;
function
= "{\n " << ka::delimit(",\n ")
[name << type(+"Function") ]
<< arguments
<< "\n}"
;
arguments = "\"arguments\": [" << -(("\n " << expression) % ",") << ']';
value
= "{\n " << ka::delimit(",\n ")
[name << type(+"Value") ]
<< value_
<< "\n}"
;
type = "\"type\":\"" << ka::string(ka::_r1) << "\"";
string = '"' << *('\\' << ka::char_("\\\"") | ka::char_) << '"';
name = "\"name\":" << string;
value_ = "\"value\":" << string;
}
private:
ka::rule<It, output_ast::Expression()> expression;
ka::rule<It, output_ast::Function()> function;
ka::rule<It, output_ast::Arguments()> arguments;
ka::rule<It, output_ast::Value()> value;
ka::rule<It, std::string()> string, name, value_;
ka::rule<It, void(std::string)> type;
};
}
Post Scriptum
I was making the simplified take for completeness. And ran into this excellent demonstration of completely unobvious attribute handling quirks. The following (just stripping whitespace handling) does not work:
function = '{' << ka::delimit(',') [name << type] << arguments << '}';
value = '{' << ka::delimit(',') [name << type] << value_ << '}' ;
You can read the error novel here in case you like drama. The problem is that the delimit[]
block magically consolidates the attributes into a single string (huh). The error message reflects that the string attribute has not been consumed when e.g. starting the arguments
generator.
The most direct way to treat the symptom would be to break up the attribute, but there's no real way:
function = '{' << ka::delimit(',') [name << ka::eps << type] << arguments << '}';
value = '{' << ka::delimit(',') [name << ka::eps << type] << value_ << '}' ;
No difference
function = '{' << ka::delimit(',') [ka::as_string[name] << ka::as_string[type]] << arguments << '}';
value = '{' << ka::delimit(',') [ka::as_string[name] << ka::as_string[type]] << value_ << '}' ;
Would be nice if it actually worked. No amount of adding includes or replacing with incantations like ka::as<std::string>()[...]
made the compilation error go away.²
So, to just end this sob-story, we'll stoop to the mind-numbingly tedious:
function = '{' << name << ',' << type << ',' << arguments << '}';
arguments = "\"arguments\":[" << -(expression % ',') << ']';
See the section labeled "Simplified Version" below for the live demo.
Using it
The shortest way to generate using that grammar is to create the intermediate representation:
///////////////////////////////////////////////////////////////////////////////
// Expression -> output_ast
struct serialization {
static output_ast::Expression call(Expression const* e) {
if (auto* f = dynamic_cast<Function const*>(e)) {
output_ast::Arguments args;
for (auto& a : f->m_arguments) args.push_back(call(a));
return output_ast::Function { f->getName(), args };
}
if (auto* v = dynamic_cast<Value const*>(e)) {
return output_ast::Value { v->getName(), v->getValue() };
}
return {};
}
};
auto to_output(Expression const* expression) {
return serialization::call(expression);
}
And use that:
using It = boost::spirit::ostream_iterator;
std::cout << format(karma_json::Generator<It>{}, to_output(plus1));
Full Demo
Live On Wandbox¹
#include <boost/lexical_cast.hpp>
#include <iostream>
#include <vector>
struct Expression {
virtual std::string getName() const = 0;
};
struct Value : Expression {
virtual std::string getValue() const = 0;
};
struct IntegerValue : Value {
IntegerValue(int value) : m_value(value) {}
virtual std::string getName() const override { return "IntegerValue"; }
virtual std::string getValue() const override { return boost::lexical_cast<std::string>(m_value); }
private:
int m_value;
};
struct Function : Expression {
void addArgument(Expression *expression) { m_arguments.push_back(expression); }
virtual std::string getName() const override { return m_name; }
protected:
std::vector<Expression *> m_arguments;
std::string m_name;
friend struct serialization;
};
struct Plus : Function {
Plus() : Function() { m_name = "Plus"; }
};
///////////////////////////////////////////////////////////////////////////////
// A simple intermediate representation
#include <boost/variant.hpp>
namespace output_ast {
struct Function;
struct Value;
using Expression = boost::variant<Function, Value>;
using Arguments = std::vector<Expression>;
struct Value { std::string name, value; };
struct Function { std::string name; Arguments args; };
}
#include <boost/fusion/include/struct.hpp>
BOOST_FUSION_ADAPT_STRUCT(output_ast::Value, name, value)
BOOST_FUSION_ADAPT_STRUCT(output_ast::Function, name, args)
#include <boost/spirit/include/karma.hpp>
namespace karma_json {
namespace ka = boost::spirit::karma;
template <typename It>
struct Generator : ka::grammar<It, output_ast::Expression()> {
Generator() : Generator::base_type(expression) {
expression = function|value;
function
= "{\n " << ka::delimit(",\n ")
[name << type(+"Function") ]
<< arguments
<< "\n}"
;
arguments = "\"arguments\": [" << -(("\n " << expression) % ",") << ']';
value
= "{\n " << ka::delimit(",\n ")
[name << type(+"Value") ]
<< value_
<< "\n}"
;
type = "\"type\":\"" << ka::string(ka::_r1) << "\"";
string = '"' << *('\\' << ka::char_("\\\"") | ka::char_) << '"';
name = "\"name\":" << string;
value_ = "\"value\":" << string;
}
private:
ka::rule<It, output_ast::Expression()> expression;
ka::rule<It, output_ast::Function()> function;
ka::rule<It, output_ast::Arguments()> arguments;
ka::rule<It, output_ast::Value()> value;
ka::rule<It, std::string()> string, name, value_;
ka::rule<It, void(std::string)> type;
};
}
///////////////////////////////////////////////////////////////////////////////
// Expression -> output_ast
struct serialization {
static output_ast::Expression call(Expression const* e) {
if (auto* f = dynamic_cast<Function const*>(e)) {
output_ast::Arguments args;
for (auto& a : f->m_arguments) args.push_back(call(a));
return output_ast::Function { f->getName(), args };
}
if (auto* v = dynamic_cast<Value const*>(e)) {
return output_ast::Value { v->getName(), v->getValue() };
}
return {};
}
};
auto to_output(Expression const* expression) {
return serialization::call(expression);
}
int main() {
// Build expression 4 + 5 + 6 as 4 + (5 + 6)
Function *plus1 = new Plus();
Function *plus2 = new Plus();
Value *iv4 = new IntegerValue(4);
Value *iv5 = new IntegerValue(5);
Value *iv6 = new IntegerValue(6);
plus2->addArgument(iv5);
plus2->addArgument(iv6);
plus1->addArgument(iv4);
plus1->addArgument(plus2);
// Generate json string here, but how?
using It = boost::spirit::ostream_iterator;
std::cout << format(karma_json::Generator<It>{}, to_output(plus1));
}
The Output
The generator is being as as readable/robust/functional as I'd like (there are quirks related to delimiters, there are issues when type contain characters that would need to be quoted, there's no stateful indentation).
The result doesn't look as expected, though it's valid JSON:
{
"name":"Plus",
"type":"Function",
"arguments": [
{
"name":"IntegerValue",
"type":"Value",
"value":"4"
},
{
"name":"Plus",
"type":"Function",
"arguments": [
{
"name":"IntegerValue",
"type":"Value",
"value":"5"
},
{
"name":"IntegerValue",
"type":"Value",
"value":"6"
}]
}]
}
Fixing it is... a nice challenge if you want to try it.
The Simplified Version
The simplified version, complete with attribute-handling workaround documented above:
Live On Coliru
namespace karma_json {
namespace ka = boost::spirit::karma;
template <typename It>
struct Generator : ka::grammar<It, output_ast::Expression()> {
Generator() : Generator::base_type(expression) {
expression = function|value;
function = '{' << name << ',' << type << ',' << arguments << '}';
arguments = "\"arguments\":[" << -(expression % ',') << ']';
value = '{' << name << ',' << type << ',' << value_ << '}' ;
string = '"' << *('\\' << ka::char_("\\\"") | ka::char_) << '"';
type = "\"type\":" << string;
name = "\"name\":" << string;
value_ = "\"value\":" << string;
}
private:
ka::rule<It, output_ast::Expression()> expression;
ka::rule<It, output_ast::Function()> function;
ka::rule<It, output_ast::Arguments()> arguments;
ka::rule<It, output_ast::Value()> value;
ka::rule<It, std::string()> string, name, type, value_;
};
}
Yields the following output:
{"name":"Plus","type":"Function","arguments":[{"name":"IntegerValue","type":"Value","value":"4"},{"name":"Plus","type":"Function","arguments":[{"name":"IntegerValue","type":"Value","value":"5"},{"name":"IntegerValue","type":"Value","value":"6"}]}]}
I'm inclined to think this is a much better cost/benefit ratio than the failed attempt at "pretty" formatting. But the real story here is that the maintenance cost is through the roof anyways.
¹ Interestingly, Coliru exceeds the compilation time... This too could be an argument guiding your design descisions
² makes you wonder how many people actually use Karma day-to-day
serialization
struct with back-end/format agnostic accessors only, and separate all backends out to different TUs. I'll leave that as the proverbial exercise to the reader. – Anthonyanthophore