Boost::spirit how to parse and call c++ function-like expressions
Asked Answered
C

2

3

I want to use boost spirit to parse an expression like

function1(arg1, arg2, function2(arg1, arg2, arg3), function3(arg1,arg2))

and call corresponding c++ functions. What should be the grammar to parse above expression and call the corresponding c++ function by phoneix::bind()?

I have 2 types of functions to call

1) string functions;

wstring GetSubString(wstring stringToCut, int position, int length); wstring GetStringToken(wstring stringToTokenize, wstring seperators, int tokenNumber );

2) Functions that return integer;

int GetCount();

int GetId(wstring srcId, wstring srcType);

Consonant answered 9/6, 2013 at 18:44 Comment(6)
without code it's impossible to tell what "the corresponding c++ function" is supposed to meanBlackwood
I just updated with function declarations. Do you think its possible that I can parse and call these functions all together with an expression?Consonant
Everything is possible :) Let me see what I can demoBlackwood
I'm not sure how it would be writen in boost::sprirt, but the abstract grammar for this could be: expression = identifier ['(' [ expression [ ',' expression ] ...] ')']. You'd probably split it into several rules, though.Nuclei
I've chosen to demo a simple recursive expression grammar for functions with up to 3 arguments. This is likely a bit more generic than you required, but hey ... :)Blackwood
@korhan And I've taken the time to also demo a complete 'on-the-fly' evaluation approach (using semantic actions), in a second answerBlackwood
B
9

Second Answer (more pragmatic)

Here's a second take, for comparison:

Just in case you really didn't want to parse into an abstract syntax tree representation, but rather evaluate the functions on-the-fly during parsing, you can simplify the grammar.

It comes in at 92 lines as opposed to 209 lines in the first answer. It really depends on what you're implementing which approach is more suitable.

This shorter approach has some downsides:

  • less flexible (not reusable)
  • less robust (if functions have side effects, they will happen even if parsing fails halfway)
  • less extensible (the supported functions are hardwired into the grammar1)

Full code:

//#define BOOST_SPIRIT_DEBUG
#define BOOST_SPIRIT_USE_PHOENIX_V3
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/phoenix/function.hpp>

namespace qi    = boost::spirit::qi;
namespace phx   = boost::phoenix;

typedef boost::variant<int, std::string> value;

//////////////////////////////////////////////////
// Demo functions:
value AnswerToLTUAE() {
    return 42;
}

value ReverseString(value const& input) {
    auto& as_string = boost::get<std::string>(input);
    return std::string(as_string.rbegin(), as_string.rend());
}

value Concatenate(value const& a, value const& b) {
    std::ostringstream oss;
    oss << a << b;
    return oss.str();
}

BOOST_PHOENIX_ADAPT_FUNCTION_NULLARY(value, AnswerToLTUAE_, AnswerToLTUAE)
BOOST_PHOENIX_ADAPT_FUNCTION(value, ReverseString_, ReverseString, 1)
BOOST_PHOENIX_ADAPT_FUNCTION(value, Concatenate_, Concatenate, 2)

//////////////////////////////////////////////////
// Parser grammar
template <typename It, typename Skipper = qi::space_type>
    struct parser : qi::grammar<It, value(), Skipper>
{
    parser() : parser::base_type(expr_)
    {
        using namespace qi;

        function_call_ = 
                (lit("AnswerToLTUAE")   > '(' > ')')                     
                     [ _val = AnswerToLTUAE_() ]
              | (lit("ReverseString") > '(' > expr_ > ')')               
                     [ _val = ReverseString_(_1) ]
              | (lit("Concatenate")   > '(' > expr_ > ',' > expr_ > ')') 
                     [ _val = Concatenate_(_1, _2) ]
            ;
        string_        = as_string [ 
                            lexeme [ "'" >> *~char_("'") >> "'" ] 
                         ];
        value_         = int_ | string_;

        expr_          = function_call_ | value_;

        on_error<fail> ( expr_, std::cout
             << phx::val("Error! Expecting ") << _4 << phx::val(" here: \"")
             << phx::construct<std::string>(_3, _2) << phx::val("\"\n"));

        BOOST_SPIRIT_DEBUG_NODES((expr_)(function_call_)(value_)(string_))
    }

  private:
    qi::rule<It, value(), Skipper> value_, function_call_, expr_, string_;
};

int main()
{
    for (const std::string input: std::vector<std::string> { 
            "-99",
            "'string'",
            "AnswerToLTUAE()",
            "ReverseString('string')",
            "Concatenate('string', 987)",
            "Concatenate('The Answer Is ', AnswerToLTUAE())",
            })
    {
        auto f(std::begin(input)), l(std::end(input));
        const static parser<decltype(f)> p;

        value direct_eval;
        bool ok = qi::phrase_parse(f,l,p,qi::space,direct_eval);

        if (!ok)
            std::cout << "invalid input\n";
        else
        {
            std::cout << "input:\t" << input       << "\n";
            std::cout << "eval:\t"  << direct_eval << "\n\n";
        }

        if (f!=l) std::cout << "unparsed: '" << std::string(f,l) << "'\n";
    }
}

Note how, instead of using BOOST_PHOENIX_ADAPT_FUNCTION* we could have directly used boost::phoenix::bind.

The output is still the same:

input:  -99
eval:   -99

input:  'string'
eval:   string

input:  AnswerToLTUAE()
eval:   42

input:  ReverseString('string')
eval:   gnirts

input:  Concatenate('string', 987)
eval:   string987

input:  Concatenate('The Answer Is ', AnswerToLTUAE())
eval:   The Answer Is 42

1 This last downside is easily remedied by using the 'Nabialek Trick'

Blackwood answered 9/6, 2013 at 20:53 Comment(1)
This was what I was trying to do. Thx a lot!Consonant
B
9

First Answer (complete)

I've gone and implemented a simple recursive expression grammar for functions having up-to-three parameters:

for (const std::string input: std::vector<std::string> { 
        "-99",
        "'string'",
        "AnswerToLTUAE()",
        "ReverseString('string')",
        "Concatenate('string', 987)",
        "Concatenate('The Answer Is ', AnswerToLTUAE())",
        })
{
    auto f(std::begin(input)), l(std::end(input));
    const static parser<decltype(f)> p;

    expr parsed_script;
    bool ok = qi::phrase_parse(f,l,p,qi::space,parsed_script);

    if (!ok)
        std::cout << "invalid input\n";
    else
    {
        const static generator<boost::spirit::ostream_iterator> g;
        std::cout << "input:\t" << input                           << "\n";
        std::cout << "tree:\t"  << karma::format(g, parsed_script) << "\n";
        std::cout << "eval:\t"  << evaluate(parsed_script)         << "\n";
    }

    if (f!=l) std::cout << "unparsed: '" << std::string(f,l) << "'\n";
}

Which prints:

input:  -99
tree:   -99
eval:   -99

input:  'string'
tree:   'string'
eval:   string

input:  AnswerToLTUAE()
tree:   nullary_function_call()
eval:   42

input:  ReverseString('string')
tree:   unary_function_call('string')
eval:   gnirts

input:  Concatenate('string', 987)
tree:   binary_function_call('string',987)
eval:   string987

input:  Concatenate('The Answer Is ', AnswerToLTUAE())
tree:   binary_function_call('The Answer Is ',nullary_function_call())
eval:   The Answer Is 42

Some notes:

  • I separated parsing from execution (which is always a good idea IMO)
  • I implemented function evaluation for zero, one or two parameters (this should be easy to extend)
  • Values are assumed to be integers or strings (should be easy to extend)
  • I added a karma generator to display the parsed expression (with a TODO marked in the comment)

I hope this helps:

//#define BOOST_SPIRIT_DEBUG
#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/spirit/include/karma.hpp>
#include <boost/variant/recursive_wrapper.hpp>

namespace qi    = boost::spirit::qi;
namespace karma = boost::spirit::karma;
namespace phx   = boost::phoenix;

typedef boost::variant<int, std::string> value;
typedef boost::variant<value, boost::recursive_wrapper<struct function_call> > expr;

typedef std::function<value()                          > nullary_function_impl;
typedef std::function<value(value const&)              > unary_function_impl;
typedef std::function<value(value const&, value const&)> binary_function_impl;

typedef boost::variant<nullary_function_impl, unary_function_impl, binary_function_impl> function_impl;
typedef qi::symbols<char, function_impl> function_table;

struct function_call 
{ 
    typedef std::vector<expr> arguments_t;
    function_call() = default;

    function_call(function_impl f, arguments_t const& arguments) 
        : f(f), arguments(arguments) { }

    function_impl f;
    arguments_t arguments;

};

BOOST_FUSION_ADAPT_STRUCT(function_call, (function_impl, f)(function_call::arguments_t, arguments))

#ifdef BOOST_SPIRIT_DEBUG
namespace std
{
    static inline std::ostream& operator<<(std::ostream& os, nullary_function_impl const& f) { return os << "<nullary_function_impl>"; }
    static inline std::ostream& operator<<(std::ostream& os, unary_function_impl const& f)   { return os << "<unary_function_impl>";   }
    static inline std::ostream& operator<<(std::ostream& os, binary_function_impl const& f)  { return os << "<binary_function_impl>";  }
}
static inline std::ostream& operator<<(std::ostream& os, function_call const& call) { return os << call.f << "(" << call.arguments.size() << ")"; }
#endif

//////////////////////////////////////////////////
// Evaluation
value evaluate(const expr& e);

struct eval : boost::static_visitor<value> 
{
    eval() {}

    value operator()(const value& v) const 
    { 
        return v;
    }

    value operator()(const function_call& call) const
    {
        return boost::apply_visitor(invoke(call.arguments), call.f);
    }

  private:
    struct invoke : boost::static_visitor<value> 
    {
        function_call::arguments_t const& _args;
        invoke(function_call::arguments_t const& args) : _args(args) {}

        value operator()(nullary_function_impl const& f) const { 
            return f(); 
        }
        value operator()(unary_function_impl   const& f) const { 
            auto a = evaluate(_args.at(0));
            return f(a); 
        }
        value operator()(binary_function_impl  const& f) const { 
            auto a = evaluate(_args.at(0));
            auto b = evaluate(_args.at(1));
            return f(a, b); 
        }
    };
};

value evaluate(const expr& e)
{ 
    return boost::apply_visitor(eval(), e); 
}

//////////////////////////////////////////////////
// Demo functions:
value AnswerToLTUAE() {
    return 42;
}

value ReverseString(value const& input) {
    auto& as_string = boost::get<std::string>(input);
    return std::string(as_string.rbegin(), as_string.rend());
}

value Concatenate(value const& a, value const& b) {
    std::ostringstream oss;
    oss << a << b;
    return oss.str();
}

//////////////////////////////////////////////////
// Parser grammar
template <typename It, typename Skipper = qi::space_type>
    struct parser : qi::grammar<It, expr(), Skipper>
{
    parser() : parser::base_type(expr_)
    {
        using namespace qi;

        n_ary_ops.add
            ("AnswerToLTUAE", nullary_function_impl{ &::AnswerToLTUAE })
            ("ReverseString", unary_function_impl  { &::ReverseString })
            ("Concatenate"  , binary_function_impl { &::Concatenate });

        function_call_ = n_ary_ops > '(' > expr_list > ')';
        string_        = qi::lexeme [ "'" >> *~qi::char_("'") >> "'" ];
        value_         = qi::int_ | string_;

        expr_list      = -expr_ % ',';
        expr_          = function_call_ | value_;

        on_error<fail> ( expr_, std::cout
             << phx::val("Error! Expecting ") << _4 << phx::val(" here: \"")
             << phx::construct<std::string>(_3, _2) << phx::val("\"\n"));

        BOOST_SPIRIT_DEBUG_NODES((expr_)(expr_list)(function_call_)(value_)(string_))
    }

  private:
    function_table          n_ary_ops;

    template <typename Attr> using Rule = qi::rule<It, Attr(), Skipper>;
    Rule<std::string>       string_;
    Rule<value>             value_;
    Rule<function_call>     function_call_;
    Rule<std::vector<expr>> expr_list;
    Rule<expr>              expr_;
};

//////////////////////////////////////////////////
// Output generator
template <typename It>
    struct generator : karma::grammar<It, expr()>
{
    generator() : generator::base_type(expr_)
    {
        using namespace karma;

        nullary_       = eps << "nullary_function_call";              // TODO reverse lookup :)
        unary_         = eps << "unary_function_call";
        binary_        = eps << "binary_function_call";

        function_      = nullary_ | unary_ | binary_;
        function_call_ = function_ << expr_list;

        expr_list      = '(' << -(expr_ % ',') << ')';
        value_         = karma::int_ | ("'" << karma::string << "'");
        expr_          = function_call_ | value_;
    }

  private:
    template <typename Attr> using Rule = karma::rule<It, Attr()>;
    Rule<nullary_function_impl> nullary_;
    Rule<unary_function_impl>   unary_;
    Rule<binary_function_impl>  binary_;
    Rule<function_impl>         function_;
    Rule<function_call>         function_call_;
    Rule<value>                 value_;
    Rule<std::vector<expr>>     expr_list;
    Rule<expr>                  expr_;
};

int main()
{
    for (const std::string input: std::vector<std::string> { 
            "-99",
            "'string'",
            "AnswerToLTUAE()",
            "ReverseString('string')",
            "Concatenate('string', 987)",
            "Concatenate('The Answer Is ', AnswerToLTUAE())",
            })
    {
        auto f(std::begin(input)), l(std::end(input));
        const static parser<decltype(f)> p;

        expr parsed_script;
        bool ok = qi::phrase_parse(f,l,p,qi::space,parsed_script);

        if (!ok)
            std::cout << "invalid input\n";
        else
        {
            const static generator<boost::spirit::ostream_iterator> g;
            std::cout << "input:\t" << input                           << "\n";
            std::cout << "tree:\t"  << karma::format(g, parsed_script) << "\n";
            std::cout << "eval:\t"  << evaluate(parsed_script)         << "\n\n";
        }

        if (f!=l) std::cout << "unparsed: '" << std::string(f,l) << "'\n";
    }
}
Blackwood answered 9/6, 2013 at 20:9 Comment(2)
Hello, sehe, your program is fine, I like the such way. I've added your program with few structures and rules and it has been fine until I started adding binary operation in this grammar. Can you help me out, please (and for other ones), how to change your grammar to make able this parser to recognize binary operations? I've put this expr_ = (function_call_ | value_ | pr_metric_) >> -( '+' > expr_ ); It doesn't work I guess because I don't understand what a structure I need to store expressions. I mean it should be that expr can also store vector of expr. Hope you'll reply!Heteromerous
@AyratArifullin You could look at some of my answers that parse arithmetic or boolean expressions: a starting search, this answer and maybe hereBlackwood
B
9

Second Answer (more pragmatic)

Here's a second take, for comparison:

Just in case you really didn't want to parse into an abstract syntax tree representation, but rather evaluate the functions on-the-fly during parsing, you can simplify the grammar.

It comes in at 92 lines as opposed to 209 lines in the first answer. It really depends on what you're implementing which approach is more suitable.

This shorter approach has some downsides:

  • less flexible (not reusable)
  • less robust (if functions have side effects, they will happen even if parsing fails halfway)
  • less extensible (the supported functions are hardwired into the grammar1)

Full code:

//#define BOOST_SPIRIT_DEBUG
#define BOOST_SPIRIT_USE_PHOENIX_V3
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/phoenix/function.hpp>

namespace qi    = boost::spirit::qi;
namespace phx   = boost::phoenix;

typedef boost::variant<int, std::string> value;

//////////////////////////////////////////////////
// Demo functions:
value AnswerToLTUAE() {
    return 42;
}

value ReverseString(value const& input) {
    auto& as_string = boost::get<std::string>(input);
    return std::string(as_string.rbegin(), as_string.rend());
}

value Concatenate(value const& a, value const& b) {
    std::ostringstream oss;
    oss << a << b;
    return oss.str();
}

BOOST_PHOENIX_ADAPT_FUNCTION_NULLARY(value, AnswerToLTUAE_, AnswerToLTUAE)
BOOST_PHOENIX_ADAPT_FUNCTION(value, ReverseString_, ReverseString, 1)
BOOST_PHOENIX_ADAPT_FUNCTION(value, Concatenate_, Concatenate, 2)

//////////////////////////////////////////////////
// Parser grammar
template <typename It, typename Skipper = qi::space_type>
    struct parser : qi::grammar<It, value(), Skipper>
{
    parser() : parser::base_type(expr_)
    {
        using namespace qi;

        function_call_ = 
                (lit("AnswerToLTUAE")   > '(' > ')')                     
                     [ _val = AnswerToLTUAE_() ]
              | (lit("ReverseString") > '(' > expr_ > ')')               
                     [ _val = ReverseString_(_1) ]
              | (lit("Concatenate")   > '(' > expr_ > ',' > expr_ > ')') 
                     [ _val = Concatenate_(_1, _2) ]
            ;
        string_        = as_string [ 
                            lexeme [ "'" >> *~char_("'") >> "'" ] 
                         ];
        value_         = int_ | string_;

        expr_          = function_call_ | value_;

        on_error<fail> ( expr_, std::cout
             << phx::val("Error! Expecting ") << _4 << phx::val(" here: \"")
             << phx::construct<std::string>(_3, _2) << phx::val("\"\n"));

        BOOST_SPIRIT_DEBUG_NODES((expr_)(function_call_)(value_)(string_))
    }

  private:
    qi::rule<It, value(), Skipper> value_, function_call_, expr_, string_;
};

int main()
{
    for (const std::string input: std::vector<std::string> { 
            "-99",
            "'string'",
            "AnswerToLTUAE()",
            "ReverseString('string')",
            "Concatenate('string', 987)",
            "Concatenate('The Answer Is ', AnswerToLTUAE())",
            })
    {
        auto f(std::begin(input)), l(std::end(input));
        const static parser<decltype(f)> p;

        value direct_eval;
        bool ok = qi::phrase_parse(f,l,p,qi::space,direct_eval);

        if (!ok)
            std::cout << "invalid input\n";
        else
        {
            std::cout << "input:\t" << input       << "\n";
            std::cout << "eval:\t"  << direct_eval << "\n\n";
        }

        if (f!=l) std::cout << "unparsed: '" << std::string(f,l) << "'\n";
    }
}

Note how, instead of using BOOST_PHOENIX_ADAPT_FUNCTION* we could have directly used boost::phoenix::bind.

The output is still the same:

input:  -99
eval:   -99

input:  'string'
eval:   string

input:  AnswerToLTUAE()
eval:   42

input:  ReverseString('string')
eval:   gnirts

input:  Concatenate('string', 987)
eval:   string987

input:  Concatenate('The Answer Is ', AnswerToLTUAE())
eval:   The Answer Is 42

1 This last downside is easily remedied by using the 'Nabialek Trick'

Blackwood answered 9/6, 2013 at 20:53 Comment(1)
This was what I was trying to do. Thx a lot!Consonant

© 2022 - 2024 — McMap. All rights reserved.