Recursive x3 parser with results passing around
Asked Answered
P

1

2

(1) Say we want to parse a simple recursive block surrounded by {}.

{
    Some text.
    {
        {
            Some more text.
        }
        Some Text again.
        {}
    }
}

This recursive parser is quite simple.

x3::rule<struct idBlock1> const ruleBlock1{"Block1"};
auto const ruleBlock1_def =
    x3::lit('{') >>
    *(
        ruleBlock1 |
        (x3::char_ - x3::lit('}'))
    ) >>
    x3::lit('}');

BOOST_SPIRIT_DEFINE(ruleBlock1)

(2) Then the block becomes more complex. It could also be surrounded by [].

{
    Some text.
    [
        {
            Some more text.
        }
        Some Text again.
        []
    ]
}

We need somewhere to store what kind of opening bracket that we have. Since x3 does not have locals, we may use attribute (x3::_val) instead.

x3::rule<struct idBlock2, char> const ruleBlock2{"Block2"};
auto const ruleBlock2_def = x3::rule<struct _, char>{} =
    (
        x3::lit('{')[([](auto& ctx){x3::_val(ctx)='}';})] |
        x3::lit('[')[([](auto& ctx){x3::_val(ctx)=']';})]
    ) >>
    *(
        ruleBlock2 |
        (
            x3::char_ - 
            (
                x3::eps[([](auto& ctx){x3::_pass(ctx)='}'==x3::_val(ctx);})] >> x3::lit('}') |
                x3::eps[([](auto& ctx){x3::_pass(ctx)=']'==x3::_val(ctx);})] >> x3::lit(']')
            )
        )
    ) >>
    (
        x3::eps[([](auto& ctx){x3::_pass(ctx)='}'==x3::_val(ctx);})] >> x3::lit('}') |
        x3::eps[([](auto& ctx){x3::_pass(ctx)=']'==x3::_val(ctx);})] >> x3::lit(']')
    );

BOOST_SPIRIT_DEFINE(ruleBlock2)

(3) The block content (surrounded part), we call it argument, may be much more complicated than this example. So we decide to create a rule for it. This attribute solution is not working in this case. Luckily we still have x3::with directive. We can save the open bracket (or expecting close bracket) in a stack reference and pass it to the next level.

struct SBlockEndTag {};
x3::rule<struct idBlockEnd> const ruleBlockEnd{"BlockEnd"};
x3::rule<struct idArg> const ruleArg{"Arg"};
x3::rule<struct idBlock3> const ruleBlock3{"Block3"};
auto const ruleBlockEnd_def =
    x3::eps[([](auto& ctx){
        assert(!x3::get<SBlockEndTag>(ctx).get().empty());
        x3::_pass(ctx)='}'==x3::get<SBlockEndTag>(ctx).get().top();
    })] >> 
    x3::lit('}') 
    |
    x3::eps[([](auto& ctx){
        assert(!x3::get<SBlockEndTag>(ctx).get().empty());
        x3::_pass(ctx)=']'==x3::get<SBlockEndTag>(ctx).get().top();
    })] >>
    x3::lit(']');
auto const ruleArg_def =
    *(
        ruleBlock3 |
        (x3::char_ - ruleBlockEnd)
    );
auto const ruleBlock3_def =
    (
        x3::lit('{')[([](auto& ctx){x3::get<SBlockEndTag>(ctx).get().push('}');})] |
        x3::lit('[')[([](auto& ctx){x3::get<SBlockEndTag>(ctx).get().push(']');})]
    ) >>
    ruleArg >>
    ruleBlockEnd[([](auto& ctx){
        assert(!x3::get<SBlockEndTag>(ctx).get().empty());
        x3::get<SBlockEndTag>(ctx).get().pop();
    })];

BOOST_SPIRIT_DEFINE(ruleBlockEnd, ruleArg, ruleBlock3)

The code is on Coliru.

Question: is this how we write recursive x3 parser for this kind of problem? With spirit Qi's locals and inherited attributes, the solution seems to be much simpler. Thanks.

Phonics answered 23/5, 2017 at 9:55 Comment(0)
S
6

You can use x3::with<>.

However, I'd just write this:

auto const block_def =
    '{' >> *( block  | (char_ - '}')) >> '}'
  | '[' >> *( block  | (char_ - ']')) >> ']';

Demo

Live On Coliru

#include <boost/spirit/home/x3.hpp>
#include <iostream>

namespace Parser {
    using namespace boost::spirit::x3;

    rule<struct idBlock1> const block {"Block"};
    auto const block_def =
        '{' >> *( block  | (char_ - '}')) >> '}'
      | '[' >> *( block  | (char_ - ']')) >> ']';

    BOOST_SPIRIT_DEFINE(block)
}

int main() {
    std::string const input = R"({
    Some text.
    [
        {
            Some more text.
        }
        Some Text again.
        []
    ]
})";

    std::cout << "Parsed: " << std::boolalpha << parse(input.begin(), input.end(), Parser::block) << "\n";
}

Prints:

Parsed: true

BUT - Code Duplication!

If you insist on generalizing:

auto dyna_block = [](auto open, auto close) {
    return open >> *(block | (char_ - close)) >> close;
};

auto const block_def =
    dyna_block('{', '}')
  | dyna_block('[', ']');
Simonesimoneau answered 23/5, 2017 at 10:30 Comment(1)
Added the preemptive response to the anti-code-duplication brigade :)Simonesimoneau

© 2022 - 2024 — McMap. All rights reserved.