Understanding Boost.spirit's string parser
Asked Answered
F

1

6
#include <iostream>
#include <boost/spirit/include/qi.hpp>

namespace qi = boost::spirit::qi;
int main ()
{
    using qi::string;

    std::string input("a");
    std::string::iterator strbegin = input.begin();
    std::string p;
    bool ok = qi::phrase_parse(strbegin, input.end(),
            ((string("a")  >> string("a")) | string("a")),
            qi::space,                  
            p);                               

    if (ok && strbegin == input.end()) {
        std::cout << p << std::endl;
        std::cout << p.size() << std::endl;
    } else {
        std::cout << "fail" << std::endl;
        std::cout << std::string(strbegin, input.end()) << std::endl;
    }
}

This program outputs aa. How is it possible? Input string is a. Parser should match aa or a. I have written string("a") only for testing operators.

The same is when using char_ instead of string.

Fantasia answered 22/2, 2014 at 19:23 Comment(1)
Just stumbled into this very similar question again: https://mcmap.net/q/1619608/-boost-spirit-qi-duplicate-parsing-on-the-output/85371Ire
I
6

It's not the string matcher per se. It's [attribute propagation] + [backtracking] in action.

A string attribute is a container attribute and many elements could be assigned into it by different parser subexpressions. Now for efficiency reasons, Spirit doesn't rollback the values of emitted attributes on backtracking.

Often this is no problem at all, but as you can see, the 'a' from the failed first branch of the alternative sticks around.

Either reword or employ the 'big gun' qi::hold[] directive:

(qi::hold [ string("a")  >> string("a") ] | string("a")),

Rewording could look like:

qi::string("a") >> -qi::string("a"),

Also, if you're really just trying to match certain textual strings, consider:

(qi::raw [ qi::lit("aa") | "a" ]), 
// or even just
qi::string("aa") | qi::string("a"),

Now which one of these applies most, depends on your grammar.

Ire answered 22/2, 2014 at 21:42 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.