boost::spirit::qi keywords and identifiers
Asked Answered
H

1

6

I've seen a few posts related to the nuances of keyword/identifier use in qi grammars, but I can't quite make sense of how the approach demonstrated in the boost examples is supposed to work...

Keywords declaration:

qi::symbols<char> keywords;

Example keyword set:

keywords.add
        ("foo")
        ("bar")
        ;

Identifier rule declaration:

qi::rule<std::string::const_iterator, std::string(), ascii::space_type> identifier;

Here's how the identifier rule is defined in the qi calc and compiler examples:

identifier = !keywords >> qi::raw[ qi::lexeme[ ( qi::alpha | '_' ) >> *( qi::alnum | '_' ) ] ];

Perhaps I'm reading the qi syntax wrong, but it seems to me that this would not accept any literal that matches or starts with a keyword. Rejecting a full keyword match is the desired behavior. But, I want to accept "food" as an identifier, even though it begins with the keyword "foo". This seems like a pretty standard use case, but having trouble finding documentation that really nails this down.

Can anyone offer an identifier rule that only rejects exact matches to keywords?

Thanks!

Homogenesis answered 13/12, 2014 at 20:33 Comment(1)
Actually this question deserves some votes. It should be much more oft realized, and probably be addressed in the Spirit Tutorials, as it's often overlooked (obviously the compiler samples are ok)Hegelian
H
3

Actually this question deserves some votes. It should be much more oft realized, and probably be addressed in the Spirit Tutorials, as it's often overlooked (obviously the compiler samples are ok)


Perhaps I'm reading the qi syntax wrong, but it seems to me that this would not accept any literal that matches or starts with a keyword.

That's correct. In case you spotted in one of my own answers (quite a good chance) I tend to do this as a quick-and-dirty way to fixup grammars that didn't have proper keyword guards there in the first place.

But yeah, requiring distinct keywords/identifiers requires some more work. I might find a link to an answer where it's done correctly (it's not hard, it's just tedious).

Meanwhile, have a look at the very relevant

If you're building a really robust general-purpose language grammar, this is about the point where you should consider using a Spirit Lexer. Then again, in my humble opinion, Spirit aims at rapid development and small, one-off grammars that are succinctly embedded using Spirit's expression template eDSL. In a lot of aspects, that is much the antipode of when this matters, I reckon.

Hegelian answered 13/12, 2014 at 23:21 Comment(1)
Thanks very much for your response. This clears it up. I was aware of the distinct directive, but thought I might be missing something with that quick-fix. Thanks!Homogenesis

© 2022 - 2024 — McMap. All rights reserved.