I want to write a parser (as a qi extension) which can be used
via my_parser(p1, p2, ...)
where p1, p2, ...
are qi parser expressions.
Actually, I want to implement a best_match
parser which works like qi alternative, but selects not the first matching rule but the rule which 'explains' most of the input.
Given two rules simple_id = +(qi::alpha)
and complex_id = simple_id >> *(qi::string("::") > simple_id)
it would select complex_id
on the input willy::anton
. And it would not produce intermediate attributes while doing so. These benefits get payed for with runtime because lookahead parsing is required.
In my view there are several use cases for such a parser construct. benchmark(p1, ...)
as a supporting parser for optimization, just one example. I will provide that, once I know how to that.
This parser would be an non-terminal. I tried (hard), but I can not find answers to my problem within qi. My guess is, that C++ mechanisms are so tightly integrated withc qi, that there is no understandable entry point, at least for me.
It is quite easy to implement what I want introducing an operator. I append the current solution below. It compiles as expected but is incomplete.
A qi operator gets a fusion::vector/sequence (whatever) ready made and acts on it. There seem to be no library offerings to solve my problem. Even make_nary_composite
already expects the args being compiled to Elements
.
I tried a lot, everything failed, so I don't want to bore you with that.
A workaround which I can imagine could be to provide a stateful operator ,
, which would make p1, ...
to a legal qi expression. Then implement a unary_parser best_match
or directive to process that expression.
The ,
would get two modes: getting the length of input the current (successful) parser consumes and actually parsing the selected one from phase 1. The wrapper what run that comma parser first in mode 1 and then in mode 2. It might be ugly, but could work.
The operator implementation p1 |= p2 |= ...
is the most expensive regarding runtime for n > 2. I would be happy to get around that.
Minimum (but nevertheless reasonable) overhead would have best_match(p1, ...)
.
Is that possible at all?
For I am not too experienced with boost::fusion I added some questions to the code as well (how tos regarding fusion).
It feels wrong, to press something which actually is a nary non-terminal parser into the unary parser scheme. But with my poor understanding it seems to be the only way to get it done. I'd highly appreciate any help to get me out of this.
My environment: Boost 1.61, MSVC 2015 Upd 2, target win32console.
PROGRESS:
I think, I'm getting it slowly but surely. I was very happy to see an error_invalid_expression
. Compile failed because of the following proto expression:
boost::proto::exprns_::expr<
boost::proto::tagns_::tag::function,boost::proto::argsns_::list5<
const boost::proto::exprns_::expr<
boost::proto::tagns_::tag::terminal,boost::proto::argsns_::term<
mxc::qitoo::tag::best_match
>
,0> &
,const boost::spirit::qi::rule<
iterator_type,std::string (void),
boost::spirit::standard::space_type,
boost::spirit::unused_type,boost::spirit::unused_type> &
,const boost::spirit::qi::rule<
iterator_type,std::string (void),
boost::spirit::standard::space_type,
boost::spirit::unused_type,
boost::spirit::unused_type> &
, const boost::spirit::qi::rule<
iterator_type,std::string (void),
boost::spirit::standard::space_type,
boost::spirit::unused_type,
boost::spirit::unused_type> &
,const boost::spirit::qi::rule<
iterator_type,std::string (void),
boost::spirit::standard::space_type,
boost::spirit::unused_type,
boost::spirit::unused_type> &
>
,5>
This actually is the expression which describes my function-style usage of my best_match parser. My test rule looks like
qitoo::best_match(id, qualified_id, id, qualified_id)
where id
and qualified_id
are the rules as mentioned above. All that I want.
The error occurs because this expression does not have a member parse
. Okay. Digging deeper I found qi's meta-compiler does support unary, binary and nary (what means variadic) parsers. But it does not support function style syntax. Qi seems to utilize unary, binary and nary for operators only. And allthough the nary type is supported, it is more or less obsolete, because qi operators are binary max.
END PROGRESS EDIT
#include <boost/spirit/home/qi.hpp>
namespace boost {
namespace spirit
{
///////////////////////////////////////////////////////////////////////////
// Enablers
///////////////////////////////////////////////////////////////////////////
template <>
struct use_operator<qi::domain, proto::tag::bitwise_or_assign> // enables |=
: mpl::true_ {};
template <>
struct flatten_tree<qi::domain, proto::tag::bitwise_or_assign> // flattens |=
: mpl::true_ {};
}
}
namespace mxc {
namespace qitoo {
namespace spirit = boost::spirit;
namespace qi = spirit::qi;
namespace fusion = boost::fusion;
template <typename Elements>
struct best_match_parser : qi::nary_parser<mxc::qitoo::best_match_parser<Elements>>
{
// This one does a lookahead parse of each qi expression to find out which rule matches
// most of the input
template <typename Iterator, typename Context, typename Skipper>
struct best_function
{
best_function(Iterator& first, Iterator const& last, Context& context,
Skipper const& skipper, int& best, int& idx, int& size)
: first(first), last(last), context(context), skipper(skipper), best(best),
idx(idx), size(size) {};
template <typename Component>
void operator()(Component& component) const
{
Iterator f = first;
if (component.parse(f, last, context, skipper, qi::unused)) {
// please have a look on how I transport best, idx and size
// into the parser context
//
// I guess, this is a nasty hack and could be done better
// Any advice would be highliy appreciated
int l = f - first;
if (l > size) {
size = l;
best = idx++;
}
idx++;
}
}
private:
int& best;
int& idx;
int& size;
Iterator& first;
Iterator const& last;
Context& context;
Skipper const& skipper;
};
template <typename Context, typename Iterator>
struct attribute
{
// Put all the element attributes in a tuple
typedef typename spirit::traits::build_attribute_sequence <
Elements, Context, spirit::traits::alternative_attribute_transform
, Iterator, qi::domain
>::type all_attributes;
// Ok, now make a variant over the attribute sequence. Note that
// build_variant makes sure that 1) all attributes in the variant
// are unique 2) puts the unused attribute, if there is any, to
// the front and 3) collapses single element variants, variant<T>
// to T.
typedef typename
spirit::traits::build_variant<all_attributes>::type
type;
};
best_match_parser(Elements const& elements_) : elements(elements_), size(0), idx(0), best(-1) {}
template <typename Iterator, typename Context, typename Skipper, typename Attribute>
bool parse(Iterator& first, Iterator const& last
, Context& context, Skipper const& skipper
, Attribute& attr_) const
{
best_function<Iterator, Context, Skipper> f(first, last, context, skipper, best, idx, size);
// find out which parser matches most of the input
fusion::for_each(elements, f);
// best >= 0 if at least one parser was successful
if (best >= 0) {
// now that I have the best parser identified, how do I access it?
// I understand that the next line can not work, but I'm looking for something like that
// --> auto v = fusion::at<boost::mpl::int_<best>>(elements);
};
return false;
}
template <typename Context>
qi::info what(Context& context) const
{
qi::info result("best_match");
fusion::for_each(elements,
spirit::detail::what_function<Context>(result, context));
return result;
}
Elements elements;
mutable int best;
mutable int idx;
mutable int size;
};
}
}
namespace boost {
namespace spirit {
namespace qi {
///////////////////////////////////////////////////////////////////////////
// Parser generators: make_xxx function (objects)
///////////////////////////////////////////////////////////////////////////
template <typename Elements, typename Modifiers>
struct make_composite<proto::tag::bitwise_or_assign, Elements, Modifiers>
: make_nary_composite < Elements, mxc::qitoo::best_match_parser >
{};
}
namespace traits {
///////////////////////////////////////////////////////////////////////////
template <typename Elements>
struct has_semantic_action<mxc::qitoo::best_match_parser<Elements> >
: nary_has_semantic_action<Elements> {};
///////////////////////////////////////////////////////////////////////////
template <typename Elements, typename Attribute, typename Context
, typename Iterator>
struct handles_container<mxc::qitoo::best_match_parser<Elements>, Attribute, Context
, Iterator>
: nary_handles_container<Elements, Attribute, Context, Iterator> {};
}
}
}
namespace qi = boost::spirit::qi;
namespace qitoo = mxc::qitoo;
using iterator_type = std::string::const_iterator;
using result_type = std::string;
template<typename Parser>
void parse(const std::string message, const std::string& input, const std::string& rule, const Parser& parser) {
iterator_type iter = input.begin(), end = input.end();
std::vector<result_type> parsed_result;
std::cout << "-------------------------\n";
std::cout << message << "\n";
std::cout << "Rule: " << rule << std::endl;
std::cout << "Parsing: \"" << input << "\"\n";
bool result = qi::phrase_parse(iter, end, parser, qi::space, parsed_result);
if (result)
{
std::cout << "Parser succeeded.\n";
std::cout << "Parsed " << parsed_result.size() << " elements:";
for (const auto& str : parsed_result)
std::cout << "[" << str << "]";
std::cout << std::endl;
}
else
{
std::cout << "Parser failed" << std::endl;
}
if (iter == end) {
std::cout << "EOI reached." << std::endl;
}
else {
std::cout << "EOI not reached. Unparsed: \"" << std::string(iter, end) << "\"" << std::endl;
}
std::cout << "-------------------------\n";
}
int main()
{
namespace qi = boost::spirit::qi;
qi::rule < iterator_type, std::string(), qi::space_type>
id = (qi::alpha | qi::char_('_')) >> *(qi::alnum | qi::char_('_'));
qi::rule < iterator_type, std::string(), qi::space_type>
qualified_id = id >> *(qi::string("::") > id);
namespace qitoo = mxc::qitoo;
namespace qi = boost::spirit::qi;
parse("best match operator, select second rule"
, "willy::anton::lutz"
, "id |= qualified_id"
, id |= qualified_id);
}