What if you try out GLL Combinators? Although it uses Scala, you may write "thin" wrappers for it by means of the JNI.
GLL Combinators is a framework designed to implement the GLL parsing algorithm
(Scott and Johnstone, LDTA 2009) in a functional manner. More specifically, the
framework makes use of atomic parser combinators to compose grammars which are
then evaluated using the GLL algorithm. The framework provides a syntax for this
task which is almost identical to that of the parser combinator framework built
into Scala. For example, we can render the classic "parentheses grammar" using
GLL combinators:
lazy val expr: Parser[Any] = (
"(" ~ expr ~ ")"
| ""
)
As the type annotation implies, expr
will reference an instance of type
Parser[Any]
. That is, an atomic parser which consumes some input and returns
a value of type Any
. We can invoke this parser against a String
input
in the following way:
expr("((()))")
This will return a value of type Stream[Result[Any]]
. The Result[A]
ADT is
defined as one of the following (for some type A
):
Success[A]
-- Represents a successful parse and contains the resulting value.
Failure
-- Represents a failed parse and contains the relevant error message
as well as the remainder of the parse stream (the characters which were not
consumed).
If any result is successful (i.e. an instance of Success[A]
), then no failures
will be returned. Thus, the Stream
returned will be completely homogeneous,
containing either Success
or Failure
, but not both. A Stream
is
returned rather than a single value to allow for ambiguity in the grammar (see
below).
It's worth mentioning that GLL is a form of recursive-descent parsing. It has
all of the advantages of conventional recursive-descent, including an intuitive
control flow, arbitrary positioning of semantic actions and superior error
messages. In fact, it is the fact that GLL is recursive-descent which allows it
to be implemented using atomic combinators. Other algorithms which share the
same capabilities as GLL (such as GLR, Earley Parsing, etc) are fundamentally
incompatible with the combinator model due to their highly-unintuitive control
flow. In GLL parsing, the control flow follows that of the grammar, as it does
in traditional parser combinators or any other form of recursive-descent.