Scala Parser Combinator, Ambiguous Grammar & Parse Forest
Asked Answered
H

1

8

I am trying to get the parser to return all possible parse results (parse forest) from an ambiguous grammar and choose from the parse forest by evaluating them against user context / history and a knowledge base. For performance reason, this should probably be done with the packrat parser and a search limit / upper-bound parameter to limite the number of recursive calls in applying production rules to avoid infinite loops.

Being new to both Scala and its Parser Combinators, I can't figure out how to do this or whether it can be done at all. Could someone please help? Much appreciated.

Best regards, Thomas Juan

Hydrocortisone answered 23/4, 2012 at 8:32 Comment(0)
M
11

You cannot do this with Scala's built-in parser combinators. Packrat combinators are restricted to only unambiguous grammars. If you try to deal with ambiguity, you lose the ability to memoize and the parse complexity becomes O(k^n) even for unambiguous trails. So, don't do that.

What you want is a parser combinator framework that correctly handles ambiguity. As far as I know, the only such framework for Scala is my GLL combinators. The syntax is almost identical to Scala's parser combinators (the main difference being that higher-arity functions work correctly), but the output is a Stream[A], where A is the result type. Thus, you get a parse forest. Note that the result is not actually a SPPF (shared packed parse forest), so if you produce an exponential number of results, the stream will have proportional size. This was done to maintain ultimate flexibility in the result type.

GLL combinators is O(n^3) in the worst case, even for pathologically ambiguous grammars. In the average case, where the parse trail is unambiguous, GLL combinators is O(n). The constant time overhead is currently a bit excessive, but optimization is an ongoing project. We use GLL combinators in production at Precog, so you can expect the library to be well maintained going forward.

Merna answered 23/4, 2012 at 15:43 Comment(2)
Thank you for the insight. As I explore my ideas, I realize that parse failures may indicate to me where to repair the grammar, and suggest possible solutions to the user, is it possible in your GLL Combinators to maintain info or a "score" for failed parse attempts as they are tried? Much appreciated!Hydrocortisone
Also, would this extra behaviour force the parser to always produce the worst-case O(n^3) complexity?Hydrocortisone

© 2022 - 2024 — McMap. All rights reserved.