Scala Parsers: Availability, Differences and Combining?

I

4

25

My question is about the Scala Parsers:

Which ones are available (in the Standard library and outside),
what's the difference between them,
do they share a common API and
can different Parsers be combined to parse one input string?

I found at least these:

Scala's "standard" parser (seems to be an LL parser)
Scala's Packrat parser (since 2.8, is a LALR parser)
The Parboiled parser (PEG parser?)
Spiewak's GLL parser combinator

Iliad answered 12/12, 2010 at 19:24 Comment(2)

be sure to look at Parboiled, github.com/sirthias/parboiled/wiki – Idealistic 13/12, 2010 at 20:57

I used recently jparsec (jparsec.codehaus.org) - it's nice (in Java project). It's Java library, but I think with several implicit conversions it can look nicely in Scala... just my 5 cents... – Glynas 23/1, 2011 at 18:33

P

4

Just wanted to update this answer with a pointer to the latest iteration of the parboiled project, called parboiled2:

https://github.com/sirthias/parboiled2

parboiled2 targets only Scala (as opposed to Scala + Java), makes use of Scala macros, and is very actively maintained.

Pontormo answered 10/3, 2014 at 17:39 Comment(0)

S

12

There's also Dan Spiewak's implementation of GLL parser combinators.

Shibboleth answered 15/12, 2010 at 1:6 Comment(0)

V

11

It's worth noting that Scala's standard parser combinators are not LL, nor are Packrat combinators LALR. Parser combinators are a form of recursive descent with infinite backtracking. You can think of them a bit like "LL(*)". The class of languages supported by this technique is precisely the class of unambiguous context-free languages, or the same class as LALR(1) and Packrat. However, the class of grammar is quite a bit different, with the most notable weakness being non-support for left-recursion.

Packrat combinators do support left-recursion, but they still fail to support many other, more subtle features of LALR. This weakness generally stems from the ordered choice operator, which can lead to some devilishly tricky grammar bugs, as well as prevents certain nice grammatical formulations. The most often-seen example of these bugs happens when you accidentally order ambiguous choices as shortest first, resulting in a greedy match that prevents the correct branch from ever being tried. LALR doesn't have this problem, since it simply tries all possible branches at once, deferring the decision point until the end of the production.

Vestigial answered 6/2, 2012 at 16:41 Comment(0)

S

8

There is also a new approach known as "parsing with derivatives". The approach is described here. There is an implementation in Scala by Daniel Spiewak.

Stavro answered 6/2, 2012 at 14:20 Comment(0)

P

4

Just wanted to update this answer with a pointer to the latest iteration of the parboiled project, called parboiled2:

https://github.com/sirthias/parboiled2

parboiled2 targets only Scala (as opposed to Scala + Java), makes use of Scala macros, and is very actively maintained.

Pontormo answered 10/3, 2014 at 17:39 Comment(0)

Recommended topics

Hot tags