What is the advantage of using a parser generator like happy as opposed to using parser combinators?
Asked Answered
H

5

28

To learn how to write and parse a context-free grammar I want to choose a tool. For Haskell, there are two big options: Happy, which generates a parser from a grammar description and *Parsec, which allows you to directly code a parser in Haskell.

What are the (dis)advantages of either approach?

Hostler answered 1/9, 2011 at 10:27 Comment(9)
Depending on what you're doing attoparsec may be a good choice. serpentine.com/blog/2010/03/03/…Ferrand
Out of all the parser combinator libraries on Hackage, parsec is possibly my least favorite. uu-parsinglib, attoparsec, and polyparse are my top choices (in that order).Hogle
My impression is that Happy has significantly better performance than almost any parser combinator library, especially for large grammars with little or no context. On the other hand, the flexibility of combinator libraries (with regards to effects, treating parsers as first class values, etc) is I think an everywhere better choice until performance becomes the dominating factor and one is dealing with a complex grammar.Angelia
@Angelia Would you consider a grammar for a (simple) programming language like PL/0 complex?Hostler
@FUZxxl -- by my standards above, yes. In this context, I'm thinking of "complex" grammars as pretty much anything beyond formats that one would use for raw data serialization (where attoparsec or a hand-tuned parser could well be better). Since Happy packs things down to neat little state machines which it can optimize, then the more states the parser can be in, the better the expected gain, I 'd imagine. Also, bear in mind that happy is happiest (most performant) when used with a separate lexer pass such as by alex.Angelia
My rule of thumb is if I have a LR grammar I go with Happy, otherwise I use Parsec. It's usually not worth the effort changing LR to Parsec (more code, slower parser), but if I have to work out the grammar myself I like the tricks that Parsec supplies (e.g. context sensitive parsing).Tenedos
@John L: what's your reasoning behind that order? (I use polyparse, have had a look at attoparsec but have never touched uu-parsinglib)Beware
@ivanm: uu-parsinglib is very full-featured (automatic error correction is nice), and simple to use. The documentation is decent also. I probably use attoparsec more because I do more work with binary stuff, for which it's very efficient. The on-demand model also matches well with enumerator-style IO (which I naturally use heavily). I listed polyparse because I admire the simplicity and elegance, although I rarely use it.Hogle
Another vote for uu-parsinglib. Our team switched from Parsec to uu-parsinglib a bit more than a year ago, and we haven't looked back. No explicit 'try' combinator needed, ability to do lazy parsing, and error correction mean that the fundamentals are more powerful than those of Parsec (even if the end-user niceties - documentation,pre-provided parsers for lexing etc aren't yet quite as good - but getting better). Happy is generally somewhat more awkward to use - but you do get the standard yacc-style guarantee of linear time parsing - otherwise you get told at compile time.Ascribe
T
25

External vs internal DSL

The parser specification format for Happy is an external DSL, whereas with Parsec you have the full power of Haskell available when defining your parsers. This means that you can for example write functions to generate parsers, use Template Haskell and so on.

Precedence rules

With Happy, you can use precedences to simplify your grammar, whereas with Parsec you have to nest the grammar rules correctly yourself. Changing the precedence of an operator is therefore much more tedious in Parsec.

Static checking

Happy will warn you about ambiguities in your grammar at compile time. (Though it's not great at telling you where they are.) With Parsec, you get no warning until your parser fails at run time.

Tera answered 1/9, 2011 at 13:18 Comment(2)
buildExpressionParser in Text.ParserCombinators.Parsec.Expr does the expression grammar nesting for you.Miscreant
@pat: Neat! I did not know about that one.Tera
R
5

This is the traditional decision: do I use lex/yacc (happy) or do I write my own (mostly recursive descent) parser, only that the parsec library is like a DSL for doing it right.

If one has experience with the yacc/lex approach, using happy will be a smaller learning curve.

Ringworm answered 1/9, 2011 at 11:6 Comment(3)
I have only experience with Parsec, which is a toolkit to write LL(1) parsers in Haskell using a specialized monad, so it's not quite "writing a parser from scratch".Hostler
If you already have experience with Parsec, then I advise to stay with it. Note that I said Parsec "is like a DSL for doing it (i.e. writing it's own parser) right"Ringworm
@FUZxxl: Slight nitpick: Parsec is LL(N) not just LL(1). You can have arbitrary lookahead using the try combinator. Actually, parsec is even more powerful than LL(N) because it can cause context sensitive grammars due to it being a monad.Humidity
D
2

In my opinion Parsec hides most of the nasty grammar details and lets you write your parsers more intuitively. If you want to learn this stuff in the first place go with some parser-generator like Happy (or even try to implement one yourself).

Doehne answered 1/9, 2011 at 11:13 Comment(0)
L
1

I'm used to the parser combinator library uu-parsinglib from utrecht university. One can have error correcting and permutations for free, and also the things that parsec has. I also like it because my implemented grammar looks like an EBNF grammar, without so much monadic stuff, and is easy to read.

Local answered 12/9, 2011 at 13:36 Comment(0)
D
1

Naive parser combinators do not allow left-recursion in grammar rules and I haven't found a library that does.

Happy does allow full BNF in language spec, and some useful staff like priority rules. So, for complicated cases Happy and parser generators in general are much better. However, in case of simple, stupid languages with LL(k) parseable grammars, I would use a parser combinator library as more maintainer-friendly.

Daphie answered 9/8, 2012 at 16:1 Comment(1)
There are parser combinator libraries that implement the GLL algorithm which can handle left-recursive grammars.Skillern

© 2022 - 2024 — McMap. All rights reserved.