Choosing a Haskell parser

G

4

32

There are many open sourced parser implementations available to us in Haskell. Parsec seems to be the standard for text parsing and attoparsec seems to be a popular choice for binary parsing but I don't know much beyond that. Is there a particular decision tree that you follow for choosing a parser implementation? Have you learned anything interesting about the strengths or weaknesses of the libraries?

Go answered 19/6, 2010 at 20:57 Comment(0)

B

52

You have several good options.

For lightweight parsing of String types:

For packed bytestring parsing, e.g. of HTTP headers.

attoparsec

For actual binary data most people use either:

binary -- for lazy binary parsing
cereal -- for strict binary parsing

The main question to ask yourself is what is the underlying string type?

String?
bytestring (strict)?
bytestring (lazy)?
unicode text

That decision largely determines which parser toolset you'll use.

The second question to ask is: do I already have a grammar for the data type? If so, I can just use happy

The Happy parser generator

And obviously for custom data types there are a variety of good existing parsers:

XML
- haxml
- xml-light
- hxt
- hexpat
CSV
- bytestring-csv
- csv
JSON
- json
rss/atom
- feed

Brimstone answered 19/6, 2010 at 21:13 Comment(0)

S

12

Just to add to Don's post: Personally, I quite like Text.ParserCombinators.ReadP (part of base) for no-nonsense quick and easy stuff. Particularly when Parsec seems like overkill.

There is a bytestringreadp library for the bytestring version, but it doesn't cover Char8 bytestrings, and I suspect attoparsec would be a better choice at this point.

Sworn answered 20/6, 2010 at 0:57 Comment(0)

S

4

I recently converted some code from Parsec to Attoparsec. Both are quite capable.

Attoparsec wins on performance and memory footprint, but Parsec provides better error reporting and has more complete documentation.

Socialistic answered 20/6, 2010 at 20:53 Comment(0)

G

3

Bryan O’Sullivan’s blog post What’s in a parser? Attoparsec rewired (2/2) includes a nice performance benchmark comparing several implementations along with some comments comparing memory usage.

Go answered 20/6, 2010 at 15:44 Comment(0)

Recommended topics

Hot tags