Is Scalas/Haskells parser combinators sufficient?

R

6

8

I'm wondering if Scalas/Haskells parser combinators are sufficient for parsing a programming language. More specifically the language MiniJava. I'm currently reading compiller construction and jflex and java cup is quite painful to work with so I'm wondering if I could/should use parser combinators instead. The MiniJava syntax is very small. MiniJavas BNF: http://www.cambridge.org/us/features/052182060X/grammar.html

Rhapsodic answered 28/1, 2009 at 22:51 Comment(1)

I'm happy to announce that it certainly was! – Rhapsodic 22/3, 2009 at 12:5

S

4

Scala's parser is a backtracking parser, so it can deal with pretty much any BNF or EBNF. It also means, though, that there are edge cases where input can be painfully slow to be read.

If the grammar can be changed into an LL(1) grammar, you can use the ~! operator to keep backtracking to a minimum.

The grammar probably CAN be turned into LL(1), but, as written, it is not. See, for instance, that Expression and Statement have First/First conflicts (look this up at the end of the linked article).

Anyway, for an academic project, it is enough. For real life compiler stuff, you'll need faster parsers.

Semolina answered 8/7, 2009 at 21:58 Comment(0)

V

11

I've never used Scala, but the existence of a definitive BNF makes this easy.

Trivially translated into Haskell's Text.ParserCombinators.Parsec:

goal = do c <- mainClass
          cs <- many classDeclaration
          eof
          return $ c:cs
mainClass = do token "class"
               name <- identifier
               ...

etc. The PArrows translation is pretty trivial too. You'll probably find it easier to have a distinct lexing phase before the parser, but you can do without too.

Vasoinhibitor answered 29/1, 2009 at 2:6 Comment(0)

P

6

I'm using Scala's parser combinators to parse PL/SQL code, it works like a charm.

Photochemistry answered 29/1, 2009 at 20:52 Comment(1)

Ah yes, sorry the blog is in Spanish. I've also added support for C/Java parsing to the app. – Conformation 15/8, 2010 at 16:37

M

5

At least Parsec has built-in lexer for Java-like languages:

lexer = makeTokenParser javaStyle

You have to define the reserved words yourself.

Megillah answered 29/1, 2009 at 8:50 Comment(0)

S

4

Scala's parser is a backtracking parser, so it can deal with pretty much any BNF or EBNF. It also means, though, that there are edge cases where input can be painfully slow to be read.

If the grammar can be changed into an LL(1) grammar, you can use the ~! operator to keep backtracking to a minimum.

The grammar probably CAN be turned into LL(1), but, as written, it is not. See, for instance, that Expression and Statement have First/First conflicts (look this up at the end of the linked article).

Anyway, for an academic project, it is enough. For real life compiler stuff, you'll need faster parsers.

Semolina answered 8/7, 2009 at 21:58 Comment(0)

S

2

Programming in Scala (p. 647) says:

It [Scala's parser combinator framework] is much easier to understand and to adapt than a parser generator, and the difference in speed would often not matter in practice, unless you want to parse very large inputs.

As I would not classify source code as very large input (ideally), it should be sufficient.

Swallowtail answered 28/1, 2009 at 23:7 Comment(0)

V

0

I haven't dealt with the Scala or Haskell parser combinator libraries, but it looks like the grammar should be fine.

Volin answered 28/1, 2009 at 23:5 Comment(0)

Recommended topics

Hot tags