Lexer/parser to generate Scala code from BNF grammar
Asked Answered
E

3

11

I'm currently looking for a lexer/parser that generates Scala code from a BNF grammar (an ocamlyacc file with precedence and associativity). I'm quite confused since I found almost nothing on how to do it.

For parsing, I found scala-bison (that I have a lot of trouble to work with). All the other tools are just Java parsers imported into Scala (like ANTLR).

For lexing, I found nothing.

I also found the famous parser combinators of Scala, but (correct me if I'm wrong), even if they are quite appealing, they consume a lot of time and memory, mainly due to backtracking.

So I have two main questions:

  • Why do people only seem to concentrate on _parser combinators?
  • What is your best lexer/parser generator suggestion to use with Scala?
Expedition answered 22/6, 2010 at 14:19 Comment(0)
P
9

As one of the authors of the ScalaBison paper, I have run into this issue a few times. :-) What I would usually do for scanning in Scala is use JFlex. It works surprisingly well with ScalaBison, and all of our benchmarking was done using that combination. The unfortunate downside is that it does generate Java sources, and so compilation takes a bit of gymnastics. I believe that John Boyland (the main author of the paper) has developed a Scala output mode for JFlex, but I don't think it has been publicly released.

For my own development, I've been working a lot with scannerless parsing techniques. Scala 2.8's packrat parser combinators are quite good, though still not generalized. I've built an experimental library which implements generalized parsing within the parser combinator framework. Its asymptotic bounds are much better than traditional parser combinators, but in practice the constant time overhead is higher (I'm still working on it).

Pansy answered 22/6, 2010 at 16:54 Comment(2)
Thanks for the answer and your gll combinators, I'll try to understand how it works :) But I think I'll try to play with JFlex and Scala together.Expedition
Thanks to all lot of tutorial (including some of yours on codecommit) I finally managed to do a simple lexer/parser with parser combinators, and without too much recursion.. thanks again !Expedition
T
4

Scala 2.8 has a packrat parser. I quote from the API docs here:

Packrat Parsing is a technique for implementing backtracking, recursive-descent parsers, with the advantage that it guarantees unlimited lookahead and a linear parse time. Using this technique, left recursive grammars can also be accepted.

Tinatinamou answered 22/6, 2010 at 16:1 Comment(0)
B
4

I know that this question is old, but for those still in search of a lexer generator that outputs Scala code, I've written a fork of JFlex that emits Scala rather than Java, including corresponding Maven and sbt plugins. All are now available on Maven Central.

We're currently using it (including the Maven/sbt plugins) to tokenize English text as part of the natural language processing pipline in FACTORIE -- example .flex file containing Scala here.

Bohannan answered 8/4, 2015 at 22:21 Comment(2)
That's great. I had released JFlex 1.5 + scale github.com/moy/JFlex/releases but it seems your is more up-to-date, as well easier to find.Insomniac
@JohnTangBoyland I wish I had found your version before writing mine!Bohannan

© 2022 - 2024 — McMap. All rights reserved.