Recursive grammars in FParsec
Asked Answered
W

1

11

I've decided to check out FParsec, and tried to write a parser for λ expressions. As it turns out, eagerness makes recursive parsing difficult. How can I solve this?

Code:

open FParsec

type λExpr =
    | Variable of char
    | Application of λExpr * λExpr
    | Lambda of char * λExpr

let rec FV = function
    | Variable v -> Set.singleton v
    | Application (f, x) -> FV f + FV x
    | Lambda (x, m) -> FV m - Set.singleton x

let Λ0 = FV >> (=) Set.empty

let apply f p =
    parse
        { let! v = p
          return f v }

let λ e =

    let expr, exprR = createParserForwardedToRef()

    let var = lower |> apply Variable

    let app = tuple2 expr expr
                 |> apply Application

    let lam = pipe2 (pchar 'λ' >>. many lower)
                        (pchar '.' >>. expr) (fun vs e ->
                                                List.foldBack (fun c e -> Lambda (c, e)) vs e)

    exprR := choice [
                    lam
                    app
                    var
                    (pchar '(' >>. expr .>> pchar ')')
                    ]

    run expr e

Thanks!

Wellbeing answered 31/5, 2011 at 10:29 Comment(2)
+1 for greek characters in your code :)Geraud
Okay, I finally downloaded FParsec :-)Truss
T
10

As you pointed out, the problem is that your parser for application calls expr recursively and so there is an infinite loop. The parser needs to be written such that it always consumes some input and then decides what to do.

In case of lambda calculus, the tricky thing is recognizing an application and a variable because if you have input like x... then the first character suggests it could be either of them.

You can merge the rules for application and variable like this:

let rec varApp = parse {
  let! first = lower |> apply Variable
  let! res = 
    choice [ expr |> apply (fun e -> Application(first, e))
             parse { return first } ]
  return res }

This first parses a variable and then either parses another expression (in which case it is an application) or it just returns the variable (if there is no expression following the variable). The rest of the rules are similar:

and lam = 
  pipe2 (pchar 'λ' >>. many lower)
        (pchar '.' >>. expr) (fun vs e ->
    List.foldBack (fun c e -> Lambda (c, e)) vs e)
and brac = pchar '(' >>. expr .>> pchar ')'
and expr = parse.Delay(fun () ->
  choice 
    [ lam; varApp; brac ])

I just avoided the need for explicit mutation by using parse.Delay() (which makes it possible to create recursive value references). In principle, it could be written as:

and expr = parse {
  return! choice [ lam; varApp; brac ] }

...but for some reason, FParsec doesn't implement the ReturnFrom member that is needed if you want to use return! in computation expressions.

Truss answered 1/6, 2011 at 1:46 Comment(7)
Thanks for bringing this to my attention. The omission of the ReturnFrom member from the 'parse' builder object is an oversight. In a previous version of F#, ReturnFrom was defined implicitly. (The definition is trivial.) This was supposed to be fixed in FParsec 0.9, but I forgot about it. I've just checked the fix into the BitBucket repository.Hemelytron
Personally I no longer use the computation expression syntax for constructing parsers due to the performance issues described here: quanttec.com/fparsec/users-guide/…Hemelytron
@Stephan Tolksdorf - The lack of ReturnFrom bothered me in the beginning but I have started avoiding the computation expression syntax because I noticed the parsers were much faster that way. I've just started getting used to your operator precedence parser which has given me an even bigger speed boost.Kirstinkirstyn
@Stephan - Thanks for the answer and for adding ReturnFrom :-). The points about performance are very interesting! Would it be possible to solve it using some combinator (e.g. in the style of Seq.cache)?Truss
@Kirstinkirstyn - Nice to hear that the operator precedence parser speeds up your parser. If there's any thing in FParsec that bothers you besides the previously missing ReturnFrom, please let me know!Hemelytron
@Tomas Petricek - The equivalent to Seq.cache for FParsec would be a memoization combinator like the one described in quanttec.com/fparsec/users-guide/… However, such a combinator will only help performance in certain special situations where you have a frequently backtracking parser. To get rid of the overhead associated with the computation expression syntax you either need more advanced compiler optimizations or you need to use use meta-programming techniques to essentially perform these optimizations yourself.Hemelytron
Fortunately, the overhead associated with the computation expression syntax is usually negligible for the most important application of computation expressions in F#: async-expressions. (seq-expressions are a special case, since the F# compiler compiles them into state machines.)Hemelytron

© 2022 - 2024 — McMap. All rights reserved.