Parsing "x y z" with the precedence of multiply
Asked Answered
P

1

5

I'm trying to write a parser for the Mathematica language in F# using FParsec.

I have written one for a MiniML that supports the syntax f x y = (f(x))(y) with high precedence for function application. Now I need to use the same syntax to mean f*x*y and, therefore, have the same precedence as multiply. In particular, x y + 2 = x*y + 2 whereas x y ^ 2 = x * y^2.

How can this be accomplished?

Poleaxe answered 28/3, 2015 at 21:34 Comment(7)
I haven't tried this, but I think you could implement this with the OperatorPrecedenceParser by making your normal whitespace parser not accept whitespace between identifiers and adding an infix operator for the " " space string with an 'after-string-parser' that fails without consuming input if the space is not followed by an identifier.Interbreed
But that wouldn't parse (x)(y) = x*y?Poleaxe
Maybe you could parse the second term in parens using a "(" postfix operator that parses the term and the closing paren with the after-string-parser. A cleaner approach, without the hackish " " and "(" operators, would be to parse juxtaposed terms as a sequence of terms. To correctly handle precedence, you'd probably need a separate OPP instance for all the (top-level) terms in a sequence other than the first. This other OPP would only include the operators that have a higher precedence than multiplication (and no prefix +/-).Interbreed
Aha! Didn't occur to me to use two OPPs. I'll try it. Thanks!Poleaxe
Argh, turns out this doesn't quite work for me because I have a high precedence "/" operator and a low precedence "/." operator. Unless they are in the same OperatorPrecedenceParser they are not disambiguated and trying to parse "/." fails with an error.Poleaxe
This should be easy to fix: Just make the after-string-parser of the "/" operator check that the backslash is not followed by a dot, e.g. by using notFollowedByString "." >>. spaces as the after-string-parser for this operator.Interbreed
If there are multiple operators that start with "/", you obviously have to check for any of the possible chars, e.g. by using nextCharSatisfiesNot (function '.' | ':' -> true | _ -> false) instead of notFollowedByString. (The generated error message shouldn't matter when the after-string-parser fails without consuming input.)Interbreed
P
7

As Stephan pointed out in a comment you can split the operator parser into two separate parsers and put your own parser in the middle for space-separated expressions. The following code demonstrates this:

#I "../packages/FParsec.1.0.1/lib/net40-client"
#r "FParsec"
#r "FParsecCS"

open FParsec
open System.Numerics

type Expr =
  | Int of BigInteger
  | Add of Expr * Expr
  | Mul of Expr * Expr
  | Pow of Expr * Expr

let str s = pstring s >>. spaces
let pInt : Parser<_, unit> = many1Satisfy isDigit |>> BigInteger.Parse .>> spaces
let high = OperatorPrecedenceParser<Expr,unit,unit>()
let low = OperatorPrecedenceParser<Expr,unit,unit>()
let pHighExpr = high.ExpressionParser .>> spaces
let pLowExpr = low.ExpressionParser .>> spaces

high.TermParser <-
  choice
    [ pInt |>> Int
      between (str "(") (str ")") pLowExpr ]

low.TermParser <-
  many1 pHighExpr |>> (function [f] -> f | fs -> List.reduce (fun f g -> Mul(f, g)) fs) .>> spaces

low.AddOperator(InfixOperator("+", spaces, 10, Associativity.Left, fun f g -> Add(f, g)))
high.AddOperator(InfixOperator("^", spaces, 20, Associativity.Right, fun f g -> Pow(f, g)))

run (spaces >>. pLowExpr .>> eof) "1 2 + 3 4 ^ 5 6"

The output is:

Add (Mul (Int 1,Int 2),Mul (Mul (Int 3,Pow (Int 4,Int 5)),Int 6))

which represents 1 * 2 + 3 * 4^5 * 6 as expected.

Poleaxe answered 29/3, 2015 at 20:10 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.