Parsing complete mathematical expressions with PEG.js
Asked Answered
P

2

5

I'm trying to extend the example grammar of PEG.js for parsing mathematical expressions with all the 4 operators for my online BASIC interpreter experiment:

http://www.dantonag.it/basicjs/basicjs.html

but not all the expressions are parsed correctly.

This is my PEG grammar:

expression = additive

additive = left:multiplicative atag:("+" / "-") right:additive { return {tag: atag, left:left, right:right}; } / multiplicative

multiplicative = left:primary atag:("*" / "/") right:multiplicative { return {tag: atag, left:left, right:right}; } / primary

primary = number / "(" additive:additive ")" { return additive; }

number = digits:[0-9]+ { return parseInt(digits.join(""), 10); }

It parses correctly expressions like 2*3+1 (giving 7), but not an expression like 2-1-1, that gives 2 instead of 0.

Can you help me improving and debugging this?

Thanks in advance.

Edit: I've added the "number" rule to the grammar. And yes, my grammar gives as output a recursive structure that is analogue to a parse tree.

Peeper answered 15/10, 2013 at 20:7 Comment(0)
C
7

First: your grammar is missing the number rule. Also, as I'm sure you're aware, running your grammar (after adding number) on your example does not give 2, but rather something like a parse tree. Would you mind updating the question to fix those two issues?


Problem: It looks like you've run into associativity. Associativity comes into play when two operators with the same precedence are competing for an operand. In your example, - is competing with - -- so clearly it will have the same precedence as itself -- but associativity will also be important for breaking ties between + and -, and between * and /.

I assume that 2*3+1 is parsed correctly because the two operators have different precedences, meaning that associativity does not come into play, and that your grammar correctly implements precedence (although you should note that 2+3*1 is a more standard example for showing that multiplication has higher precedence than addition, since simple left-to-right parsing of 2*3+1 gives the same result as your parser).

I assume you want - to be left-associative, but it seems to be right-associative in your grammar, based on this example:

  • input:

    1-2-3
    
  • output (parsed as 1-(2-3)):

    {
       "tag": "-",
       "left": "1",
       "right": {
          "tag": "-",
          "left": "2",
          "right": "3"
       }
    }
    

The left associative tree would look like this (from (1-2)-3):

{
   "tag": "-",
   "left": {
      "tag": "-",
      "left": "1",
      "right": "2"
   },
   "right": "3"
}

You should note that your other operators also appear to be right-associative instead of left-.

Solution: I don't really know how peg.js works, but some quick googling turned up this.

Grammar-based solutions to operator precedence and associativity are often pretty nasty (see a grammar for Python for evidence), so you may want to check out [top down] operator precedence parsing for a more flexible and expressive alternative. Douglas Crockford, Vaughn Pratt, and Annika Aasa have some nice articles on this subject.

Chippy answered 16/10, 2013 at 14:33 Comment(1)
Ok thanks, solved this looking at the example grammar you posted. I think it would have been very hard to come to this solution without an hint!Peeper
A
11

Matt's answer is the correct one, but on how to implement left-associativity within pegjs:

expression = additive

additive
  = first:multiplicative rest:(("+" / "-") multiplicative)+ {
    return rest.reduce(function(memo, curr) {
      return {atag: curr[0], left: memo, right: curr[1]};
    }, first);
  }
  / multiplicative

multiplicative
  = first:primary rest:(("*" / "/") primary)+ {
    return rest.reduce(function(memo, curr) {
      return {atag: curr[0], left: memo, right: curr[1]};
    }, first);
  }
  / primary

primary
  = number
  / "(" additive:additive ")" { return additive; }

number
  = digits:[0-9]+ { return parseInt(digits.join(""), 10); }

The javascript.pegjs example Matt links to uses a similar method. The key is to process strings of operations with the same precedence as a list, which allows you to build your tree with the correct associativity.

Anam answered 12/6, 2015 at 8:34 Comment(1)
The use of reduce to roll up repeat expressions is really elegant. Great answer!Schopenhauer
C
7

First: your grammar is missing the number rule. Also, as I'm sure you're aware, running your grammar (after adding number) on your example does not give 2, but rather something like a parse tree. Would you mind updating the question to fix those two issues?


Problem: It looks like you've run into associativity. Associativity comes into play when two operators with the same precedence are competing for an operand. In your example, - is competing with - -- so clearly it will have the same precedence as itself -- but associativity will also be important for breaking ties between + and -, and between * and /.

I assume that 2*3+1 is parsed correctly because the two operators have different precedences, meaning that associativity does not come into play, and that your grammar correctly implements precedence (although you should note that 2+3*1 is a more standard example for showing that multiplication has higher precedence than addition, since simple left-to-right parsing of 2*3+1 gives the same result as your parser).

I assume you want - to be left-associative, but it seems to be right-associative in your grammar, based on this example:

  • input:

    1-2-3
    
  • output (parsed as 1-(2-3)):

    {
       "tag": "-",
       "left": "1",
       "right": {
          "tag": "-",
          "left": "2",
          "right": "3"
       }
    }
    

The left associative tree would look like this (from (1-2)-3):

{
   "tag": "-",
   "left": {
      "tag": "-",
      "left": "1",
      "right": "2"
   },
   "right": "3"
}

You should note that your other operators also appear to be right-associative instead of left-.

Solution: I don't really know how peg.js works, but some quick googling turned up this.

Grammar-based solutions to operator precedence and associativity are often pretty nasty (see a grammar for Python for evidence), so you may want to check out [top down] operator precedence parsing for a more flexible and expressive alternative. Douglas Crockford, Vaughn Pratt, and Annika Aasa have some nice articles on this subject.

Chippy answered 16/10, 2013 at 14:33 Comment(1)
Ok thanks, solved this looking at the example grammar you posted. I think it would have been very hard to come to this solution without an hint!Peeper

© 2022 - 2024 — McMap. All rights reserved.