What should the correct grammar be for correct precedence evaluation of +,-,/,*, etc
Asked Answered
S

2

6

My grammar has these rules

 expression
: expression EQ conditionalOrExpression                     #eqExpr
 | expression NEQ conditionalOrExpression                   #neqExpr
 | expression LT conditionalOrExpression                    #ltExpr
 | expression GT conditionalOrExpression                    #gtExpr
 | expression LTEQ conditionalOrExpression                  #lteqExpr
 | expression GTEQ conditionalOrExpression                  #gteqExpr
 | conditionalOrExpression                                  #next       
;

conditionalOrExpression
 : conditionalOrExpression OR conditionalAndExpression      #orExpr
 | conditionalAndExpression                                 #and
 ;

conditionalAndExpression
 : conditionalAndExpression AND additiveExpression          #andExpr
 | additiveExpression                                       #add
 ;

additiveExpression
 : additiveExpression PLUS multiplicativeExpression         #plusExpr
 | additiveExpression MINUS multiplicativeExpression        #minusExpr
 | multiplicativeExpression                                 #multiplicative
 ;

multiplicativeExpression
 : multiplicativeExpression MULT unaryExpression            #multExpr
 | multiplicativeExpression DIV unaryExpression             #divExpr
 | unaryExpression                                          #unary
 ;

unaryExpression
 : MINUS unaryExpression                                    #unaryMinusExpr
 | NOT unaryExpression                                      #notExpr
 | atom                                                     #atomExpr
 ;

function
: ID OPAR (parameter (',' parameter)*)? CPAR
;

parameter
: STRING                                                    #stringParameter
| expression                                                #exprParameter
;

atom
 : OPAR expression CPAR                                     #parExpr
 | (INT | FLOAT)                                            #numberAtom
 | (TRUE | FALSE)                                           #booleanAtom
 | ID                                                       #idAtom
 | function                                                 #functionAtom
 ;

I have implemented the appropriate Visitor.

If I evaluate "40 + 10 - (2*40) + (100/40) + 0.2" the result is -32.7. This is because the expression gets evaluated as

(40+10) - (((2*40) + (100/40)) + 0.2) 

which makes sense according to the rules (PLUS before MINUS).

However, if I evaluate the same expression in Excel or e.g. assign it to a double in C#, in both cases the result is -27.3. This is because they evaluate the rule as

(((40+10)-(2*40)) + (100/40)) + 0.2

So which is "correct"? -32.7 is technically correct since that's what the rules dictate. But how should the grammar change to match the results in Excel/C#?

Sycee answered 20/8, 2014 at 20:3 Comment(1)
The C#/Excel way of doing it is "correct" in that it is the mathematical way. Your grammar, as you say, puts "PLUS before MINUS". This is not the case in the order of operations used in standard modern mathematics. It is as follows: Bracketed Expressions -> Exponentiation -> Multiplication/Division (whichever comes first, left to right) -> Addition/Subtraction (whichever comes first, left to right). You are, of course, free to define mathematics however you want in your own language, but I would highly suggest using the standard form, as programmers will already be familiar with it.Citronellal
A
6

If you group the + and - in a single alternative, it should work. I changed the grammar in my demo evaluator on GitHub as follows:

expr
 : expr POW<assoc=right> expr           #powExpr
 | MINUS expr                           #unaryMinusExpr
 | NOT expr                             #notExpr
 | expr op=(MULT | DIV | MOD) expr      #multiplicationExpr
 | expr op=(PLUS | MINUS) expr          #additiveExpr
 | expr op=(LTEQ | GTEQ | LT | GT) expr #relationalExpr
 | expr op=(EQ | NEQ) expr              #equalityExpr
 | expr AND expr                        #andExpr
 | expr OR expr                         #orExpr
 | atom                                 #atomExpr
 ;

and then tested with the following code:

MuLexer lexer = new MuLexer(new ANTLRInputStream("log(40 + 10 - (2*40) + (100/40) + 0.2);"));
MuParser parser = new MuParser(new CommonTokenStream(lexer));
new EvalVisitor().visit(parser.parse());

the value -27.3 gets printed on my console.

Aeroembolism answered 21/8, 2014 at 8:26 Comment(1)
now how to left-factor this code? (Combining all the '| expr' into one rule) :)Incapacitate
B
1

To get the order of operations correct on expressions like this you'll want to set up nested rules. Parse the higher precedence operators deeper in, closer to the primary expressions. For an example, I'd point you to a syntax I wrote for Jison, a similar grammar tool. The code can be seen here.

You'll break the one expr expression into several smaller expressions for each "level" of operator precedence, such as:

expr
  : comp_expr
  | expr0
  ;

expr0
  : or_expr
  | expr1
  ;

expr1
  : and_expr
  | expr2
  ;

expr2
  : add_expr
  | expr3
  ;

expr3
  : mult_expr
  | expr4
  ;

expr4
  : exp_expr
  | primary
  ;
Bisutun answered 20/8, 2014 at 20:10 Comment(2)
C++ (and probably C#) would look vaguely like this: ideone.com/GyAEoB [source: en.wikipedia.org/wiki/…Wirephoto
this leads to a complicated parse treeIncapacitate

© 2022 - 2024 — McMap. All rights reserved.