Recursive expression evaluator using Java
Asked Answered
Z

6

10

I am going to write an expression evaluator which only does addition and subtraction. I have a simple algorithm to do that; but, I have some implementation problems.

I considered an expression as (it is a String)

"(" <expression1> <operator> <expression2> ")"

Here is my algorithm

String evaluate( String expression )

   if expression is digit
      return expression

   else if expression is "(" <expression1> <operator> <expression2> ")"
      cut the brackets out of it
      expression1 = evaluate( <expression1> )
      operator = <operator>
      expression2 = evaluate( <expression2> )

   if operator is +
      expression1 + expression2
   
   else if operator is -
      expression1 - expression2 

My problem is parsing <expression1>, <operator> and <expression2> from the expression. How can I do that?

Note: I'm not asking for a code. All I need is an idea to do that.

Zoography answered 1/11, 2010 at 21:10 Comment(1)
If you are interested in a working example of a small Java math evaluator written in precisely this way, I have one on my website: softwaremonkey.org/Code/MathEvalTwannatwattle
T
7

My problem is parsing <expression1>, <operator> and <expression2> from the expression

Don't do that, then :) When you see an opening bracket, do your recursive call to expression. At the end of the expresssion, either you find another operator (and so you're not at the end of the expression after all), or a right-bracket, in which case you return from the evaluate.

Tried answered 1/11, 2010 at 21:16 Comment(5)
Well, to do a recursive call to parse expression1, he would basically need to count parenthesis in order to tell where expression1 ended, but otherwise I like your answer.Flooring
Not really. The recursion does the counting for you. If you run across a ) at a point where an expression could end, that's the end of this recursive call. This is how recursive-descent parsers work...Tried
Ah, so you recurse on the tail of the string, even if it's ill-balanced?Flooring
I think so, if I understand you correctly. If you see a (, you recurse. Either you hit the end of the input (in which case, error) or you see a balancing ) and return from this recursion. If you see a ) after returning to the top level, that's an error too. This is what (recursive descent) parser generators will produce, but it's educational to implement one yourself. That's why they're called recursive descent, in fact!Tried
You don't have to do any of that. Your term() and factor() and prime() methods should just return if the next token isn't something they can handle. So when expression() returns into the code that called it because of the '(', the next token should be ')'. If it isn't, it's missing.Beyrouth
F
3

Either you use a parser generator such as JavaCUP or ANTLR. Write up a BNF of your expression and generate a parser. Here is a sample grammar that would get you started:

Expression ::= Digit
            |  LeftBracket Expression Plus Expression RightBracket
            |  LeftBracket Expression Minus Expression RightBracket
            |  LeftBracket Expression RightBracket

A "hacky" way of doing it yourself would be to look for the first ) backtrack to the closest ( look at the parenthesis free expression in between, simply split on the operator symbols and evaluate.

Flooring answered 1/11, 2010 at 21:16 Comment(3)
I think you left Number out of your grammar.Reorganization
That’s an ambiguous grammar, since parentheses aren’t mandatory.Genvieve
Good point. And the OP seemed to require parentheses so I added it :-)Flooring
D
3

Use a StringTokenizer to split your input string into parenthesis, operators and numbers, then iterate over your tokens, making a recursive call for every open-parens, and exiting your method for every close parenthesis.

I know you didn't ask for code, but this works for valid input:

public static int eval(String expr) {
    StringTokenizer st = new StringTokenizer(expr, "()+- ", true);
    return eval(st);
}

private static int eval(StringTokenizer st) {
    int result = 0;
    String tok;
    boolean addition = true;
    while ((tok = getNextToken(st)) != null) {
        if (")".equals(tok))
            return result;
        else if ("(".equals(tok))
            result = eval(st);
        else if ("+".equals(tok))
            addition = true;
        else if ("-".equals(tok))
            addition = false;
        else if (addition)
            result += Integer.parseInt(tok);
        else
            result -= Integer.parseInt(tok);
    }
    return result;
}

private static String getNextToken(StringTokenizer st) {
    while (st.hasMoreTokens()) {
        String tok = st.nextToken().trim();
        if (tok.length() > 0)
            return tok;
    }
    return null;
}

It would need better handling of invalid input, but you get the idea...

Darrickdarrill answered 1/11, 2010 at 21:26 Comment(5)
I didn't understand why did you use getNextToken() instead of using nextToken() ?Zoography
It doesn't handle parentheses or operator precedence correctly, and it never will until you introduce recursion into it, or an operand stack.Beyrouth
Parenthesis are handled correctly, and since addition and subtraction (the only 2 operations needed) have the same precedence, there's no need to add any additional logic for that. If you were to want multiplication and division, then yes, you'd need an operand stack.Darrickdarrill
@ECP: Mea Culpa - I see you're right about the mishandling of parenthesis; my unnecessary recursive call for simple addition or subtraction was messings things up there... that's what I get for trying to hack some code together in 5 minutes :p I fixed the code to remove this unnecessary recursion.Darrickdarrill
@alicozgo getNextToken() is used to skip whitespace, though in retrospect that could've just been ignored by eval() itself as well. And you're right, this is essentially the same solution Paul suggested.Darrickdarrill
S
3

I would recommend changing the infix input into postfix and then evaluating it, rather than reducing the expression infix-wise. There are already well defined algorithms for this and it doesn't come with the inherent multiple-nested-parentheses parsing problems.

Take a look at the Shunting Yard Algorithm to convert to postfix/RPN then evaluate it using a stack using Postfix Operations. This is fast (O(n)) and reliable.

Someplace answered 1/11, 2010 at 21:51 Comment(0)
K
1

I would suggest taking an approach that more closely resembles the one described in this old but (in my opinion) relevant series of articles on compiler design. I found that the approach of using small functions/methods that parse parts of the expression to be highly effective.

This approach allows you to decompose your parsing method into many sub-methods whose names and order of execution closely follows the EBNF you might use to describe the expressions to be parsed.

Kempe answered 1/11, 2010 at 21:15 Comment(0)
L
-2

Perhaps create regular expressions for expression and operator and then use matching to identify and break out your content.

Lecia answered 1/11, 2010 at 21:23 Comment(2)
You can't create a regular expression for expression as it involves well balanced parentheses.Flooring
This is not a regular language, it is context free and so cannot be parsed by regular expressions.Someplace

© 2022 - 2024 — McMap. All rights reserved.