How to build a parse tree of a mathematical expression?
Asked Answered
P

4

11

I'm learning how to write tokenizers, parsers and as an exercise I'm writing a calculator in JavaScript.

I'm using a prase tree approach (I hope I got this term right) to build my calculator. I'm building a tree of tokens based on operator precedence.

For example, given an expression a*b+c*(d*(g-f)) the correct tree would be:

         +
        / \
       *   \
      / \   \
     a   b   \
              *
             / \
            c   *
               / \
              d   -
                 / \
                g   f

Once I've the tree, I can just traverse it down and apply the operations at each root node on the left and right nodes recursively to find the value of the expression.

However, the biggest problem is actually building this tree. I just can't figure out how to do it correctly. I can't just split on operators +, -, / and * and create the tree from the left and right parts because of precedence.

What I've done so far is tokenize the expression. So given a*b+c*(d*(g-f)), I end up with an array of tokens:

[a, Operator*, b, Operator+, c, Operator*, OpenParen, d, Operator*, OpenParen, g, Operator-, f, CloseParen, CloseParen]

However I can't figure out the next step about how to go from this array of tokens to a tree that I can traverse and figure out the value. Can anyone help me with ideas about how to do that?

Press answered 1/7, 2014 at 23:45 Comment(7)
Have a look at the Shunting-yard AlgorithmLoire
Or a parser generator.Rusch
@Rusch can you explain what to look for? i can't understand what a parser generator is... there are so many different pages about this term.Press
Look up yacc, or antlr. These are tools that eat a language grammar and generate the code; you fill in the code to create the individual nodes.Rusch
This may be useful: 1052470/javascript-parser-for-simple-expressionVachel
Parser generators are used because writing good parsing code by hand can be hard an tedious. Parser generators take a formal language spec and output the code. They are good for parsers of computer languages. Maths equations are a bit of a special case, the way they are structured is different to most languages and can be tricky to get right using yacc or bison. Often the parser for these is handcoded using the en.wikipedia.org/wiki/Shunting_yard_algorithm. BTW I've written my own maths parsing library at singsurf.org/djep/GWTJep.phpQualm
See my SO answer on how to build a recursive descent parser (for lots of things, including expressions). That answer links to another that describes how to build a tree as you parse. See #2246462Bloodless
D
7

Right mate, I dont know how to make this look pretty

I wrote a similar program in C but my tree is upside down, meaning new operators become the root.

A calculator parse tree in C code, but read the readme

ex: input 2 + 3 - 4

Start with empty node

{}

Rule 1: when you read a number, append a child either left or right of the current node, whichever one is empty

      {}
     /
   {2}

Rule 2: then you read an operator, you have to climb starting at the current node {2} to an empty node, when you find one, change its value to +, if there are no empty nodes, you must create one then make it the root of the tree

      {+}
     /
   {2}

you encounter another number, go to rule 1, we are currently at {+} find an side that is empty (this time right)

      {+}
     /   \
   {2}   {3}

Now we have new operand '-' but because the parent of {3} is full, you must create new node and make it the root of everything

        {-}
        /
      {+}
     /   \
   {2}   {3}

oh look at that, another number, because we are currently pointing at the root, lets find a child of {-} that is empty, (hint the right side)

         {-}
        /   \
      {+}   {4}
     /   \
   {2}   {3}

Once a tree is built, take a look at the function getresult() to calculate everything

You are probably wondering how Parenthesization works. I did it this way:

I create brand new tree every time I encounter a '(' and build the rest of the input into that tree. If I read another '(', I create another one and continue build with the new tree.

Once input is read, I attach all the trees's roots to one another to make one final tree. Check out the code and readme, I have to draw to explain everything.

Hope this helps future readers as well

Doralyn answered 11/3, 2017 at 19:13 Comment(2)
It looks like the GitHub link is dead, maybe you deleted or renamed the repo?Glossotomy
I think it's github.com/scarface382/math-parser nowTrojan
S
0

I've seen this question more often so I just wrote this away. This should definitely push you in the right direction but be careful, this is a top-down parser so it may not interpret the expression with the precedence you would expect.

class Program
{
    static void Main()
    {
        Console.WriteLine(SimpleExpressionParser.Parse("(10+30*2)/20").ToString());
        Console.ReadLine();
        //
        // ouput: ((10+30)*2)/20
    }

    public static class SimpleExpressionParser
    {
        public static SimpleExpression Parse(string str)
        {
            if (str == null)
            {
                throw new ArgumentNullException("str");
            }

            int index = 0;

            return InternalParse(str, ref index, 0);
        }

        private static SimpleExpression InternalParse(string str, ref int index, int level)
        {
            State state = State.ExpectLeft;
            SimpleExpression _expression = new SimpleExpression();

            int startIndex = index;
            int length = str.Length;

            while (index < length)
            {
                char chr = str[index];

                if (chr == ')' && level != 0)
                {
                    break;
                }

                switch (state)
                {
                    case State.ExpectLeft:
                    case State.ExpectRight:
                        {
                            SimpleExpression expression = null;

                            if (Char.IsDigit(chr))
                            {
                                int findRep = FindRep(Char.IsDigit, str, index + 1, length - 1);

                                expression = new SimpleExpression(int.Parse(str.Substring(index, findRep + 1)));
                                index += findRep;
                            }
                            else if (chr == '(')
                            {
                                index++;
                                expression = InternalParse(str, ref index, level + 1);
                            }

                            if (expression == null)
                            {
                                throw new Exception(String.Format("Expression expected at index {0}", index));
                            }

                            if (state == State.ExpectLeft)
                            {
                                _expression.Left = expression;
                                state = State.ExpectOperator;
                            }
                            else
                            {
                                _expression.Right = expression;
                                state = State.ExpectFarOperator;
                            }
                        }
                        break;

                    case State.ExpectOperator:
                    case State.ExpectFarOperator:
                        {
                            SimpleExpressionOperator op;

                            switch (chr)
                            {
                                case '+': op = SimpleExpressionOperator.Add; break;
                                case '-': op = SimpleExpressionOperator.Subtract; break;
                                case '*': op = SimpleExpressionOperator.Multiply; break;
                                case '/': op = SimpleExpressionOperator.Divide; break;

                                default:
                                    throw new Exception(String.Format("Invalid operator encountered at index {0}", index));
                            }

                            if (state == State.ExpectOperator)
                            {
                                _expression.Operator = op;
                                state = State.ExpectRight;
                            }
                            else
                            {
                                index++;

                                return new SimpleExpression(op, _expression, InternalParse(str, ref index, level));
                            }
                        }
                        break;
                }

                index++;
            }

            if (state == State.ExpectLeft || state == State.ExpectRight)
            {
                throw new Exception("Could not complete expression");
            }

            return _expression;
        }

        private static int FindRep(Func<char, bool> validator, string str, int beginPos, int endPos)
        {
            int pos;

            for (pos = beginPos; pos <= endPos; pos++)
            {
                if (!validator.Invoke(str[pos]))
                {
                    break;
                }
            }

            return pos - beginPos;
        }

        private enum State
        {
            ExpectLeft,
            ExpectRight,
            ExpectOperator,
            ExpectFarOperator
        }
    }

    public enum SimpleExpressionOperator
    {
        Add,
        Subtract,
        Multiply,
        Divide
    }

    public class SimpleExpression
    {
        private static Dictionary<SimpleExpressionOperator, char> opEquivs = new Dictionary<SimpleExpressionOperator, char>()
        {
            { SimpleExpressionOperator.Add, '+' },
            { SimpleExpressionOperator.Subtract, '-' },
            { SimpleExpressionOperator.Multiply, '*' },
            { SimpleExpressionOperator.Divide, '/' }
        };

        public SimpleExpression() { }

        public SimpleExpression(int literal)
        {
            Literal = literal;
            IsLiteral = true;
        }

        public SimpleExpression(SimpleExpressionOperator op, SimpleExpression left, SimpleExpression right)
        {
            Operator = op;
            Left = left;
            Right = right;
        }

        public bool IsLiteral
        {
            get;
            set;
        }

        public int Literal
        {
            get;
            set;
        }

        public SimpleExpressionOperator Operator
        {
            get;
            set;
        }

        public SimpleExpression Left
        {
            get;
            set;
        }

        public SimpleExpression Right
        {
            get;
            set;
        }

        public override string ToString()
        {
            StringBuilder sb = new StringBuilder();

            AppendExpression(sb, this, 0);

            return sb.ToString();
        }

        private static void AppendExpression(StringBuilder sb, SimpleExpression expression, int level)
        {
            bool enclose = (level != 0 && !expression.IsLiteral && expression.Right != null);

            if (enclose)
            {
                sb.Append('(');
            }

            if (expression.IsLiteral)
            {
                sb.Append(expression.Literal);
            }
            else
            {
                if (expression.Left == null)
                {
                    throw new Exception("Invalid expression encountered");
                }

                AppendExpression(sb, expression.Left, level + 1);

                if (expression.Right != null)
                {
                    sb.Append(opEquivs[expression.Operator]);
                    AppendExpression(sb, expression.Right, level + 1);
                }
            }

            if (enclose)
            {
                sb.Append(')');
            }
        }
    }
}
Smollett answered 2/7, 2014 at 12:35 Comment(1)
question is about javascript, not c#Bradleigh
K
0

You may be interested in math.js, a math library for JavaScript which contains an advanced expression parser (see docs and examples. You can simply parse an expression like:

var node = math.parse('a*b+c*(d*(g-f))');

which returns a node tree (which can be compiled and evaluated).

You can study the parsers code here: https://github.com/josdejong/mathjs/blob/master/src/expression/parse.js

You can find a few examples of simpler expression parsers here (for C++ and Java): http://www.speqmath.com/tutorials/, that can be useful to study the basics of an expression parser.

Kumquat answered 2/7, 2014 at 20:0 Comment(0)
B
0

See https://github.com/carlos-chaguendo/arboles-binarios This algorithm converts an algebraic expression to a binary tree. Converts the expression to postfix and then evaluates and draws the tree


test: 2+3*4+2^3

output:

    22 =
        └── +
            ├── +
            │   ├── 2
            │   └── *
            │       ├── 3
            │       └── 4
            └── ^
                ├── 2
                └── 3
Bertrand answered 27/1, 2017 at 22:34 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.