Shunting yard algorithm with functions
Asked Answered
U

1

6

my first post here :)

I saw that there are a lot of questions about the Shunting yard algorithm but I hope there still are forum members that interested to help me with yet another question about this algorithm.

I did search trough the other posts to see if my answer is already answered and i did some research on other forums and the internet to:

https://mcmap.net/q/1776637/-modifying-the-shunting-yard-algorithm-c
http://www.computerhope.com/forum/index.php?topic=146535.0
http://en.wikipedia.org/wiki/Shunting-yard_algorithm
http://www.autoitscript.com/forum/topic/164627-shunting-yard-with-functions/

My code is written in vb-script because i like its simplicity and i don't know java or c like languages..

my question:

at the moment the algorithm allows wrong usage of "(" and ")" example: function((10,20,)30) is allowed but it is clearly not the right way to call a function..

also i'm not sure if my code is written correctly, the pseudo-code from Wikipedia was my reference but its not very clear :(

i'm also planning to extend it with if-else statements and nested loops and stuff because the main goal is to write some sort of interpreter in a vb like language as a learning project :)

my code [edit]:

SET PRECEDENCE = CREATEOBJECT("SCRIPTING.DICTIONARY")
WITH PRECEDENCE
    .ADD "^",3
    .ADD "*",2
    .ADD "/",2
    .ADD "%",2
    .ADD "+",1
    .ADD "-",1
    .ADD "FUNCTION",0
    .ADD "(",0
    .ADD ",",0
    .ADD ")",0
END WITH

'#############################################################################
tokenArray = split("FUNCTION ( ( A , B ) , C )")
msgbox SHUNTINGYARD(tokenArray)
'#############################################################################

FUNCTION SHUNTINGYARD(INPUT)

    TOKEN_QUEUE = ARRAY()
    TOKEN_STACK = ARRAY()

    FOR TOKEN_NUMBER = 0 TO UBOUND(INPUT)
        SELECT CASE INPUT(TOKEN_NUMBER)

            CASE "("
                CALL PUSH(INPUT(TOKEN_NUMBER), TOKEN_STACK)

            CASE ")"
                DO WHILE NOT( PRECEDENCE( PEEK(TOKEN_STACK) ) = 0 )
                    CALL PUSH(POP(TOKEN_STACK), TOKEN_QUEUE)
                    IF STACKISEMPTY(TOKEN_STACK) THEN CALL ERRORS("Can't find a matching ""("".", TRUE)
                LOOP

                IF PEEK(TOKEN_STACK) = "FUNCTION" THEN
                    DISCARD = POP(TOKEN_STACK)
                    CALL PUSH("@", TOKEN_QUEUE)
                ELSE
                    DISCARD = POP(TOKEN_STACK)
                END IF

            CASE ","
                DO WHILE NOT( PRECEDENCE( PEEK(TOKEN_STACK) ) = 0 )
                    CALL PUSH(POP(TOKEN_STACK), TOKEN_QUEUE)
                    IF STACKISEMPTY(TOKEN_STACK) THEN CALL ERRORS("Can't find a matching function ""("".", TRUE)
                LOOP

            CASE "+","-","*","/","^","%"
                TOKEN_A = INPUT(TOKEN_NUMBER)
                DO WHILE ISOPERATOR(PEEK(TOKEN_STACK))
                    TOKEN_B = PEEK(TOKEN_STACK)
                    IF (ASSOCIATIVITY(TOKEN_B) = "left" AND PRECEDENCE(TOKEN_A) = PRECEDENCE(TOKEN_B)) OR (PRECEDENCE(TOKEN_A) < PRECEDENCE(TOKEN_B)) THEN
                        CALL PUSH(POP(TOKEN_STACK), TOKEN_QUEUE)
                    ELSE
                        EXIT DO
                    END IF
                LOOP
                CALL PUSH(TOKEN_A, TOKEN_STACK)

            CASE ELSE
                CALL PUSH(INPUT(TOKEN_NUMBER), TOKEN_QUEUE)

        END SELECT
    NEXT

    FOR ITEMCOUNT = 0 TO UBOUND(TOKEN_STACK)
        IF PEEK(TOKEN_STACK) = "(" THEN CALL ERRORS("Can't find a matching "")"".", TRUE)'(
        CALL PUSH(POP(TOKEN_STACK), TOKEN_QUEUE)
    NEXT

    SHUNTINGYARD = JOIN(TOKEN_QUEUE,"|")

END FUNCTION

'#############################################################################

FUNCTION ASSOCIATIVITY(ASSOC)
    SELECT CASE LCASE(ASSOC)
        CASE "^","\"
            ASSOCIATIVITY = "right"
        CASE ELSE
            ASSOCIATIVITY = "left"
    END SELECT
END FUNCTION

FUNCTION ISOPERATOR(ITEM)
    ISOPERATOR = LEN(ITEM) = 1 AND INSTR("+-*/%^",ITEM)
END FUNCTION

SUB PUSH(ITEM,BYREF STACK)
    IF UBOUND(STACK) > -1 THEN
        REDIM PRESERVE STACK(UBOUND(STACK) + 1)
        STACK(UBOUND(STACK)) = ITEM
    ELSE
        STACK = ARRAY(ITEM)
    END IF
END SUB

FUNCTION POP(BYREF STACK)
    IF UBOUND(STACK) > -1 THEN
        POP = STACK(UBOUND(STACK))
        REDIM PRESERVE STACK(UBOUND(STACK) - 1)
    END IF
END FUNCTION

FUNCTION STACKISEMPTY(STACK)
    IF UBOUND(STACK) > -1 THEN
        STACKISEMPTY = FALSE
    ELSE
        STACKISEMPTY = TRUE
    END IF
END FUNCTION

FUNCTION PEEK(STACK)
    IF UBOUND(STACK) > -1 THEN
        PEEK = STACK(UBOUND(STACK))
    END IF
END FUNCTION
Unfit answered 30/9, 2014 at 19:37 Comment(1)
for the people that are following this topic: #28594487Unfit
F
7

You can treat the "FUN", "(", ")", "," similar to other operators and can be pushed onto the TOKEN_STACK. (I've shorted FUNCTION to FUN for succinctness). "FUN","(", ")", "," have a precedence lower precedence than "+" so out precedence table looks like

^                    4
*  /  %              3
+  -                 2
( ) ,   FUN          1

Consider what happens when 1+FUN(2*3) is parsed

Remaining Input     Stack             Output
1+FUN(2*3)                          
+FUN(2*3)                           1
FUN(2*3)          +                 1
(2*3)             +  FUN            1
2*3)              +  FUN  (         1
*3)               +  FUN  (         1  2
3)                +  FUN  (  *      1  2
)                 +  FUN  (  *      1  2  3
                  +  FUN  (  *  )   1  2  3          Note 1
                  +  FUN  ( )       1  2  3  *       Note 2
                  +                 1  2  3  * FUN() 
                                    1  2  3  * FUN() +

The output token "FUN()" stands for the function evaluation.

Note 1: when we try to push ")" onto the stack all operators with higher precedence are taken off the stack and moved to the output.

Note 2: When the item at the end of the stack is the matching "(", then that is removed from the stack. There are two cases to handle, simple matching of parenthesis as with 1+(2*3) or if there is a function token before the "(" on the stack. In that case pop the FUN from the stack and added a function evaluation token to the output.

"," would be treated in a similar way to ")" causing operators with higher precedence to be popped. But it would not cause the function evaluation to be output. You would need some way to record how many arguments a function has.

In my code I uses a recursion based algorithm. The parsing algorithm can be called recursively and is given a stop-token. When the stop-token is encountered it backs out of the recursion. When a function is encountered it calls the recursion waiting for either "," or ")".


I manage to get iut running using the cscript command line program. I also added some debugging code with Wscript.Echo.

Looking at the TOKEN_STACK you did not ever add the FUNCTION Token to the stack, so it was not there when you searched for it.

Adding CASE "FUNCTION" CALL PUSH(INPUT(TOKEN_NUMBER), TOKEN_STACK)

seems to give the right thing. Although I'm not sure whats supposed to happen when you match the first closing bracket. Its also going badly wrong with input like "E + FUNCTION ( ( A , B ) , C ) + F".

SET PRECEDENCE = CREATEOBJECT("SCRIPTING.DICTIONARY")
WITH PRECEDENCE
    .ADD "^",3
    .ADD "*",2
    .ADD "/",2
    .ADD "%",2
    .ADD "+",1
    .ADD "-",1
    .ADD "FUNCTION",0
    .ADD "(",0
    .ADD ",",0
    .ADD ")",0
END WITH

Wscript.Echo "Start"

'#############################################################################
tokenArray = split("FUNCTION ( ( A , B ) , C )")
'#tokenArray = split("A + B * C")


Wscript.Echo "Result " + SHUNTINGYARD(tokenArray)
'#############################################################################

FUNCTION SHUNTINGYARD(INPUT)

    TOKEN_QUEUE = ARRAY()
    TOKEN_STACK = ARRAY()

    FOR TOKEN_NUMBER = 0 TO UBOUND(INPUT)
    Wscript.Echo "Token " + INPUT(TOKEN_NUMBER)
    Wscript.Echo "Stack " + JOIN(TOKEN_STACK,"|")

        SELECT CASE INPUT(TOKEN_NUMBER)

            CASE "("
                CALL PUSH(INPUT(TOKEN_NUMBER), TOKEN_STACK)

            CASE ")"

                DO WHILE NOT( PRECEDENCE( PEEK(TOKEN_STACK) ) = 0 )

                    CALL PUSH(POP(TOKEN_STACK), TOKEN_QUEUE)
                    IF STACKISEMPTY(TOKEN_STACK) THEN CALL ERRORS("Can't find a matching ""("".", TRUE)
                LOOP

                IF PEEK(TOKEN_STACK) = "FUNCTION" THEN
                    DISCARD = POP(TOKEN_STACK)
                    CALL PUSH("@", TOKEN_QUEUE)
                ELSE
                    DISCARD = POP(TOKEN_STACK)
                END IF

            CASE ","
                DO WHILE NOT( PRECEDENCE( PEEK(TOKEN_STACK) ) = 0 )
                    CALL PUSH(POP(TOKEN_STACK), TOKEN_QUEUE)
                    IF STACKISEMPTY(TOKEN_STACK) THEN CALL ERRORS("Can't find a matching function ""("".", TRUE)
                LOOP

            CASE "+","-","*","/","^","%"
                TOKEN_A = INPUT(TOKEN_NUMBER)
                DO WHILE ISOPERATOR(PEEK(TOKEN_STACK))
                    TOKEN_B = PEEK(TOKEN_STACK)
                    IF (ASSOCIATIVITY(TOKEN_B) = "left" AND PRECEDENCE(TOKEN_A) = PRECEDENCE(TOKEN_B)) OR (PRECEDENCE(TOKEN_A) < PRECEDENCE(TOKEN_B)) THEN
                        CALL PUSH(POP(TOKEN_STACK), TOKEN_QUEUE)
                    ELSE
                        EXIT DO
                    END IF
                LOOP
                CALL PUSH(TOKEN_A, TOKEN_STACK)

            CASE "FUNCTION"
                CALL PUSH(INPUT(TOKEN_NUMBER), TOKEN_STACK)

            CASE ELSE
                CALL PUSH(INPUT(TOKEN_NUMBER), TOKEN_QUEUE)

        END SELECT
    NEXT

    FOR ITEMCOUNT = 0 TO UBOUND(TOKEN_STACK)
        IF PEEK(TOKEN_STACK) = "(" THEN CALL ERRORS("Can't find a matching "")"".", TRUE)'(
        CALL PUSH(POP(TOKEN_STACK), TOKEN_QUEUE)
    NEXT

    SHUNTINGYARD = JOIN(TOKEN_QUEUE,"|")

END FUNCTION

'#############################################################################

FUNCTION ASSOCIATIVITY(ASSOC)
    SELECT CASE LCASE(ASSOC)
        CASE "^","\"
            ASSOCIATIVITY = "right"
        CASE ELSE
            ASSOCIATIVITY = "left"
    END SELECT
END FUNCTION

FUNCTION ISOPERATOR(ITEM)
    ISOPERATOR = LEN(ITEM) = 1 AND INSTR("+-*/%^",ITEM)
END FUNCTION

SUB PUSH(ITEM,BYREF STACK)
    IF UBOUND(STACK) > -1 THEN
        REDIM PRESERVE STACK(UBOUND(STACK) + 1)
        STACK(UBOUND(STACK)) = ITEM
    ELSE
        STACK = ARRAY(ITEM)
    END IF
END SUB

FUNCTION POP(BYREF STACK)
    IF UBOUND(STACK) > -1 THEN
        POP = STACK(UBOUND(STACK))
        REDIM PRESERVE STACK(UBOUND(STACK) - 1)
    END IF
END FUNCTION

FUNCTION STACKISEMPTY(STACK)
    IF UBOUND(STACK) > -1 THEN
        STACKISEMPTY = FALSE
    ELSE
        STACKISEMPTY = TRUE
    END IF
END FUNCTION

FUNCTION PEEK(STACK)
    IF UBOUND(STACK) > -1 THEN
        PEEK = STACK(UBOUND(STACK))
    END IF
END FUNCTION
Fleawort answered 2/10, 2014 at 8:14 Comment(10)
Thanks for the helpful reply Salix alba! if i find the time i will modify my code and post it here :)Unfit
Ok, Found some time^^. its been a while (apologies). I still have some troubles with the code, the function token isn't replaced by the function evaluation token and i don't know why, also i'm not sure if the code is written the right way: (trying to figure out how to repost my code s: )Unfit
Yes I'll try and help you with this. However VBscript is not a language I'm at all familiar with. Could you provide a complete working example and pointer to how I can get it running. I've got IE8 and VB express edition 2008.Fleawort
Sure and thanks! if you have windows the just make a file with .vbs as the extension with my script as the content, also the autoit version can be found on the autoit forum. if you want to test the vbscript just double click on it :) if you use mac or linux its more difficult, then you need some windows emulator.. thanks for your help! if its to hard to do it in vbscript then vb2008 or java are fine to :)Unfit
I managed to get it running using cscript command line program. I've updated my answer which highlights one problem with your code- the FUNCTION token was not being pushed onto the stack.Fleawort
"Although I'm not sure whats supposed to happen when you match the first closing bracket. Its also going badly wrong with input like "E + FUNCTION ( ( A , B ) , C ) + F""Unfit
hmmm, i see. that can be a big problem.. don't know how to solve this. maybe i have to split the functions from the input and replace them with something like "@"? it's not easy this one :(Unfit
My knowledge about modifying the Shunting Yard is not that great, i'm sorry. The reason I posted it is because it works fine with numbers and stuff but from the moment I want to add functions its one problem after the other.. How do programmers that make script languages do this? do they also use this algorithm? Can't find much about this algorithm if i google "how to make an interpreter"..Unfit
It might be easier to try with single argument functions first. Having to count the number of arguments adds another layer of complication. I'd really try doing things on paper first, after adding each token what should the state of the output queue. For example if you want to evaluate "A+SIN(B)" the final TOKEN_QUEUE output will be "A | B | sin | +"Fleawort
I think its best to split every line of code to functions and expressions, then i can split the parameters of each function by "," and count them..Unfit

© 2022 - 2024 — McMap. All rights reserved.