Why is there such a difference between parse tree visualization and my visitor/listener traversal?
Asked Answered
R

1

11

I've created this sample grammar using the ANTLR4 plugin in IntelliJ and when I use its tool chain to generate a visual representation for some invalid content (in this case an empty string), this representation seems to differ from what I'm able to get when doing actual parse tree traversal using a sample visitor/listener implementation for same input.

This is the grammar:

grammar TestParser;

THIS            : 'this';

Identifier
    : [a-zA-Z0-9]+
    ;

WS  :  [ \t\r\n\u000C]+ -> skip;

parseExpression:
    expression EOF
;

expression
    : expression bop='.' (Identifier | THIS ) #DottedExpression
    | primary                                 #PrimaryExpression
    ;

primary
        : THIS                                #This
        | Identifier                          #PrimaryIdentifier
        ;

For an empty string, I get the following tree:

parse tree

This tree indicates that the parser built a parse tree that contains "DottedExpression" and "primary:This" (assuming it uses its own visitor/listener implementation to do this). Yet when I attempt the same using the following code:

package org.example.so;

import org.antlr.v4.runtime.CharStreams;
import org.antlr.v4.runtime.CommonTokenStream;
import org.antlr.v4.runtime.tree.ParseTreeWalker;

public class TestParser {
    public static void main(String[] args) {
        String input = "";
        TestParserLexer lexer = new TestParserLexer(CharStreams.fromString(input));
        CommonTokenStream tokenStream = new CommonTokenStream(lexer);
        TestParserParser parser = new TestParserParser(tokenStream);

        TestParserParser.ParseExpressionContext parseExpressionContext = parser.parseExpression();

        MyVisitor visitor = new MyVisitor();
        visitor.visit(parseExpressionContext);

        System.out.println("----------------");

        ParseTreeWalker walker = new ParseTreeWalker();
        walker.walk(new MyListener(), parseExpressionContext);

        System.out.println("----------------");
    }

    private static class MyVisitor extends TestParserBaseVisitor {

        @Override
        public Object visitParseExpression(TestParserParser.ParseExpressionContext ctx) {
            System.out.println(TestParserParser.ruleNames[ctx.getRuleIndex()]);
            return super.visitParseExpression(ctx);
        }

        @Override
        public Object visitDottedExpression(TestParserParser.DottedExpressionContext ctx) {
            System.out.println(TestParserParser.ruleNames[ctx.getRuleIndex()] + ":DottedExpression");
            return super.visitDottedExpression(ctx);
        }

        @Override
        public Object visitPrimaryExpression(TestParserParser.PrimaryExpressionContext ctx) {
            System.out.println(TestParserParser.ruleNames[ctx.getRuleIndex()] + ":PrimaryExpression");
            return super.visitPrimaryExpression(ctx);
        }

        @Override
        public Object visitThis(TestParserParser.ThisContext ctx) {
            System.out.println(TestParserParser.ruleNames[ctx.getRuleIndex()]);
            return super.visitThis(ctx);
        }

        @Override
        public Object visitPrimaryIdentifier(TestParserParser.PrimaryIdentifierContext ctx) {
            System.out.println(TestParserParser.ruleNames[ctx.getRuleIndex()]);
            return super.visitPrimaryIdentifier(ctx);
        }
    }

    private static  class MyListener extends TestParserBaseListener {

        @Override
        public void enterParseExpression(TestParserParser.ParseExpressionContext ctx) {
            System.out.println(TestParserParser.ruleNames[ctx.getRuleIndex()]);
        }

        @Override
        public void enterDottedExpression(TestParserParser.DottedExpressionContext ctx) {
            System.out.println(TestParserParser.ruleNames[ctx.getRuleIndex()] + ":DottedExpression");
        }

        @Override
        public void enterPrimaryExpression(TestParserParser.PrimaryExpressionContext ctx) {
            System.out.println(TestParserParser.ruleNames[ctx.getRuleIndex()] + ":PrimaryExpression");
        }

        @Override
        public void enterThis(TestParserParser.ThisContext ctx) {
            System.out.println(TestParserParser.ruleNames[ctx.getRuleIndex()]);
        }

        @Override
        public void enterPrimaryIdentifier(TestParserParser.PrimaryIdentifierContext ctx) {
            System.out.println(TestParserParser.ruleNames[ctx.getRuleIndex()]);
        }
    }
}

I get the following output:

line 1:0 mismatched input '<EOF>' expecting {'this', Identifier}
parseExpression
expression:PrimaryExpression
----------------
parseExpression
expression:PrimaryExpression
----------------

So, not only the tree depths do not match, the output even indicates a different rule was matched second ("PrimaryExpression" instead of "DottedExpression").

Why is there such a difference between what I'm shown and what I attempt to show? How do I create the same tree representation as shown by the plugin?

Using ANTLR version 4.7. Plugin version is 1.8.4.

Rosary answered 29/8, 2017 at 17:32 Comment(2)
Took a quick look at the ANTLR source. My guess is that the graphical parse tree is being created with the parser in a different prediction mode than you're getting when you run the parser yourself.Sharondasharos
Incidentally, this appears to be the code that's generating the parse tree that's rendered graphically. Maybe you can look for clues in how the parser is set up here: github.com/antlr/antlr4/blob/…Sharondasharos
I
3

This issue was fixed at version 1.8.2 of the plugin. In case that you have version 1.8.2 or later, then you probably discovered an other unknown subcase of the issue.

However (based on the issue that I am referring to), trees differ only when the parsing is resulting to an error. So if you are not interested on using the errors' information, you should be fine.

Indehiscent answered 5/9, 2017 at 14:34 Comment(4)
I'm using a newer version (1.8.4). Do you happen to have the link to the issue you mentioned? I do need to know how the parser behaves with erroneous input.Rosary
I don't have the link, but it's somewhere on the github page github.com/antlr/intellij-plugin-v4Indehiscent
Probably this.Rosary
I don't think this is the same issue (if at all one). They mention a difference between TestRig tree and the tree shown by the plugin. When I try this grammar in TestRig the tree seems to match the of the plugin, but not exactly since it does not show error nodes ((parseExpression (expression primary) <EOF>)). So I now apparently have two tools telling me that my parse tree should be different.Rosary

© 2022 - 2024 — McMap. All rights reserved.