How do I pretty-print productions and line numbers, using ANTLR4?
Asked Answered
L

1

8

I'm trying to write a piece of code that will take an ANTLR4 parser and use it to generate ASTs for inputs similar to the ones given by the -tree option on grun (misc.TestRig). However, I'd additionally like for the output to include all the line number/offset information.

For example, instead of printing

(add (int 5) '+' (int 6))

I'd like to get

(add (int 5 [line 3, offset 6:7]) '+' (int 6 [line 3, offset 8:9]) [line 3, offset 5:10])

Or something similar.

There aren't a tremendous number of visitor examples for ANTLR4 yet, but I am pretty sure I can do most of this by copying the default implementation for toStringTree (used by grun). However, I do not see any information about the line numbers or offsets.

I expected to be able to write super simple code like this:

String visit(ParseTree t) {
    return "(" + t.productionName + t.visitChildren() + t.lineNumber + ")";
}

but it doesn't seem to be this simple. I'm guessing I should be able to get line number information from the parser, but I haven't figured out how to do so. How can I grab this line number/offset information in my traversal?


To fill in the few blanks in the solution below, I used:

List<String> ruleNames = Arrays.asList(parser.getRuleNames());
parser.setBuildParseTree(true);
ParserRuleContext prc = parser.program();
ParseTree tree = prc;

to get the tree and the ruleNames. program is the name for the top production in my grammar.

Litotes answered 13/10, 2013 at 21:52 Comment(2)
There are 2 toStringTree methods. One takes a Parser instance, but the other just takes a List<String> of rule names.Ostensorium
@280Z28: You state a true fact. Calling toStringTree with a parser argument causes the implementation to grab the list of rules (recog.getRuleNames()) and pass it to the toStringTree that takes a List. At any rate, this still doesn't explain how to get at the line number/offset information while writing a visitor.Litotes
O
13

The Trees.toStringTree method can be implemented using a ParseTreeListener. The following listener produces exactly the same output as Trees.toStringTree.

public class TreePrinterListener implements ParseTreeListener {
    private final List<String> ruleNames;
    private final StringBuilder builder = new StringBuilder();

    public TreePrinterListener(Parser parser) {
        this.ruleNames = Arrays.asList(parser.getRuleNames());
    }

    public TreePrinterListener(List<String> ruleNames) {
        this.ruleNames = ruleNames;
    }

    @Override
    public void visitTerminal(TerminalNode node) {
        if (builder.length() > 0) {
            builder.append(' ');
        }

        builder.append(Utils.escapeWhitespace(Trees.getNodeText(node, ruleNames), false));
    }

    @Override
    public void visitErrorNode(ErrorNode node) {
        if (builder.length() > 0) {
            builder.append(' ');
        }

        builder.append(Utils.escapeWhitespace(Trees.getNodeText(node, ruleNames), false));
    }

    @Override
    public void enterEveryRule(ParserRuleContext ctx) {
        if (builder.length() > 0) {
            builder.append(' ');
        }

        if (ctx.getChildCount() > 0) {
            builder.append('(');
        }

        int ruleIndex = ctx.getRuleIndex();
        String ruleName;
        if (ruleIndex >= 0 && ruleIndex < ruleNames.size()) {
            ruleName = ruleNames.get(ruleIndex);
        }
        else {
            ruleName = Integer.toString(ruleIndex);
        }

        builder.append(ruleName);
    }

    @Override
    public void exitEveryRule(ParserRuleContext ctx) {
        if (ctx.getChildCount() > 0) {
            builder.append(')');
        }
    }

    @Override
    public String toString() {
        return builder.toString();
    }
}

The class can be used as follows:

List<String> ruleNames = ...;
ParseTree tree = ...;

TreePrinterListener listener = new TreePrinterListener(ruleNames);
ParseTreeWalker.DEFAULT.walk(listener, tree);
String formatted = listener.toString();

The class can be modified to produce the information in your output by updating the exitEveryRule method:

@Override
public void exitEveryRule(ParserRuleContext ctx) {
    if (ctx.getChildCount() > 0) {
        Token positionToken = ctx.getStart();
        if (positionToken != null) {
            builder.append(" [line ");
            builder.append(positionToken.getLine());
            builder.append(", offset ");
            builder.append(positionToken.getStartIndex());
            builder.append(':');
            builder.append(positionToken.getStopIndex());
            builder.append("])");
        }
        else {
            builder.append(')');
        }
    }
}
Ostensorium answered 14/10, 2013 at 3:36 Comment(1)
This worked great. I'm updating the question to fill in the few blanks.Litotes

© 2022 - 2024 — McMap. All rights reserved.