cannot create implicit token for string literal in non-combined grammar

Asked 30/10, 2014 at 9:32 Answered 27/3, 2016 at 15:14

so found a nice grammar for a calculator and copied it with some lil changes from here: https://dexvis.wordpress.com/2012/11/22/a-tale-of-two-grammars/

I have two Files: Parser and Lexer. Looks like this:

    parser grammar Parser;

options{
    language = Java;
    tokenVocab = Lexer;
}

// PARSER
program : ((assignment|expression) ';')+;

assignment : ID '=' expression;

expression
    : '(' expression ')'                # parenExpression
    | expression ('*'|'/') expression   # multOrDiv
    | expression ('+'|'-') expression   # addOrSubtract
    | 'print' arg (',' arg)*            # print
    | STRING                            # string
    | ID                                # identifier
    | INT                               # integer;

arg : ID|STRING;

and the Lexer:

    lexer grammar WRBLexer;

STRING : '"' (' '..'~')* '"';
ID     : ('a'..'z'|'A'..'Z')+;
INT    : '0'..'9'+;
WS     : [ \t\n\r]+ -> skip ;

Basically just splitted Lexer and Parser into two files. But when i try to save i get some Errors:

error(126): Parser.g4:9:35: cannot create implicit token for string literal in non-combined grammar: ';'
error(126): Parser.g4:11:16: cannot create implicit token for string literal in non-combined grammar: '='
error(126): Parser.g4:2:13: cannot create implicit token for string literal in non-combined grammar: '('
error(126): Parser.g4:2:28: cannot create implicit token for string literal in non-combined grammar: ')'
error(126): Parser.g4:3:10: cannot create implicit token for string literal in non-combined grammar: 'print'
error(126): Parser.g4:3:23: cannot create implicit token for string literal in non-combined grammar: ','
error(126): Parser.g4:9:37: cannot create implicit token for string literal in non-combined grammar: '*'
error(126): Parser.g4:9:41: cannot create implicit token for string literal in non-combined grammar: '/'
error(126): Parser.g4:10:47: cannot create implicit token for string literal in non-combined grammar: '+'
error(126): Parser.g4:10:51: cannot create implicit token for string literal in non-combined grammar: '-'
10 error(s)

Hope someone can help me with this.

Best regards

Graeae answered 30/10, 2014 at 9:32 Comment(0)

All literal tokens inside your parser grammar: '*', '/', etc. need to be defined in your lexer grammar:

lexer grammar WRBLexer;

ADD : '+';
MUL : '*';
...

And then in your parser grammar, you'd do:

expression
    : ...
    | expression (MUL|DIV) expression   # multOrDiv
    | expression (ADD|SUB) expression   # addOrSubtract
    | ...
    ;

Derwin answered 30/10, 2014 at 10:31 Comment(4)

@abcdabcd987 no, not in parser grammars. – Derwin 27/3, 2016 at 22:4

I believe that you can use a literal in the parser if it matches a rule you haved declared somewhere in the lexer (and if there's no ambiguity). Took me a while to figure out how this grammar github.com/antlr/grammars-v4/blob/master/csharp/CSharpParser.g4 could afford to use literals... – Noreen 21/3, 2017 at 7:43

Bart, perhaps it might be good to update your answer with the comment from @simon-mourier to make it complete? – Stumpage 2/10, 2021 at 10:21

At the time I wrote this answer, it was the case. Apparently this changed for newer versions of ANTLR4. In the original question, the error message clearly states the problem, so I'll leave my answer "as is" in case someone else uses an older version of ANTLR4 and encounters the same error. – Derwin 2/10, 2021 at 10:34

Since you write two file.

All your symbols, must write in Lexer file.

I suggest you to do this:

In Lexer file:

STRING : '"' (' '..'~')* '"';
ID     : ('a'..'z'|'A'..'Z')+;
INT    : '0'..'9'+;
WS     : [ \t\n\r]+ -> skip ;
ADD_SUB: '+' | '-';
MUL_DIV: '*' | '/';
COMMA  : ',';
PRINT  : 'print';
Lb     : '(';
Rb     : ')';
COLON  : ';';
EQUAL  : '=';

And your Parser:

parser grammar Parser;

options{
    language = Java;
    tokenVocab = Lexer;
}

// PARSER
program : ((assignment|expression) COLON)+;

assignment : ID EQUAL expression;

expression
    : Lb expression Rb                # parenExpression
    | expression MUL_DIV expression   # multOrDiv
    | expression ADD_SUB expression   # addOrSubtract
    | PRINT arg (COMMA arg)*            # print
    | STRING                            # string
    | ID                                # identifier
    | INT                               # integer
;
arg : ID|STRING;

Clot answered 27/3, 2016 at 15:14 Comment(0)

-1

Actually, it's okay to write literal tokens inside your rules. You can name literal tokens. For example,

expr: expr op=('*' | '/') expr  # binaryExpr
    | expr op=('+' | '-') expr  # binaryExpr
    | Number                    # number
    ;

Number: blah blah ;

Star : '*';
Div  : '/';
Plus : '+';
Minus: '-';

And you can write the listener as follows:

class BinaryExpr {
    public enum BinaryOp {
        // ...
    }
    // ...
}
public class MyListener extends YourGrammarBaseListener {
    @Override
    public void exitBinaryExpr(YourGrammarParser.BinaryExprContext ctx) {
        BinaryExpr.BinaryOp op;
        switch (ctx.op.getType()) {
            case YourGrammarParser.Star:  op = BinaryExpr.BinaryOp.MUL; break;
            case YourGrammarParser.Div:   op = BinaryExpr.BinaryOp.DIV; break;
            case YourGrammarParser.Plus:  op = BinaryExpr.BinaryOp.ADD; break;
            case YourGrammarParser.Minus: op = BinaryExpr.BinaryOp.SUB; break;
            default: throw new RuntimeException("Unknown binary op.");
        }

        // ...
    }
}

Peshawar answered 27/3, 2016 at 15:1 Comment(1)

The OP is using a parser grammar, while you are (most probably) testing with a combined grammar (combined grammars start with grammar GRAMMAR_NAME;, parser grammars start with parser grammar GRAMMAR_NAME;). In a parser grammar, you cannot use literal tokens. ANTLR will exit with the following error: "cannot create implicit token for string literal in non-combined grammar". – Derwin 27/3, 2016 at 22:4

Recommended topics

Hot tags