Using ANTLR Parser and Lexer Separatly

I used ANTLR version 4 for creating compiler.First Phase was the Lexer part. I created "CompilerLexer.g4" file and putted lexer rules in it.It works fine.

CompilerLexer.g4:

lexer grammar CompilerLexer;

INT         :   'int'   ;   //1
FLOAT       :   'float' ;   //2
BEGIN       :   'begin' ;   //3
END         :   'end'   ;   //4
To          :   'to'    ;   //5
NEXT        :   'next'  ;   //6
REAL        :   'real'  ;   //7
BOOLEAN     :   'bool'  ;   //8
.
.
.
NOTEQUAL    :   '!='    ;   //46
AND         :   '&&'    ;   //47
OR          :   '||'    ;   //48
POW         :   '^'     ;   //49
ID          : [a-zA-Z]+ ;   //50




WS
:   ' ' -> channel(HIDDEN)  //50
;

Now it is time for phase 2 which is the parser.I created "CompilerParser.g4" file and putted grammars in it but have dozens warning and errors.

CompilerParser.g4:

parser grammar CompilerParser;

options {   tokenVocab = CompilerLexer; }

STATEMENT   :   EXPRESSION SEMIC
        |   IFSTMT
        |   WHILESTMT
        |   FORSTMT
        |   READSTMT SEMIC
        |   WRITESTMT SEMIC
        |   VARDEF SEMIC
        |   BLOCK
        ;

BLOCK       : BEGIN STATEMENTS END
        ;

STATEMENTS  : STATEMENT STATEMENTS*
        ;

EXPRESSION  : ID ASSIGN EXPRESSION
        | BOOLEXP
        ;

RELEXP      : MODEXP (GT | LT | EQUAL | NOTEQUAL | LE | GE | AND | OR) RELEXP
        | MODEXP
        ;

.
.
.

VARDEF      : (ID COMA)* ID COLON VARTYPE
        ;

VARTYPE     : INT
        | FLOAT
        | CHAR
        | STRING
        ;
compileUnit
:   EOF
;

Warning and errors:

implicit definition of token 'BLOCK' in parser

implicit definition of token 'BOOLEXP' in parser

implicit definition of token 'EXP' in parser

implicit definition of token 'EXPLIST' in parser

lexer rule 'BLOCK' not allowed in parser

lexer rule 'EXP' not allowed in parser

lexer rule 'EXPLIST' not allowed in parser

lexer rule 'EXPRESSION' not allowed in parser

Have dozens of these warning and errors. What is the cause?

General Questions: What is difference between using combined grammar and using lexer and parser separately? How should join separate grammar and lexer files?

EDIT

user2998131 wrote:

General Questions: What is difference between using combined grammar and using lexer and parser separately?

Separating the lexer and parser rules will keeps things organized. Also, when creating separate lexer and parser grammars, you can't (accidentally) put literal tokens inside your parser grammar but will need to define all tokens in your lexer grammar. This will make it apparent which lexer rules get matched before others, and you can't make any typo's inside recurring literal tokens:

grammar P; r1 : 'foo' r2; r2 : r3 'foo '; // added an accidental space after 'foo'

But when you have a parser grammar, you can't make that mistake. You will have to use the lexer rule that matches 'foo':

parser grammar P options { tokenVocab=L; } r1 : FOO r2; r2 : r3 FOO; lexer grammar L; FOO : 'foo';

user2998131 wrote:

How should join separate grammar and lexer files?

Just like you do in your parser grammar: you point to the proper tokenVocab inside the options { ... } block.

EDIT

Recommended topics

Hot tags