I am using antlr 4.5 to build a parser for a language with several special comment formats, which I would like to stream to different channels.
It seems antlr 4.5 has been extended with a new construct for declaring extra lexer channels:
extract from doc https://theantlrguy.atlassian.net/wiki/display/ANTLR4/Lexer+Rules
As of 4.5, you can also define channel names like you enumerations with the following construct above the lexer rules:
channels { WSCHANNEL, MYHIDDEN }
My lexing and parsing rules are in a single file, and my code looks like this:
channels {
ANNOT_CHANNEL,
FORMAL_SPEC_CHANNEL,
DOC_CHANNEL,
COMMENT_CHANNEL,
PRAGMAS_CHANNEL
}
... parsing rules ...
// expression annotation (sent to a special channel)
ANNOT: (EOL_ANNOT | LUS_ANNOT | C_ANNOT) -> channel(ANNOT_CHANNEL) ;
fragment LUS_ANNOT: '(*!' ( COMMENT | . )*? '*)' ;
fragment C_ANNOT: '/*!' ( COMMENT | . )*? '*/' ;
fragment EOL_ANNOT: ('--!' | '//!') .*? EOL ;
// formal specification annotations (sent to a special channel)
FORMAL_SPEC: (EOL_SPEC | LUS_SPEC | C_SPEC ) -> channel(FORMAL_SPEC_CHANNEL) ;
fragment LUS_SPEC: '(*@' ( COMMENT | . )*? '*)' ;
fragment C_SPEC: '/*@' ( COMMENT | . )*? '*/' ;
fragment EOL_SPEC: ('--@' | '//@' | '--%') .*? EOL;
// documentation annotation (sent to a special channel)
DOC: ( EOL_DOC |LUS_DOC | C_DOC ) -> channel(DOC_CHANNEL);
fragment LUS_DOC: '(**' ( COMMENT | . )*? '*)' ;
fragment C_DOC: '/**' ( COMMENT | . )*? '*/' ;
fragment EOL_DOC: ('--*' | '//*') .*? EOL;
// standard comment (sent to a special channel)
COMMENT: ( EOL_COMMENT | LUS_COMMENT | C_COMMENT ) -> channel(COMMENT_CHANNEL);
fragment LUS_COMMENT: '(*' ( COMMENT | . )*? '*)' ;
fragment C_COMMENT: '/*' ( COMMENT |. )*? '*/' ;
fragment EOL_COMMENT: ('--' | '//') .*? EOL;
// pragmas are sent to a special channel
PRAGMA: '#pragma' CHARACTER* '#end' -> channel(PRAGMAS_CHANNEL);
however I am still getting this 4.4-like error
warning(155): Scade6.g4:550:52: rule ANNOT contains a lexer command with an unrecognized constant value; lexer interpreters may produce incorrect output
warning(155): Scade6.g4:556:56: rule FORMAL_SPEC contains a lexer command with an unrecognized constant value; lexer interpreters may produce incorrect output
warning(155): Scade6.g4:562:45: rule DOC contains a lexer command with an unrecognized constant value; lexer interpreters may produce incorrect output
warning(155): Scade6.g4:568:62: rule COMMENT contains a lexer command with an unrecognized constant value; lexer interpreters may produce incorrect output
warning(155): Scade6.g4:574:47: rule PRAGMA contains a lexer command with an unrecognized constant value; lexer interpreters may produce incorrect output
If I split lexer and parser in two distinct files and use an import statement to import the lexer in the parser I still get the same error as above,
Using integer constants instead of names with a combined grammar
-> channel(10000)
yields the following error
error(164): Scade6.g4:8:0: custom channels are not supported in combined grammars
If I split lexer and parser apart in two files and use integer constants no warning, however it is not really satisfactory for readability.
Is there anything I can do to have extra channels named properly? (with either combined or separate lexer/parser specs, no preference)
Regards,
channels { }
construct precisely for this, sadly it does not seem to do the job as expected, at least from my experiments. – Halsy