How to match any text in ANTLRv4? I mean text, which is unknown at the time of grammar writing?
My grammar is follows:
grammar Anytext;
line :
comment;
comment : '#' anytext;
anytext: ANY*;
WS : [ \t\r\n]+;
ANY : .;
And my code is follows:
String line = "# This_is_a_comment";
ANTLRInputStream input = new ANTLRInputStream(line);
AnytextLexer lexer = new AnytextLexer(input);
CommonTokenStream tokens = new CommonTokenStream(lexer);
AnytextParser parser = new AnytextParser(tokens);
ParseTree tree = parser.comment();
System.out.println(tree.toStringTree(parser)); // print LISP-style tree
Output follows:
line 1:1 extraneous input ' ' expecting {<EOF>, ANY}
(comment # (anytext T h i s _ i s _ a _ c o m m e n t))
If I change ANY
rule
ANY : [ \t\r\n.];
it stops recognizing any symbol at all.
UPDATE1
I have no end line character at the end.
UPDATE 2
So, I understood, that it is impossible to match any text with lexer since lexer can't allow multiple classes. If I define lexer rule for any symbol it will either hide all other rules or doesn't work.
But the question persists.
How to match all symbols at parser level then?
Suppose I have table-shaped data and I wan't to process some fields and ignore others. If I had anytext
rule, I would write
infoline :
( codepoint WS 'field1' WS field1Value ) |
( codepoint WS 'field2' WS field2Value ) |
( codepoint WS anytext );
here I am parsing rows if 2nd column contains field1
and field2
values and ignore rows otherwise.
How to accomplish this approach?
[space] (type WS)
. From my point of view it is alsoANY
? Why not? – Histoplasmosis