How to detect beginning of line, or: "The name 'getCharPositionInLine' does not exist in the current context"
Asked Answered
G

1

6

I'm trying to create a Beginning-Of-Line token:

lexer grammar ScriptLexer;

BOL : {getCharPositionInLine() == 0;}; // Beginning Of Line token

But the above emits the error

The name 'getCharPositionInLine' does not exist in the current context

As it creates this code:

private void BOL_action(RuleContext _localctx, int actionIndex) {
    switch (actionIndex) {
    case 0: getCharPositionInLine() == 0; break;
    }
}

Where the getCharPositionInLine() method doesn't exist...

Gaff answered 9/8, 2015 at 11:27 Comment(6)
Maybe try GetCharPositionInLine() (PascalCase as recommended by various C# code guidelines)Labrie
@knittl, tried that. No method with a name that is even similar to that...Gaff
Have a look at the lexer class: github.com/antlr/antlr4-csharp/blob/master/runtime/CSharp/… There is a charPositionInLine in there, but I'm not really familiar with C# to post an answer (hence this comment).Australopithecus
@Labrie C# has properties in the language, so you won't see many getter functions in C# code :-) The solution here is to use the Column property, so fragment BOL : { Column == 0 } ; (or == 1, dunno) should probably work (I don't think it makes sense to have an empty lexer rule, hence the fragment).Constrictor
@LucasTrzesniewski - that was it. Please post an answer so I can accept itGaff
If anybody is looking for Typescript property it's this.charPositionInLine === 0; where this refers to Lexer superclass.Wilber
M
8

Simplest approach is to just recognize an EOL as the corresponding BOL token.

BC  : '/*' .*? '*/' -> channel(HIDDEN) ;
LC  : '//' ~[\r\n]* -> channel(HIDDEN) ;
HWS : [ \t]*        -> channel(HIDDEN) ;
BOL : [\r\n\f]+ ;

Rules like a block comment rule will consume the EOLs internally, so no problem there. Rules like a line comment will not consume the EOL, so a proper BOL will be emitted for the line immediately following.

A potential problem is that no BOL will be emitted for the beginning of input. Simplest way to handle this is to force prefix the input text with a line terminal before feeding it to the lexer.

Matrilineal answered 10/8, 2015 at 0:57 Comment(1)
Excellent answer, it helped me with a similar question (I got here via https://mcmap.net/q/1775172/-how-to-recognise-start-of-line-in-an-antlr-grammar/1112244). I will add that if you don't route BOL to a hidden channel, you will have to include it in your parser everywhere you expect to encounter those characters. In my case, I use a separate lexer and parser, and I defined in my lexer the token that had to appear at the beginning of the line (it is a line label). My parser rules are not EOL-delimited otherwise, so I routed BOL to a hidden channel in order to avoid adding it as a parser rule.Hock

© 2022 - 2024 — McMap. All rights reserved.