How to define tokens that can appear in multiple lexical modes in ANTLR4?

About

Asked 4/4, 2013 at 9:25 Answered 4/4, 2013 at 13:6

I am learning ANTLR4 and was trying to play with lexical modes. How can I have the same token appear in multiple lexical modes? As a very simple example, let's say my grammar has two modes, and I want to match white space and end-of-lines in both of them how can I do it without ending with WS_MODE1 and WS_MODE2 for example. Is there a way to reuse the same definition in both cases? I am hoping to get WS tokens in the output stream for all white space irrespective of the mode. The same applies to EOL and other keywords that can appear in both modes.

Lakieshalakin answered 4/4, 2013 at 9:25 Comment(0)

The rules have to have different names, but you can use the -> type(...) lexer command to give them the same type.

WS : [ \t]+;

mode Mode1;

    Mode1_WS : WS -> type(WS);

mode Mode2;

    Mode2_WS : WS -> type(WS);

Even though Mode1_WS and Mode2_WS are not fragment rules, the code generator will see the type command and know that you reassigned their types, so it will not define tokens for them.

Hierolatry answered 4/4, 2013 at 13:6 Comment(4)

Short question about the usage of these Lexer rules: in the parser rules do you refer to WS or Mode1_WS, Mode2_WS ? I tried both but it seems you only define the Lexer rules without refering to them directly in the parser rules. In that senes it's rather an 'import statement' than an 'alias'. – Aberdare 12/11, 2014 at 13:54

The type command explicitly assigns the token type, which is the type the parser will see. In this case, WS would be used to reference tokens created by any of these 3 rules. – Hierolatry 12/11, 2014 at 16:25

@SamHarwell what terminates the final mode spec? I noticed some lexer docs have fragment defs following the final mode spec where fragment usage shows the fragments are available to all modes including the default. – Sadomasochism 6/9, 2017 at 3:5

Tokens that can be matched in all modes would be a very welcome lexer grammar feature. I find myself "aliasing" tokens in the absence of such a feature. – Cardinal 6/8, 2019 at 21:8

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags