Access Channels in ANTLR 4 and Parse them separately
Asked Answered
A

4

3

I have included my comments in to a separate channel in ANTLR 4. In my case it is channel 2.

This is my lexer grammar.

COMMENT: '/*' .*? '*/' -> channel(2) 
       ;

I want to access this channel 2 and do a parse on this channel to accumulate comments. So I included that as in parsing grammar as below:

comment
:COMMENT
;

In the program

        string s = " paring string"
        AntlrInputStream input = new AntlrInputStream(s);
        CSSLexer lexer = new CSSLexer(input); 
       
        CommonTokenStream tokens = new CommonTokenStream(lexer,2);

Then I want to do the parsing on the tokens

var xr = parser.comment().GetRuleContexts<CommentContext>();

because I want to get the information from the CommentContext object such as Start.Column etc.

EDIT:

This is the improved question

To be more specific, I want to get all the tokens in channel 2 and parse them using comment grammar to get all the comments to a list(IReadOnly<CommentContext>) so that I can iterate through each of these and access the information such as, start line, start column, end line end column, and the token text.

CommonTokenStream tokens = new CommonTokenStream(lexer,2);

This is not giving me the tokens in channel 2. And another thing I discovered is until these tokens are passed as arguments to the parser construct XParser parser = new XParser(tokens);

Then only I can access the the tokens by calling GetTokens().In the tokes I can see that there are comments identified as tokens and is in the channel 2. Even though CommentTokenStrem species the channel number as above. it contains all the tokens.

  1. What is the reason of not able to access the tokens until the parser object is created using the tokens?

  2. I want to get a CommentTokenStrem in channel 2 and pass the to the XParser object creation to parse these tokens using my comment grammar. What is the best way of doing this in ANTLR 4 API?

Audiovisual answered 4/9, 2013 at 6:18 Comment(1)
What is your specific question?Privily
P
5

CommonTokenStream internally tracks all tokens from any channel. The only thing you won't see when you call getTokens() is lexer rules where a -> skip action was executed (no token is even created for those rules).

You can look at the tokens on channel 2 by using the TokenStream.LT and IntStream.consume methods.

Java example

CommonTokenStream cts = new CommonTokenStream(tokenSource, 2);
List<Token> tokens = new ArrayList<Token>();
while (cts.LA(1) != EOF) {
    tokens.add(cts.LT(1));
    cts.consume();
}

C# example:

CommonTokenStream cts = new CommonTokenStream(tokenSource, 2);
IList<IToken> tokens = new List<IToken>();
while (cts.La(1) != Eof)
{
    tokens.Add(cts.Lt(1));
    cts.Consume();
}
Privily answered 5/9, 2013 at 12:26 Comment(2)
Thanks for the reply. I would like to get the Start line number and Start Column number which is provided. But I want to access the End line number and End Column number which in not available. How to achieve this.Audiovisual
@Diode you can either infer that information based on the start position of the following token, or calculate it from the end position of the token in the input stream along with the start position of each line of the input (you must calculate or track this separately).Privily
F
2

How about this:

 var allowedChannels = new[] { 2 }; // add more if you need to
 var tokensImInterestedIn = tokens.GetTokens().Where(token => allowedChannels.Contains(token.Channel) && token.Type != CSSLexer.Eof).ToArray();

 // if you're just interested in one particular channel
 var tokensImInterestedIn = tokens.GetTokens().Where(token => token.Channel == 2) && token.Type != CSSLexer.Eof).ToArray();
Fredi answered 5/9, 2013 at 6:52 Comment(2)
I want to get the tokens from channel 2 and parse it using some grammar. Parser expect an IToken stream. So, I was looking for a way to get IToken stream in channel 2. I will try this approach. Thanks a lot.Audiovisual
Depending on what you want to achieve, remember that the TokenIndex property on the individual tokens, will refer to the index in the original tokens array, not the tokensImInterestedIn - so beware of out-of-range errors.Alfonzoalford
E
1

ANTLR 4 C# :

       using Antlr4.Runtime;
       ...

       MyLexer lexer = new MyLexer (inputStream);
       var tokenstream = new CommonTokenStream(lexer, TokenConstants.HiddenChannel);
       IList<IToken> tokens = new List<IToken>();

       while (tokenstream.La(1) != TokenConstants.Eof)
       {                    
                            tokens.Add(tokenstream.Lt(1));
                            tokenstream.Consume();
       }
       foreach (IToken iToken in tokens)
       {
              Console.WriteLine(" Line : {0} Text : {1} ",
                                iToken.Line,
                                iToken.Text                  
                                );
       }
Expect answered 27/11, 2017 at 11:15 Comment(0)
F
0

Alternatively you could put all the other tokens in another channel and use the default channel for your parser.

Of course this would not work if you have two parsers that expect tokens in separate channels.

Fredi answered 5/9, 2013 at 11:7 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.