How to use antlr 4 TokenStream as iterable stream?
Asked Answered
R

1

5

I have created a lexer using antlr 4 for tokenizing Turkish natural language texts, what I need to do is to have a token stream which I can fetch tokens one by one. CommonTokenStream returns a List if I use it like this:

ANTLRInputStream inputStream = new ANTLRInputStream(input);
TurkishLexer lexer = new TurkishLexer(inputStream);
CommonTokenStream tokenStream = new CommonTokenStream(lexer);
List<Token> tokens = tokenStream.fill();
for (Token token : token) ...

However I don't want to construct a list of tokens as my input could be huge, I just want something like:

for (Token token: tokenStream.next()) ...

Which I would iterate until getting an EOF token.

Is there a Token Stream that allows me to iterate over tokens?

Resnick answered 31/1, 2013 at 10:10 Comment(0)
M
13

Rather than use a CommonTokenStream, you could simply use Lexer.nextToken.

for (Token token = lexer.nextToken();
     token.getType() != Token.EOF;
     token = lexer.nextToken())
{
    ...
Mu answered 31/1, 2013 at 14:24 Comment(1)
Ah thanks a lot. Still maybe it would be nice to expose this in Token streams. As a side note, Java version uses lexer.nextToken()Resnick

© 2022 - 2024 — McMap. All rights reserved.