I built an Antlr 4 filter using a grammar (not important in the context here) and the filters look something like "age > 30 AND height < 6.1".
However, the question is, I'll build this filter once, and use it to evaluate maybe a thousand docs. Each doc will have an "age" and "height" attribute in them.
However what I'm not sure is, how to re-use parser or lexer so that I can speed up the evaluation. Build a lexer && parser each time seems really a waste of time.
The java code is something like
public Boolean createFilterVisitor(String input, DocFieldAccessor docFieldAccessor) {
FilterLexer lexer = new FilterLexer(CharStreams.fromString(input));
lexer.removeErrorListener(ConsoleErrorListener.INSTANCE);
CommonTokenStream tokens = new CommonTokenStream(lexer);
FilterParser parser = new FilterParser(tokens);
parser.addErrorListener(new FilterErrorListener());
parser.removeErrorListener(ConsoleErrorListener.INSTANCE);
FilterVisitorImpl filterVisitor = new FilterVisitorImpl(docFieldAccessor);
return filterVisitor.visit(parser.filter());
}
and then
for doc in docs:
createFilterVisitor(doc, someAccessor);
I tried to build lexer and parser once, and then do lexer.reset() and parser.reset() at the beginning of the loop. It seems to work (it filters reasonable docs) but I'm not really sure if I'm doing it correctly. I don't know what reset means and when should I use it
So my question is:
- How do I parse the filter string and build a parser/lexer once and not for every doc?
- Am I using reset correctly?
I have this code. Does this work?
public class KalaFilter {
private final String filterClause;
private FilterLexer lexer;
private FilterParser parser;
@Getter
private final FilterAnalyzer filterAnalyzer;
public KalaFilter(String filterClause) {
this.filterClause = filterClause;
lexer = new FilterLexer(CharStreams.fromString(filterClause));
lexer.removeErrorListener(ConsoleErrorListener.INSTANCE);
CommonTokenStream tokens = new CommonTokenStream(lexer);
parser = new FilterParser(tokens);
parser.addErrorListener(new FilterErrorListener());
parser.removeErrorListener(ConsoleErrorListener.INSTANCE);
ParseTree parseTree = parser.filter();
filterAnalyzer = new FilterAnalyzer();
ParseTreeWalker walker = new ParseTreeWalker(); // create standard walker
walker.walk(filterAnalyzer, parseTree);
}
// return filter result by visit parser
public Boolean visitFitlerResult(DocFieldAccessor docFieldAccessor) {
//lexer.reset();
//FilterLexer lexer = new FilterLexer(CharStreams.fromString(filterClause));
/*
CommonTokenStream tokens = new CommonTokenStream(lexer);
FilterParser parser = new FilterParser(tokens);
parser.addErrorListener(new FilterErrorListener());
parser.removeErrorListener(ConsoleErrorListener.INSTANCE);
*/
parser.reset();
FilterVisitorImpl filterVisitor = new FilterVisitorImpl(docFieldAccessor);
return filterVisitor.visit(parser.filter());
}
}