I'm trying to parse a Nested Boolean Expression and get the individual conditions within the expression separately. For e.g., if the input string is:
(A = a OR B = b OR C = c AND ((D = d AND E = e) OR (F = f AND G = g)))
I would like to get the conditions with the correct order. i.e.,
D =d AND E = e OR F = f AND G = g AND A = a OR B = b OR C = c
I'm using ANTLR 4 to parse the input text and here's my grammar:
grammar SimpleBoolean;
rule_set : nestedCondition* EOF;
AND : 'AND' ;
OR : 'OR' ;
NOT : 'NOT';
TRUE : 'TRUE' ;
FALSE : 'FALSE' ;
GT : '>' ;
GE : '>=' ;
LT : '<' ;
LE : '<=' ;
EQ : '=' ;
LPAREN : '(' ;
RPAREN : ')' ;
DECIMAL : '-'?[0-9]+('.'[0-9]+)? ;
IDENTIFIER : [a-zA-Z_][a-zA-Z_0-9]* ;
WS : [ \r\t\u000C\n]+ -> skip;
nestedCondition : LPAREN condition+ RPAREN (binary nestedCondition)*;
condition: predicate (binary predicate)*
| predicate (binary component)*;
component: predicate | multiAttrComp;
multiAttrComp : LPAREN predicate (and predicate)+ RPAREN;
predicate : IDENTIFIER comparator IDENTIFIER;
comparator : GT | GE | LT | LE | EQ ;
binary: AND | OR ;
unary: NOT;
and: AND;
And here's the Java Code that I'm using to parse it:
ANTLRInputStream inputStr = new ANTLRInputStream(input);
SimpleBooleanLexer lexer = new SimpleBooleanLexer(inputStr);
TokenStream tokens = new CommonTokenStream(lexer);
SimpleBooleanParser parser = new SimpleBooleanParser(tokens);
parser.getBuildParseTree();
ParseTree tree = parser.rule_set();
System.out.println(tree.toStringTree(parser));
The output is:
(rule_set (nestedCondition ( (condition (predicate A (comparator =) a) (binary OR) (component (predicate B (comparator =) b)) (binary OR) (component (predicate C (comparator =) c)) (binary AND) (component (multiAttrComp ( (predicate ( D (comparator =) d) (and AND) (predicate E (comparator =) e) ))) (binary OR) (component (multiAttrComp ( (predicate F (comparator =) f) (and AND) (predicate G (comparator =) g) )))) ) )) <EOF>)
I'm looking for help on how to parse this tree to get the conditions in the correct order? In ANTLR 3, we could specify ^ and ! to decide how the tree is built(refer this thread), but I learnt that this is not supported in ANTLR 4.
Can someone suggest how I can parse the String in the correct order in Java using the ParseTree created by ANTLR?
GT | GE | LT | LE | EQ
all have the same precedence and they should be evaluated beforeAND | OR
. The parsing should be based on the brackets( )
. What I'm looking for is help on how to parse the String in Java using the ParseTree shown in the code above. – ArchdioceseAND
between two components, it would always be inside brackets( )
. – Archdiocese(A = a OR B = b OR C = c AND ((D = d AND E = e) OR (F = f AND G = g)))
, the firstAND
is not inside parens. WillA = a OR B = b OR C = c
be evaluated first, or willC = c AND ((D = d AND E = e) OR (F = f AND G = g))
first be evaluated? – Wolford