How can I make a rule to match all of its alternatives only once in any order, in ANTLR?
"All of its alternatives only once" is simply rule: altA altB altC ...
. That's the easy part. Asking rule
to accept all alternatives altA
, altB
, altC
, etc. in any arrangement means accommodating every arrangement. That's easy enough to handle manually for small numbers (rule: (altA altB | altB altA);
). But there's no automatic shortcut that I'm aware of to handling all the permutations for you automatically.
Here are some approaches in case there isn't a built-in way and assuming you can't relax your requirements. Caveats: I don't know the full scope of your problem; I don't know your grammar; I don't know why you want what you're asking for; I don't know what class of solution you prefer, other than you'd probably like it easier than any of these options.
First, you could bite the bullet and produce all the permutations of matches yourself, either by hand or by running a permutation generator. Then ANTLR would have what you want in a manner that it understands. It's crude but efficient: it's straight ANTLR syntax so there's no external code involved as there is in the options below.
For example, assume you have a rule field
that processes input like "public static final x"
, with all three modifiers expected but in no particular order. The permutations would look like so:
field : modifiers ID EOF;
modifiers
: PUBLIC STATIC FINAL //ABC
| PUBLIC FINAL STATIC //ACB
| STATIC PUBLIC FINAL //BAC
| STATIC FINAL PUBLIC //BCA
| FINAL PUBLIC STATIC //CAB
| FINAL STATIC PUBLIC //CBA
;
That's the end of it. No external code, no predicates, no nothing.
Second, you could use semantic predicates in the grammar to ensure that all matches are provided and matched without duplicates. There are various ways of writing the predicates themselves, but it boils down to tracking what matches have been made (to prevent duplicates) and then testing whether the rule has matched all the parts that it expects. Here is a basic example, following the same requirements as the previous one:
field
locals [java.util.HashSet<String> names = new java.util.HashSet<String>();]
: modifiers ID EOF;
modifiers
//Ensure that the full number of modifiers have been provided
: {$field::names.size() < 3}? modifier modifiers
| {$field::names.size() == 3}? //match nothing once we have (any) three modifiers
;
modifier
//Ensure that no duplicates have been provided
: {!$field::names.contains("public")}? PUBLIC {$field::names.add("public");}
| {!$field::names.contains("static")}? STATIC {$field::names.add("static");}
| {!$field::names.contains("final")}? FINAL {$field::names.add("final");}
;
Here rule field
tracks the modifier names in local variable names
. Rule modifiers
calls rule modifier
until names
contains three values. Rule modifier
matches any name that doesn't have a corresponding key in names
. Note that the values are manually added to names
. They could be any arbitrary value as long as modifier
's alternatives add the same value to both sides of the token it's matching.
My implementation is a little crude because the modifiers end up nesting in the produced parse tree (since modifiers
contains one modifier
and one modifiers
that contains one modifier
and one modifiers
that...), but I hope you get the idea.
Third, you could leave the poor parser alone and test for completeness in the calling code. This can be done during parsing using a parser listener or done after parsing using the ParserRuleContext
object produced by the parser. This breaks the problem into two parts: let the parser solve "any X, Y, Z in any order" and let the calling code solve "all and only just X, Y, Z."
Here is an example using the listener approach:
//partial grammar
field : modifier* ID EOF; //accept any modifiers in any order
modifier
: PUBLIC
| STATIC
| FINAL
;
//snippet of calling code
//initialize lexer and parser
parser.addParseListener(new MyGrammarBaseListener() {
@Override
public void exitField(FieldContext ctx) {
// The field rule has finished. Let's verify that no modifiers
// were duplicated.
//TODO check for duplicates, ensure all modifiers are specified.
//TODO log errors accordingly.
}
});
//Call the parser.
parser.field();
The grammar is kept clean. Modifiers can appear in the input arbitrarily before ID
, in any number and in any order. The calling code performs the tests by whatever means it chooses, logging whatever errors it wants.
Here is an uberexample that pulls together the three options I mentioned, to give a clearer idea of what I'm talking about.
Modifiers.g
grammar Modifiers;
//Hard-coded version : all the permutations are specified //
permutationField : permutationModifiers ID EOF;
permutationModifiers
: PUBLIC STATIC FINAL //ABC
| PUBLIC FINAL STATIC //ACB
| STATIC PUBLIC FINAL //BAC
| STATIC FINAL PUBLIC //BCA
| FINAL PUBLIC STATIC //CAB
| FINAL STATIC PUBLIC //CBA
;
// Predicate version : use semantic predicates to prevent duplicates and ensure all the modifiers are provided //
predicateField
locals [java.util.HashSet<String> names = new java.util.HashSet<String>();]
: predicateModifiers ID EOF;
predicateModifiers
//Ensure that the full number of modifiers have been provided
: {$predicateField::names.size() < 3}? predicateModifier predicateModifiers
| {$predicateField::names.size() == 3}? //match nothing once we have (any) three modifiers
;
predicateModifier
//Ensure that no duplicates have been provided
: {!$predicateField::names.contains("public")}? PUBLIC {$predicateField::names.add("public");}
| {!$predicateField::names.contains("static")}? STATIC {$predicateField::names.add("static");}
| {!$predicateField::names.contains("final")}? FINAL {$predicateField::names.add("final");}
;
//Listener version : test everything when the parser calls the listener //
listenerField : listenerModifier* ID EOF;
listenerModifier
: PUBLIC
| STATIC
| FINAL
;
PUBLIC : 'public';
STATIC : 'static';
FINAL : 'final';
FOO : 'foo';
ID : [a-zA-Z]+;
WS : [ \r\n\t]+ -> skip;
ModifiersTest.java
import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import org.antlr.v4.runtime.ANTLRInputStream;
import org.antlr.v4.runtime.BaseErrorListener;
import org.antlr.v4.runtime.CommonTokenStream;
import org.antlr.v4.runtime.RecognitionException;
import org.antlr.v4.runtime.Recognizer;
import org.antlr.v4.runtime.misc.Nullable;
public class ModifiersTest {
public static void main(String[] args) {
run("public static final x", "ok");
run("final static public x", "ok");
run("static public static final x", "too many modifiers");
run("static x", "missing modifiers");
run("final final x", "missing & duplicated modifiers");
}
private static void run(String code, String title) {
System.out.printf("%n---%n**Input : %s**%n%n\t%s%n%n", title, code);
System.out.println("**Permutation Output**\n");
runPermutationTest(code);
System.out.println();
System.out.println("**Predicate Output**\n");
runPredicateTest(code);
System.out.println();
System.out.println("**Listener Output**\n");
runListenerTest(code);
System.out.println();
}
private static void runPermutationTest(String code) {
ModifiersParser parser = createParser(code);
parser.permutationField();
System.out.println("\t(done)");
}
private static void runPredicateTest(String code) {
ModifiersParser parser = createParser(code);
parser.predicateField();
System.out.println("\t(done)");
}
private static void runListenerTest(String code) {
ModifiersParser parser = createParser(code);
parser.addParseListener(new ModifiersBaseListener() {
@Override
public void exitListenerField(ModifiersParser.ListenerFieldContext ctx) {
// The field rule has finished. Let's verify that no modifiers
// were duplicated.
HashSet<String> uniqueNames = new HashSet<String>();
ArrayList<String> allNames = new ArrayList<String>();
HashSet<String> expectedNames = new HashSet<String>();
expectedNames.add("public");
expectedNames.add("static");
expectedNames.add("final");
if (ctx.listenerModifier() != null && !ctx.listenerModifier().isEmpty()) {
List<ModifiersParser.ListenerModifierContext> modifiers = ctx.listenerModifier();
// Collect all the modifier names in a set.
for (ModifiersParser.ListenerModifierContext modifier : modifiers) {
uniqueNames.add(modifier.getText());
allNames.add(modifier.getText());
}
}
// Is the number of unique modifiers less than the number of
// all given modifiers? If so, then there must be duplicates.
if (uniqueNames.size() < allNames.size()) {
ArrayList<String> names = new ArrayList<String>(allNames);
for (String name : uniqueNames){
names.remove(name);
}
System.out.println("\tDetected duplicate modifiers : " + names);
} else {
System.out.println("\t(No duplicate modifiers detected)");
}
//Are we missing any expected modifiers?
if (!uniqueNames.containsAll(expectedNames)) {
ArrayList<String> names = new ArrayList<String>(expectedNames);
names.removeAll(uniqueNames);
System.out.println("\tDetected missing modifiers : " + names);
} else {
System.out.println("\t(No missing modifiers detected)");
}
}
});
parser.listenerField();
System.out.println("\t(done)");
}
private static ModifiersParser createParser(String code) {
ANTLRInputStream input = new ANTLRInputStream(code);
ModifiersLexer lexer = new ModifiersLexer(input);
ModifiersParser parser = new ModifiersParser(new CommonTokenStream(lexer));
BaseErrorListener errorListener = createErrorListener();
lexer.addErrorListener(errorListener);
parser.addErrorListener(errorListener);
return parser;
}
private static BaseErrorListener createErrorListener() {
BaseErrorListener errorListener = new BaseErrorListener() {
@Override
public void syntaxError(Recognizer<?, ?> recognizer, @Nullable Object offendingSymbol, int line,
int charPositionInLine, String msg, @Nullable RecognitionException e) {
//Print the syntax error
System.out.printf("\t%s at (%d, %d)%n", msg, line, charPositionInLine);
}
};
return errorListener;
}
}
Test Scenarios (output from the code above)
Input : ok
public static final x
Permutation Output
(done)
Predicate Output
(done)
Listener Output
(No duplicate modifiers detected)
(No missing modifiers detected)
(done)
Input : ok
final static public x
Permutation Output
(done)
Predicate Output
(done)
Listener Output
(No duplicate modifiers detected)
(No missing modifiers detected)
(done)
Input : too many modifiers
static public static final x
Permutation Output
extraneous input 'static' expecting 'final' at (1, 14)
(done)
Predicate Output
no viable alternative at input 'static' at (1, 14)
(done)
Listener Output
Detected duplicate modifiers : [static]
(No missing modifiers detected)
(done)
Input : missing modifiers
static x
Permutation Output
no viable alternative at input 'staticx' at (1, 7)
(done)
Predicate Output
no viable alternative at input 'x' at (1, 7)
(done)
Listener Output
(No duplicate modifiers detected)
Detected missing modifiers : [final, public]
(done)
Input : missing & duplicated modifiers
final final x
Permutation Output
no viable alternative at input 'finalfinal' at (1, 6)
(done)
Predicate Output
no viable alternative at input 'final' at (1, 6)
(done)
Listener Output
Detected duplicate modifiers : [final]
Detected missing modifiers : [static, public]
(done)