Generate syntactically correct sentences from an Antlr grammar

Asked 12/12, 2011 at 17:32 Answered 24/3, 2023 at 9:59

I have an Xtext/Antlr grammar that parses a subset of coffeescript. I have some test cases, but I thought of doing another sort of test:

Generate random, syntactically correct snippets from my Antlr grammar
Feed these snippets to the original coffeescript parser (calling coffee -ne "the sentence")
Check if each sentence is parsed by coffeescript

Thus I could ensure that my parser accepts a proper subset, and it's not too permissive in some cases. Now, I am stuck with the first step. How can I generate sentences from my Antlr grammar (which also makes heavy use of syntactic predicates)? So I'm interested in the opposite of parsing a sentence.

I found some related attempts, but the answers are not using Antlr at all, but a custom grammar in python, or in clojure, or in ruby. I'd prefer a working solution rather than a hint about how it could be implemented.

Casady answered 12/12, 2011 at 17:32 Comment(0)

No, you can't do this. If you look at the code that ANTLR compiles into, you can see that it's only a recognizer, not a generator.

The links you provided are your best bet -- take your ANTLR grammar, strip out all the rules to make it into a formal grammar, and then try to run it through one of those programs.

Or if your coffeescript subset is very small, you could take the approach of generating strings of random tokens and throwing away all the strings that don't parse.

Interpreter answered 8/8, 2012 at 15:47 Comment(0)

I realize this is a very old question, but this is now possible in VS Code using Mike Lischke's amazing extension "ANTLR4 grammar syntax support".

You can right click on any rule in your grammar and select "Generate Valid Input for Rule". The randomly generated, valid input will be output to the ANTLR4 Sentence Generation output.

Note that if you have recursion in your grammar, you might hit the default recursion limit which produces a "⨱" character (source). Searching for the reason of this character is what actually led me to this question.

More details in the extension's docs: https://github.com/mike-lischke/vscode-antlr4/blob/master/doc/sentence-generation.md

Churinga answered 24/3, 2023 at 9:59 Comment(2)

It's so nice to see a positive answer after more than 10 years :) – Casady 27/3, 2023 at 20:10

Glad I could help bring it some closure :) (the real credit going to Mike Lischke of course)! – Churinga 27/3, 2023 at 20:29

Recommended topics

Hot tags