Is there a way to generate unit test to test my grammar
Asked Answered
I

3

3

I created my grammar using antlr4 but I want to test robustess
is there an automatic tool or a good way to do that fast

Thanks :)

Imbroglio answered 17/12, 2015 at 15:52 Comment(0)
C
3

The only way I found to create unit tests for a grammar is to create a number of examples from a written spec of the given language. This is neither fast, nor complete, but I see no other way.

You could be tempted to create test cases directly from the grammar (writing a tool for that isn't that hard). But think a moment about this. What would you test then? Your unit tests would always succeed, unless you use generated test cases from an earlier version of the grammar.

A special case is when you write a grammar for a language that has already a grammar for another parser generation tool. In that case you could use the original grammar to generate test cases which you then can use to test your new grammar for conformity.

However, I don't know any tool that can generate the test cases for you.

Update

Meanwhile I got another idea that would allow for better testing: have a sentence generator that generates random sentences from your grammar (I'm currently working on one in my Visual Studio Code ANTLR4 extension). The produced sentences can then be examined using a heuristic approach, for their validity:

  • Confirm the base structure.
  • Check for mandatory keywords and their correct order.
  • Check that identifiers and strings are valid.
  • Watch out for unusual constructs that are not valid according to language.
  • ...

This would already cover a good part of the language, but has limits. Matching code and generating it are not 1:1 operations. A grammar rule that matches certain (valid) input might generate much more than that (and can so produce invalid input).

Cockleshell answered 18/12, 2015 at 9:36 Comment(0)
T
6

As it's so hard to find real unit tests for ANTLR, I wrote 2 articles about it:

A Lexer test checks whether a given text is read and converted to the expected Token sequences. It's useful for instance to avoid ambiguity errors.

A Parser test take a sequence of tokens (that is, it starts after the lesser part) and checks whether that token sequence traverses the expected rules ( java methods).

Tm answered 21/12, 2018 at 12:30 Comment(3)
Thanks for the Lexer example! I converted it to Python here gist.github.com/nmz787/cf98aa465a4d071a937cf74788687a54Broadax
+1, but please edit "Lexer rules are described as LOWER_CASE". You meant UPPER_CASE probably. And only the first character needs to be upper case.Cribbs
Great answer, this should be the accepted one.Piperpiperaceous
C
3

The only way I found to create unit tests for a grammar is to create a number of examples from a written spec of the given language. This is neither fast, nor complete, but I see no other way.

You could be tempted to create test cases directly from the grammar (writing a tool for that isn't that hard). But think a moment about this. What would you test then? Your unit tests would always succeed, unless you use generated test cases from an earlier version of the grammar.

A special case is when you write a grammar for a language that has already a grammar for another parser generation tool. In that case you could use the original grammar to generate test cases which you then can use to test your new grammar for conformity.

However, I don't know any tool that can generate the test cases for you.

Update

Meanwhile I got another idea that would allow for better testing: have a sentence generator that generates random sentences from your grammar (I'm currently working on one in my Visual Studio Code ANTLR4 extension). The produced sentences can then be examined using a heuristic approach, for their validity:

  • Confirm the base structure.
  • Check for mandatory keywords and their correct order.
  • Check that identifiers and strings are valid.
  • Watch out for unusual constructs that are not valid according to language.
  • ...

This would already cover a good part of the language, but has limits. Matching code and generating it are not 1:1 operations. A grammar rule that matches certain (valid) input might generate much more than that (and can so produce invalid input).

Cockleshell answered 18/12, 2015 at 9:36 Comment(0)
C
1

In one chapter of his book 'Software Testing Techniques' Boris Beizer addresses the topic of 'syntax testing'. The basic idea is to (mentally or actually) take a grammar and represent it as a syntax diagram (aka railroad diagram). For systematic testing, this graph then would be covered: Good cases where the input matches the elements, but also bad cases for each node. Iterations and recursive calls would be handled like loops, that is, cases with zero, one, two, one less than max, max, once above max iterations (i.e. occurrences of the respective syntactic element).

Cecilececiley answered 6/8, 2019 at 20:39 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.