How to convert BNF to EBNF
Asked Answered
D

2

8

How can I convert this BNF to EBNF?

<vardec> ::= var <vardeclist>;
<vardeclist> ::= <varandtype> {;<varandtype>}
<varandtype> ::= <ident> {,<ident>} : <typespec>
<ident> ::= <letter> {<idchar>}
<idchar> ::= <letter> | <digit> | _
Deviant answered 17/2, 2013 at 14:42 Comment(2)
What exactly are you having problems with?Fan
The 'possible duplicate' question has one answer which contains two links to material off SO. It certainly asks an approximately equivalent question; it does not, however, have a very good answer, so it isn't a good duplicate.Omeromero
O
13

EBNF or Extended Backus-Naur Form is ISO 14977:1996, and is available in PDF from ISO for free*. It is not widely used by the computer language standards. There's also a paper that describes it, and that paper contains this table summarizing EBNF notation.

         Table 1: Extended BNF
Extended BNF    Operator  Meaning
-------------------------------------------------------------
unquoted words            Non-terminal symbol
" ... "                   Terminal symbol
' ... '                   Terminal symbol
( ... )                   Brackets
[ ... ]                   Optional symbols
{ ... }                   Symbols repeated zero or more times
{ ... }-                  Symbols repeated one or more times†
=               in        Defining symbol
;               post      Rule terminator
|               in        Alternative
,               in        Concatenation
-               in        Except
*               in        Occurrences of
(* ... *)                 Comment
? ... ?                   Special sequence

The * operator is used with a preceding (unsigned) integer number; it does not seem to allow for variable numbers of repetitions — such as 1-15 characters after an initial character to make identifiers up to 16 characters long.

In the standard, open parenthesis ( is called start group symbol and close parenthesis ) is called end group symbol; open square bracket [ is start option symbol and close square bracket is end option symbol; open brace { is start repeat symbol and close brace } is end repeat symbol. Single quotes ' are called first quote symbol and double quotes " are second quote symbol.

* Yes, free — even though you can also pay 74 CHF for it if you wish. Look at the Note under the box containing the chargeable items.


The question seeks to convert this 'BNF' into EBNF:

<vardec> ::= var <vardeclist>;
<vardeclist> ::= <varandtype> {;<varandtype>}
<varandtype> ::= <ident> {,<ident>} : <typespec>
<ident> ::= <letter> {<idchar>}
<idchar> ::= <letter> | <digit> | _

The BNF is not formally defined, so we have to make some (easy) guesses as to what it means. The translation is routine (it could be mechanical if the BNF is formally defined):

vardec     = 'var', vardeclist, ';';
vardeclist = varandtype, { ';', varandtype };
varandtype = ident, { ',', ident }, ':', typespec;
ident      = letter, { idchar };
idchar     = letter | digit | '_';

The angle brackets have to be removed around non-terminals; the definition symbol ::= is replaced by =; the terminals such as ; and _ are enclosed in quotes; concatenation is explicitly marked with ,; and each rule is ended with ;. The grouping and alternative operations in the original happen to coincide with the standard notation. Note that explicit concatenation with the comma means that multi-word non-terminals are unambiguous.


Casual study of the standard itself suggests that the {...}- notation is not part of the standard, just of the paper. However, as jmmut notes in a comment, the standard does define the meaning of {…}-:

§5.8 Syntactic term

When a syntactic-term is a syntactic-factor followed by an except-symbol followed by a syntactic-exception it represents any sequence of symbols that satisfies both of the conditions:

a) it is a sequence of symbols represented by the syntactic-factor,

b) it is not a sequence of symbols represented by the syntactic-exception.

NOTE - { "A" } - represents a sequence of one or more A's because it is a syntactic-term with an empty syntactic-exception.

Omeromero answered 17/2, 2013 at 15:43 Comment(4)
I'd like to note that the notation {...}- is part of the standard (end of page 4, under 5.8 syntactic-term), just not in an intuitive way. It is shown how only using the definitions of {, } and - a "one or more" rule can be made: ee = {"A"}-,"E"; defines "AE", "AAE", "AAAE", etc. > NOTE { "A" } - represents a sequence of one or more A's because it is a syntactic-term with an empty syntactic-exception.Heraclid
@jmmut: Thank you. Well spotted. 'Tis an interesting piece of minimalism. I've updated my answer. I'm not sure whether it is over-kill; it might be.Omeromero
Agree, it's not directly related to the question, but I'd say it's good to have at least the note, I didn't know it was allowed before reading this answer. Oh, and another detail, the commas (concatenation) are also needed around groups like braces (repetition). definitions like ident in the example should be ident = letter, { idchar }; I think, according to the standard from 4.5 to 4.10.Heraclid
@jmmut: Yup — fixed. Thank you. (As you can tell, I don't have an EBNF verifier to validate what I wrote.)Omeromero
H
2

Remove the angle brackets and put all terminals into quotes:

vardec ::= "var" vardeclist;
vardeclist ::= varandtype { ";" varandtype }
varandtype ::= ident { "," ident } ":" typespec
ident ::= letter { idchar }
idchar ::= letter | digit | "_"
Hunnish answered 17/2, 2013 at 14:46 Comment(1)
To a first approximation; there are some details you need to fix up.Omeromero

© 2022 - 2024 — McMap. All rights reserved.