why is `int test {}` a function definition in C language BNF
Asked Answered
C

2

6

I'm interested in the famous The syntax of C in Backus-Naur Form and studied for a while, what confuse me is that some syntax looks wrong to me but is considered right according to the BNF.

For example, int test {}, what's this? I think this is a ill syntax in C, but the truth is the BNF considered this a function definition:

int -> type_const -> type_spec -> decl_specs
test-> id -> direct_declarator -> declarator
'{' '}' -> compound_stat
decl_specs declarator compound_stat -> function_definition

I tried this with bison, it considered the input int test {} is a right form, but I tried this on a C compiler, it will not compile.

So got questions:

  1. int test {} a right syntax or not?
  2. If it is a right syntax, what is that mean and why compiler do not recognized it?
  3. If it is an ill syntax, can I say the BNF is not rigorous? And does that mean modern C compiler does not stick with this BNF?
Colorable answered 22/3, 2019 at 7:20 Comment(0)
S
3

The grammar is necessary but not sufficient to describe a valid C program. For that you need constraints from the standard too. A simpler example of this would be 0++, which follows the syntax of a C expression, but certainly isn't a valid program fragment...

C11 6.9.1p2:

  1. The identifier declared in a function definition (which is the name of the function) shall have a function type, as specified by the declarator portion of the function definition. [162]

The footnote 162 explains that the intent of the constraint is that a typedef cannot be used, i.e. that

typedef int F(void);
F f { /* ... */ }

will not be valid, even though such a typedef could be used for a function declaration, i.e.

F f;

would declare the function

int f(void);

But mere existence of this constraint also proves that the BNF grammar in itself is not sufficient in this case. Hence you are correct in that the grammar would consider such a fragment a function definition.

Saleratus answered 22/3, 2019 at 8:48 Comment(11)
Can I understand that this in this BNF, int test {} is syntax valid, but semantics wrong?Colorable
Thanks for answering. I'm writing a parser for C language, do you happen to know if there's a more 'accurate' BNF? A BNF which follows the constraints as much as possible, to give int test {} a wrong syntax error?Colorable
@reavenisadesk No, I doubt that is ever possible... At least not in BNF. You just need to go through the parsed tree and check the constraints otherwise.Bifoliate
You can consider the English language, if you'd write a BNF for all the semantically valid sentences it would be quite a bit more complicated than the BNF of syntax...Bifoliate
You're right, I guess the test-> id -> direct_declarator -> identifier path just isn't possible after all. Unless there's another exotic way of defining a function.Lint
@AnttiHaapala I understand, I just got a feeling this is not a semantically problem but a syntax problem... Just like as if "Noun Noun Verb" is allow in English but you can't say "sheep grass eat"...Colorable
But you can say "Snow sheep eat." and it is syntactically and semantically correct to a letter!Bifoliate
@antti: but in this case, the constraint really is syntactic. And it would have been possible to write it out in BNF, but they chose not to because the resulting bloat would have made the grammar even harder to read. The tagged-non-terminal formalism used by Ecmascript (which is "just" syntactic sugar over BNF) makes this sort of constraint much easier to write (and read).Decontrol
In C (but not C++), lvalue can also be defined syntactically. In theory. But practically no-one would actually want to write and validate a grammar which did that.Decontrol
@Decontrol and not only an lvalue but modifiable lvalue - foo++ does not make sense for int foo[]...Bifoliate
@antti: true, you cannot reject non-modifiable lvalues syntactically. Like you can't reject redefinitions. But you could specify that a function definition has to be based on a function declaration, if you wanted to.Decontrol
R
1

The BNF form is a precise way to describe the syntax of a language, i.e. what to do precisely to get the parse tree starting from raw input.

For each language you can define infinite many grammars that describe that language. The properties of these grammars that describe the same language can differ a lot.

If you study the grammar of the C language, take care as it is not context free but context sensitive, which means, the decision of choosing a rule or other depends on what there is around that point in input.

Read about the lexer hack to see how to correctly interpret the Backus Naur form of the C grammar.

Renettarenew answered 22/3, 2019 at 9:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.