My attempt at value initialization is interpreted as a function declaration, and why doesn't A a(()); solve it? [duplicate]
Asked Answered
M

5

168

Among the many things Stack Overflow has taught me is what is known as the "most vexing parse", which is classically demonstrated with a line such as

A a(B()); //declares a function

While this, for most, intuitively appears to be the declaration of an object a of type A, taking a temporary B object as a constructor parameter, it's actually a declaration of a function a returning an A, taking a pointer to a function which returns B and itself takes no parameters. Similarly the line

A a(); //declares a function

also falls under the same category, since instead of an object, it declares a function. Now, in the first case, the usual workaround for this issue is to add an extra set of brackets/parenthesis around the B(), as the compiler will then interpret it as the declaration of an object

A a((B())); //declares an object

However, in the second case, doing the same leads to a compile error

A a(()); //compile error

My question is, why? Yes I'm very well aware that the correct 'workaround' is to change it to A a;, but I'm curious to know what it is that the extra () does for the compiler in the first example which then doesn't work when reapplying it in the second example. Is the A a((B())); workaround a specific exception written into the standard?

Milter answered 15/9, 2009 at 0:13 Comment(10)
(B()) is just a C++ expression, nothing more. It's not any kind of exception. The only difference that it makes is that there's no way it can be possibly parsed as a type, and so it's not.Portuguese
Ah yes, now I see it. That was along the lines of what I was interested in (i.e. what exactly it was that the extra () was doing around the B()). Just like how A a(((((((((B()))))))))); or any number of parenthesis similarly works.Milter
It should also be noted that the second case, A a(); is not of the same category. For the compiler, there is never any different way to parse it: An initializer at that place never consists of empty parentheses, so this is always a function declaration.Silverfish
litb's excellent point is a subtle yet important one and is worth emphasizing - the reason the ambiguity exists in this declaration 'A a(B())' is in the parsing of 'B()' -> it can be both an expression & a declaration & the compiler must 'pick' decl over expr - so if B() is a decl then 'a' can only be a func decl (not a variable decl). If '()' was allowed to be an initializer 'A a()' would be ambiguous - but not expr vs decl, but var decl vs func decl - there is no rule to prefer one decl over another - and so '()' is just not allowed as an initializer here - and the ambiguity does not rise.Morula
see what Danny Kalev says about it : informit.com/guides/content.aspx?g=cplusplus&seqNum=439Ryder
@Faisal Vali. Your comment is the best answer I have seen. Very good. Could you please, add it as an answer?Pleione
A a(); is not an example of the most vexing parse. It is simply a function declaration, just like it is in C.Dysfunction
"the correct 'workaround' is to change it to A a;" is wrong. That won't give you initialization of a POD type. To get intitialization write A a{};.Sargassum
This Q&A is being discussed on meta.Administrative
Possibly the cause of confusion regarding A a(); stems from the fact that both A *a = new A; and A *b = new A(); are legitimate ways of using new to construct an A. Whether that second of these should ever have been allowed, given the behavior of A a(); is an interesting academic question, but the reality is that particular boat sailed away many many years ago.Hoahoactzin
J
73

There is no enlightened answer, it's just because it's not defined as valid syntax by the C++ language... So it is so, by definition of the language.

If you do have an expression within then it is valid. For example:

 ((0));//compiles

Even simpler put: because (x) is a valid C++ expression, while () is not.

To learn more about how languages are defined, and how compilers work, you should learn about Formal language theory or more specifically Context Free Grammars (CFG) and related material like finite state machines. If you are interested in that though the wikipedia pages won't be enough, you'll have to get a book.

Joaquinajoash answered 15/9, 2009 at 0:24 Comment(2)
Even simpler put: because (x) is a valid C++ expression, while () is not.Portuguese
I've accepted this answer, in addition Pavel's comment to my initial question helped me out a lotMilter
O
41

The final solution to this issue is to move to the C+11 uniform initialization syntax if you can.

A a{};

http://www.stroustrup.com/C++11FAQ.html#uniform-init

Operculum answered 21/7, 2011 at 17:35 Comment(0)
A
31

C function declarators

First of all, there is C. In C, A a() is function declaration. For example, putchar has the following declaration. Normally, such declarations are stored in header files, however nothing stops you from writing them manually, if you know how the declaration of function looks like. The argument names are optional in declarations, so I omitted it in this example.

int putchar(int);

This allows you to write the code like this.

int puts(const char *);
int main() {
    puts("Hello, world!");
}

C also allows you to define functions that take functions as arguments, with nice readable syntax that looks like a function call (well, it's readable, as long you won't return a pointer to function).

#include <stdio.h>

int eighty_four() {
    return 84;
}

int output_result(int callback()) {
    printf("Returned: %d\n", callback());
    return 0;
}

int main() {
    return output_result(eighty_four);
}

As I mentioned, C allows omitting argument names in header files, therefore the output_result would look like this in header file.

int output_result(int());

One argument in constructor

Don't you recognize that one? Well, let me remind you.

A a(B());

Yep, it's exactly the same function declaration. A is int, a is output_result, and B is int.

You can easily notice a conflict of C with new features of C++. To be exact, constructors being class name and parenthesis, and alternate declaration syntax with () instead of =. By design, C++ tries to be compatible with C code, and therefore it has to deal with this case - even if practically nobody cares. Therefore, old C features have priority over new C++ features. The grammar of declarations tries to match the name as function, before reverting to the new syntax with () if it fails.

If one of those features wouldn't exist, or had a different syntax (like {} in C++11), this issue would never have happened for syntax with one argument.

Now you may ask why A a((B())) works. Well, let's declare output_result with useless parentheses.

int output_result((int()));

It won't work. The grammar requires the variable to not be in parentheses.

<stdin>:1:19: error: expected declaration specifiers or ‘...’ before ‘(’ token

However, C++ expects standard expression here. In C++, you can write the following code.

int value = int();

And the following code.

int value = ((((int()))));

C++ expects expression inside inside parentheses to be... well... expression, as opposed to the type C expects. Parentheses don't mean anything here. However, by inserting useless parentheses, the C function declaration is not matched, and the new syntax can be matched properly (which simply expects an expression, such as 2 + 2).

More arguments in constructor

Surely one argument is nice, but what about two? It's not that constructors may have just one argument. One of built-in classes which takes two arguments is std::string

std::string hundred_dots(100, '.');

This is all well and fine (technically, it would have most vexing parse if it would be written as std::string wat(int(), char()), but let's be honest - who would write that? But let's assume this code has a vexing problem. You would assume that you have to put everything in parentheses.

std::string hundred_dots((100, '.'));

Not quite so.

<stdin>:2:36: error: invalid conversion from ‘char’ to ‘const char*’ [-fpermissive]
In file included from /usr/include/c++/4.8/string:53:0,
                 from <stdin>:1:
/usr/include/c++/4.8/bits/basic_string.tcc:212:5: error:   initializing argument 1 of ‘std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const _CharT*, const _Alloc&) [with _CharT = char; _Traits = std::char_traits<char>; _Alloc = std::allocator<char>]’ [-fpermissive]
     basic_string<_CharT, _Traits, _Alloc>::
     ^

I'm not sure why g++ tries to convert char to const char *. Either way, the constructor was called with just one value of type char. There is no overload which has one argument of type char, therefore the compiler is confused. You may ask - why the argument is of type char?

(100, '.')

Yes, , here is a comma operator. The comma operator takes two arguments, and gives the right-side argument. It isn't really useful, but it's something to be known for my explanation.

Instead, to solve the most vexing parse, the following code is needed.

std::string hundred_dots((100), ('.'));

The arguments are in parentheses, not the entire expression. In fact, just one of expressions needs to be in parentheses, as it's enough to break from the C grammar slightly to use the C++ feature. Things brings us to the point of zero arguments.

Zero arguments in constructor

You may have noticed the eighty_four function in my explanation.

int eighty_four();

Yes, this is also affected by the most vexing parse. It's a valid definition, and one you most likely have seen if you created header files (and you should). Adding parentheses doesn't fix it.

int eighty_four(());

Why is that so? Well, () is not an expression. In C++, you have to put an expression between parentheses. You cannot write auto value = () in C++, because () doesn't mean anything (and even if did, like empty tuple (see Python), it would be one argument, not zero). Practically that means you cannot use shorthand syntax without using C++11's {} syntax, as there are no expressions to put in parenthesis, and C grammar for function declarations will always apply.

Arsenault answered 18/5, 2014 at 19:53 Comment(0)
V
12

You could instead

A a(());

use

A a=A();
Vladivostok answered 3/2, 2010 at 15:36 Comment(1)
The 'better workaround' is not equivalent. int a = int(); initializes a with 0, int a; leaves a uninitialized. A correct workaround is to use A a = {}; for aggregates, A a; when default-initialization does what you want, and A a = A(); in all other cases -- or just use A a = A(); consistently. In C++11, just use A a {};Unnatural
E
6

The innermost parens in your example would be an expression, and in C++ the grammar defines an expression to be an assignment-expression or another expression followed by a comma and another assignment-expression (Appendix A.4 - Grammar summary/Expressions).

The grammar further defines an assignment-expression as one of several other types of expression, none of which can be nothing (or only whitespace).

So the reason you can't have A a(()) is simply because the grammar doesn't allow it. However, I can't answer why the people who created C++ didn't allow this particular use of empty parens as some sort of special-case - I'd guess that they'd rather not put in such a special case if there was a reasonable alternative.

Ene answered 15/9, 2009 at 0:33 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.