C++ constructors: why is this virtual function call not safe?
Asked Answered
C

3

13

This is from the C++11 standard sec 12.7.4. This is rather confusing.

  1. What does the last sentence in the text mean exactly?
  2. Why is the last method call in B::B undefined? Shoudn't it just call a.A::f?

4 Member functions, including virtual functions (10.3), can be called during construction or destruction (12.6.2). When a virtual function is called directly or indirectly from a constructor or from a destructor, including during the construction or destruction of the class’s non-static data members, and the object to which the call applies is the object (call it x) under construction or destruction, the function called is the final overrider in the constructor’s or destructor’s class and not one overriding it in a more-derived class. If the virtual function call uses an explicit class member access (5.2.5) and the object expression refers to the complete object of x or one of that object’s base class subobjects but not x or one of its base class subobjects, the behavior is undefined. [ Example:

struct V {
 virtual void f();
 virtual void g();
};

struct A : virtual V {
 virtual void f();
};

struct B : virtual V {
 virtual void g();
 B(V*, A*);
};

struct D : A, B {
 virtual void f();
 virtual void g();
 D() : B((A*)this, this) { }
};

B::B(V* v, A* a) {
 f(); // calls V::f, not A::f
 g(); // calls B::g, not D::g
 v->g(); // v is base of B, the call is well-defined, calls B::g
 a->f(); // undefined behavior, a’s type not a base of B
}

—end example ]

Ceil answered 7/7, 2012 at 18:32 Comment(8)
possible duplicate of Calling virtual functions inside constructorsQuathlamba
No, it's not that problem. It's a different one. The virtual functions issue is secondary.Rank
@Harvey: it's not a dupp. This question is way beyond the scope of the one you referenced.Ceil
@Thomas Sorry about that. I am trying to figure out a way to undo the close.Quathlamba
@Thomas Please accept my apology. It looks like there is no way to undo the close. meta.stackexchange.com/questions/915/… but they said the close vote will be faded out in couple daysQuathlamba
C++03 says "If the virtual function call uses an explicit class member access and the object-expression refers to the object under construction or destruction but its type is neither the constructor or destructor’s own class or one of its bases, the result of the call is undefined." I think it's debatable whether the C++11 language is actually clearer. Maybe a little...Aide
@Harvey: apology accepted. It's not a big issue.Ceil
Thinking about it, "If the virtual function call uses an explicit class member access" is probably an error in the std.Chichi
I
19

That portion of the standard is simply telling you that when you are constructing some "large" object J whose base class hierarchy includes multiple inheritance, and you are currently sitting inside the constructor of some base subobject H, then you are only allowed to use polymorphism of H and its direct and indirect base subobjects. You are not allowed to use any polymorphism outside that subhierarchy.

For example, consider this inheritance diagram (arrows point from derived classes to base classes)

enter image description here

Let's say we are constructing a "large" object of type J. And we are currently executing the constructor of class H. Inside the constructor of H you are allowed to enjoy typical constructor-restricted polymorphism of the subhierarchy inside the red oval. For example, you can call virtual functions of base subobject of type B, and the polymorphic behavior will work as expected inside the circled subhierarchy ("as expected" means that the polymorphic behavior will go as low as H in the hierarchy, but no lower). You can also call virtual functions of A, E, X and other subobjects that fall inside the red oval.

However, if you somehow gain access to the hierarchy outside the oval and attempt to use polymorphism there, the behavior becomes undefined. For example, if you somehow gain access to G subobject from the constructor of H and attempt to call a virtual function of G - the behavior is undefined. The same can be said about calling virtual functions of D and I from the constructor of H.

The only way to obtain such access to the "outside" subhierarchy is if someone somehow passed a pointer/reference to G subobject into the constructor of H. Hence the reference to "explicit class member access" in the standard text (although it seems to be excessive).

The standard includes virtual inheritance into the example to demonstrate how inclusive this rule is. In the above diagram base subobject X is shared by both the subhierarchy inside the oval and subhierarchy outside the oval. The standard says that it is OK to call virtual functions of X subobject from the constructor of H.

Note that this restriction applies even if the construction of D, G and I subobjects has been finished before the construction of H began.


The roots of this specification lead to practical consideration of implementing polymorphic mechanism. In practical implementations the VMT pointer is introduced as a data field into the object layout of the most basic polymorphic classes in the hierarchy. Derived classes don't introduce their own VMT pointers, they simply provide their own specific values for the pointers introduced by the base classes (and, possibly, longer VMTs).

Take a look at the example from the standard. The class A is derived from class V. This means that the VMT pointer of A physically belongs to V subobject. All calls to virtual functions introduced by V are dispatched through VMT pointer introduced by V. I.e. whenever you call

pointer_to_A->f();

it is actually translated into

V *v_subobject = (V *) pointer_to_A; // go to V
vmt = v_subobject->vmt_ptr;          // retrieve the table
vmt[index_for_f]();                  // call through the table

However, in the example from the standard the very same V subobject is also embedded into B. In order to make the constructor-restricted polymorphism work correctly, the compiler will place a pointer to B's VMT into VMT pointer stored in V (because while B's constructor is active V subobject has to act as part of B).

If at this moment you somehow attempt to call

a->f(); // as in the example

the above algorithm will find B's VMT pointer stored in its V subobject and will attempt to call f() through that VMT. This obviously makes no sense at all. I.e. having virtual methods of A dispatched through B's VMT makes no sense. The behavior is undefined.

This is rather simple to verify with practical experiment. Let's add its own version of f to B and do this

#include <iostream>

struct V {
  virtual void f() { std::cout << "V" << std::endl; }
};

struct A : virtual V {
  virtual void f() { std::cout << "A" << std::endl; }
};

struct B : virtual V {
  virtual void f() { std::cout << "B" << std::endl; }
  B(V*, A*);
};

struct D : A, B {
  virtual void f() {}
  D() : B((A*)this, this) { }
};

B::B(V* v, A* a) {
  a->f(); // What `f()` is called here???
}

int main() {
  D d;
}

You expect A::f to be called here? I tried several compilers, an all of them actually call B::f! Meanwhile, the this pointer value B::f receives in such call is completely bogus.

http://ideone.com/Ua332

This happens exactly for the reasons I described above (most compilers implement polymorphism the way I described above). This is the reason the language describes such calls as undefined.

One might note that in this specific example it is actually the virtual inheritance that leads to this unusual behavior. Yes, it happens exactly because the V subobject is shared between A and B subobjects. It is quite possible that without virtual inheritance the behavior would be much more predictable. However, the language specification apparently decided to just draw line the the way it is drawn in my diagram: when you are constructing H you are not allowed to step out of the "sandbox" of H's subhierarchy regardless of what inheritance type is used.

Interviewee answered 7/7, 2012 at 18:32 Comment(2)
"Derived classes don't introduce their own VMT pointers" Not always, but they very often need to introduce their own vptr.Chichi
If pointer_to_A->f(); is implemented in term of the common base class subobject, as you propose, why wouldn't it work? I bet it would.Chichi
A
1

The last sentence of the normative text that you cite reads as follows:

If the virtual function call uses an explicit class member access and the object expression refers to the complete object of x or one of that object’s base class subobjects but not x or one of its base class subobjects, the behavior is undefined.

This is, admittedly, rather convoluted. This sentence exists to restrict what functions may be called during construction in the presence of multiple inheritance.

The example contains multiple inheritance: D derives from A and B (we'll ignore V, because it is not required to demonstrate why the behavior is undefined). During construction of a D object, both the A and B constructors will be called to construct the base class subobjects of the D object.

When the B constructor is called, the type of the complete object of x is D. In that constructor, a is a pointer to the A base class subobject of x. So, we can say the following about a->f():

  • The object under construction is the B base class subobject of a D object (because this base class subobject is the object currently under construction, it is what the text refers to as x).

  • It uses explicit class member access (via the -> operator, in this case)

  • The type of the complete object of x is D, because that is the most-derived type that is being constructed

  • The object expression (a) refers to a base class subobject of the complete object of x (it refers to the A base class subobject of the D object being constructed)

  • The base class subobject to which the object expression refers is not x and is not a base class subobject of x: A is not B and A is not a base class of B.

Therefore, the behavior of the call is undefined, per the rule we started from at the beginning.

Why is the last method call in B::B undefined? Shouldn't it just call a.A::f?

The rule you cite states that when a constructor is called during construction, "the function called is the final overrider in the constructor’s class and not one overriding it in a more-derived class."

In this case, the constructor's class is B. Because B does not derive from A, there is no final overrider for the virtual function. Therefore the attempt to make the virtual call exhibits undefined behavior.

Aide answered 7/7, 2012 at 18:51 Comment(8)
So, for this to make sense, the compiler must somehow take into account that & a == this. Otherwise, if we were to call B::B(V *, A*) as a standalone class (not as a subclass of D), the call to a->f() would be well-defined, wouldn't it?Ceil
If the A* points to some completely unrelated A object that has been fully constructed, then the behavior would be well-defined. The compiler doesn't have to take anything into account: it can just assume that the A* points to a fully-constructed A object, otherwise the behavior is undefined, in which case the behavior of the compiler is unconstrained.Aide
I'm not sure that's the issue. The thing with A's part of D is that its vtable now points to D's vtable. So D's virtual functions will be called before D is initialized. This is not a problem when using B's vtable, because during construction it's still the original B's. (Yes, I know, vtables are not part of the standard etc. But it's the easiest way to explain the problem IMO).Rank
@eran: The behavior you describe is unlikely. Both Visual C++ 2012 and g++ 4.7 generate code that updates the A vptr from the A vtable to the D vtable after all base class subobjects have been initialized but before data member initialization begins. When the B constructor is called for the base class subobject, the vptr will still point to the vtable for A. In any case, it should be possible to describe the behavior of a program without considering implementation details, so long as the program exhibits well-defined behavior.Aide
@eran: A's vtable does not point to D's vtable at that point. In fact A does not have its own vtable pointer at all. It inherits its vtable pointer from V. A goes to V every time it needs access to vtable. That pointer in V will eventually point to D's vtable, but that will only happen when all base class constructors are finished. While B's constructor is working, that pointer actually points to B's vtable (since B and A share the same V). My answer has a practical example that confirms that.Helman
@eran: If V were a non-virtual base of A, then A vtable (the vtable pointer stored in A's instance of V) would probably point to A's vtable (not D's yet, but A's). But in case of virtual inheritance (i.e. when V is shared by A and B) things work out differently.Helman
"It uses explicit class member access (via the -> operator, in this case)" which BTW is a completely irrelevant detailChichi
@AnT "If V were a non-virtual base of A," actually, whether or not it is a virtual base "then A vtable" A vptr "would probably point to A's vtable (not D's yet, but A's)." You mean "would point to A in D vtable". Which isn't true, in either case A will be the primary base and have the full vtable.Chichi
R
0

Here's how I understand this: During the construction of an object, each sub-object constructs its part. In the example, it means that V::V() initializes V's members; A initializes A's members, and so on. Since V is initialized before A and B, they can both rely on V's members to be initialized.

In the example, B's constructor accepts two pointers to itself. Its V part is already constructed, so it's safe to call v->g(). However, at that point D's A part has not been initialized yet. Therefore, the call a->f() accesses uninitialized memory, which is undefined behavior.

Edit:

In the D above, A is initialized before B, so there won't be any access to A's uninitialized memory. On the other hand, once A has been fully constructed, its virtual functions are overridden by those of D (in practice: its vtable is set to A's during construction, and to D's once the construction is over). Therefore, the call to a->f() will invoke D::f(), before D has been initialized. So either way - A is constructed before B or after - you're going to call a method on an uninitialized object.

The virtual functions part has already been discussed here, but for completeness: the call to f() uses V::f because A has not been initialized yet, and as far as B is concerned, that's the only implementation of f. g() calls B::g because B overrides g.

Rank answered 7/7, 2012 at 18:46 Comment(2)
Are you sure that A's virtual functions are overriden by D's before D's constructor is called?Ceil
@ThomasMcLeod, quoting your quote of the standard: "under construction or destruction, the function called is the final overrider in the constructor’s or destructor’s class and not one overriding it in a more-derived class". Once A has been constructed, this no longer applies, and its virtual functions may be overridden by the more-derived class D.Rank

© 2022 - 2024 — McMap. All rights reserved.