Construction/Destruction in the Presence of Multiple Inheritance
How is the above object constructed in memory when the object itself is constructed? And how do we ensure that a partially-constructed object (and its vtable) are safe for constructors to operate on?
Fortunately, it's all handled very carefully for us. Say we're constructing a new object of type D
(through, for example, new D
). First, the memory for the object is allocated in the heap and a pointer returned. D
's constructor is invoked, but before doing any D
-specific construction it call's A
's constructor on the object (after adjusting the this
pointer, of course!). A
's constructor fills in the A
part of the D
object as if it were an instance of A
.
d --> +----------+
| |
+----------+
| |
+----------+
| |
+----------+
| | +-----------------------+
+----------+ | 0 (top_offset) |
| | +-----------------------+
+----------+ | ptr to typeinfo for A |
| vtable |-----> +-----------------------+
+----------+ | A::v() |
| a | +-----------------------+
+----------+
Control is returned to D
's constructor, which invokes B
's constructor. (Pointer adjustment isn't needed here.) When B
's constructor is done,the object looks like this:
B-in-D
+-----------------------+
| 20 (vbase_offset) |
+-----------------------+
| 0 (top_offset) |
+-----------------------+
d --> +----------+ | ptr to typeinfo for B |
| vtable |------> +-----------------------+
+----------+ | B::w() |
| b | +-----------------------+
+----------+ | 0 (vbase_offset) |
| | +-----------------------+
+----------+ | -20 (top_offset) |
| | +-----------------------+
+----------+ | ptr to typeinfo for B |
| | +--> +-----------------------+
+----------+ | | A::v() |
| vtable |---+ +-----------------------+
+----------+
| a |
+----------+
But wait... B
's constructor modified the A
part of the object by changing it's vtable pointer! How did it know to distinguish this kind of B-in-D from a B-in-something-else (or a standalone B
for that matter)? Simple. The virtual table table told it to do this. This structure, abbreviated VTT, is a table of vtables used in construction. In our case, the VTT for D
looks like this:
B-in-D
+-----------------------+
| 20 (vbase_offset) |
VTT for D +-----------------------+
+-------------------+ | 0 (top_offset) |
| vtable for D |-------------+ +-----------------------+
+-------------------+ | | ptr to typeinfo for B |
| vtable for B-in-D |-------------|----------> +-----------------------+
+-------------------+ | | B::w() |
| vtable for B-in-D |-------------|--------+ +-----------------------+
+-------------------+ | | | 0 (vbase_offset) |
| vtable for C-in-D |-------------|-----+ | +-----------------------+
+-------------------+ | | | | -20 (top_offset) |
| vtable for C-in-D |-------------|--+ | | +-----------------------+
+-------------------+ | | | | | ptr to typeinfo for B |
| vtable for D |----------+ | | | +-> +-----------------------+
+-------------------+ | | | | | A::v() |
| vtable for D |-------+ | | | | +-----------------------+
+-------------------+ | | | | |
| | | | | C-in-D
| | | | | +-----------------------+
| | | | | | 12 (vbase_offset) |
| | | | | +-----------------------+
| | | | | | 0 (top_offset) |
| | | | | +-----------------------+
| | | | | | ptr to typeinfo for C |
| | | | +----> +-----------------------+
| | | | | C::x() |
| | | | +-----------------------+
| | | | | 0 (vbase_offset) |
| | | | +-----------------------+
| | | | | -12 (top_offset) |
| | | | +-----------------------+
| | | | | ptr to typeinfo for C |
| | | +-------> +-----------------------+
| | | | A::v() |
| | | +-----------------------+
| | |
| | | D
| | | +-----------------------+
| | | | 20 (vbase_offset) |
| | | +-----------------------+
| | | | 0 (top_offset) |
| | | +-----------------------+
| | | | ptr to typeinfo for D |
| | +----------> +-----------------------+
| | | B::w() |
| | +-----------------------+
| | | D::y() |
| | +-----------------------+
| | | 12 (vbase_offset) |
| | +-----------------------+
| | | -8 (top_offset) |
| | +-----------------------+
| | | ptr to typeinfo for D |
+----------------> +-----------------------+
| | C::x() |
| +-----------------------+
| | 0 (vbase_offset) |
| +-----------------------+
| | -20 (top_offset) |
| +-----------------------+
| | ptr to typeinfo for D |
+-------------> +-----------------------+
| A::v() |
+-----------------------+
D's constructor passes a pointer into D's VTT to B's constructor (in this case, it passes in the address of the first B-in-D entry). And, indeed,the vtable that was used for the object layout above is a special vtable used just for the construction of B-in-D.
Control is returned to the D constructor, and it calls the C constructor(with a VTT address parameter pointing to the "C-in-D+12" entry). When C's constructor is done with the object it looks like this:
B-in-D
+-----------------------+
| 20 (vbase_offset) |
+-----------------------+
| 0 (top_offset) |
+-----------------------+
| ptr to typeinfo for B |
+---------------------------------> +-----------------------+
| | B::w() |
| +-----------------------+
| C-in-D | 0 (vbase_offset) |
| +-----------------------+ +-----------------------+
d --> +----------+ | | 12 (vbase_offset) | | -20 (top_offset) |
| vtable |--+ +-----------------------+ +-----------------------+
+----------+ | 0 (top_offset) | | ptr to typeinfo for B |
| b | +-----------------------+ +-----------------------+
+----------+ | ptr to typeinfo for C | | A::v() |
| vtable |--------> +-----------------------+ +-----------------------+
+----------+ | C::x() |
| c | +-----------------------+
+----------+ | 0 (vbase_offset) |
| | +-----------------------+
+----------+ | -12 (top_offset) |
| vtable |--+ +-----------------------+
+----------+ | | ptr to typeinfo for C |
| a | +-----> +-----------------------+
+----------+ | A::v() |
+-----------------------+
As you see, C's constructor again modified the embedded A's vtable pointer.The embedded C and A objects are now using the special construction C-in-D vtable, and the embedded B object is using the special construction B-in-D vtable. Finally, D's constructor finishes the job and we end up with the same diagram as before:
+-----------------------+
| 20 (vbase_offset) |
+-----------------------+
| 0 (top_offset) |
+-----------------------+
| ptr to typeinfo for D |
+----------> +-----------------------+
d --> +----------+ | | B::w() |
| vtable |----+ +-----------------------+
+----------+ | D::y() |
| b | +-----------------------+
+----------+ | 12 (vbase_offset) |
| vtable |---------+ +-----------------------+
+----------+ | | -8 (top_offset) |
| c | | +-----------------------+
+----------+ | | ptr to typeinfo for D |
| d | +-----> +-----------------------+
+----------+ | C::x() |
| vtable |----+ +-----------------------+
+----------+ | | 0 (vbase_offset) |
| a | | +-----------------------+
+----------+ | | -20 (top_offset) |
| +-----------------------+
| | ptr to typeinfo for D |
+----------> +-----------------------+
| A::v() |
+-----------------------+
Destruction occurs in the same fashion but in reverse. D's destructor is invoked. After the user's destruction code runs, the destructor calls C's destructor and directs it to use the relevant portion of D's VTT. C's destructor manipulates the vtable pointers in the same way it did during construction; that is, the relevant vtable pointers now point into the C-in-D construction vtable. Then it runs the user's destruction code for C and returns control to D's destructor, which next invokes B's destructor with a reference into D's VTT. B's destructor sets up the relevant portions of the object to refer into the B-in-D construction vtable. It runs the user's destruction code for B and returns control to D's destructor, which finally invokes A's destructor. A's destructor changes the vtable for the A portion of the object to refer into the vtable for A. Finally, control returns to D's destructor and destruction of the object is complete. The memory once used by the object is returned to the system.
Now, in fact, the story is somewhat more complicated. Have you ever seen those "in-charge" and "not-in-charge" constructor and destructor specifications in GCC-produced warning and error messages or in GCC-produced binaries? Well, the fact is that there can be two constructor implementations and up to three destructor implementations.
An "in-charge" (or complete object) constructor is one that constructs virtual bases, and a "not-in-charge" (or base object) constructor is one that does not. Consider our above example. If a B is constructed, its constructor needs to call A's constructor to construct it. Similarly, C's constructor needs to construct A. However, if B and C are constructed as part of a construction of a D, their constructors should not construct A, because A is a virtual base and D's constructor will take care of constructing it exactly once for the instance of D. Consider the cases:
If you do a new A, A's "in-charge" constructor is invoked to construct A.
When you do a new B, B's "in-charge" constructor is invoked. It will call the "not-in-charge" constructor for A.
new C is similar to new B.
A new D invokes D's "in-charge" constructor. Wewalked through this example. D's "in-charge" constructor calls the"not-in-charge" versions of A's, B's, and C's constructors (in thatorder).
An "in-charge" destructor is the analogue of an "in-charge"constructor---it takes charge of destructing virtual bases. Similarly,a "not-in-charge" destructor is generated. But there's a third one as well. An "in-charge deleting" destructor is one that deallocates the storage as well as destructing the object. So when is one called in preference to the other?
Well, there are two kinds of objects that can be destructed---those allocated on the stack, and those allocated in the heap. Consider this code (given our diamond hierarchy with virtual-inheritance from before):
D d; // allocates a D on the stack and constructs it
D *pd = new D; // allocates a D in the heap and constructs it
/* ... */
delete pd; // calls "in-charge deleting" destructor for D
return; // calls "in-charge" destructor for stack-allocated D
We see that the actual delete operator isn't invoked by the code doing the delete, but rather by the in-charge deleting destructor for the object being deleted. Why do it this way? Why not have the caller call the in-charge destructor, then delete the object? Then you'd have only two copies of destructor implementations instead of three...
Well, the compiler could do such a thing, but it would be morecomplicated for other reasons. Consider this code (assuming a virtual destructor,which you always use, right?...right?!?):
D *pd = new D; // allocates a D in the heap and constructs it
C *pc = d; // we have a pointer-to-C that points to our heap-allocated D
/* ... */
delete pc; // call destructor thunk through vtable, but what about delete?
If you didn't have an "in-charge deleting" variety of D's destructor, then the delete operation would need to adjust the pointer just like the destructor thunk does. Remember, the C object is embedded in a D, and so our pointer-to-C above is adjusted to point into the middle of our D object.We can't just delete this pointer, since it isn't the pointer that was returned by malloc()
when we constructed it.
So, if we didn't have an in-charge deleting destructor, we'd have to have thunks to the delete operator (and represent them in our vtables), or something else similar.
Thunks, Virtual and Non-Virtual
This section not written yet.
Multiple Inheritance with Virtual Methods on One Side
Okay. One last exercise. What if we have a diamond inheritance hierarchy with virtual inheritance, as before, but only have virtual methods along one side of it? So:
class A {
public:
int a;
};
class B : public virtual A {
public:
int b;
virtual void w();
};
class C : public virtual A {
public:
int c;
};
class D : public B, public C {
public:
int d;
virtual void y();
};
In this case the object layout is the following:
+-----------------------+
| 20 (vbase_offset) |
+-----------------------+
| 0 (top_offset) |
+-----------------------+
| ptr to typeinfo for D |
+----------> +-----------------------+
d --> +----------+ | | B::w() |
| vtable |----+ +-----------------------+
+----------+ | D::y() |
| b | +-----------------------+
+----------+ | 12 (vbase_offset) |
| vtable |---------+ +-----------------------+
+----------+ | | -8 (top_offset) |
| c | | +-----------------------+
+----------+ | | ptr to typeinfo for D |
| d | +-----> +-----------------------+
+----------+
| a |
+----------+
So you can see the C subobject, which has no virtual methods, still has a vtable (albeit empty). Indeed, all instances of C have an empty vtable.