The basic problem to solve is that if you cast a pointer to the most derived type to a pointer to one of its bases, the pointer must refer to an address in memory from which each member of the type can be located by code that does not know of derived types. With non-virtual inheritance, this is usually achieved by having the exact layout, and that in turn is achieved by containing a base class subobject and then adding the extra bits of the derived type:
struct base { int x; };
struct derived : base { int y };
Layout for derived:
--------- <- base & derived start here
x
---------
y
---------
If you add a second derived and a most derived types (again, without virtual inheritance) you get something like:
struct derived2 : base { int z; };
struct most_derived : derived, derived 2 {};
With this layout:
--------- <- derived::base, derived and most_derived start here
x
---------
y
--------- <- derived2::base & derived2 start here
x
---------
z
---------
If you have a most_derived
object and you bind a pointer/reference of type derived2
it will point to the line marked with derived2::base
. Now, if inheritance from base was virtual, then there should be a single instance of base
. For the sake of discussion, just assume that we naïvely remove the second base
:
--------- <- derived::base, derived and most_derived start here
x
---------
y
--------- <- derived2 start here??
z
---------
Now the problem is that if we obtain a pointer to derived
it has the same layout as the original, but if we tried to obtain a pointer to derived2
the layout would differ and code in derived2
would not be able to locate the x
member. We need to do something smarter, and that is where the pointer comes into play. By adding a pointer to each object that inherits virtually, we get this layout:
--------- <- derived starts here
base::ptr --\
y | pointer to where the base object resides
--------- <-/
x
---------
Similarly for derived2
. Now, at the cost of the extra indirection we can locate the x
subobject through the pointer. When we can create most_derived
layout with a single base, it could look like this:
--------- <- derived starts here
base::ptr -----\
y |
--------- | <- derived2
base::ptr --\ |
z | |
--------- <--+-/ <- base
x
---------
Now code in derived
and derived2
nows how to access the base subobject (just dereference the base::ptr
member object), and at the same time you have a single instance of base
. If code in either intermediate class access x
they can do so by doing this->[hidden base pointer]->x
, and that will be resolved at runtime to the proper position.
The important bit here is that code compiled at the derived
/derived2
layer can be used with an object of that type, or any derived object. If we wrote a second most_derived2
object where the order of inheritance was reversed, then they layout of y
and z
could be swapped, and the offsets from a pointer to the derived
or derived2
subobjects to the base
subobject will be different, but the code to access x
would still be the same: dereference your own hidden base pointer, guaranteeing that if a method in derived
is the final overrider, and that access base::x
then it will find it regardless of the final layout.