auto
is an unknown type in a type equation; as usual, the type should be defined at some point. A virtual function needs to have a definition, it is always "used" even if the function is never called in the program.
Short description of the vtable issue
Covariant return types are an implementation issue with the vtable: covariant returns is an internally powerful feature (then castrated by arbitrary language rules). Covariance is limited to pointers (and references) derived to base conversions, but the internal power and hence difficulty of implementation is almost the one of arbitrary conversions: derived to base amount to arbitrary code (derived to base restricted to exclusive base class subobjects, aka non-virtual inheritance, would be much simpler).
Covariance in case of conversion to shared base subobjects (aka virtual inheritance) means that conversion not only can changes the value representation of the pointer, but it also changes its value in an information loosing way, in the general case.
Hence virtual covariance (covariant return type involving virtual inheritance conversion) means that the overrider cannot be confused with the overridden function in a primary base situation.
Detailed explanation
Basic theory of vtables and primary bases
struct Primbase {
virtual void foo(); // new
};
struct Der
: Primbase { // primary base
void foo(); // replace Primbase::foo()
virtual void bar(); // new slot
};
Primbase
is the primary base here, it starts at the same address at the derived object. This is extremely important: for the primary base, the up/down conversions can be done with a reinterpret or C style cast in the generated code. Single inheritance is so much easier for the implementer because there are only primary base classes. With multiple inheritance, pointer arithmetic is needed.
There is only one vptr in Der
, the one of Primbase
; there is one vtable for Der
, layout compatible with the vtable of Primbase
.
Here the usual compiler will not allocate another slot for Der::foo()
in the vtable, as the derived function is actually called (in hypothetical the generated C code) with a Primbase*
this
pointer, not a Der*
. The Der
vtable has only two slots (plus the RTTI data).
Primary covariance
Now we add some simple covariance:
struct Primbase {
virtual Primbase *foo(); // new slot in vtable
};
struct Der
: Primbase { // primary base
Der *foo(); // replaces Primbase::foo() in vtable
virtual void bar(); // new slot
};
Here the covariance is trivial, as it involves a primary base. Nothing to see at the compiled code level.
Non-zero offset covariance
More complex:
struct Basebelow {
virtual void bar(); // new slot
};
struct Primbase {
virtual Basebelow *foo(); // new
};
struct Der
: Primbase, // primary base
Basebelow { // base at a non zero offset
Der *foo(); // new slot?
};
Here the representation of a Der*
isn't the same as the representation of its base class subobject pointer Basebelow*
. Two implementations choices:
(settle) settle on the Basebelow *(Primbase::foo)()
virtual call interface for the whole hierarchy: this
is a Primbase*
(compatible with Der*
) but return value type is not compatible (different representation), so the derived function implementation will convert the Der*
to a Primbase*
(pointer arithmetic) and the caller with convert back when doing a virtual call on a Der
;
(introduce) another virtual function slot in the Der
vtable for the function returning a Der*
.
Generalized in a sharing hierarchy: virtual covariance
In the general case, base class subobjects are shared by different derived class, this is virtual "diamond":
struct B {};
struct L : virtual B {};
struct R : virtual B {};
struct D : L, R {};
Here the conversion to B*
is dynamic, based on the runtime type (often using the vptr, or else internal pointers/offsets in the objects, as in MSVC).
In general, such conversions to base class subobject lose information and cannot be undone. There is no reliable B*
to L*
down conversion. Hence, the (settle) choice is not available. The implementation will have to (introduce).
Example: Vtable for an override with a covariant return type in the Itanium ABI
The Itanium C++ ABI describes the layout of the vtable. Here is the rule regarding the introduction of vtable entries for a derived class (in particular one with a primary base class):
There is an entry for any virtual function declared in a class,
whether it is a new function or overrides a base class function,
unless it overrides a function from the primary base, and conversion
between their return types does not require an adjustment.
(emphasis mine)
So when a function overrides a declaration in the base class, the return type is compared: if they are similar, that is, one is invariably a primary base class of the other, in other words, always at offset 0, no vtable entry is added.
Back to auto
issue
(introduce) is not a complicated implementation choice, but it makes the vtable grows: the layout of the vtable is determined by the number of (introduce) done.
So the layout of the vtable is determined by the number of virtual functions (which we know from class definition), the presence of covariant virtual functions (which we can only know from function return types) and the type of covariance: primary covariance, non-zero offset covariance or virtual covariance.
Conclusion
The layout of the vtable can only be determined knowing the return type of virtual overriders of base class virtual functions returning a pointer (or reference) to a class type. The vtable computation would have to be delayed when there are such overriders in a class.
This would complicate the implementation.
Note: the terms like "virtual covariance" used are all made up, except "primary base" which is officially defined in the Itanium C++ ABI.
EDIT: Why I think constraint checking is not an issue
Checking of covariant constraints is not a problem, doesn't break separate compilation, or the C++ model:
auto
overrider of a class pointer(/ref) pointer returning function
struct B {
virtual int f();
virtual B *g();
};
struct D : B {
auto f(); // int f()
auto g(); // ?
};
The type of f()
is fully constrained and the function definition must return an int
.
The return type of g()
is partially constrained: it can be B*
or some derived_from_B*
. The checking will occur at the definition point.
Overriding of an auto virtual function
Consider an potential derived class D2
:
struct D2 : D {
T1 f(); // T1 must be int
T2 g(); // ?
};
Here the constraints on f()
could be checked, as T1
must be int
, but not the constraints on T2
, because the declaration of D::g()
is not known. All we know is that T2
must be a pointer to a subclass of B
(possibly just B
).
The definition of D::g()
can be covariant and introduce a stronger constraint:
auto D::g() {
return new D;
} // covariant D* return
so T2
must be a pointer to a class derived from D
(possibly just D
).
Before seeing the definition, we cannot know this constraint.
Because the overriding declaration cannot be checked before seeing the definition, it must be rejected.
For simplicity, I think f()
should also be rejected.