Why can't virtual functions use return type deduction?
Asked Answered
S

3

34

n3797 says:

§ 7.1.6.4/14:

A function declared with a return type that uses a placeholder type shall not be virtual (10.3).

Therefore the following program is ill-formed:

struct s
{
    virtual auto foo()
    {
    }
};

All I can find for the rationale is this vague one-liner from n3638:

virtual

It would be possible to allow return type deduction for virtual functions, but that would complicate both override checking and vtable layout, so it seems preferable to prohibit this.

Can anyone provide further rationale or give a good (code) example that agrees with the above quote?

Scruff answered 9/10, 2014 at 0:55 Comment(0)
S
26

The rationale that you included is reasonably clear: naturally, virtual functions are meant to be overridden by subclasses, so you as the designer of the base class should make it as easy as possible for people who inherit your class to provide a suitable override. However, if you use auto, figuring out the return type for the override becomes a tedious task for a programmer. Compilers would have less of a problem with it, but humans would have many opportunities to get confused.

For example, if you see a return statement that looks like this

return a * 3 + b;

you would have to trace the program back to the point of declaration of a and b, figure out the type promotions, and decide what the return type shall be.

It appears that the language designers figured out that this would be rather confusing, and decided against allowing this feature.

Swordplay answered 9/10, 2014 at 1:2 Comment(3)
Although it's true that such a feature would make it highly confusing for some humans, it goes against Herb's opinion on the usage of auto (i.e. don't care about types, only that it's a duck). Granted this is a highly contentious issue among C++ users. Still, I'm inclined to believe the real reason is what AndreyT stated.Placet
guh that whole duck-typing-in-C++ rubbish. Herb is losing itChessy
@Placet except a virtual function's type is part of its contract -- you cannot duck-type derive from a virtual function in C++. You have to match the signature nearly exactly. Now, if we expanded covariant return types like std::function does, it would be closer (as any type that could be converted to the return type of the base class would be a legal return type in the child), but even then there would still be issues about the contract being fuzzy.Xanthin
O
17

Well, the deduced return type of the function only becomes known at the point of function definition: the return type is deduced from the return statements inside the function body.

Meanwhile, the vtable is built and override semantics is checked based purely on function declarations present in the class definition. These checks never relied on function definition and never needed to see the definition. For example, the language requires the overriding function to have the same return type or a covariant return type as the function it overrides. When non-defining function declaration specifies a deduced return type (i.e. auto without trailing return type), its return type is unknown at that point and remains unknown until the compiler encounters the definition of the function. It is not possible to perform the aforementioned return type check when return type is unknown. Asking the compiler to somehow postpone the return type check to the point where it becomes known would require a major qualitative redesign of this fundamental area of the language specification. (I'm not sure it is even possible.)

Another alternative would be to relieve the compiler of that burden under the blanket mandate of "no diagnostics is required" or "the behavior is undefined", i.e. hand the responsibility over to the user, but that would also constitute a major deviation from the former design of the language.

Basically, for a somewhat similar reason you cannot apply the & operator to a function declared as auto f(); but not defined yet, as the example in 7.1.6.3/11 shows.

Outlook answered 9/10, 2014 at 1:5 Comment(10)
"It is not possible to perform override checking when return type is unknown" I'm not quite sure what you mean here: The function signature determines whether or not the function overrides a virtual function of a base class; the return type is not part of the signature (of non-template functions).Minium
@dyp: By override checking I mean the comparison of return types, which are required to be either identical or covariant. How are you going to perform that comparison without knowing the actual types? The only option is to leave it as "no diagnostic is required" or "the behavior is undefined", but I don't see it as a good idea.Outlook
Ah, ok. This has to be postponed as well until the compiler sees the definition.Minium
Though this might be an argument not to make things too complicated for the compiler writers, I wonder if there's a technical reason why return type deduction cannot be used for virtual functions. (I.e., as long as you provide the definition such that all translation units that call this specific function can see it)Minium
@dyp: very little is impossible but i think adding general dynamic library support to the language would be in this case. at least practically.Rieger
@AnT When the function is defined, you do the checking.Drolet
@curiousguy: It is not possible to implement within the modern C++ translation philosophy (which is fundamentally based on independent translation units). That would require a completely teardown of said philosophy and implementation of global error checking mechanisms that transcend the boundaries of translation units. C++ would become a very very very different language.Outlook
@AnT Maybe we are talking past each other. I don't even know the proper term for an auto function that is "undefined"? and becomes "known", "defined", "reduced". Until the function declaration is "defined", and cannot be used in any way, just like a regular "undefined" auto declaration.Drolet
I don't see how this is different from a non virtual auto: the declaration cannot be used before it is "defined" "reduced" whatever it is called. Uselessness alone is not a good reason for forbidding a particular case of a feature.Drolet
Please see my answer, I explain my POV.Drolet
D
6

auto is an unknown type in a type equation; as usual, the type should be defined at some point. A virtual function needs to have a definition, it is always "used" even if the function is never called in the program.

Short description of the vtable issue

Covariant return types are an implementation issue with the vtable: covariant returns is an internally powerful feature (then castrated by arbitrary language rules). Covariance is limited to pointers (and references) derived to base conversions, but the internal power and hence difficulty of implementation is almost the one of arbitrary conversions: derived to base amount to arbitrary code (derived to base restricted to exclusive base class subobjects, aka non-virtual inheritance, would be much simpler).

Covariance in case of conversion to shared base subobjects (aka virtual inheritance) means that conversion not only can changes the value representation of the pointer, but it also changes its value in an information loosing way, in the general case.

Hence virtual covariance (covariant return type involving virtual inheritance conversion) means that the overrider cannot be confused with the overridden function in a primary base situation.

Detailed explanation

Basic theory of vtables and primary bases

struct Primbase {
    virtual void foo(); // new
};

struct Der 
     : Primbase { // primary base 
    void foo(); // replace Primbase::foo()
    virtual void bar(); // new slot
};

Primbase is the primary base here, it starts at the same address at the derived object. This is extremely important: for the primary base, the up/down conversions can be done with a reinterpret or C style cast in the generated code. Single inheritance is so much easier for the implementer because there are only primary base classes. With multiple inheritance, pointer arithmetic is needed.

There is only one vptr in Der, the one of Primbase; there is one vtable for Der, layout compatible with the vtable of Primbase.

Here the usual compiler will not allocate another slot for Der::foo() in the vtable, as the derived function is actually called (in hypothetical the generated C code) with a Primbase* this pointer, not a Der*. The Der vtable has only two slots (plus the RTTI data).

Primary covariance

Now we add some simple covariance:

struct Primbase {
    virtual Primbase *foo(); // new slot in vtable
};

struct Der 
     : Primbase { // primary base 
    Der *foo(); // replaces Primbase::foo() in vtable
    virtual void bar(); // new slot
};

Here the covariance is trivial, as it involves a primary base. Nothing to see at the compiled code level.

Non-zero offset covariance

More complex:

struct Basebelow {
    virtual void bar(); // new slot
};

struct Primbase {
    virtual Basebelow *foo(); // new
};

struct Der 
     : Primbase, // primary base 
       Basebelow { // base at a non zero offset
    Der *foo(); // new slot?
};

Here the representation of a Der* isn't the same as the representation of its base class subobject pointer Basebelow*. Two implementations choices:

  • (settle) settle on the Basebelow *(Primbase::foo)() virtual call interface for the whole hierarchy: this is a Primbase* (compatible with Der*) but return value type is not compatible (different representation), so the derived function implementation will convert the Der* to a Primbase* (pointer arithmetic) and the caller with convert back when doing a virtual call on a Der;

  • (introduce) another virtual function slot in the Der vtable for the function returning a Der*.

Generalized in a sharing hierarchy: virtual covariance

In the general case, base class subobjects are shared by different derived class, this is virtual "diamond":

struct B {};
struct L : virtual B {};
struct R : virtual B {};
struct D : L, R {};

Here the conversion to B* is dynamic, based on the runtime type (often using the vptr, or else internal pointers/offsets in the objects, as in MSVC).

In general, such conversions to base class subobject lose information and cannot be undone. There is no reliable B* to L* down conversion. Hence, the (settle) choice is not available. The implementation will have to (introduce).

Example: Vtable for an override with a covariant return type in the Itanium ABI

The Itanium C++ ABI describes the layout of the vtable. Here is the rule regarding the introduction of vtable entries for a derived class (in particular one with a primary base class):

There is an entry for any virtual function declared in a class, whether it is a new function or overrides a base class function, unless it overrides a function from the primary base, and conversion between their return types does not require an adjustment.

(emphasis mine)

So when a function overrides a declaration in the base class, the return type is compared: if they are similar, that is, one is invariably a primary base class of the other, in other words, always at offset 0, no vtable entry is added.

Back to auto issue

(introduce) is not a complicated implementation choice, but it makes the vtable grows: the layout of the vtable is determined by the number of (introduce) done.

So the layout of the vtable is determined by the number of virtual functions (which we know from class definition), the presence of covariant virtual functions (which we can only know from function return types) and the type of covariance: primary covariance, non-zero offset covariance or virtual covariance.

Conclusion

The layout of the vtable can only be determined knowing the return type of virtual overriders of base class virtual functions returning a pointer (or reference) to a class type. The vtable computation would have to be delayed when there are such overriders in a class.

This would complicate the implementation.

Note: the terms like "virtual covariance" used are all made up, except "primary base" which is officially defined in the Itanium C++ ABI.

EDIT: Why I think constraint checking is not an issue

Checking of covariant constraints is not a problem, doesn't break separate compilation, or the C++ model:

auto overrider of a class pointer(/ref) pointer returning function

struct B {
    virtual int f();
    virtual B *g();
};

struct D : B {
    auto f(); // int f() 
    auto g(); // ?
};

The type of f() is fully constrained and the function definition must return an int.

The return type of g() is partially constrained: it can be B* or some derived_from_B*. The checking will occur at the definition point.

Overriding of an auto virtual function

Consider an potential derived class D2:

struct D2 : D {
    T1 f(); // T1 must be int 
    T2 g(); // ?
};

Here the constraints on f() could be checked, as T1 must be int, but not the constraints on T2, because the declaration of D::g() is not known. All we know is that T2 must be a pointer to a subclass of B (possibly just B).

The definition of D::g() can be covariant and introduce a stronger constraint:

auto D::g() { 
    return new D;
} // covariant D* return

so T2 must be a pointer to a class derived from D (possibly just D).

Before seeing the definition, we cannot know this constraint.

Because the overriding declaration cannot be checked before seeing the definition, it must be rejected.

For simplicity, I think f() should also be rejected.

Drolet answered 15/8, 2015 at 21:43 Comment(7)
This issue is unrelated to vtables. At the time vtables are generated, all auto occurences everywhere will have already been deduced. Although you have a point in mentioning increased complexity when checking declarations, this is an issue that can be solved. Not allowing auto there is merely a design choice.Braunite
@Braunite At what "times" are vtable generated?Drolet
The compiler will deduce the return type of the method from the return statements. The return type of the method is not "auto", but an actual type the compiler will figure out (en.cppreference.com/w/cpp/language/…). This will happen during syntactic/semantic analysis. The vtables will be generated during code generation, thus the actual type is already known. See also en.cppreference.com/w/cpp/language/translation_phasesBraunite
Then please rephrase your question. I thought you were asking when a compiler generates virtual tables. Code generation (where vtables are generated) happens after syntactic/semantic analysis. Not sure what you mean if not this by "time".Braunite
As I explained in my answer, the issue I see here is the generation of vtables. The full type information is needed to generate vtables. When exactly are vtables generated?Drolet
@Braunite What ABI do you propose that supports these deducted return types?Drolet
You can’t actually reinterpret_cast through inheritance like that (except in the trivial case of standard-layout types).Termination

© 2022 - 2024 — McMap. All rights reserved.