Inline a virtual function in a method when the object has value semantics
Asked Answered
G

2

6

Consider the following code with a template method design pattern:

class A {
    public:
        void templateMethod() {
            doSomething();
        }
    private:
        virtual void doSomething() {
            std::cout << “42\n”;
        }
};
class B : public A {
    private:
        void doSomething() override {
            std::cout << “43\n”;
        }
};

int main() {
    // case 1
    A a; // value semantics
    a.templateMethod(); // knows at compile time that A::doSomething() must be called

    // case 2
    B b; // value semantics
    b.templateMethod(); // knows at compile time that B::doSomething() must be called

    // case 3
    A& a_or_b_ref = runtime_condition() ? a : b;  // ref semantics 
    a_or_b_ref.templateMethod(); // does not know which doSomething() at compile time, a virtual call is needed
    return 0;
}

I am wondering if the compiler is able to inline/unvirtualize the “doSomething()” member function in case 1 and 2. This is possible if it creates 3 different pieces of binary code for templateMethod(): one with no inline, and 2 with either A::doSomething() or B::doSomething() inlined (that must be called respectively in cases 3, 1 and 2)

Do you know if this optimization is required by the standard, or else if any compiler implements it ? I know that I can achive the same kind of effect with a CRT pattern and no virtual, but the intent will be less clear.

Goldfish answered 18/12, 2014 at 5:10 Comment(8)
It seems to me that even with aggressive optimization most of compilers will fail with inlining static versions, because there can be many examples of functions with the same signature that can't be inlined. For example, if you had some external memory accessing in your fucions : 'cout << *p', where 'p' is member of class. Signature of doSomething() is the same with your example but inlining can't be done. But it's just an opinion.Osuna
Well maybe I am wrong but I always thought that any function could technically be inlined without condition provided it is neither virtual nor recursive. In your example, I don't see why the compiler would not be able to inline if the function is non-virtual.Spent
Hm, I always imaginge that inline is some sort of pasting the code directly, so technically you have code with 'cout << *(this->p)' string. You need information about 'this' pointer in that code, but with inlining you will miss it. Am I wrong?Osuna
case 3 can also be done at compile time as b_ref is a simple alias of b. Something like A& b_ref = runtime_condition() ? a : static_cast<A&>(b); would require virtual call.Subdual
@FominArseniy For the compiler, "this" is just an argument. So except for virtual ones, member functions are technically not different from regular functions.Spent
@Subdual Yes this is the kind of example I am refering to. I will edit my post based on your suggestion, but I think the static_cast is not necessary, right ?Spent
@Bérenger: true, static_cast not needed.Subdual
Virtual function calls are not that expensive anyway.Proprietor
P
1

The standard does not require optimisations in general (occasionally it goes out of its way to allow them); it specifies the outcome and it is up to the compiler to figure out how best to achieve it.

In all three cases I would expect templateMethod to be inlined. The compiler is then free to perform further optimisations; in the first two cases it knows the dynamic type of this and so can generate a non-virtual call for doSomething. (I'd then expect it to inline those calls.)

Have a look at the generated code and see for yourself.

Proprietor answered 18/12, 2014 at 9:25 Comment(2)
Based on Jarod42 comment, I edited the code so that it is not possible for the compiler to know what to call at compile time for case 3.Spent
Well then that case won't be inlined. Edited answer.Proprietor
S
0

The optimisation is a problem of the compiler not of the standard. It would be a major bug if an optimisation was leading to a non respect or the princips of virtual functions.

So in the 3rd case :

// case 3
A& b_ref = b; // ref semantics   
b_ref.templateMethod();

the actual object is a B, and the actual function called must be the one defined in B class, whatever the reference of pointer used is.

And my compiler displays correctly 43 - has it displayed anything else I would have changed compiler immediately ...

Subdominant answered 18/12, 2014 at 10:36 Comment(8)
My point is that in case 1 and 2, the compiler knows at compile time the object and therefore does not need to generate a virtual function object code. Of course when it doesn't know like case 3, it should respect the virtual call mechanism. However, in order to do the three cases with both the right behaviour and a complete optimisation, it needs to generate 3 different templateMethod() object code.Spent
@Goldfish It doesn't need to generate 3 versions of templateMethod. It inlines it, i.e. replaces the call with the function body, in three different places, and then optimises the resulting code as part of compiling main.Proprietor
@Goldfish : I admit that I considere that it is the problem of compiler developpers, not mine. I tried to analyse optimised code and soon gave up : it was quicker than unoptimised one but I hardly found what I had written in source. Now I only try to do low level optimisation if I have performance problem, and only after identifying the bottleneck.Subdominant
@SergeBallesta I agree, but it is always good to know when the compiler is smart enough or not, so you don't try to optimize later on something which was optimized behind the scenes from the beginning.Spent
@AlanStokes I am concerned with the inlining of doSomething(), not templateMethod(). Suppose template method is recursive depending on a run-time condition: it won't be inlined. But will doSomething() be inlined ? If yes, it implies 3 different object code for templateMethod()Spent
No; it implies that only in the sense that when a function is inlined the code generated for the function is customised for that call site. Read my answer and my comment above. (Compilers will happily inline recursive functions, btw - just not to infinite depth.) You clearly find your existing answers unsatisfactory; perhaps you should clarify what you are actually asking.Proprietor
@AlanStokes "the code generated for the function is customised for that call site". Yes, but suppose that for whatever reason, templateMethod() is not inlined. Then if the compiler is naive it will generate code for this method at ONE site. Then it will fail to inline doSomething() at this site since it really needs to inline 2 different codes: A::doSomething() and B::doSomething().Spent
@AlanStokes Btw your answer is interesting, but not completely addressing the problem, hence my comments.Spent

© 2022 - 2024 — McMap. All rights reserved.