Why doesn't this simple function get de-virtualized?
Asked Answered
H

3

7

Consider the following code:

struct A {
    virtual A& operator+=(const A& other) noexcept = 0;
};

void foo_inner(int *p) noexcept { *p += *p; }
void foo_virtual_inner(A *p) noexcept { *p += *p; }

void foo(int *p) noexcept
{
    return foo_inner(p);
}

struct Aint : public A {
    int i;
    A& operator+=(const A& other) noexcept override final
    { 
// No devirtualization of foo_virtual with:
        i += dynamic_cast<const Aint&>(other).i; 
// ... nor with:
//      i += reinterpret_cast<const Aint&>(other).i; 
        return *this;
    }
};

void foo_virtual(Aint *p) noexcept
{
    return foo_virtual_inner(p);
}

As far as I can tell, both foo() and foo_virtual() should compile to the same object code. The compiler has all the information it needs to de-virtualize the call to operator+= in foo_virtual_inner(), when it's called from foo_virtual. But - neither GCC 8.3, nor MSVC 19.10, nor clang 8 do this. Naturally I used the maximum optimization flag (-O3 or /Ox).

Why? Is this a bug, or am I missing something?


clang 8 output:

foo(int*):                               # @foo(int*)
        shl     dword ptr [rdi]
        ret
foo_virtual(Aint*):                  # @foo_virtual(Aint*)
        mov     rax, qword ptr [rdi]
        mov     rax, qword ptr [rax]
        mov     rsi, rdi
        jmp     rax                     # TAILCALL

GCC 8.3 output:

foo(int*):
        sal     DWORD PTR [rdi]
        ret
foo_virtual(Aint*):
        mov     rax, QWORD PTR [rdi]
        mov     rax, QWORD PTR [rax]
        cmp     rax, OFFSET FLAT:Aint::operator+=(A const&)
        jne     .L19
        push    rbx
        xor     ecx, ecx
        mov     edx, OFFSET FLAT:typeinfo for Aint
        mov     esi, OFFSET FLAT:typeinfo for A
        mov     rbx, rdi
        call    __dynamic_cast
        test    rax, rax
        je      .L20
        mov     eax, DWORD PTR [rax+8]
        add     DWORD PTR [rbx+8], eax
        pop     rbx
        ret
.L19:
        mov     rsi, rdi
        jmp     rax
foo_virtual(Aint*) [clone .cold.1]:
.L20:
        call    __cxa_bad_cast

MSVC 19.10 output:

p$ = 8
void foo(int * __ptr64) PROC                                    ; foo
        mov     eax, DWORD PTR [rcx]
        add     eax, eax
        mov     DWORD PTR [rcx], eax
        ret     0
void foo(int * __ptr64) ENDP                                    ; foo

p$ = 8
void foo_virtual(Aint * __ptr64) PROC                  ; foo_virtual
        mov     rax, QWORD PTR [rcx]
        mov     rdx, rcx
        rex_jmp QWORD PTR [rax]
void foo_virtual(Aint * __ptr64) ENDP 

PS - What's the explanation for all of that typeinfo business in the compiled code under GCC?

Hurried answered 1/4, 2019 at 22:42 Comment(7)
It looks like the typeinfo for GCC is because it inlined the call to operator+= (and it's contained dynamic_cast) after verifying that that is the function being called. What happens at .L19?Cavefish
Try compiling with LTO enabled and see if this helps the linker to see the errors of its ways. blog.llvm.org/2017/03/devirtualization-in-llvm-and-clang.htmlNash
@MichaelDorgan: 1. Can I do that with GodBolt somehow? 2. Why would it be a linker look into the internals of a function that has no external dependency?Hurried
@1201ProgramAlarm: Added the .L19 and .L20 lines (they were already available via the link though.)Hurried
Does it help to make Aint itself final?Solicitude
@DavisHerring: 1. You could check yourself through the GodBolt link. 2. No :-(Hurried
@einpoklum: Sorry, all I could do from a small screen was get your hopes up.Solicitude
D
4

GCC guesses that Aint *p points to instance of Aint *p (but does not think this is guaranteed to happen) and therefore it devirtualises speculatively the call to operator+= and the typeinfo checking is an inlined copy of it. -fno-devirtualize-speculatively leads to same code as Clang and MSVC produces.

_Z11foo_virtualP4Aint:
.LFB4:
        .cfi_startproc
        movq    (%rdi), %rax
        movq    %rdi, %rsi
        movq    (%rax), %rax
        jmp     *%rax
Diocese answered 4/4, 2019 at 10:53 Comment(2)
I'm not sure I understand what you mean. GCC follows a pointer to the vtable and from there to operator+=. That's not de-virtualized, unless I'm missing something.Hurried
This is how speculative devirtualization works. If you think pointer ptr will point to an instance where virtual method foo is A::foo you replace: ptr->foo () by if (ptr->foo == A::foo) A::foo(a); /*usually inlined*/ else a->foo (); Point is to get the indirect call from the hot path and enable inlining which often enables other optimization (not much interesting is happening in this particular example, but still I would guess the speculatively devirtualized code being a tiny bit faster).Sanalda
H
1

Following @JanHubicka's answer, I've filed a bug against GCC:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89924

and it's being worked on (!).

edit: OK, it wasn't really being worked on after all I guess :-(

Hurried answered 27/9, 2019 at 19:54 Comment(0)
A
-1

The compiler can't assume that an Aint* actually points to an Aint object until it sees some operation that would have undefined semantics otherwise, like referring to one of its non-static members. Otherwise it could be the result of reinterpret_cast from some other pointer type waiting to be reinterpret_casted back to that type.

It seems to me that the standard conversion to A* should be such an operation, but AFAICT the standard doesn't currently say that. Wording to that effect would need to consider converting to a non-virtual base of an object under construction, which is deliberately allowed.

Automatic answered 27/9, 2019 at 21:45 Comment(3)
"can't assume" <- In the example given above, it doesn't need to assume anything. It knows for a fact the pointer actually points to an Aint.Hurried
How does it know that?Automatic
It knows that because foo_virtual takes an Aint *, and that's what we're compiling.Hurried

© 2022 - 2024 — McMap. All rights reserved.