Why does MSVC never return struct in RAX for member-functions?
Asked Answered
C

1

6

I've stumbled across an oddity in MSVCs codegen, regarding structures that are used as return-values. Consider the following code (live demo here):

struct Result
{
    uint64_t value;
};

Result makeResult(uint64_t value)
{
    return { value };
}

struct ResultFactory
{
    NOINLINE Result MakeResult(uint64_t value) const
    {
        return { value };
    }
};

We have a struct, which perfectly fullfils the x64-APIs condition for being returned in RAX. And as long as the free function is used, this is the case:

value$ = 8
Result makeResult(unsigned __int64) PROC ; makeResult, COMDAT
  mov rax, rcx
  ret 0
Result makeResult(unsigned __int64) ENDP ; makeResult

Now when we look at the member-function, it looks slightly different:

Result ResultFactory::MakeResult(unsigned __int64)const  PROC ; ResultFactory::MakeResult, COMDAT
  mov QWORD PTR [rdx], r8
  mov rax, rdx
  ret 0
Result ResultFactory::MakeResult(unsigned __int64)const  ENDP ; ResultFactory::MakeResult

Here, the compiler decided to require "Result" to have a reference passed in the first register (well, RDX/second, as that's what MSVC does for member-functions in the first place when RAX cannot be returned).

Why would that be the case? Is there any good reason for that? It seems needlessly pessimising code-gen, and I really see no benefit to it. Having "RCX" always be this kind of makes sense, but always requiring a reference, even for primitive structs? This also means that there is unfortunately a very real difference between using a member-function and a free function, as long as neigther can be inlined. Or in case where a member-function is used, you it could be faster to just return a primitive type and bit_cast it across the function boundary (whether or not that all matters is another question, but it shouldn't be the case frankly).

Clang/GCC seem to do it "right". I'm not 100% sure if this is just a MSVC quirk, or actually the x64-windows calling convention (MSDN doesn't really say anything about c++ specifically). Anyone got a clue what's going on here?

EDIT: As pointed out by @Turtlefight, this is indeed mandated by the Windows-ABI. My follow-up, or rewording of this question would then be - why does the windows-ABI make this distinction, when it seems to only lead to worse code-gen, plus actually makes handling global and member-functions be wastly different and thus more complex. In case anyone would know why it was designed that way.

Crapulous answered 28/3 at 14:56 Comment(2)
It must be the ABI. You can try clang-cl to prove that.Falbala
Back in the old country we just use RAX to convert the unbelievers.Kerk
A
8

This is required by the Windows x64 ABI.

Non-static member functions cannot return user-defined types by value.
Only static member functions and global functions can return user-defined types by value.

x64 calling convention - return values

Return Values

User-defined types can be returned by value from global functions and static member functions. To return a user-defined type by value in RAX, it must have a length of 1, 2, 4, 8, 16, 32, or 64 bits. It must also have no user-defined constructor, destructor, or copy assignment operator. It can have no private or protected non-static data members, and no non-static data members of reference type. It can't have base classes or virtual functions. And, it can only have data members that also meet these requirements. (This definition is essentially the same as a C++03 POD type. Because the definition has changed in the C++11 standard, we don't recommend using std::is_pod for this test.)
Otherwise, the caller must allocate memory for the return value and pass a pointer to it as the first argument. The remaining arguments are then shifted one argument to the right. The same pointer must be returned by the callee in RAX.


clang will also generate the same code if you ask it to compile for the microsoft x64 abi:
(with -target x86_64-pc-windows-msvc -fc++-abi=microsoft)

godbolt

"?MakeResult@ResultFactory@@QEBA?AUResult@@_K@Z":
    mov rax, rdx
    mov qword ptr [rdx], r8
    ret
Arawakan answered 28/3 at 16:17 Comment(3)
Oh, how in the hell did I miss that. Probably read that article so many times it didn't register anymore. Well, then my follow-up/edited question would be "why does the x64 API mandate that?" - when it in fact creates worse code as well as makes handling global and member-functions differentiate substantially. But that might be hard for anyone to say.Crapulous
This also does not seem to apply to member functions with explicit object parameters (Result MakeResult(this const ResultFactory&, uint64_t value) or Result MakeResult(this ResultFactory, uint64_t value)), and does not seem to be a property of the calling convention (global __thiscall functions return in a register and member __cdecl or __stdcall functions use a pointer)Crooks
@Crapulous Why it has been implemented this way is unfortunately hard to find out. According to this msvc issue from 2020 it might have been done to simplify some edge cases around the construction and destruction of non-trivial objects. – The x64 calling convention has been around for almost 2 decades now. Any change to how return values are handled for member functions would need to be a breaking change - so the chances of that being implemented are unfortunately close to zero.Arawakan

© 2022 - 2024 — McMap. All rights reserved.