GCC/Clang x86_64 C++ ABI mismatch when returning a tuple?
Asked Answered
S

2

14

When trying to optimize return values on x86_64, I noticed a strange thing. Namely, given the code:

#include <cstdint>
#include <tuple>
#include <utility>

using namespace std;

constexpr uint64_t a = 1u;
constexpr uint64_t b = 2u;

pair<uint64_t, uint64_t> f() { return {a, b}; }
tuple<uint64_t, uint64_t> g() { return tuple<uint64_t, uint64_t>{a, b}; }

Clang 3.8 outputs this assembly code for f:

movl $1, %eax
movl $2, %edx
retq

and this for g:

movl $2, %eax
movl $1, %edx
retq

which look optimal. However, when compiled with GCC 6.1, while the generated assembly for f is identical to what Clang output, the assembly generated for g is:

movq %rdi, %rax
movq $2, (%rdi)
movq $1, 8(%rdi)
ret

It looks like the type of the return value is classified as MEMORY by GCC but as INTEGER by Clang. I can confirm that linking Clang code with GCC code such code can result in segmentation faults (Clang calling GCC-compiled g() which writes to wherever %rdi happens to point) and an invalid value being returned (GCC calling Clang-compiled g()). Which compiler is at fault?

Related:

See also

Sansbury answered 26/5, 2016 at 9:56 Comment(4)
Isn't movl 32 bit and movq 64-bit !?Ideality
@DieterLücking: Doesn't matter for positive integer constants less than or equal to UINT32_MAX, the upper 32-bit of the register are implicitly set to zero.Homicide
@DieterLücking The movl instruction clears the rest of the bits in the 64-bit destination register. Its just that the instruction encoding is 1 byte shorter in than the equivalent using movq.Sansbury
@jotik: 2 bytes shorter. The REX version of the 5byte movl $imm32, %r32 encoding is the 10 byte movabs $imm64, %r64 (REX + opcode + 8byte immediate). The sign-extending movq $imm32, %r/m64 form is 7 bytes: It needs a mod/rm byte instead of encoding the dest register into the opcode. (So it can store to memory, like movq $2, (%rdi) is doing)Stationmaster
L
4

As davmac's answer shows, the libstdc++ std::tuple is trivially copy constructible, but not trivially move constructible. The two compilers disagree on whether the move constructor should affect the argument passing conventions.

The C++ ABI thread you linked to seems to explain that disagreement: http://sourcerytools.com/pipermail/cxx-abi-dev/2016-February/002891.html

In summary, Clang implements exactly what the ABI spec says, but G++ implements what it was supposed to say, but wasn't updated to actually say.

Lancelle answered 1/7, 2016 at 13:59 Comment(3)
I might be missing something, but, in the thread you link the suggested ABI text is: In the special case where the parameter type does not have both a trivial destructor and at least one trivial copy or move constructor that is not deleted, the caller must allocate space for a temporary copy, and pass the resulting copy by reference - which implies that tuple need not be passed by invisible reference, surely, since it is trivially movable and destructible? G++ is certainly not following that.Mckellar
(edit: it is trivially copyable and destructible.)Mckellar
On re-reading, the text I quoted in the comment above is (IIUC) a suggested improvement to the "rule we ended up with" (but which never made it into the spec), and what you refer to by "what it was supposed to say" is the latter and not the former. Ok, makes sense.Mckellar
M
10

The ABI states that parameter values are classified according to a specific algorithm. Relevant here is:

  1. If the size of the aggregate exceeds a single eightbyte, each is classified separately. Each eightbyte gets initialized to class NO_CLASS.

  2. Each field of an object is classified recursively so that always two fields are considered. The resulting class is calculated according to the classes of the fields in the eightbyte:

In this case, each of the fields (for either a tuple or a pair) are of type uint64_t and so occupy an entire "eightbyte". The "two fields" to be considered in each eightbyte, then, are the "NO_CLASS" (as per 3) and the uint64_t field, which is classified as INTEGER.

There is also, related to parameter passing:

If a C++ object has either a non-trivial copy constructor or a non-trivial destructor, it is passed by invisible reference (the object is replaced in the parameter list by a pointer that has class INTEGER)

An object that doesn't meet those requirements must have an address, and therefore needs to be in memory, which is why the above requirement exists. The same is true for return values, though this seems to be an omitted in the specification (probably by accident).

Finally, there is:

(c) If the size of the aggregate exceeds two eightbytes and the first eight-byte isn’t SSE or any other eightbyte isn’t SSEUP, the whole argument is passed in memory.

That doesn't apply here, obviously; the size of the aggregate is exactly two eightbytes.

On returning of values, the text says:

  1. Classify the return type with the classification algorithm

Which means, as per above, that the tuple should be classifed as INTEGER. Then:

  1. If the class is INTEGER, the next available register of the sequence %rax, %rdx is used.

This is quite clear.

The only still-open question is whether the types are non-trivially-copy-constructible/destructible. As mentioned above, values of such type cannot be passed or returned in registers, even though the specification does not seem to recognize the problem for return values. However, we can easily show that the tuple and pair are both trivially-copy-constructible and trivially-destructible, using the following program:

Test program:

#include <utility>
#include <cstdint>
#include <tuple>
#include <iostream>

using namespace std;

int main(int argc, char **argv)
{
    cout << "pair is trivial? : " << is_trivial<pair<uint64_t, uint64_t> >::value << endl;
    cout << "pair is trivially_copy_constructible? : " << is_trivially_copy_constructible<pair<uint64_t, uint64_t> >::value << endl;
    cout << "pair is standard_layout? : " << is_standard_layout<pair<uint64_t, uint64_t> >::value << endl;
    cout << "pair is pod? : " << is_pod<pair<uint64_t, uint64_t> >::value << endl;
    cout << "pair is trivially_destructable? : " << is_trivially_destructible<pair<uint64_t, uint64_t> >::value << endl;
    cout << "pair is trivially_move_constructible? : " << is_trivially_move_constructible<pair<uint64_t, uint64_t> >::value << endl;

    cout << "tuple is trivial? : " << is_trivial<tuple<uint64_t, uint64_t> >::value << endl;
    cout << "tuple is trivially_copy_constructible? : " << is_trivially_copy_constructible<tuple<uint64_t, uint64_t> >::value << endl;
    cout << "tuple is standard_layout? : " << is_standard_layout<tuple<uint64_t, uint64_t> >::value << endl;
    cout << "tuple is pod? : " << is_pod<tuple<uint64_t, uint64_t> >::value << endl;
    cout << "tuple is trivially_destructable? : " << is_trivially_destructible<tuple<uint64_t, uint64_t> >::value << endl;
    cout << "tuple is trivially_move_constructible? : " << is_trivially_move_constructible<tuple<uint64_t, uint64_t> >::value << endl;
    return 0;
}

Output when compiled with GCC or Clang:

pair is trivial? : 0
pair is trivially_copy_constructible? : 1
pair is standard_layout? : 1
pair is pod? : 0
pair is trivially_destructable? : 1
pair is trivially_move_constructible? : 1
tuple is trivial? : 0
tuple is trivially_copy_constructible? : 1
tuple is standard_layout? : 0
tuple is pod? : 0
tuple is trivially_destructable? : 1
tuple is trivially_move_constructible? : 0

This implies that GCC is getting it wrong. The return value should be passed in %rax,%rdx.

(The main noticable differences between the types is that pair is standard layout and is trivially move-constructible whereas tuple is not, so it's possible that GCC is always returning non-trivially-move-constructible values via a pointer, for example).

Mckellar answered 27/5, 2016 at 9:35 Comment(1)
The OP links to a cxx-abi-dev thread where the effect of move-constructors on the ABI is discussed, and it's agreed that trivially-move-constructible should be considered, and so GCC is (I believe) doing the right thing for the libstdc++ std::tuple.Lancelle
L
4

As davmac's answer shows, the libstdc++ std::tuple is trivially copy constructible, but not trivially move constructible. The two compilers disagree on whether the move constructor should affect the argument passing conventions.

The C++ ABI thread you linked to seems to explain that disagreement: http://sourcerytools.com/pipermail/cxx-abi-dev/2016-February/002891.html

In summary, Clang implements exactly what the ABI spec says, but G++ implements what it was supposed to say, but wasn't updated to actually say.

Lancelle answered 1/7, 2016 at 13:59 Comment(3)
I might be missing something, but, in the thread you link the suggested ABI text is: In the special case where the parameter type does not have both a trivial destructor and at least one trivial copy or move constructor that is not deleted, the caller must allocate space for a temporary copy, and pass the resulting copy by reference - which implies that tuple need not be passed by invisible reference, surely, since it is trivially movable and destructible? G++ is certainly not following that.Mckellar
(edit: it is trivially copyable and destructible.)Mckellar
On re-reading, the text I quoted in the comment above is (IIUC) a suggested improvement to the "rule we ended up with" (but which never made it into the spec), and what you refer to by "what it was supposed to say" is the latter and not the former. Ok, makes sense.Mckellar

© 2022 - 2024 — McMap. All rights reserved.