I try to understand the implication of System V AMD64 - ABI's calling convention and looking at the following example:
struct Vec3{
double x, y, z;
};
struct Vec3 do_something(void);
void use(struct Vec3 * out){
*out = do_something();
}
A Vec3
-variable is of type MEMORY and thus the caller (use
) must allocate space for the returned variable and pass it as hidden pointer to the callee (i.e. do_something
). Which is what we see in the resulting assembler (on godbolt, compiled with -O2
):
use:
pushq %rbx
movq %rdi, %rbx ;remember out
subq $32, %rsp ;memory for returned object
movq %rsp, %rdi ;hidden pointer to %rdi
call do_something
movdqu (%rsp), %xmm0 ;copy memory to out
movq 16(%rsp), %rax
movups %xmm0, (%rbx)
movq %rax, 16(%rbx)
addq $32, %rsp ;unwind/restore
popq %rbx
ret
I understand, that an alias of pointer out
(e.g. as global variable) could be used in do_something
and thus out
cannot be passed as hidden pointer to do_something
: if it would, out
would be changed inside of do_something
and not when do_something
returns, thus some calculations might become faulty. For example this version of do_something
would return faulty results:
struct Vec3 global; //initialized somewhere
struct Vec3 do_something(void){
struct Vec3 res;
res.x = 2*global.x;
res.y = global.y+global.x;
res.z = 0;
return res;
}
if out
where an alias for the global variable global
and were used as hidden pointer passed in %rdi
, res
were also an alias of global
, because the compiler would use the memory pointed to by hidden pointer directly (a kind of RVO in C), without actually creating a temporary object and copying it when returned, then res.y
would be 2*x+y
(if x,y
are old values of global
) and not x+y
as for any other hidden pointer.
It was suggested to me, that using restrict
should solve the problem, i.e.
void use(struct Vec3 *restrict out){
*out = do_something();
}
because now, the compiler knows, that there are no aliases of out
which could be used in do_something
, so the assembler could be as simple as this:
use:
jmp do_something ; %rdi is now the hidden pointer
However, this is not the case neither for gcc nor for clang - the assembler stays unchanged (see on godbolt).
What prevents the usage of out
as hidden pointer?
NB: The desired (or very similar) behavior would be achieved for a slightly different function-signature:
struct Vec3 use_v2(){
return do_something();
}
which results in (see on godbolt):
use_v2:
pushq %r12
movq %rdi, %r12
call do_something
movq %r12, %rax
popq %r12
ret
Vec3 do_something();
and force compiler use hidden pointer ? need explicit writevoid do_something(Vec3* )
- because this is only way (impossible returnVec3
). so if you want optimized binary code - you must at begin yourself write optimized source code – CarlcarlaVec3
impossible return (by value) and writevoid do_something(Vec3* )
(anyway this is will be real signature of function) – Carlcarla