x86-64 System V passes the first 6 integer args in registers RDI, RSI, RDX, RCX, R8, R9. So in main
we have mov $666, %edi
(which zero-extends to the full RDI) to pass the 64-bit arg long john
.
push
can't write registers; nothing1 can stop GCC from using mov
to set registers, and you wouldn't want to. If you passed 7 or more args, GCC normally would use push
in main
to pass the 7th on the stack, because -mno-accumulate-outgoing-args
is the default in modern GCC. push
has been efficient on x86 since Pentium-M or so introduced a "stack engine" to track stack-pointer updates specially.
Sunil Bojanapally's answer covers those options, which are more relevant for 32-bit code where all args are passed on the stack. If you got here from searching on the title question, see that answer or Why does gcc use movl instead of push to pass function args? This answer is about the actual question, which is about what the callee does with its incoming arg in a debug build, not about how the arg is passed to it.
You're talking about the code inside the callee that stores that incoming arg to the stack. This isn't passing an arg, it's just a consequence of a debug build - every C variable gets a memory address unless declared register
with the default -O0
anti-optimization level. Compilers emit instructions to store incoming register args to the stack.
In this case movq %rdi, -8(%rbp)
is storing to the red zone below RSP, since worship()
is a leaf function. The stack space is already effectively reserved (down to -128(%rsp)
, and at this point RBP=RSP).
And just to be clear, this is not part of the function call. Spilling incoming args to the stack inside the callee only happens in a debug build, not part of the calling convention.
If it had needed to sub $16, %rsp
/ mov
-store / leave
, e.g. if you'd compiled with -mno-red-zone
, then yes it could have been an optimization to do that spill with push %rdi
. But existing compilers don't do that optimization for initializing + creating locals.
push %rdi
in worship
would have required the compiler to use leave
instead of just pop %rbp
, which is slightly more expensive. And it would only align the stack to RSP%16 == 8 after push %rbp
aligned it to RSP%16 == 0; compilers prefer to keep the stack aligned by 16 even when they're not making further function calls.
And of course if you'd just enabled optimization, worship
would just be xor %eax,%eax
/ ret
, not wasting instructions putting the register arg anywhere.
Footnote 1: -Oz
(favour code-size without caring about speed) might use 3-byte push imm8
/ pop rdi
instead of 5-byte mov edi, imm32
to materialize a value in a register if it was in the -128..+127 range. But 666 isn't, so mov
is also the smallest way to set a register to that value without any pre-existing known register values near that. (Code golf x86-64 machine code tips).
movq %rdi, -8(%rbp)
is a parameter pushing? Any parameters must be inputted from the calling function, which ismain
here. The callee must read out the transferred parameter – Unshod