Couldn't stdcall
functions also get a parameter of how many variables are there and do the same?
If the caller has to pass a separate arg with the number of bytes to be popped, that's more work than just doing add esp, 16
or whatever after the call (cdecl style caller-pops). It would totally defeat the purpose of stdcall, which is to save a few bytes of space at each call site, especially for naive code-gen that wouldn't defer popping args across a couple calls, or reuse the space allocated by a push with mov
stores. (There are often multiple call-sites for each function, so the extra 2 bytes for ret imm16
vs. ret
is amortized over that.)
Even worse, the callee can't use a variable number efficiently on x86 / x86-64. ret imm16
only works with an immediate (constant embedded in the machine code), so to pop a variable number of bytes above the return address, a function would have to copy the return address high up in the stack and do a plain ret
from there. (Or defeat branch return-address branch prediction by popping the return address into a register.)
See also:
How do cdecl
functions know how many arguments they've received?
They don't.
C is designed around the assumption that variadic functions don't know how many args they received, so functions need something like a format string or sentinel to know how many to iterate. For example, the POSIX execl(3)
(wrapper for the execve(2)
system call) takes a NULL
-terminated list of char*
args.
Thus calling conventions in general don't waste code-size and cycles on providing a count as a side-channel; whatever info the function needs will be part of the real C-level args.
Fun fact: printf("%d", 1, 2, 3)
is well-defined behaviour in C, and is required to safely ignore args beyond the ones referenced by the format string.
So using stdcall
and calculating based on the format-string can't work. You're right, if you wanted to make a callee-pops convention that worked for variadic functions, you would need to pass a size somewhere, e.g. in a register. But like I said earlier, the caller knows the right number, so it would be vastly easier to let the caller manage the stack, instead of making the callee dig up this extra arg later. That's why no real-world calling conventions work this way, AFAIK.
printf
works internally - it will pick the next variadic argument whenever it encounters a specifier for printing a value. Which is also why it's undefined behavior to provide too few arguments – Campanulaceous