How do vararg functions find out the number of arguments in machine code?
Asked Answered
D

4

14

How can variadic functions like printf find out the number of arguments they got?

The amount of arguments obviously isn't passed as a (hidden) parameter (see a call to printf in asm example here).

What's the trick?

Dorsey answered 11/3, 2011 at 12:8 Comment(2)
For printf et al you know how many additional arguments to expect based on the format string.Pygidium
I would expect something like the number of arguments is pushed after all the arguments, but that's not the case in listings you link to.Shults
H
14

The trick is that you tell them somehow else. For printf you have to supply a format string which even contains type information (which might be incorrect though). The way to supply this information is mainly user-contract and often error-prone.

As for calling conventions: Usually the arguments are pushed onto the stack from left to right and then the backjump address at last. The calling routine clears the stack. So there is no technical need for the called routine to know the number of parameters.

EDIT: In C++0x there is a safe way (even typesafe!) to call variadic functions!

Hydrothermal answered 11/3, 2011 at 12:14 Comment(2)
Ah, that's the trick, it just assumes the format string to be correct. What's that type safe way?Dorsey
@OP C++ is also "cryptic" (harder to disassembly). That's why I love this language.Lowman
F
9

Implicitly, from the format string. Note that stdarg.h doesn't contain any macros to retrieve the total "variable" number of arguments passed. This is also one of the reasons the C calling convention requires the caller to clean the stack, even though this increases code size.

Furfuran answered 11/3, 2011 at 12:32 Comment(0)
S
9

This is the reason why arguments are pushed on reverse order on the C calling convention, e.g:

If you call:

printf("%s %s", foo, bar);

The stack ends up like:

  ...
+-------------------+
| bar               |
+-------------------+
| foo               |
+-------------------+
| "%s %s"           |
+-------------------+
| return address    |
+-------------------+
| old frame pointer | <- frame pointer
+-------------------+
  ...

Arguments are accesed indirectly using its offset from the frame pointer (the frame pointer can be omitted by smart compilers that know how to calculate things from the stack pointer). The first argument is always at a well-known address in this scheme, the function accesses as many arguments as its first arguments tell it to.

Try the following:

printf("%x %x %x %x %x %x\n");

This will dump part of the stack.

Soever answered 14/3, 2011 at 0:29 Comment(1)
In the x86-64 SystemV calling convention, the first 6 integer/pointer args are passed in registers, so only the last %x will get printf to take a value from stack memory.Leena
Z
5
  • The AMD64 System V ABI (Linux, Mac OS X) does pass the number vector (SSE / AVX) varargs in al (the low byte of RAX), unlike any standard IA-32 calling conventions. See also: Why is %eax zeroed before a call to printf?

    But only up to 8 (the max number of registers to use). And IIRC, the ABI allows al to be greater than the actual number of XMM/YMM/ZMM args but it must not be less. So it does not in general always tell you the number of FP args; you can't tell how many more than 8, and al is allowed to overcount.

    It's only usable for performance reasons, to skip saving unneeded vector registers to the "Register Save Area" mentioned in "3.5.7 Variable Argument Lists". For example GCC makes code that tests al!=0 and then dumps XMM0..7 to the stack or nothing. (Or if the function uses VA_ARG with __m256 anywhere, then YMM0..7.)

  • On the C level, there are also other techniques besides parsing the format string as mentioned by others. You could also:

    • pass a sentinel (void *)0 to indicate the last argument like execl does.

      You will want to use the sentinel function attribute to help GCC enforce that at compile time: C warning Missing sentinel in function call

    • pass it as an extra integer argument with the number of varargs

    • use the format function attribute to help GCC enforce format strings of known types like printf or strftime

Related: How are variable arguments implemented in gcc?

Zitazitah answered 20/7, 2015 at 15:11 Comment(2)
gcc/clang currently just check if al is zero or not, and if it's non-zero they dump all 8 xmm registers to the stack. (Rather than using a loop with al counting down.)Leena
Yes, it's only for performance reasons. You can't set al to a lower value to get the callee to look on the stack for FP args instead of registers. The calling convention requires al to be >= number of FP args, up to 8. With GCC's implementation, passing 0 will end up with the callee looking at its uninitialized dump area. (The consequence of violating the ABI in asm is pretty much like C UB: anything could happen. The callee might assume that al=0 means no FP args and do some optimization that later breaks. It's up to the compiler (or human) author of the asm.Leena

© 2022 - 2024 — McMap. All rights reserved.