How does argument passing work?

Asked 9/12, 2010 at 7:9 Answered 6/10, 2018 at 2:56

I want to know how passing arguments to functions in C works. Where are the values being stored and how and they retrieved? How does variadic argument passing work? Also since it's related: what about return values?

I have a basic understanding of CPU registers and assembler, but not enough that I thoroughly understand the ASM that GCC spits back at me. Some simple annotated examples would be much appreciated.

Keck answered 9/12, 2010 at 7:9 Comment(3)

Argument passing depends on the calling convention, which depends on the CPU. What CPU are you using? MIPS? x86? x86-64? – Gaspard 9/12, 2010 at 7:11

possible duplicate of What are the different calling conventions in C/C++ and what do each mean? – Androgynous 9/12, 2010 at 7:12

@Gaspard x86. cdecl and stdcall are probably most pertinent. @Ignacio Vazquez-Abrams good link, reading through some of the stuff there now – Keck 9/12, 2010 at 8:57

Considering this code:

int foo (int a, int b) {
  return a + b;
}

int main (void) {
  foo(3, 5);
  return 0;
}

Compiling it with gcc foo.c -S gives the assembly output:

foo:
    pushl   %ebp
    movl    %esp, %ebp
    movl    12(%ebp), %eax
    movl    8(%ebp), %edx
    leal    (%edx,%eax), %eax
    popl    %ebp
    ret

main:
    pushl   %ebp
    movl    %esp, %ebp
    subl    $8, %esp
    movl    $5, 4(%esp)
    movl    $3, (%esp)
    call    foo
    movl    $0, %eax
    leave
    ret

So basically the caller (in this case main) first allocates 8 bytes on the stack to accomodate the two arguments, then puts the two arguments on the stack at the corresponding offsets (4 and 0), and then the call instruction is issued which transfers the control to the foo routine. The foo routine reads its arguments from the corresponding offsets at the stack, restores it, and puts its return value in the eax register so it's available to the caller.

Woodcock answered 9/12, 2010 at 7:28 Comment(1)

I’d suggest adding the flag for the target platform, so readers on another architecture can reproduce it. Also to acknowledge that it’s platform-specific. – Zechariah 6/10, 2018 at 1:38

That is platform specific and part of the "ABI". In fact, some compilers even allow you to choose between different conventions.

Microsoft's Visual Studio, for example, offers the __fastcall calling convention, which uses registers. Other platforms or calling conventions use the stack exclusively.

Variadic arguments work in a very similar way - they are passed via registers or stack. In case of registers, they are usually in ascending order, based on type. If you have something like (int a, int b, float c, int d), a PowerPC ABI might put a in r3, b in r4, d in r5, and c in fp1 (I forgot where float registers start, but you get the idea).

Return values, again, work the same way.

Unfortunately, I don't have many examples, most of my assembly is in PowerPC, and all you see in the assembly is the code going straight for r3, r4, r5, and placing the return value in r3 as well.

Metopic answered 9/12, 2010 at 7:17 Comment(0)

Your questions are more than anybody could reasonably try to answer in a SO post, not to mention that it's implementation defined as well.

However, if you're interested in the x86 answer might I suggest you watch this Stanford CS107 Lecture titled Programming Paradigms where all the answers to the questions you posed will be explained in great detail (and quite eloquently) in the first 6-8 lectures.

Railey answered 9/12, 2010 at 7:15 Comment(1)

These lectures look good. Definitely going to try and check them out. Just need to find the time :) – Keck 9/12, 2010 at 8:58

It depends on your compiler, the target architecture and OS you’re compiling for, and whether your compiler supports non-standard extensions that change the calling convention. But there are some commonalities.

The C calling convention is usually established by the vendor of the operating system, because they need to decide what convention the system libraries use.

More recent CPUs (such as ARM or PowerPC) tend to have their calling conventions defined by the CPU vendor and compatible across different operating systems. x86 is an exception to this: different systems use different calling conventions. There used to be a lot more calling conventions for the 16-bit 8086 and 32-bit 80386 than there are for x86_64 (although even that is not down to one). 32-bit x86 Windows programs sometimes use multiple calling conventions within the same program.

Some observations:

An example of an operating system that supports several different ABIs with different calling conventions simultaneously, some of which follow the same conventions as other OSes for the same architecture, is Linux for x86_64. This can host three different major ABIs (i386, x32 and x86_64), two of which are the same as other operating systems for the same CPU, and several variants.
An exception to the rule that there's one system calling convention used for everything is 16- and 32-bit versions of MS Windows, which inherited some of the proliferation of calling conventions from MS-DOS. The Windows C API uses a different calling convention (STDCALL, originally FAR PASCAL) than the “C” calling convention for the same platform, and also supports FORTRAN and FASTCALL conventions. All four come in NEAR and FAR variants on 16-bit OSes. Nearly all Windows programs therefore use at least two different conventions in the same program.
Architectures with a lot of registers, including classic RISC and nearly all modern ISAs, use several of those registers to pass and return function arguments.
Architectures with few or no general-purpose registers often pass arguments on the stack, pointed to by a stack pointer. CISC architectures often have instructions to call and return which store the return address on the stack. (RISC architectures typically store the return address in a "link register", which the callee can save/restore manually if it's not a leaf function.)
A common variant is for tail calls, functions whose return value is also the return value of the caller, to jump to the next function (so it returns to our parent function) instead of calling it and then returning after it returns. Placing args in the right places has to account for the return address already being on the stack, where a call instruction would place it. This is especially true of tail-recursive calls, which have exactly the same stack frame on each invocation. A tail-recursive call is typically equivalent to a loop: update a few registers that changed, then jump back to the entry point. They do not need to create a new stack frame, or have their own return address: you can simply update the caller’s stack frame and use its return address as the tail call’s. i.e. tail-recursion easily optimizes into a loop.
Some architectures with only a few registers nevertheless defined an alternative calling convention that could pass one or two arguments in registers. This was FASTCALL on MS-DOS and Windows.
A few older ISAs, such as SPARC, had a special bank of “windowed” registers, so that every function has its own bank of input and output registers, and when it made a function call, the caller’s outputs became the callee’s inputs, and the reverse when it came time to return a value. Modern superscalar designs consider this more trouble than it’s worth.
A few very old architectures used self-modifying code in their calling convention, and the first edition of The Art of Computer Programming followed this model for its abstract language. It no longer works on most modern CPUs, which have instruction caches.
A few other very old architectures had no stack and generally could not call the same function again, re-entering it, until it returned.
A function with a lot of arguments almost always puts most of them onto the stack.
C functions that put arguments on the stack almost have to push them in reverse order and have the caller clean up the stack. The called function might not even know exactly how many arguments are on the stack! That is, if you call printf("%d\n", x); the compiler will push x, then the format string, then the return address, onto the stack. This guarantees that the first argument is at a known offset from the stack pointer and <varargs.h> has the information it needs to work.
Most other languages, and therefore some operating systems that C compilers support, do it the other way around: arguments are pushed from left to right. The function being called usually cleans up its own stack frame. This used to be called the PASCAL convention on MS-DOS, and survives as the STDCALL convention on Windows. It cannot support variadic functions. (https://en.wikibooks.org/wiki/X86_Disassembly/Calling_Conventions)
Fortran and a few other language historically passed all arguments by reference, which translates to C as pointer arguments. Compilers that might need to interface with these other languages often support these foreign calling conventions.
Because a major source of bugs was “smashing the stack,” many compilers now have a way to add canary values (which, like a canary in a coal mine, warn you that something dangerous is going on if anything happens to them) and other means of detecting when code tampers with the stack frame.
Another form of variation across different platforms is whether the stack frame will contain all the information it needs for a debugger or exception-handler to backtrace, or whether that info will be in separate metadata (or not present at all) allowing simplification of function prologue/epilogue (-fomit-frame-pointer).

You can get cross-compilers to emit code using different calling conventions, and compare them, with switches such as -S -target (on clang).

Zechariah answered 6/10, 2018 at 2:56 Comment(17)

re: the first point: a better counter-example is Windows, where there are many different 32-bit calling conventions, like cdecl (similar to i386 System V), stdcall (callee-pops), fastcall (register args), vectorcall (register args for vectors, too), and probably others. Linux only has one calling convention per mode. (x32 and x86-64 use identical calling conventions, only the type widths are different.) – Bouley 6/10, 2018 at 3:3

@PeterCordes I do give older Windows as my example in my next point. But that’s a good example, too. – Zechariah 6/10, 2018 at 3:4

Basically it's x86 that is the exception to one calling convention defined by the vendor. Maybe because pure stack-args calling conventions suck, it was worth replacing for OSes that do still care about performance for 32-bit code (i.e. Windows), even an ILP32 ABI for 64-bit mode would be better in almost every way. 7 GP registers + a stack pointer is plenty for 2 or 3 register args, like gcc uses with -m32 -mregparm=3, or with fastcall. – Bouley 6/10, 2018 at 3:6

I know you mention Windows later, but your Linux point isn't really an example of what you claim it is. :/ – Bouley 6/10, 2018 at 3:7

@PeterCordes I was more trying to give an example of how one OS can support different ABIs for different ISAs simultaneously, several of which are supported by other OSes, then an example of how the same OS for the same ISA might have many different conventions. – Zechariah 6/10, 2018 at 3:7

@PeterCordes Do you like this wording better? – Zechariah 6/10, 2018 at 3:9

Oh. Well every modern mainstream x86 OS fits that bill. Windows, Linux, OS X, *BSD, Solaris, and so on all use a different calling convention in long mode than they do for 32-bit processes, and do support both 32 and 64-bit processes running at the same time under 64-bit kernels. – Bouley 6/10, 2018 at 3:10

Ok, yeah that's better since you're just saying "an example", not making it sound like Linux is at all unique that way. – Bouley 6/10, 2018 at 3:10

@PeterCordes Yeah, It’s an illustration, certainly not the only one. Linux came to mind because it supports both the i386 SYSV cross-platform standard, and also the x86_64 cross-platform standard. – Zechariah 6/10, 2018 at 3:11

You're misusing the term "tail-call", and there's no "tail-call convention". A tailcall is when you jmp foo. instead of call foo / ret. That's not a different convention; it's replacing your stack frame with the new function's stack frame, transparently to your caller and to the callee, so the ABI docs don't need to mention it. IDK if you're talking about omitting the frame pointer in leaf functions. On Linux, the i386 and x86-64 System V ABIs allow omitting the frame pointer in non-leaf funcs (by providing a metadata-based unwind mechanism). -fomit-frame-pointer is the default. – Bouley 6/10, 2018 at 3:15

(unclear because of the unfortunate ambiguity between "making a stack frame" = setting up E/RBP vs. the generic non-x86-specific meaning of allocating some stack space for use by a function.) – Bouley 6/10, 2018 at 3:18

@PeterCordes Re-worded the section on tail calls to be clearer on what they are and why they’re more efficient. It’s not an ABI issue, but I think it’s relevant to the question that was asked, which was how arguments are passed to functions. – Zechariah 6/10, 2018 at 3:40

No, in a tailcall you make sure that args are where the callee is expecting them, and the stack pointer is pointing where it's supposed to. The calling convention can affect whether a tailcall is possible: In a caller-pops convention, you can tailcall any function that that uses less space for stack args than the current function. If all args fit in regs it's easy, you use no space space for args. Otherwise you overwrite your incoming args. (Callee owns the args in normal conventions.) Unfortunately compilers typically miss this latter case. – Bouley 6/10, 2018 at 3:42

xD, that was a reply to a comment you just deleted. Good point, yes the caller looks different for a tailcall than a regular call, in any calling convention. (Least noticeable with only register args, of course). – Bouley 6/10, 2018 at 3:42

@PeterCordes Good time to clean up our conversation, or anything else you wanted to bring up? Great feedback! Thanks. – Zechariah 6/10, 2018 at 3:45

but there is a single standard one for the 64-bit x86. Unfortunately nope, Microsoft extended fastcall instead of using the (superior in a few ways) x86-64 System V ABI. Why does Windows64 use a different calling convention from all other OSes on x86-64?/ – Bouley 6/10, 2018 at 3:48

@PeterCordes Whoops! – Zechariah 6/10, 2018 at 3:52

Basically, C passes arguments by pushing them on the stack. For pointer types, the pointer is pushed on the stack.

One things about C is that the caller restores the stack rather the function being called. This way, the number of arguments can vary and the called function doesn't need to know ahead of time how many arguments will be passed.

Return values are returned in the AX register, or variations thereof.

Berkin answered 9/12, 2010 at 7:17 Comment(5)

Although what you described is a common occurrence of how things work, nowhere in the C standard is the concept of a 'stack' described. All of the things you mentioned are implementation defined details and are not tied to C in any way. – Railey 9/12, 2010 at 7:20

@SiegeX: I think this answer is making a reasonable assumption, considering that the OP didn't state the architecture they're using. – Amphi 9/12, 2010 at 7:32

It's certainly not true that C always uses caller restore. It is required for variable argument functions, but not for those with fixed parameters. Thus, callee restore is often used in the latter case. A common example is fastcall. – Trip 9/12, 2010 at 7:47

Heck, if your calling convention included a parameter count, even variadic functions would be callee-restore. – Gaspard 9/12, 2010 at 9:13

The OP was non-specific and didn't say anything about the C standard. I made some assumptions and offered my input. – Berkin 10/12, 2010 at 6:8

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags