stdcall and cdecl
Asked Answered
M

9

104

There are (among others) two types of calling conventions - stdcall and cdecl. I have few questions on them:

  1. When a cdecl function is called, how does a caller know if it should free up the stack ? At the call site, does the caller know if the function being called is a cdecl or a stdcall function ? How does it work ? How does the caller know if it should free up the stack or not ? Or is it the linkers responsibility ?
  2. If a function which is declared as stdcall calls a function(which has a calling convention as cdecl), or the other way round, would this be inappropriate ?
  3. In general, can we say that which call will be faster - cdecl or stdcall ?
Martainn answered 4/8, 2010 at 9:51 Comment(2)
There are many types of calling conventions, of which those are just two. en.wikipedia.org/wiki/X86_calling_conventionsSanitary
Please, mark a correct answerSamsun
P
90

Raymond Chen gives a nice overview of what __stdcall and __cdecl does.

(1) The caller "knows" to clean up the stack after calling a function because the compiler knows the calling convention of that function and generates the necessary code.

void __stdcall StdcallFunc() {}

void __cdecl CdeclFunc()
{
    // The compiler knows that StdcallFunc() uses the __stdcall
    // convention at this point, so it generates the proper binary
    // for stack cleanup.
    StdcallFunc();
}

It is possible to mismatch the calling convention, like this:

LRESULT MyWndProc(HWND hwnd, UINT msg,
    WPARAM wParam, LPARAM lParam);
// ...
// Compiler usually complains but there's this cast here...
windowClass.lpfnWndProc = reinterpret_cast<WNDPROC>(&MyWndProc);

So many code samples get this wrong it's not even funny. It's supposed to be like this:

// CALLBACK is #define'd as __stdcall
LRESULT CALLBACK MyWndProc(HWND hwnd, UINT msg
    WPARAM wParam, LPARAM lParam);
// ...
windowClass.lpfnWndProc = &MyWndProc;

However, assuming the programmer doesn't ignore compiler errors, the compiler will generate the code needed to clean up the stack properly since it'll know the calling conventions of the functions involved.

(2) Both ways should work. In fact, this happens quite frequently at least in code that interacts with the Windows API, because __cdecl is the default for C and C++ programs according to the Visual C++ compiler and the WinAPI functions use the __stdcall convention.

(3) There should be no real performance difference between the two.

Proprioceptor answered 4/8, 2010 at 9:58 Comment(2)
+1 for a good example and Raymond Chen's post about calling convention history. For anyone interested, the other parts of that are a good read as well.Blowhard
+1 for Raymond Chen. Btw (OT): Why is it that I can't find the other parts using the blog search box? Google find them, but not MSDN Blogs?Afflictive
A
50

In CDECL arguments are pushed onto the stack in revers order, the caller clears the stack and result is returned via processor registry (later I will call it "register A"). In STDCALL there is one difference, the caller doeasn't clear the stack, the calle do.

You are asking which one is faster. No one. You should use native calling convention as long as you can. Change convention only if there is no way out, when using external libraries that requires certain convention to be used.

Besides, there are other conventions that compiler may choose as default one i.e. Visual C++ compiler uses FASTCALL which is theoretically faster because of more extensive usage of processor registers.

Usually you must give a proper calling convention signature to callback functions passed to some external library i.e. callback to qsort from C library must be CDECL (if the compiler by default uses other convention then we must mark the callback as CDECL) or various WinAPI callbacks must be STDCALL (whole WinAPI is STDCALL).

Other usual case may be when you are storing pointers to some external functions i.e. to create a pointer to WinAPI function its type definition must be marked with STDCALL.

And below is an example showing how does the compiler do it:

/* 1. calling function in C++ */
i = Function(x, y, z);

/* 2. function body in C++ */
int Function(int a, int b, int c) { return a + b + c; }

CDECL:

/* 1. calling CDECL 'Function' in pseudo-assembler (similar to what the compiler outputs) */
push on the stack a copy of 'z', then a copy of 'y', then a copy of 'x'
call (jump to function body, after function is finished it will jump back here, the address where to jump back is in registers)
move contents of register A to 'i' variable
pop all from the stack that we have pushed (copy of x, y and z)

/* 2. CDECL 'Function' body in pseudo-assembler */
/* Now copies of 'a', 'b' and 'c' variables are pushed onto the stack */
copy 'a' (from stack) to register A
copy 'b' (from stack) to register B
add A and B, store result in A
copy 'c' (from stack) to register B
add A and B, store result in A
jump back to caller code (a, b and c still on the stack, the result is in register A)

STDCALL:

/* 1. calling STDCALL in pseudo-assembler (similar to what the compiler outputs) */
push on the stack a copy of 'z', then a copy of 'y', then a copy of 'x'
call
move contents of register A to 'i' variable

/* 2. STDCALL 'Function' body in pseaudo-assembler */
pop 'a' from stack to register A
pop 'b' from stack to register B
add A and B, store result in A
pop 'c' from stack to register B
add A and B, store result in A
jump back to caller code (a, b and c are no more on the stack, result in register A)
Aldershot answered 4/8, 2010 at 10:28 Comment(7)
Note: __fastcall is faster than __cdecl and STDCALL is default calling convention for Windows 64-bitSciomancy
ohh; so it must pop return address, add argument block size, then jump to the return address popped earlier?(once you call ret, you can no longer clean the stack, but once you clean the stack, you can no longer call ret(since you would need to bury yourself back to ret on stack, bringing you back to the issue that you haven't cleaned the stack).Domesticate
alternatively, pop return to reg1, set stack pointer to base pointer, then jump to reg1Domesticate
alternatively, move stack pointer value from top of stack to bottom, clean, then call retDomesticate
Note that on Windows x64, __fastcall, __stdcall, and __cdecl are all aliases for the same calling convention which is __fastcall. Both 32-bit and 64-bit have an alternative calling convention of __vectorcallContravention
"which one is faster. No one" Are they really the same speed? cdecl has one extra instruction compared to stdcall. I assume this has a clock cost that stdcall does not incur. Not to mention that cdecl will consume more instruction memory if used more than once in a program, eating away at cache lines.Fregoso
Since we're talking about optimization - I worked on a compiler for a the client who produced firmware with memory constraints. Since generally there were "multiple" callers for a called function, it was more memory economical to make the called function do the stack cleanup.Devine
S
16

I noticed a posting that say that it does not matter if you call a __stdcall from a __cdecl or visa versa. It does.

The reason: with __cdecl the arguments that are passed to the called functions are removed form the stack by the calling function, in __stdcall, the arguments are removed from the stack by the called function. If you call a __cdecl function with a __stdcall, the stack is not cleaned up at all, so eventually when the __cdecl uses a stacked based reference for arguments or return address will use the old data at the current stack pointer. If you call a __stdcall function from a __cdecl, the __stdcall function cleans up the arguments on the stack, and then the __cdecl function does it again, possibly removing the calling functions return information.

The Microsoft convention for C tries to circumvent this by mangling the names. A __cdecl function is prefixed with an underscore. A __stdcall function prefixes with an underscore and suffixed with an at sign “@” and the number of bytes to be removed. Eg __cdecl f(x) is linked as _f, __stdcall f(int x) is linked as _f@4 where sizeof(int) is 4 bytes)

If you manage to get past the linker, enjoy the debugging mess.

Swabber answered 17/9, 2012 at 18:29 Comment(3)
The first paragraph is (now) incorrect, as most compilers handle the differences between the calling conventions to prevent such issues from happening.Celebrated
The caller and callee both have to agree on the callee's calling convention, yes. But "call a __stdcall from a __cdecl" is just pointing out that how the caller's own parent passed args to it isn't relevant to how it manages the stack when calling other functions. __cdecl and __stdcall agree on which registers are call-preserved vs. call-clobbered so there's no need for extra saving of registers that a function doesn't already use itself, for example.Rondelle
The problem would be (as you say) if you called a __stdcall function as if it were a __cdecl function. Not from a __cdecl function. e.g. __stdcall foo(int x) can be written to call printf. Your answer is correct, except for the first paragraph stating disagreement with a different thing. >.<Rondelle
H
5

I want to improve on @adf88's answer. I feel that pseudocode for the STDCALL does not reflect the way of how it happens in reality. 'a', 'b', and 'c' aren't popped from the stack in the function body. Instead they are popped by the ret instruction (ret 12 would be used in this case) that in one swoop jumps back to the caller and at the same time pops 'a', 'b', and 'c' from the stack.

Here is my version corrected according to my understanding:

STDCALL:

/* 1. calling STDCALL in pseudo-assembler (similar to what the compiler outputs) */
push on the stack a copy of 'z', then copy of 'y', then copy of 'x'
call
move contents of register A to 'i' variable

/* 2. STDCALL 'Function' body in pseaudo-assembler */ copy 'a' (from stack) to register A copy 'b' (from stack) to register B add A and B, store result in A copy 'c' (from stack) to register B add A and B, store result in A jump back to caller code and at the same time pop 'a', 'b' and 'c' off the stack (a, b and c are removed from the stack in this step, result in register A)

Heterophyllous answered 25/10, 2015 at 1:42 Comment(0)
A
2

It's specified in the function type. When you have a function pointer, it's assumed to be cdecl if not explicitly stdcall. This means that if you get a stdcall pointer and a cdecl pointer, you can't exchange them. The two function types can call each other without issues, it's just getting one type when you expect the other. As for speed, they both perform the same roles, just in a very slightly different place, it's really irrelevant.

Aleksandr answered 4/8, 2010 at 9:59 Comment(0)
W
1

The caller and the callee need to use the same convention at the point of invokation - that's the only way it could reliably work. Both the caller and the callee follow a predefined protocol - for example, who needs to clean up the stack. If conventions mismatch your program runs into undefined behavior - likely just crashes spectacularly.

This is only required per invokation site - the calling code itself can be a function with any calling convention.

You shouldn't notice any real difference in performance between those conventions. If that becomes a problem you usually need to make less calls - for example, change the algorithm.

Washerman answered 4/8, 2010 at 9:59 Comment(0)
J
1

Those things are Compiler- and Platform-specific. Neither the C nor the C++ standard say anything about calling conventions except for extern "C" in C++.

how does a caller know if it should free up the stack ?

The caller knows the calling convention of the function and handles the call accordingly.

At the call site, does the caller know if the function being called is a cdecl or a stdcall function ?

Yes.

How does it work ?

It is part of the function declaration.

How does the caller know if it should free up the stack or not ?

The caller knows the calling conventions and can act accordingly.

Or is it the linkers responsibility ?

No, the calling convention is part of a function's declaration so the compiler knows everything it needs to know.

If a function which is declared as stdcall calls a function(which has a calling convention as cdecl), or the other way round, would this be inappropriate ?

No. Why should it?

In general, can we say that which call will be faster - cdecl or stdcall ?

I don't know. Test it.

Jamieson answered 4/8, 2010 at 10:1 Comment(0)
C
0

a) When a cdecl function is called by the caller, how does a caller know if it should free up the stack?

The cdecl modifier is part of the function prototype (or function pointer type etc.) so the caller get the info from there and acts accordingly.

b) If a function which is declared as stdcall calls a function(which has a calling convention as cdecl), or the other way round, would this be inappropriate?

No, it's fine.

c) In general, can we say that which call will be faster - cdecl or stdcall?

In general, I would refrain from any such statements. The distinction matters eg. when you want to use va_arg functions. In theory, it could be that stdcall is faster and generates smaller code because it allows to combine popping the arguments with popping the locals, but OTOH with cdecl, you can do the same thing, too, if you're clever.

The calling conventions that aim to be faster usually do some register-passing.

Counteroffensive answered 4/8, 2010 at 10:5 Comment(0)
D
0

Calling conventions have nothing to do with the C/C++ programming languages and are rather specifics on how a compiler implements the given language. If you consistently use the same compiler, you never need to worry about calling conventions.

However, sometimes we want binary code compiled by different compilers to inter-operate correctly. When we do so we need to define something called the Application Binary Interface (ABI). The ABI defines how the compiler converts the C/C++ source into machine-code. This will include calling conventions, name mangling, and v-table layout. cdelc and stdcall are two different calling conventions commonly used on x86 platforms.

By placing the information on the calling convention into the source header, the compiler will know what code needs to be generated to inter-operate correctly with the given executable.

Dockage answered 4/8, 2010 at 10:16 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.