Stack cleanup in stdcall (callee-pops) for variable arguments

I'm learning a bit of assembly for fun (currently using NASM on Windows), and I have a question regarding the stdcall calling convention and functions with variable numbers of arguments. For example, a sum function that takes X integers and adds them all together.

Since the callee needs to clean/reset the stack when using stdcall, but you can only use constant values with ret, I've been wondering if there's anything wrong with popping the return address, moving esp, and jumping back to the caller yourself, instead of using ret. I assume this would be slower, since it requires more instructions, but would it be acceptable?

; int sum(count, ...)
sum:
    mov ecx, [esp+4] ; count
    
    ; calc args size
    mov eax, ecx ; vars count
    inc eax      ; + count
    mov edx, 4   ; * 4 byte per var
    mul edx
    mov edx, eax
    
    xor eax, eax ; result
    
    cmp ecx, 0   ; if count == 0
    je .done
    inc ecx      ; count++, to start with last arg
    
    .add:
        add eax, [esp+4*ecx]
        dec ecx  ; if --ecx != 1, 0 = return, 1 = count
        cmp ecx, 1
        jnz .add
    .done:
        pop ebx
        add esp,edx
        jmp ebx

I don't see why this wouldn't be okay, and it appears to work, but I've read articles that talked about how stdcall can't handle variable arguments, because the function can't know what value to pass to ret. Am I missing something?

Of course ret imm works if the size of the arguments is a constant. Your idea would work if the function is able to determine the size of its arguments at runtime, which in this case it does from the count argument, though as ecm points out it may be inefficient because the indirect branch predictor isn't designed for such shenanigans.

But in some cases, the size of the arguments may not be known to the called function at all, not even at runtime. Consider printf. You might say it could deduce the size of its arguments from the format string; for instance, if the format string was "%d" then it should know that one int was passed and therefore clean up an extra 4 bytes from the stack. But it is perfectly legal under the C standard to call

printf("%d", 123, 456, 789, 2222);

The excess arguments are required to be ignored. But under your calling convention, printf would think it only had to clean up 4 bytes from the stack (plus its non-variadic format string argument), whereas its caller would expect it to clean up 16, and the program will crash.

So unless your calling convention is going to include a "hidden" argument that tells the called function how many bytes of arguments to clean up, it can't work. And passing such an extra argument is going to require more instructions than having the caller just do the stack cleanup itself.

Recommended topics

Hot tags