The reason for having some calling convention is pretty simple: so that the caller and the callee agree on how things will work. Without it, the caller doesn't know where to put arguments when it's calling a particular function.
As for why Microsoft decided on the specific details of _stdcall
, that's largely historical. On MS-DOS, all calls were register based, so all OS calls required assembly language, or strange extensions to most higher-level languages.
When they first did Windows, they used the cdecl
calling convention, mostly because that's what the compiler did by default. At least according to rumor, shortly before they got ready to release Windows 1.0, they switched to the Pascal calling convention because it was enough more efficient that (among other things) it allowed Windows to fit on one fewer floppy disc. Regardless of the precise details, the Pascal calling convention did make code a little smaller, because the called function cleaned up the arguments from the stack instead of needing to clean them up everywhere the function was called. For any function that was called from at least 2 different places, that's a win (and if it's tie anywhere else).
Then they started work on OS/2, and invented yet another calling convention (syscall).
Then, of course, came Win32. There wasn't really a lot wrong with syscall from a technical viewpoint, but (I'd guess) everything associated with OS/2 was considered tainted, so syscall had to go. The result was something just enough different to justify a new name. In fairness, that's a little bit of an exaggeration: they did add one truly useful addition: they encoded the number of bytes of arguments into each function name, so if (for example) you supplied an incorrect prototype for a function, the code wouldn't link rather than ending up with a mismatch between caller and callee that could lead to much more serious problems.
For the most part, it really comes back to the original point though: the exact details of the calling convention don't matter all that much, as long as you don't make a complete mess of it. Most of what matters is that the caller and callee agree on the same thing, so if a compiler knows what parameters a function accepts, it knows how to generate code to get those parameters to the function correctly (and, likewise, they both agree on how stack cleanup is handled, etc.)