Why argument's size of function is increased to word size?
Asked Answered
V

3

1

I read System V ABI for i386 and AMD64. They are telling that arguments must be rounded to multiple of word size. And i dont understand why.

Here is situation. If you pass 4 char arguments to a function on i386 architecture it will take 16 bytes (4 bytes for each char argument). Isn't it more efficient to allocate only 4 bytes for all 4 arguments like it should be with local variables?

Alignment is not the answer. Because it could take 4-12 bytes padding for 16 byte stack alignment in both situiation.

Volotta answered 4/9, 2019 at 18:44 Comment(2)
You can't pop a char. You can pop a word.Manganin
Possible duplicate (for stack-args conventions at least): Why function parameter occupy at least 4 bytes stack on x86?Mammy
G
3

Putting the 4 chars in a single register (or stack location) would require creating and afterwards extracting the individual parameters, which is costly in terms of instructions. Note that even if you are talking about the stack, the memory access should be very quick given it will be most likely in the cache.

If you really want to save that much space, you can still do it yourself using a single 4-byte argument.

Griseofulvin answered 4/9, 2019 at 18:47 Comment(7)
Im bad at assembly but i dont understand why it will be costly. If you passed word size arguments you are getting them like this "mov -8(%ebp) %ecx mov -12(%ebp) %edx etc.". In my example it will be like this "movb -8(%ebp) %ecx movb -9(%ebp) %edx etc.". Maybe i dont understand something.Volotta
@yevhen: Most arguments are passed in registers.Starbuck
@Note that the calling convention actually packs 4 chars into a single register, if they are members of an argument of structure type.Wed
@YevhenGrushko AFAIK movb requires a byte-sized destination, i.e. those instructions don't exist.Griseofulvin
@Starbuck In my example i wrote about i386Volotta
Ive made mistake in my first comment. To get arguments in the current calling convention you should do "mov 0x8(%ebp) %ecx mov 0xc(%ebp) %ebx mov 0x10(%ebp) %edx mov 0x14(%ebp) %eax ", but in my example it will be "movsbl 0x8(%ebp) %ecx movsbl 0x9(%ebp) %ebx movsbl 0xa(%ebp) %edx movsbl 0xb(%ebp) %eax"Volotta
@YevhenGrushko Yeah, that would work, but note that it is still a non-aligned access, so it is likely reading 4-bytes nevertheless and doing some massaging under the hood. I don't know about the exact performance, feel free to measure in a recent CPU!Griseofulvin
S
2

Isn't it more efficient to ...

You always have to say what you want to optimize:

  • Fast execution speed
  • Small program size
  • Less stack usage
  • Simpler compilers
  • ...

If you want to optimize for less stack usage, passing bytes to the function really would be more efficient.

However, normally you want to optimize for fast execution speed or small program size.

Unlike modern compilers (that mov the arguments to the stack) most compilers written in the 1990s I know push the arguments to the stack. If a compiler uses push operations, putting bytes to the stack would be rather complex - it would make the program slow and long.

(Note that I have never seen that a pop operation is done on a parameter.)

Shifty answered 5/9, 2019 at 5:51 Comment(2)
To get arguments in the current calling convention you should do "mov 0x8(%ebp) %ecx mov 0xc(%ebp) %ebx mov 0x10(%ebp) %edx mov 0x14(%ebp) %eax ", but in my example it will be "movsbl 0x8(%ebp) %ecx movsbl 0x9(%ebp) %ebx movsbl 0xa(%ebp) %edx movsbl 0xb(%ebp) %eax" Is it slower?Volotta
@YevhenGrushko Reading the parameters: No. The problem is writing the parameters. In the 1980s or 1990s, when the System V calling convention was developed, most compilers used the push instruction to put the arguments on the stack. And you cannot push 8-bit values.Shifty
R
0

I think the original C authors had their eye on portability and maintainability more than squeezing every byte and cycle. Not that C is careless with resources, but appropriate trade-offs were made.

Promoting each parameter to the stack granule size made sense, and really still does. If you are desperate to squeeze it in, you could always replace:

int f(int a, int b, int c, int d) { ... }

with

struct fparm { char a,b,c,d; }; int f(struct fparm a) { ... }

Modern C compilers are not so user friendly; or rather their only friend is a luser named benchmark....

Rottenstone answered 4/9, 2019 at 22:32 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.