Why is the stack pointer moved down 4 bytes greater than the stack frame size when compiling with arm-linux-gnueabi-gcc?
Asked Answered
F

1

4

Using the trivial C program below as an example. main() makes a function call to sum passing in 4 integer. sum() uses 4 locals.

void sum(int a, int b, int c, int d);

void main(void)
{
    sum(11, 12, 13, 14);
}

void sum(int a, int b, int c, int d)
{
    int x;
    int y;
    int z;
    int z2;

    x = a;
    y = b;
    z = c;
    z2 = d;
}

On my Ubuntu server 12.04.04 LTS I compile this program using

arm-linux-gnueabi-gcc -S -mthumb func.c

sum:
@ args = 0, pretend = 0, frame = 32
@ frame_needed = 1, uses_anonymous_args = 0
@ link register save eliminated.
push    {r7}
sub sp, sp, #36    <===   why is this 36 and not 32 bytes?
add r7, sp, #0

str r0, [r7, #12]
str r1, [r7, #8]
str r2, [r7, #4]
str r3, [r7, #0]   <- paramaters passed

ldr r3, [r7, #12]
str r3, [r7, #16]  <- locals
ldr r3, [r7, #8]
str r3, [r7, #20]
ldr r3, [r7, #4]
str r3, [r7, #24]
ldr r3, [r7, #0]
str r3, [r7, #28]

add r7, r7, #36
mov sp, r7
pop {r7}
bx  lr

It appears that int's a 4 bytes each. 4 locals and 4 arguments for the function makes a total of (4 *4 bytes) + (4 * 4bytes) = 32 bytes and this matches the assembly output "frame = 32".

But why does the stack pointer get decremented by 36 and not just 32?

Fletcherfletcherism answered 9/3, 2014 at 8:13 Comment(10)
The return address, I presume.Drennen
@DavidSchwartz I think the return address is held in a dedicated register, the link register lrFletcherfletcherism
the newer arm abi wants to have the stack 64 bit aligned. that is why you will see dummy pushes on compiled code pushing r3 for example when it is never used and doesnt need to be preserved. Perhaps that is what is going on here 32 would be aligned 36 is not but because of the push r7 that makes it aligned again. if this were the answer though I would have expected a two word push and a 32 offset...Gantline
I dont see the return address because there is no call here so the compiler woudldnt waste time worrying about it...what version of gcc? perhaps just look at the source code to see what it did and whyGantline
arm-linux-gnueabi-gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3Fletcherfletcherism
See also #20071966Garonne
Your code is non-sensical. When I compile it, it is just bx lr. You don't use any calculated values in sum(), so the entire routine maybe eliminated. You can not be compiling with any optimizations (or have not specified). A compiler can always reserve more than needed.Pettitoes
For instance, I changed sum() to return x + y + z - z2; and gcc at -O3 converted it to movs r0,#22! Stack rationals (and any rationals in regards to code generation) depend on different compiler options.Pettitoes
The code is compiled using arm-linux-gnueabi-gcc -S -mthumb func.c The purpose of the code above was not intended to return any value since the return values have no bearing on the stack frame in this context. The code was merely to demonstrate that the frame size didn't match the SP subtraction. It turns out that the AAPCS wants the stack aligned to 64-bit word boundaries as answered by @auselen and this appears to be the correct answer.Fletcherfletcherism
Potential duplicate of ARM: Why do I need to push/pop two registers at function calls?Exceed
G
3

Procedure call standard for ARM requires 8 byte alignment.

5.2.1.2 Stack constraints at a public interface

The stack must also conform to the following constraint at a public interface:

  • SP mod 8 = 0. The stack must be double-word aligned.

Since you are producing assembly everything is exported by default, so you get 8 byte alignment. (I tried this and gcc doesn't add .global <symbol> directive to static functions when generating assembly. I guess this says even a static function is a public interface or gcc just aligns every function to have 8-byte stack alignment.)

You can use -fomit-frame-pointer to skip pushing r7 then gcc should leave the stack depth at 32.

Gingergingerbread answered 9/3, 2014 at 18:13 Comment(6)
Does this mean the compiler never leaves shadow space for the return address, as we first thought in the previous answer? And to clarify does a push {r7} causes the SP to auto decrement 4 bytes and thus the compiler decrements an extra 4 whilst setting up space for the stack frame, to align to the (SP mod 8 = 0) boundary, and that when calling a leaf function (as shown in the previous answer) that the push {r7, lr} is 8 bytes and thus already aligned? Just checking I understood it correct.Fletcherfletcherism
Return is via r0, not via stack. Yes push as you would expect decreases sp. Yes with fp(r7)+lr it gets aligned at 40.Gingergingerbread
Thanks. By return address, I mean the pointer to the return function. I watched a seminar about the x86 frame stack and the lecturer kept referring to 'Saved PC' (presumably the x86 equiv to the LR on ARM) being on the stack frame. Seems ARM has a dedicated register lr which always holds the return address. I've seen it pushed on the stack, but it would seem from your explanation of the 8-byte alignment that the lr is never part of the stack frame at all- pushed separately?Fletcherfletcherism
infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/…, and infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ihi0042e/…Garonne
You can imagine at machine level there is no such thing as a frame (at least for ARM). As you say ARM has a link register but it also has instructions related to it, BL/BLX (Branch with link, Branch with link, and exchange instruction set) which "copy the address of the next instruction into LR (R14, the link register)". So only thing matters is if you branch to an address (lets say a function) you can return back via LR, but if you want to branch to an other address and still be able to return to the first place, you should save LR first (push to stack).Gingergingerbread
this looks like a nice read: #15752688Gingergingerbread

© 2022 - 2024 — McMap. All rights reserved.