How do I tell gcc that my inline assembly clobbers part of the stack?
Asked Answered
P

2

7

Consider inline assembly like this:

uint64_t flags;
asm ("pushf\n\tpop %0" : "=rm"(flags) : : /* ??? */);

Nonwithstanding the fact that there is probably some kind of intrinsic to get the contents of RFLAGS, how do I indicate to the compiler that my inline assembly clobbers one quadword of memory at the top of stack?

Procrastinate answered 26/8, 2016 at 7:14 Comment(10)
AFAIK, you can't. I think the only safe way to write this is by modifying rsp around the pushf/pop so you don't step on the red zone. (Use add -128 / sub -128 so they can use an imm8 encoding). And then of course the output constraint has to be "=r", because a memory operand could use an rsp-relative addressing mode. Avoiding the red-zone is the best anyone's been able to come up with in discussions of the same issue on previous SO questions.Contingence
You could also save/restore [rsp-8] into a register, but that seems worse than modifying rsp, even if it makes the stack engine insert an extra uop.Contingence
There is no way to tell extended asm that you are clobbering the stack. That said, what exactly are you trying to accomplish? There might be some other tricks that do what you need (maybe lahf?).Mellar
Oh and yeah, there is an intrinsic: __readeflags() (and yes, it works on x64).Mellar
@DavidWohlferd I was thinking about how inline assembly that clobbers part of the stack would work and this is a good example for motivation as pushf is the only way to get the entire RFLAGS register.Procrastinate
@DavidWohlferd Interestingly, the intrinsics don't seem to address this issue either.Procrastinate
Ah, I see, that was the first stab, it seems like they reimplemented that as a builtin afterwards.Procrastinate
related: you can't tell gcc that you want to clobber the whole red zone either (except by compiling the whole file or function with -mno-red-zone). So to make a function call in inline-asm, you have to jump through hoops: See #37503341 and #37640493. (Calling functions from inline asm is just a bad idea, but people trying to learn asm using inline-asm keep wanting to do it.)Contingence
Or just list "rsp" as a clobber and suppress the warnings. It seems gcc handles it just fine (by avoiding the redzone and addressing local spills relative to %rbp rather than %rsp) with only a small amount of perhaps unavoidable waste (leave or similar (lea someConst(%rbp), %rsp; ...popsToRestoreNonVolatiles...; ret;) as there seems to be no way to tell the compiler that the modified rsp was restored). The mechanism is needed for VLAs/allocas anyway. I only ran into some issues w/ rsp clobb. on clang but the fix seems simple (github.com/llvm/llvm-project/issues/61898).Insignificancy
@PetrSkocik I didn't know this is possible! Please post this approach as an answer so I can accept it.Procrastinate
I
2

Apart from Peter Cordes's approach of skipping the redzone:

long getflags0(void){
    long f; __asm(
        "add $-128, %%rsp;\n"
        "pushf; pop %0;\n"
        "sub $-128, %%rsp\n" : "=r"(f) :: );
    return f;
}

which renders:

0000000000000000 <getflags0>:
   0:   48 83 c4 80             add    $0xffffffffffffff80,%rsp
   4:   9c                      pushfq 
   5:   58                      pop    %rax
   6:   48 83 ec 80             sub    $0xffffffffffffff80,%rsp
   a:   c3                      retq   
$sz(getflags0)=11

you can also just list rsp as a clobber and silence the deprecation warning:

long getflags(void){
    long f;
    #pragma GCC diagnostic push
    #pragma GCC diagnostic ignored "-Wdeprecated"
    __asm("pushf; pop %0" : "=r"(f) :: "rsp");
    #pragma GCC diagnostic pop
    return f;
}

which renders:

000000000000000b <getflags>:
   b:   55                      push   %rbp
   c:   48 89 e5                mov    %rsp,%rbp
   f:   9c                      pushfq 
  10:   58                      pop    %rax
  11:   c9                      leaveq 
  12:   c3                      retq   
$sz(getflags)=8

From experience (played with this quite a bit), gcc actually handles rsp clobbers quite well -- by forcing a frame pointer (which it won't let you clobber alongside rsp -- that's a hard assembler error), avoiding the redzone, addressing locals relatively to the frame pointer, and by forcing an %rsp restoring code at the end of the function.

The mechanism of making the compiler let go of the end of the stack is needed for VLAs and allocas anyway, so I don't think it's going anywhere.

I think such rsp clobbers are quite usable for custom stack allocations, frees, and stack switches, as long as you don't mess with what the compiler spilled below the stack pointer it gave you (or open it up to being messed with).

I only had some issues with this approach on clang, but the fix to the compiler seems trivial: https://github.com/llvm/llvm-project/issues/61898.

As for suppressing the warnings without affecting the whole compilation unit,

#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wdeprecated"
//...
#pragma GCC diagnostic pop

can work well inside a (possibly inline -- had no issues with rsp clobbers inside inline functions either) function, or you can generate the pragma with _Pragma to make it usable inside of macros.

Clang doesn't complain about rsp clobbers (though you will run into issues on it if you use rsp clobbers on it for memory allocation unless you apply my fix to a custom build) unless you compile with -fstack-clash-protection. Then the warning is -Wstack-protector, and it's silenceable equivalently.

Please keep in mind that while this happens to work, it is not officially supported. From https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Clobbers-and-Scratch-Registers-1:

The compiler requires the value of the stack pointer to be the same after an asm statement as it was on entry to the statement. However, previous versions of GCC did not enforce this rule and allowed the stack pointer to appear in the list, with unclear semantics. This behavior is deprecated and listing the stack pointer may become an error in future versions of GCC.

Insignificancy answered 23/9, 2023 at 0:28 Comment(6)
RSP clobber sounds like a "happens-to-work" behaviour. GCC's manual (gcc.gnu.org/onlinedocs/gcc/…) says: the clobber list should not contain the stack pointer ... The compiler requires the value of the stack pointer to be the same after an asm statement as it was on entry to the statement. However, previous versions of GCC did not enforce this rule and allowed the stack pointer to appear in the list, with unclear semantics. This behavior is deprecated and listing the stack pointer may become an error in future versions of GCC.Contingence
@PeterCordes Yeah, that's a very good note that I should've probably somehow included in the answer.Insignificancy
I added a volatile int foo = 0; to your example to verify that it didn't put it in the red-zone. It allocates stack space for it so it's above RSP even in a leaf function. I guess an "rsp" clobber is treated similarly to alloca with a runtime-variable size. Function calls after that do assume RSP is still aligned, so it doesn't cost extra instructions re-aligning RSP after you tell GCC it was clobbered. godbolt.org/z/6fzsE59eeContingence
@PeterCordes Yes, exactly. That's why the answer also mentions VLAs and allocas. In my experience (gcc and my very slightly modified clang) it works quite well and opens up new interesting applications (easy custom allocas+freeas, access to raw pushes/pops from C, etc.). I had originally only made a daredevil comment and, in light of the official deprecation, didn't intend to answer, but I took fuz's bait. Maybe they'll un-deprecate it if people develop enough interesting applications for it :D.Insignificancy
@PeterCordes If not for all the useful applications, they should definitely undeprecate it to make the notorious aligned call from inline assembly simpler godbolt.org/z/KTGvMoT6d. :DInsignificancy
They'd need to be careful about defining the semantics re; alignment, but yeah it's a gap in current GCC's features. i.e. they'd have to carefully document either per-target or generally that asm with a stack-pointer clobber means the stack will be aligned appropriately for a function call on entry, and that it should still be aligned on exit even if it's grown. Otherwise an RSP clobber means the compiler needs to and rsp, -16 after each such asm statement, if it allowed RSP modification to any arbitrary value.Contingence
P
2

As far as I am concerned, this is currently not possible.

Procrastinate answered 12/2, 2017 at 12:16 Comment(2)
One workaround is to do something like add $-128, %rsp before using the stack, to skip past the red-zone, and sub $-128, %rsp after. (-128 fits in an imm8 so it's smaller than sub $128). TODO: link an existing Q&A.Contingence
Calling printf in extended inline ASM shows an example of skipping the red-zone before using the stack in inline asm. (And the inconvenience of declaring clobbers on all the call-clobbered registers, if you're going to call an arbitrary function!)Contingence
I
2

Apart from Peter Cordes's approach of skipping the redzone:

long getflags0(void){
    long f; __asm(
        "add $-128, %%rsp;\n"
        "pushf; pop %0;\n"
        "sub $-128, %%rsp\n" : "=r"(f) :: );
    return f;
}

which renders:

0000000000000000 <getflags0>:
   0:   48 83 c4 80             add    $0xffffffffffffff80,%rsp
   4:   9c                      pushfq 
   5:   58                      pop    %rax
   6:   48 83 ec 80             sub    $0xffffffffffffff80,%rsp
   a:   c3                      retq   
$sz(getflags0)=11

you can also just list rsp as a clobber and silence the deprecation warning:

long getflags(void){
    long f;
    #pragma GCC diagnostic push
    #pragma GCC diagnostic ignored "-Wdeprecated"
    __asm("pushf; pop %0" : "=r"(f) :: "rsp");
    #pragma GCC diagnostic pop
    return f;
}

which renders:

000000000000000b <getflags>:
   b:   55                      push   %rbp
   c:   48 89 e5                mov    %rsp,%rbp
   f:   9c                      pushfq 
  10:   58                      pop    %rax
  11:   c9                      leaveq 
  12:   c3                      retq   
$sz(getflags)=8

From experience (played with this quite a bit), gcc actually handles rsp clobbers quite well -- by forcing a frame pointer (which it won't let you clobber alongside rsp -- that's a hard assembler error), avoiding the redzone, addressing locals relatively to the frame pointer, and by forcing an %rsp restoring code at the end of the function.

The mechanism of making the compiler let go of the end of the stack is needed for VLAs and allocas anyway, so I don't think it's going anywhere.

I think such rsp clobbers are quite usable for custom stack allocations, frees, and stack switches, as long as you don't mess with what the compiler spilled below the stack pointer it gave you (or open it up to being messed with).

I only had some issues with this approach on clang, but the fix to the compiler seems trivial: https://github.com/llvm/llvm-project/issues/61898.

As for suppressing the warnings without affecting the whole compilation unit,

#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wdeprecated"
//...
#pragma GCC diagnostic pop

can work well inside a (possibly inline -- had no issues with rsp clobbers inside inline functions either) function, or you can generate the pragma with _Pragma to make it usable inside of macros.

Clang doesn't complain about rsp clobbers (though you will run into issues on it if you use rsp clobbers on it for memory allocation unless you apply my fix to a custom build) unless you compile with -fstack-clash-protection. Then the warning is -Wstack-protector, and it's silenceable equivalently.

Please keep in mind that while this happens to work, it is not officially supported. From https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Clobbers-and-Scratch-Registers-1:

The compiler requires the value of the stack pointer to be the same after an asm statement as it was on entry to the statement. However, previous versions of GCC did not enforce this rule and allowed the stack pointer to appear in the list, with unclear semantics. This behavior is deprecated and listing the stack pointer may become an error in future versions of GCC.

Insignificancy answered 23/9, 2023 at 0:28 Comment(6)
RSP clobber sounds like a "happens-to-work" behaviour. GCC's manual (gcc.gnu.org/onlinedocs/gcc/…) says: the clobber list should not contain the stack pointer ... The compiler requires the value of the stack pointer to be the same after an asm statement as it was on entry to the statement. However, previous versions of GCC did not enforce this rule and allowed the stack pointer to appear in the list, with unclear semantics. This behavior is deprecated and listing the stack pointer may become an error in future versions of GCC.Contingence
@PeterCordes Yeah, that's a very good note that I should've probably somehow included in the answer.Insignificancy
I added a volatile int foo = 0; to your example to verify that it didn't put it in the red-zone. It allocates stack space for it so it's above RSP even in a leaf function. I guess an "rsp" clobber is treated similarly to alloca with a runtime-variable size. Function calls after that do assume RSP is still aligned, so it doesn't cost extra instructions re-aligning RSP after you tell GCC it was clobbered. godbolt.org/z/6fzsE59eeContingence
@PeterCordes Yes, exactly. That's why the answer also mentions VLAs and allocas. In my experience (gcc and my very slightly modified clang) it works quite well and opens up new interesting applications (easy custom allocas+freeas, access to raw pushes/pops from C, etc.). I had originally only made a daredevil comment and, in light of the official deprecation, didn't intend to answer, but I took fuz's bait. Maybe they'll un-deprecate it if people develop enough interesting applications for it :D.Insignificancy
@PeterCordes If not for all the useful applications, they should definitely undeprecate it to make the notorious aligned call from inline assembly simpler godbolt.org/z/KTGvMoT6d. :DInsignificancy
They'd need to be careful about defining the semantics re; alignment, but yeah it's a gap in current GCC's features. i.e. they'd have to carefully document either per-target or generally that asm with a stack-pointer clobber means the stack will be aligned appropriately for a function call on entry, and that it should still be aligned on exit even if it's grown. Otherwise an RSP clobber means the compiler needs to and rsp, -16 after each such asm statement, if it allowed RSP modification to any arbitrary value.Contingence

© 2022 - 2024 — McMap. All rights reserved.