Understanding Inline assembly in a pre-processor macro vs Inline assembly in a function
Asked Answered
N

1

7

GGC's inline assembly can be difficult to implement properly and easy to get wrong1. From a higher level perspective inline assembly has some rules that have to be considered outside of what instructions an inline assembly statement may emit.

The C/C++ standards consider asm to be an option and implementation defined. Implementation defined behaviour is documented in GCC to include this:

Do not expect a sequence of asm statements to remain perfectly consecutive after compilation, even when you are using the volatile qualifier. If certain instructions need to remain consecutive in the output, put them in a single multi-instruction asm statement.

Basic inline assembly, or extended inline assembly without any output constraints are implicitly volatile. The documentation says that being volatile doesn't guarantee that successive statements will be ordered as the appear in the source code. This code would not have a guaranteed order:

asm ("cli");
asm ("mov $'M', %%al; out %%al, $0xe9" ::: "eax");
asm ("mov $'D', %%al; out %%al, $0xe9" ::: "eax");
asm ("mov $'P', %%al; out %%al, $0xe9" ::: "eax");
asm ("sti");

If the intention is to use CLI and STI to turn off (and turn back on) external interrupts and output some letters in the order MDP to the QEMU debug console (port 0xe9) then this isn't guaranteed. You can place all of them in a single inline assembly statement or you could use extended inline assembly templates to pass a dummy dependency to each statement guaranteeing ordering.

To make things more manageable OS developers in particular are known to create convenient wrappers around such code. Some developers do it as C pre-processor macros. In theory this looks useful:

#define outb(port, value) \
        asm ("out %0, %1" \
             : \
             : "a"((uint8_t)value), "Nd"((uint16_t)port))

#define cli() asm ("cli")

#define sti() asm ("sti")

You can then use them like this:

cli ();
outb (0xe9, 'M');
outb (0xe9, 'D');
outb (0xe9, 'P');
sti ();

Of course the C pre-processor is done first before the C compiler begins to process the code itself. The pre-processor will generate these statements all in a row which is also not guaranteed to be emitted in a particular order by the code generator:

asm ("cli");
asm ("out %0, %1" : : "a"((uint8_t)'M'), "Nd"((uint16_t)0xe9));
asm ("out %0, %1" : : "a"((uint8_t)'D'), "Nd"((uint16_t)0xe9));
asm ("out %0, %1" : : "a"((uint8_t)'P'), "Nd"((uint16_t)0xe9));
asm ("sti");

My Questions

Some developers have taken it upon themselves to use macros that place the inline assembly statements inside a compound statement like this:

#define outb(port, value) ({ \
        asm ("out %0, %1" \
             : \
             : "a"((uint8_t)value), "Nd"((uint16_t)port)); \
    })

#define cli() ({ \
        asm ("cli"); \
    })

#define sti() ({ \
        asm ("sti"); \
    })

Using these macros as we did before would have the C pre-processor generating this code:

({ asm ("cli"); });
({ asm ("out %0, %1" : : "a"((uint8_t)'M'), "Nd"((uint16_t)0xe9)); });
({ asm ("out %0, %1" : : "a"((uint8_t)'D'), "Nd"((uint16_t)0xe9)); });
({ asm ("out %0, %1" : : "a"((uint8_t)'P'), "Nd"((uint16_t)0xe9)); });
({ asm ("sti"); });

Question 1: Does placing asm statements inside a compound statement guarantee ordering? My view has been that I don't believe so, but I'm actually unsure. It is one of the reasons I avoid using pre-processor macros to generate inline assembly that I may use in a sequence like this.


For years I have used static inline functions in headers for inline assembly statements. Functions provide type checking, but I also believed that inline assembly in functions does guarantee that the side effects (including inline assembly) are emitted by the next sequence point (the ; on the end of a function call).

If I were to call actual functions my expectation is that each of these functions would have the inline assembly statements generated in order relative to one another:

cli ();
outb (0xe9, 'M');
outb (0xe9, 'D');
outb (0xe9, 'P');
sti ();

Question 2 : Does placing the inline assembly statements in actual functions (whether external linkage or inlined) guarantee the order? My feeling is that if this weren't the case that code like:

printf ("hello ");
printf ("world ");

Could be output as hello world or world hello. C's as-if rule suggests that optimizations can't alter observable behaviour. I believed the compiler wouldn't be able to assume that inline assembly actually had altered observable behaviour or not, so the compiler wouldn't be permitted to emit the inline assembly of the functions in another order.

Naidanaiditch answered 15/5, 2019 at 21:50 Comment(9)
Q1: Compiler knows which instructions have dependency. You can expect dependant ones to stay in order. Q2: Same as Q1.Spectra
@rAndom69: Do you have an authoritative source for the statement that GCC knows which instructions have dependencies and will not reorder them?Dungdungan
@EricPostpischil Not really. However this is the way it works since ancient times. inline assembly block is translated into intermediate language which is optimized by both optimization step and linker. Meaning of which it's "equal" to compiler generated intermediate code.Spectra
@rAndom69: In general, inline assembly cannot be translated into intermediate language. The “assembly” content can be arbitrary text, which is inserted into the assembly output verbatim. And manufacturers add new instructions regularly, so one would need to match the architecture specification of the instructions they are using to the compiler statement. This general unauthenticated statement about compiler knowledge of dependencies should not be trusted.Dungdungan
@EricPostpischil then you claim #11986911 etc. are bugs in compiler? (Yes they are however as there is no definition how inline assembly should work - please post link) it's essentially UB. Even VC in v6 used to rearange inline assembly while it was not NASM unit. When working in terms of UB current state is "draft".Spectra
@rAndom69: The question you link to appears to be an example of the compiler not automatically recognizing dependencies, which is in line with what I have written and which is inconsistent with the statement that the compiler recognizes dependencies. It is not a compiler bug unless the compiler specification says the compiler will recognize these dependencies.Dungdungan
@EricPostpischil as I said (and You ignored) there is little definition how of how inline assembly should work. One block is optimized in terms of this block (if compiler/linker feels like). Once again please point to definition if there is any. Otherwise inline assemby should stay what it is, lions there. What is stated was personal experiece with compilers, which is error on my side.Spectra
Eh, It took time but I see. You have issue with original statement that compilers do know. They do know, however they might give a fuck about inline assembly. UB.Spectra
@MichaelPetch: An alternative to separate asm() statements might be to have macros that expand to just a string literal, to be used inside an like asm(cli() outb(0xe9, 'M') sti()); Probably with each macro on a separate line. C string-literal concatenation takes care of joining it up into one big asm template. (Only works for compile-time-constant inputs, though; this doesn't give a mechanism for generating constraints for Extended asm.)Enviable
K
8

Do not expect a sequence of asm statements to remain perfectly consecutive after compilation, even when you are using the volatile qualifier. If certain instructions need to remain consecutive in the output, put them in a single multi-instruction asm statement.

You're actually misreading this (or overreading it). It is NOT saying that volatile asm statements can be reordered; they can't be reordered or removed -- that's the whole point of volatile. What it is saying is that other (non-volatile) things can be reordered with respect to the asm statements, and, in particular, might be moved in between any two of these asm statements. So they might not be consecutive after the optimizer gets through with them, but they will still be in order.

Note that this only applies to volatile asm blocks (which includes all block with no outputs -- they're implicitly volatile). Any other non-volatile asm blocks or statements might be moved between the volatile asm blocks, if otherwise allowed.

Khoury answered 15/5, 2019 at 23:11 Comment(10)
+1 "not consecutive" != "reordered", and it's worth mentioning that all basic asm blocks are implicitly volatile.Meagan
@groo : I'm aware that the any block that has no output constraint (which includes basic inline assembly). I only add them into the comments to ensure people know they are volatile (some don't know the volatile rule).Naidanaiditch
I had quickly commented and then deleted it (but not fast enough). I then upvoted and accepted the answer as it became clear that I was in fact overnalayzing what consecutive implied.I appreciate the answer!Naidanaiditch
@MichaelPetch: People are volatile so I guess that's invalid comment :PSpectra
@MichaelPetch: I'm glad you posted this question; I've seen you claim that asm volatile statements can get reordered before, and I'd always wondered what basis you had for that surprising claim. Glad to know it was just this wording in the docs; I think we now agree in our understandings of what GNU C inline asm does/doesn't guarantee and how to use it correctly (in the rare cases where it can't be avoided).Enviable
@Chris: Worth pointing out that there's no guarantee that register contents lives across separate asm statements. The example in the question correctly puts mov $imm8, %al in the same statement as outb, which is necessary for correctness. But I've seen code that doesn't do that, and that's a bug waiting to bite you.Enviable
@PeterCordes : Years ago when I was doing arm development this issue cropped up and there is information out there that has suggested asm volatile reordering (not just consecutive). What is even worse is that "perfectly consecutive" may imply something different than "consecutive". The concept of asm volatile reordering shows up on GCC bug reports occasionally. A couple of years ago it showed up again: gcc.gnu.org/ml/gcc-help/2017-10/msg00063.html . The problem is that many years ago the docs didn't even touch on these matters very well.Naidanaiditch
It may have been a different story if my journey with asm volatile was a more recent one, but when you live through years where there were real questions it leaves you wondering what the documents are suggesting since in the past it did appear asm volatile could be re-ordered on arm targets.Naidanaiditch
@PeterCordes : In the original question (now slightly altered) I had put the mov and the out in the same statement which is okay (as you say). It was meant as a quick example but I only noticed after I neglected to add the "EAX" clobber on there. As well I cleaned up the question to make the use of volatile consistent between examples.Naidanaiditch
It is the execution order that is guaranteed, assembly source order and memory address order may change.Iolaiolande

© 2022 - 2024 — McMap. All rights reserved.