The original question was Is this assembly function call safe/complete?
. The answer to that is: no. While it may appear to work in this simple example (especially if optimizations are disabled), you are violating rules that will eventually lead to failures (ones that are really hard to track down).
I'd like to address the (obvious) followup question of how to make it safe, but without feedback from the OP on the actual intent, I can't really do that.
So, I'll do the best I can with what we have and try to describe the things that make it unsafe and some of the things you can do about it.
Let's start by simplifying that asm:
__asm__(
"mov %0, %%edi;"
:
: "g"(a)
);
Even with this single statement, this code is already unsafe. Why? Because we are changing the value of a register (edi) without letting the compiler know.
How can the compiler not know you ask? After all, it's right there in the asm! The answer comes from this line in the gcc docs:
GCC does not parse the assembler instructions themselves and does not
know what they mean or even whether they are valid assembler input.
In that case, how do you let gcc know what's going on? The answer lies in using the constraints (the stuff after the colons) to describe the impact of the asm.
Perhaps the simplest way to fix this code would be like this:
__asm__(
"mov %0, %%edi;"
:
: "g"(a)
: edi
);
This adds edi to the clobber list. In brief, this tell gcc that the value of edi is going to be changed by the code, and that gcc shouldn't assume any particular value will be in it when the asm exits.
Now, while that's the easiest, it's not necessarily the best way. Consider this code:
__asm__(
""
:
: "D"(a)
);
This uses a machine constraint to tell gcc to put the value of the variable a
into the edi register for you. Doing it this way, gcc will load the register for you at a 'convenient' time, perhaps by always keeping a
in edi.
There is one (significant) caveat to this code: By putting the parameter after the 2nd colon, we are declaring it to be an input. Input parameters are required to be read-only (ie they must have the same value on exiting the asm).
In your case, the call
statement means that we won't be able to guarantee that edi won't be changed, so this doesn't quite work. There are a few ways to deal with this. The easiest is to move the constraint up after the first colon, making it an output, and specify "+D"
to indicate that the value is read+write. But then the contents of a
are going to be pretty much undefined after the asm (printf could set it to anything). If destroying a
is unacceptable, there's always something like this:
int junk;
__asm__ volatile (
""
: "=D" (junk)
: "0"(a)
);
This tells gcc that on starting the asm, it should put the value of the variable a
into the same place as output constraint #0 (ie edi). It also says that on output, edi won't be a
anymore, it will contain the variable junk
.
Edit: Since the 'junk' variable isn't actually going to be used, we need to add the volatile
qualifier. Volatile was implicit when there weren't any output parameters.
One other point on that line: You end it with a semi-colon. This is legal and will work as expected. However, if you ever want to use the -S
command line option to see exactly what code got generated (and if you want to get good with inline asm, you will), you will find that produces difficult-to-read code. I'd recommend using \n\t
instead of a semi-colon.
All that and we're still on the first line...
Obviously the same would apply to the other two mov
statements.
Which brings us to the call
statement.
Both Michael and I have listed a number of reasons doing call in inline asm is difficult.
- Handling all the registers that may be clobbered by the function call's ABI.
- Handling red-zone.
- Handling alignment.
- Memory clobber.
If the goal here is 'learning,' then feel free to experiment. But I don't know that I would ever feel comfortable doing this in production code. Even when it looks like it works, I'd never feel confident there wasn't some weird case I'd missed. That's aside from my normal concerns about using inline asm at all.
I know, that's a lot of information. Probably more than you were looking for as an introduction to gcc's asm
command, but you've picked a challenging place to start.
If you haven't done so already, spend time looking over all the docs in gcc's Assembly Language interface. There's a lot of good information there along with examples to try to explain how it all works.
printf
but it pretty much applies to calling any function in 64-bit code from inline assembler: https://mcmap.net/q/16848/-calling-printf-in-extended-inline-asm – Latherfoo
's callers are known, it could change howfoo
works so it doesn't even follow the standard ABI. Calling that function in a way the compiler can't see could lead to broken code after link-time optimization, if it builds at all. (e.g. maybefoo
was inlined into all of its C callers, and no stand-alone definition was emitted. This happens even without LTO forstatic
functions.) – Huffman"call foo"
with no indirection, like in that linked question about callingprintf
. – Huffman