Compile with gcc -masm=intel
and don't try to switch modes inside the asm template string. AFAIK there's no equivalent before clang14 (Note: MacOS installs clang as gcc
/ g++
by default.)
Also, of course you need to use valid GNU C inline asm, using operands to tell the compiler which C objects you want to read and write.
I don't believe Intel syntax uses the percent sign. Perhaps I am missing something?
You're getting mixed up between %operand
substitutions into the Extended-Asm template (which use a single %
), vs. the final asm that the assembler sees.
You need %%
to use a literal %
in the final asm. You wouldn't use "mov %%eax, 1"
in Intel-syntax inline asm, but you do still use "mov %0, 1"
or %[named_operand]
.
See https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html. In Basic asm (no operands), there is no substitution and % isn't special in the template, so you'd write mov $1, %eax
in Basic asm vs. mov $1, %%eax
in Extended, if for some reason you weren't using an operand like mov $1, %[tmp]
or mov $1, %0
.
uint32_t rnds_00_15;
is a local with automatic storage. Of course it there's no asm symbol with that name.
Use %[rnds_00_15]
and compile with -masm=intel
(And remove the .att_syntax
at the end; that would break the compiler-generate asm that comes after.)
You also need to remove the DWORD PTR
, because the operand-expansion already includes that, e.g. DWORD PTR [rsp - 4]
, and clang errors on DWORD PTR DWORD PTR [rsp - 4]
. (GAS accepts it just fine, but the 2nd one takes precendence so it's pointless and potentially misleading.)
And you'll want a "=m"
output operand if you want the compiler to reserve you some scratch space on the stack. You must not modify input-only operands, even if it's unused in the C. Maybe the compiler decides it can overlap something else because it's not written and not initialized (i.e. UB). (I'm not sure if your "memory"
clobber makes it safe, but there's no reason not to use an early-clobber output operand here.)
And you'll want to avoid label name conflicts by using %=
to get a unique number.
Working example (GCC and ICC, but not clang unfortunately), on the Godbolt compiler explorer (which uses -masm=intel
depending on options in the dropdown). You can use "binary mode" (the 11010 button) to prove that it actually assembles after compiling to asm without warnings.
int libtest_intel()
{
uint32_t rnds_00_15;
// Intel syntax operand-size can only be overridden with operand modifiers
// because the expansion includes an explicit DWORD PTR
__asm__ __volatile__
( // ".intel_syntax noprefix \n\t"
"mov %[rnds_00_15], 1 \n\t"
"cmp %[rnds_00_15], 1 \n\t"
"je .Ldone%= \n\t"
".Ldone%=: \n\t"
: [rnds_00_15] "=&m" (rnds_00_15)
:
: // no clobbers
);
return 0;
}
Compiles (with gcc -O3 -masm=intel
) to this asm. Also works with gcc -m32 -masm=intel
of course:
libtest_intel:
mov DWORD PTR [rsp-4], 1
cmp DWORD PTR [rsp-4], 1
je .Ldone8
.Ldone8:
xor eax, eax
ret
I couldn't get this to work with clang: It choked on .intel_syntax noprefix
when I left that in explicitly.
Operand-size overrides:
You have to use %b[tmp]
to get the compiler to substitute in BYTE PTR [rsp-4]
to only access the low byte of a dword input operand. I'd recommend AT&T syntax if you want to do much of this.
Using %[rnds_00_15]
results in Error: junk '(%ebp)' after expression.
That's because you switched to Intel syntax without telling the compiler. If you want it to use Intel addressing modes, compile with -masm=intel
so the compiler can substitute into the template with the correct syntax.
This is why I avoid that crappy GCC inline assembly at nearly all costs. Man I despise this crappy tool.
You're just using it wrong. It's a bit cumbersome, but makes sense and mostly works well if you understand how it's designed.
Repeat after me: The compiler doesn't parse the asm string at all, except to do text substitutions of %operand
. This is why it doesn't notice your .intel_syntax noprefex
and keeps substituting AT&T syntax.
It does work better and more easily with AT&T syntax though, e.g. for overriding the operand-size of a memory operand, or adding an offset. (e.g. 4 + %[mem]
works in AT&T syntax).
Dialect alternatives:
If you want to write inline asm that doesn't depend on -masm=intel
or not, use Dialect alternatives (which makes your code super-ugly; not recommended for anything other than wrapping one or two instructions):
Also demonstrates operand-size overrides
#include <stdint.h>
int libtest_override_operand_size()
{
uint32_t rnds_00_15;
// Intel syntax operand-size can only be overriden with operand modifiers
// because the expansion includes an explicit DWORD PTR
__asm__ __volatile__
(
"{movl $1, %[rnds_00_15] | mov %[rnds_00_15], 1} \n\t"
"{cmpl $1, %[rnds_00_15] | cmp %k[rnds_00_15], 1} \n\t"
"{cmpw $1, %[rnds_00_15] | cmp %w[rnds_00_15], 1} \n\t"
"{cmpb $1, %[rnds_00_15] | cmp %b[rnds_00_15], 1} \n\t"
"je .Ldone%= \n\t"
".Ldone%=: \n\t"
: [rnds_00_15] "=&m" (rnds_00_15)
);
return 0;
}
With Intel syntax, gcc compiles it to:
mov DWORD PTR [rsp-4], 1
cmp DWORD PTR [rsp-4], 1
cmp WORD PTR [rsp-4], 1
cmp BYTE PTR [rsp-4], 1
je .Ldone38
.Ldone38:
xor eax, eax
ret
With AT&T syntax, compiles to:
movl $1, -4(%rsp)
cmpl $1, -4(%rsp)
cmpw $1, -4(%rsp)
cmpb $1, -4(%rsp)
je .Ldone38
.Ldone38:
xorl %eax, %eax
ret
uint32_t rnds_00_15;
is a local with automatic storage. Of course it there's no asm symbol with that name. Did you mean%[rnds_00_15]
? And hopefully with a"+m"
read/write operand if you're going to store1
into it. Or simply"=m"
if you want the compiler to reserve you some scratch space on the stack. – Duero%[rnds_00_15]
results inError: junk '(%ebp)' after expression
. This is why I avoid that crappy GCC inline assembly at nearly all costs. Man I despise this crappy tool. – Migraine