What is "=qm" in extended assembler

About

Asked 2/2, 2014 at 22:2 Answered 2/2, 2014 at 22:27

Solved gcc assembly syntax x86 inline-assembly

I was looking through an Intel provided reference implementation of RDRAND instruction. The page is Intel Digital Random Number Generator (DRNG) Software Implementation Guide, and the code came from Intel Digital Random Number Generator software code examples.

The following is the relevant portion from Intel. It reads a random value and places it in val, and it sets the carry flag on success.

char rc;
unsigned int val;

__asm__ volatile(
    "rdrand %0 ; setc %1"
    : "=r" (val), "=qm" (rc)
);

// 1 = success, 0 = underflow
if(rc) {
    // use val
    ...
}

Soory to have to ask. I don't think it was covered in GNU Extended Assembler, and searching for "=qm" is producing spurious hits.

What does the "=qm" mean in the extended assembler?

Hydrodynamic answered 2/2, 2014 at 22:2 Comment(8)

GCC6 can avoid the craptastic setc to create an integer that the compiler will then test to put back into flags to branch on, using a flag output constraint. (And memory output is a poor choice for older compilers, don't let the compiler shoot itself in the foot that way.). Or better, use the _rdrand32_step intrinsic (RDRAND and RDSEED intrinsics GCC and Intel C++) and not inline asm at all. – Pollie 19/4, 2019 at 22:28

Thanks Peter. Intrinsics are avoided because of GCC Issue 80180, Incorrect codegen from rdseed intrinsic use. We don't want to risk running into the bug in the field. It is easy enough to sidestep with the inline asm. – Hydrodynamic 19/4, 2019 at 22:32

Oh well. In that case I'd definitely suggest removing the "=m" option from your intrinsics. Clang likes to pick memory when it has the choice even with no register pressure, and you definitely don't want that. It's only 2 register outputs total, and you're always reading the result right away afterward, so don't give the compiler the option of memory, you can be pretty sure it's not a helpful choice. (gcc is smart enough not to pick memory, but clang isn't.) – Pollie 19/4, 2019 at 22:53

@Peter - Yeah, no big deal. The reason I know about it is because I hit it during testing. It looks like we went with something similar (after several iterations): "=a" (*reinterpret_cast<word64*>(output)) (where output is memory). The byte codes gave us better coverage using older compilers. – Hydrodynamic 19/4, 2019 at 22:56

I think one of the failure modes for RDRAND is "RNG hardware broken, fails forever". If you're going to loop on it at all, in theory you should have a repeat limit, unless you don't mind an infinite loop as the behaviour for that case. I'm not 100% sure that's a real possibility. (At least on the IvB implementation, it can never actually fail from buffer-empty/try-again, but code should be prepared for that, too.) – Pollie 19/4, 2019 at 23:14

@Peter - Yeah, I was talking to DJ Johnston about that a while back. He designed the circuit for Intel. He also advised a safety valve. My feeling is, we need good silicon or all bets are off. If the silicon is broke they have bigger problems then an endless loop. Removing the checks and the throw simplified logic considerably. – Hydrodynamic 19/4, 2019 at 23:25

@Peter - In fact, early iterations of the class did use a safety valve. Then we learned RDSEED will fail to produce a random number on occasion (it is expected, and happens about 1/64 to 1/256 requests, depending on how the die is tuned). And not soon after, we learned the safety was set too low for larger blocks of random numbers. Spurious failures were encountered, which lead to mailing list messages and bug reports. Now we just deliver all the bytes requested, and don't worry about safeties. No more failures, no more mailing list messages, and no more bug reports. – Hydrodynamic 19/4, 2019 at 23:28

Oh interesting, I guess later silicon could exhaust the buffer. If you haven't had reports of actual infinite loops, then that failure mode must be very rare or not implemented in practice. And BTW, my understanding was that the digital logic can (in theory) safely detect that the analogue part is broken / drifted too far / something and not a quality source of randomness. Not that the whole chip is failing, so there's no implication that it will start running machine code wrong. But yeah, if you aren't getting any bug reports about it, then that's good enough for now. – Pollie 20/4, 2019 at 0:12

What you're looking at is an inline assembler constraint. The GCC documentation is at 6.47.3.1 Simple Constraints and 6.47.3.4 Constraints for Particular Machines under x86 family section. This one (=qm) combines three flags which indicate:

=: The operand is write-only - its previous value is not relevant.
q: The operand must be in register a, b, c, or d (it cannot be in esi, for instance).
m: The operand may be placed in memory.

Extortionate answered 2/2, 2014 at 22:27 Comment(2)

Thanks duskwuff. Does the use of "=qm" require a clobber on rc? Or is it OK since the assembler will put it in a register, and track that register's use? – Hydrodynamic 3/2, 2014 at 0:15

No, it does not. Specifying rc as an output operand already implies that it'll get overwritten with the new value of rc. Generally speaking, you only need to list "clobbers" for temporary registers that aren't otherwise listed in the operands. – Extortionate 3/2, 2014 at 0:22

qm probably means 1 byte 8 bit mem =qm will be valid constraint for storing 1 byte result See what setc wants

http://web.itu.edu.tr/~aydineb/index_files/instr/setc.html

reg8 and mem8

as we know only eax , ebx edx ecx .. a,b,c,d registers that q refer can be used cause they can accessed with low byte al dl cl ...With combining qm we are getting mem8 . m meant memory. Thats what I meant

Townshend answered 2/2, 2014 at 22:22 Comment(0)

-1

Wow that stumped me at first but I searched around a bit and found out that it is a reference to the model of the processor this peice of code is meant for.

Spicically I read that it is for the i7 Quadcore.

Is that where you got this code from?

It is a simple value indicator for a variable syntax.

Willardwillcox answered 2/2, 2014 at 22:13 Comment(1)

The QM model suffix (for Quad-core Mobile processors) is completely unrelated. – Extortionate 2/2, 2014 at 22:28

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags