Intel REX Encoding of PUSH

GAS gives the following encodings for the following instructions:

push rbp    # 0x55
push rbx    # 0x53
push r12    # 0x41 0x54
push r13    # 0x41 0x55

From the AMD64 spec (Page 313):

PUSH reg64 50 +rq Push the contexts of a 64-bit register onto the stack.

Since the offsets for rbp and rbx are 5 and 3, respectively, the first two encoding make sense. I don't understand what's going on with the last two encodings, though.

I understand that 0x40-0x4f is a REX prefix and 0x41 has the REX.B bit set (which is either an extension to the MSB of MODRM.rm or SIB.base, according to this external reference). The spec mentions that to access all of the 16 GPRs you need to use REX, but it's unclear where the cutoff is.

From consulting the docs for MODRM and SIB, I don't think SIB is used, because its purpose is indexing using a base+offset register (although to be honest, I can't really tell how you differentiate between MODRM and SIB given just the encoding).

So, I suspect MODRM is being used here. Considering just the push r12 (0x41 0x54) for the moment (and noting that r12 has offset 12), we have:

+----------------+--------------------+
| 0x41           | 0x54               |
+----------------+--------------------+
| REX            | MODRM              |
+--------+-------+-----+--------+-----+
| Prefix | WRXB  | mod | reg    | rm  |
| 0100   | 0001  | 01  | 01   0 | 100 |
+--------+-------+-----+--------+-----+

REX.B + MODRM.rm = 0b1100 = 12 so this would indicate that that is the source register (r12 = offset 12). If you ignore all of the tables in the external (unofficial) reference, REX.R + MODRM.mod + MODRM.reg = 0b00101 = 5, which is the first nibble of the push instruction base 0x50.

So, I think I have worked this backwards, but I don't understand how I would arrive at an encoding like 0x41 0x54. From the AMD reference, Figure 1-10 (Page 54) has a footnote that if MODRM.mod = 01 or 10, then the byte "includes an offset specified by the instruction displacement field." This would perhaps hint at why we have the instruction offset REX.R + MODRM.mod + MODRM.reg = 0b00101 = 5. But, why is the MODRM.mod part of the instruction offset? If it must be included than instructions that take this offset form are limited to prefixes 0b01 or 0x10. That can't be right, right?

tl;dr

How does the REX encoding actually work for instructions like push?
What is the instruction offset cutoff for needing a REX prefix? (is it documented that I can't do 0x50 + 12 for push r12 like I could for push rbp or push rbx?)
Why is the MODRM.mod included in the prefix of the instruction base? (Or is this correct at all?)
Is this consistent for similar instructions like pop? (And how do I know which instructions support this? Does it work for all instructions that have opcodes of the form XX +xx?)
Where is this documented in the official manual?
How can I differentiate between whether a REX prefix is followed by a MODRM or SIB byte?
Is there better documentation that perhaps lays these processes out in steps instead of making you jump between several pages from table to table?

Intel's vol.2 manual PDF documents the encoding:

3.1.1.1 Opcode Column in the Instruction Summary Table (Instructions without VEX Prefix)

...
+rb, +rw, +rd, +ro — Indicated the lower 3 bits of the opcode byte is used to encode the register operand without a modR/M byte. The instruction lists the corresponding hexadecimal value of the opcode byte with low 3 bits as 000b. In non-64-bit mode, a register code, from 0 through 7, is added to the hexadecimal value of the opcode byte. In 64-bit mode, indicates the four bit field of REX.b and opcode[2:0] field encodes the register operand of the instruction. “+ro” is applicable only in 64-bit mode. See Table 3-1 for the codes.

Table 3-1 uses the same coding scheme as register numbers in ModRM and SIB, unsurprisingly, but Intel goes all out and has a full table of all integer registers for all operand-sizes. Including AH/BH/CH/DH, because mov ah, 1 can use the 2-byte short form.

I've excerpted the relevant rows from the "quadword register (64-Bit Mode only)" column:

From Intel's Table 3-1. Register Codes Associated With +rb, +rw, +rd, +ro reg REX.B Reg Field RBX None 3 RBP None 5 R12 Yes 4 R13 Yes 5

Fun fact: in Intel's manual, they actually use 50 + rd instead of 50 + ro for PUSH r64, same as for push r32 in 32-bit mode. https://www.felixcloutier.com/x86/push.

Is this consistent for similar instructions like pop? (And how do I know which instructions support this? Does it work for all instructions that have opcodes of the form XX +xx?)

Yes. push/pop reg, mov reg,imm, and xchg eax, r32 / xchg rax, r64 all use the same encoding with 3 opcode bits to encode a register.

It would be nice if we could have those 8 xchg opcodes back for something more useful (like more compact VEX or EVEX prefixes in 64-bit mode), but that ship sailed when AMD played it conservative with AMD64, mostly keeping machine code as similar as possible to 32-bit mode. They did reclaim the 0x4? inc/dec reg opcodes for use as REX prefixes, though.

Intel's vol.2 manual PDF documents the encoding:

Recommended topics

Hot tags