Is there a default operand size in the x86-64 (AMD64) architecture?
Asked Answered
N

1

2

This is a question about the operand-size override prefixes in the x86-64 (AMD64) architecture.

Here is a bunch of assembler instructions (nasm) and their encodings; by new I mean the r8, ..., r15 registers:

                                                                   67: address-size override prefix
                                                                   |
                                                                   |  4x: operand-size override prefix
                                                                   |  |
   ;   Assembler                   ; | Dst operand | Src operand | -- --
       mov      eax,ecx            ; | 32-bit      | 32-bit      |       89 C8     |
       mov      r8d,ecx            ; | 32-bit new  | 32-bit      |    41 89 C8     |
       mov      eax,r9d            ; | 32-bit      | 32-bit new  |    44 89 C8     |
       mov      r8d,r9d            ; | 32-bit new  | 32-bit new  |    45 89 C8     |
       mov      rax,rcx            ; | 64-bit      | 64-bit      |    48 89 C8     |
       mov      r8,rcx             ; | 64-bit new  | 64-bit      |    49 89 C8     |
       mov      rax,r9             ; | 64-bit      | 64-bit new  |    4C 89 C8     |
       mov      r8,r9              ; | 64-bit new  | 64-bit new  |    4D 89 C8     |

       lea      eax,[ecx]          ; | 32-bit      | 32-bit      | 67    8D 01     |
       lea      r8d,[ecx]          ; | 32-bit new  | 32-bit      | 67 44 8D 01     |
       lea      eax,[r9d]          ; | 32-bit      | 32-bit new  | 67 41 8D 01     |
       lea      r8d,[r9d]          ; | 32-bit new  | 32-bit new  | 67 45 8D 01     |
       lea      rax,[rcx]          ; | 64-bit      | 64-bit      |    48 8D 01     |
       lea      r8,[rcx]           ; | 64-bit new  | 64-bit      |    4C 8D 01     |
       lea      rax,[r9]           ; | 64-bit      | 64-bit new  |    49 8D 01     |
       lea      r8,[r9]            ; | 64-bit new  | 64-bit new  |    4D 8D 01     |

       push     rax                ; |             | 64-bit      |       50        |
       push     r8                 ; |             | 64-bit new  |    41 50        |

From studying these and the same instructions with other registers, I deduce the following. There is a pairing between ‘old’ and ‘new’ registers. Non-exhaustively:

   AX <--> R8
   CX <--> R9
   DX <--> R10
   BX <--> R11
   BP <--> R13 

Ignoring the size prefix, the instruction bytes do not refer to particular registers, but to pairs of registers. As an example: the bytes 89 C8 indicate a mov instruction from a source which is either ecx, rcx, r9d, or r9, to a destination which is either eax, rax, r8d, or r8. Given that the operands must be both 32- or 64-bits wide, there are eight legal possible combinations. The operand-size override prefix (or absence thereof) indicates which of those combinations is the intended one. For instance if the prefix is present and is 44, then the source operand must be a 32-bit new register (in this example then collapsing to r9d) and the destination must be a 32-bit old register (here then signalling eax).

I may not have got it totally right, but I think I get the gist of it. It would appear then that what the operand-size override prefixes do override is the fact that without them the instruction would use 32-bit ‘old’ operands.

But for sure, there is something that escapes me, otherwise: what sense then does it make to talk about “a version of x86-64 with a default operand-size of 64-bit” (like here)?

Or is there a way, running on a 64-bit machine, to set the default operand size to either 32 or 64, and if so, and if my program set the machine appropriately, I would see different encodings?

Also: when would the 66H operand-size override prefix be used?

Nipa answered 7/7, 2021 at 15:57 Comment(8)
Have a look at the Intel Software Development Manuals. Ther the encoding and meaning of prefixes is explained. I might write a real answer later.Babul
The 40h to 4Fh prefixes are called REX prefixes. They can indicate 64-bit operand size. They can also indicate to use one of the upper 8 registers for the source or the destination. Any combination of these options is possible I believe.Bibliography
The 66 prefix changes the operand size to 16 bits.Bolten
Refer to eg wiki.osdev.org/X86-64_Instruction_Encoding#Encoding for the REX bit meanings.Bibliography
Yes in machine code, the default is 32-bit for most instructions, 64-bit for stack and jump/call instructions. In assembly source, there's no default, it must be implied by a register or specified explicitly. (Except for some assemblers having a default for push/pop.) Note that 16-bit AX corresponds to 16-bit R8W, while RAX and R8 are the pair distinguished by a REX prefix.Kassiekassity
@Bibliography Specifically, the REX prefix encodes four bits of state (one bit for the operand size, three bits for high registers). The mere presence of a REX prefix additionally encodes that you want sil, dil, spl, and bpl instead of ah, ch, dh, and bh.Babul
@Peter Cordes By “AX <--> R8” I meant to say that the registers that are paired for encoding purposes are eax, rax on one side, and r8d, r8 on the other. But Peter, on the question that I linked, why do you talk about “a version of x86-64 with a default operand-size of 64-bit”?Nipa
The "version of x86-64 with a default operand-size of 64-bit" is purely hypothetical. Peter is talking about how things would work if there were a CPU that had such behavior. But in real life, there isn't.Magenmagena
K
2

Yes in 64-bit machine code, the default operand-size is 32-bit for most instructions, 64-bit for stack and jump/call instructions, and also 64-bit for loop and jrcxz. (And the default address-size is 64-bit, so add eax, [rdi] is a 2-byte instruction, no prefixes.) And no, the defaults are not changeable, you can't have 2-byte add rax, rdx.

Operand-size encoding coding in 64-bit mode

  • 64-bit operand-size is signalled by REX.W (0x4? with the high bit set in the low nibble, 48..4f). A REX prefix with the W bit cleared can never override the operand-size to 32-bit for opcodes where it defaults to something else. (Like push)
  • 16-bit operand-size is signalled by a 0x66 prefix, like imul ax, [r8], 123
  • 8-bit operand-size uses different opcodes. (8086 had 8 and 16-bit operand-sizes; the opcodes for 8-bit operand size are unchanged since then. 8086's opcodes for 16-bit operand-size have their default being mode and prefix dependent.)

(In other modes, there is no REX, and 66 sets it to whatever the non-default is.)

Fun fact: loop and jrcxz are overridden to use ECX instead of RCX implicitly by an address-size prefix, not operand-size. IIRC, this makes some sense because the operand-size attribute of a branch affects whether it truncates EIP to IP or not.

For example, GNU .intel_syntax disassembly of those NASM-syntax examples from above.

objdump -drwC -Mintel foo
  401000:       6a 7b                   push   0x7b
  401002:       66 6a 7b                pushw  0x7b
  401005:       03 07                   add    eax,DWORD PTR [rdi]
  401007:       66 03 07                add    ax,WORD PTR [rdi]
  40100a:       48 03 07                add    rax,QWORD PTR [rdi]
  40100d:       66 41 6b 00 7b          imul   ax,WORD PTR [r8],0x7b

Note the imul example used a "high" register so it needed a REX prefix to signal R8, separate from needing a 66 prefix to signal 16-bit operand-size. The .W bit is not set in the rex prefix, it's 0x41 not 0x49.

It doesn't make sense to have both REX.W and a 0x66 prefix. It seems that the REX.W prefix "wins" in that case. Single-stepping 66 48 05 40 e2 01 00 data16 add rax,0x1e240 in Linux GDB on an i7-6700k (Skylake), the single-step leaves RIP pointing to the end of that whole instruction (and adding the full immediate to RAX), not decoding it as add ax, 0xe240 and leaving RIP pointing into the middle of the 4-byte immediate. (A 66 prefix is length-changing for that opcode, like most that have a 32-bit immediate which becomes 16-bit. See https://agner.org/optimize/ re: LCP stalls.)

I got NASM to emit that from o16 add rax, 123456. REX prefixes in general are normal and fine with a 66 prefix, e.g. to encode add r8w, [r15 + r12*4], needing all 3 other bits to be set in the REX's low nibble.


  • 32-bit address size is signalled by a 0x67 prefix, like add eax, [edx].

It can of course be combined with operand-size stuff, totally orthogonal.

Normally 32-bit address size is only useful for the Linux x32 ABI (ILP32 in long mode to save cache footprint on pointer-heavy data structures) where you may want to truncate high garbage from a pointer to make sure address math correctly wraps to stay in the low 4GiB, even with 32-bit negative numbers.

  401012:       67 03 04 ba             add    eax,DWORD PTR [edx+edi*4]

In other modes, 67 sets address size to the non-default. 16-bit address-size also implies 16-bit interpretation of the ModRM byte, so only [bx|bp + si|di] are allowed, no SIB byte to allow the flexibility of 32 / 64-bit addressing.


Modes and sets of defaults

No, the defaults can't be changed in 64-bit mode. Different bits in the GDT entry selected by CS (or any other method) won't matter. AFAIK, the table in https://en.wikipedia.org/wiki/X86-64#Operating_modes is a complete list of the possible combinations of modes and default operand/address sizes.

There's only one set of settings that allows 64-bit operand-size at all. It's not possible even in any legacy mode to have a combo like 16-bit operand, 32-bit address size.

This makes some sense from a hardware-complexity perspective. The more different combos of things it needs to support, the more transistors might be involved in an already complex and power-intensive part of the CPU.

(Although the default stack address size used implicitly by push/pop is selected independently by the SS selector, IIRC. So I think you can have normal 32-bit mode where add eax, [edx] is 2 bytes, except with push/pop/call/ret using ss:sp instead of ss:esp. Not something I've ever tried setting up.)


Note that 16-bit AX corresponds to 16-bit R8W, while RAX and R8 are the pair distinguished by a REX prefix.


In assembly source, there's no default, it must be implied by a register or specified explicitly.

Except for some assemblers having a default for push/pop, or a few bad assemblers that have a default for other cases, including the GNU assembler for things like add $1, (%rdi) defaulting to dword, with a warning only in recent versions. GAS does error on ambiguous mov, strangely. clang's built-in assembler is better, erroring on any ambiguous operand-size.

Kassiekassity answered 7/7, 2021 at 19:18 Comment(7)
Note that 66 and 48 can appear at once with SSE instructions where 66 selects a different data organisation. IIRC there are also some very recent special instructions with such an encoding, will have to look that up.Babul
@fuz: oh yes, true, when 66 is just a mandatory prefix that's essentially part of the opcode. At that point it's not really acting as an operand-size prefix, even though it's still a prefix and can appear is different orders if you want (although Intel recommends a certain order).Kassiekassity
It's not really part of the opcode though; it just has a different function (selecting data organisation).Babul
@fuz: You mean ps vs. pd? Yeah true, but it's also used for integer SSE2 / SSSE3 xmm vs. MMX-register versions like paddb xmm vs. paddb mm. And even within ps/pd, SSE2 cvtps2pd xmm is (no prefix) NP 0F 5A while SSE2 cvtpd2ps xmm is 66 0F 5A. For other mandatory prefixes like rep, the instructions overloaded with it include things as diverse as F2 0F 38 F1 crc32 r, r/m32 vs. 0F 38 F0 movbe.Kassiekassity
@fuz: So anyway, 66 can differentiate instructions that run on different ports, if that matters.Kassiekassity
Many thanks for such detail answer. Another question: when working with 32 bit values, up to now I’ve preferred, as an example, mov eax,r8d rather than mov rax,r8, because I wrongly believed that the latter takes a prefix but not the former. But it seems to me now that there should be no difference, performance- or otherwise?Nipa
@user1752563: Right, try to keep 32-bit values (and pointers used in addressing modes) in "legacy" registers so you can avoid REX prefixes, e.g. mov eax, esi is 2 bytes, vs. mov eax, r8d and mov rax, r8 both being 3 bytes and equal performance. There are still some cases where 32-bit operand-size is faster for reasons other than code-size, e.g. div r8d is much faster on Intel before Ice Lake. And popcnt eax, r8d or imul eax, r8d are faster on some AMD. Also xor-zero on SMont. The advantages of using 32bit registers/instructions in x86-64Kassiekassity

© 2022 - 2024 — McMap. All rights reserved.