Why is default operand size 32 bits in 64 mode?
Asked Answered
I

1

3

I am reading Intel doc, vol. 1 and There is a chapter of 3.6.1 Operand Size and Address Size in 64-Bit Mode. There are three prefixes REX.W, operand-size 66 and address-size 67 prefix. And there is mentioned that operand are defaulted to be 32 bit in size. And is only possible to change it with REX.W instruction prefix (after other prefixes) to make it 64 bits long.

I do not know why so, why cannot I used the full 64 bit space for example for int operand? Does it have something to do with sign? Or why is there this restriction? (so, does C unsigned int uses REX.W prefix with a operation on the int (as there is also mentioned, a prefix lasts only for a particular instruction, but not for the whole segment, which should be (the size, either address or operand's) default and contained in segment descriptor).

Do I understand it correctly?

Isogloss answered 21/1, 2020 at 21:25 Comment(14)
The default size is 32 to not break existing 32bit programs, which are forward-compatible and can be run on 64bit processors without any modification.Caswell
So it does not have any meaning for example for sign? Or what are other consequences of this design (what does it mean for other structures of cpu and addressing, this particular design) apart of compatibility?Isogloss
It means nothing else. They could have of course changed everything and made the default size 64, but that would have broken compatibility. That's the whole point.Caswell
signedness is an interpretation of data, not a consequence of how it is stored.Stromboli
And - you mentioned, 32bit programs can be still executed, but they use 32 addressing as well, so would not it mean to default 32 ADDRESSING as well? (for the compatibility), or why can 32 programs use 64 addresses (which are default in 64 mode)?Isogloss
32-bit programs will not use the additional address space available from executing on a 64-bit architecture.Stromboli
so they are zeroed at the beginning address?Isogloss
Not sure what you mean by zeroed at the beginning address, but they are certainly relative to "address zero" in your virtual address space.Stromboli
I mean the additional space from 32 to 64 size of address, the half is what zeroed? ones? In 32bit programsIsogloss
@Isogloss it's somewhat arbitrary and irrelevant what the unavailable addresses are because they are not accessible from operating in 32-bit mode. You will most certainly receive a segmentation fault if you attempt to access them. Also do keep in mind everything is technically in a virtual address space, so no physical data was ever initialized for the additional address space in the first place.Stromboli
No matter of virtual or physical space, the instructions have to somehow handle the size of addresses, so in the 32 bit mode, does it access the right size of 64 memory(or registers with address), or left hand part of address? It may be irrelevant, but if you said there is possibility of seg. fault, the I am interested of how does instruction in 32 mode handle address of 64 memory(or register), the number of address is higher. So how (does it handle)?Isogloss
@MarcoBonelli: 64-bit mode is a separate mode; 32-bit programs can't decode correctly in that mode. e.g. 0x40 is inc eax in compat mode but a REX prefix in 64-bit mode. See x86-32 / x86-64 polyglot machine-code fragment that detects 64bit mode at run-time? for an example. Also, the default operand-size for push/pop/call is 8 bytes in 64-bit mode. 64-bit kernels run 32-bit binaries in compat mode. 64-bit mode decoding mostly similarly is a matter of sharing transistors in the decoders, not binary compatibility.Kindig
Whether int is a 32-bit or 64-bit integer is up to the compiler. It's perfectly legal for a compiler to choose that on a 64-bit system, int is a 64-bit integer. Indeed some compilers have this as an option (ILP64). It's not clear whether the question is "Why don't compilers generally use a 64-bit integer for int?" or "Why did Intel design the CPU so 32-bit integers are more convenient?" (These sort of "Why" questions sort of require getting into the mind of the designers, which is speculative in the absence of documents.)Cumbrance
@RaymondChen: Intel had nothing to do with this; they were sailing on the good ship Itanic while AMD was designing AMD64 in ~2000 :P AMD's design decisions seem to have been focused on sharing decoder transistors as much as possible, perhaps in case AMD64 didn't catch on and they were stuck supporting it without people using it. They could have done lots of subtle things that removed annoying CISC quirks of x86 like flags unchanged after zero-count shifts, or for example made setcc a 32-bit operand-size instruction in 64-bit mode. Maybe they thought that could hurt asm source porting?Kindig
K
6

TL:DR: you have 2 separate questions. 1 about C type sizes, and another about how x86-64 machine code encodes 32 vs. 64-bit operand-size. The encoding choice is fairly arbitrary and could have been made different. But int is 32-bit because that's what compiler devs chose, nothing to do with machine code.


int is 32-bit because that's still a useful size to use. It uses half the memory bandwidth / cache footprint of int64_t. Most C implementations for 64-bit ISAs have 32-bit int, including both mainstream ABIs for x86-64 (x86-64 System V and Windows). On Windows, even long is a 32-bit type, presumably for source compatibility with code written for 32-bit that made assumptions about type sizes.

Also, AMD's integer multiplier at the time was somewhat faster for 32-bit than 64-bit, and this was the case until Ryzen. (First-gen AMD64 silicon was AMD's K8 microarchitecture; see https://agner.org/optimize/ for instruction tables.)

The advantages of using 32bit registers/instructions in x86-64

x86-64 was designed by AMD in ~2000, as AMD64. Intel was committed to Itanium and not involved; all the design decisions for x86-64 were made by AMD architects.

AMD64 is designed with implicit zero-extension when writing a 32-bit register, so 32-bit operand-size can be used efficiently with none of the partial-register shenanigans you get with 8 and 16-bit mode.

TL:DR: There's good reason for CPUs to want to make 32-bit operand-size available somehow, and for C type systems to have an easily accessible 32-bit type. Using int for that is natural.

If you want 64-bit operand-size, use it. (And then describe it to a C compiler as long long or [u]int64_t, if you're writing C declarations for your asm globals or function prototypes). Nothing's stopping you (except for somewhat larger code size from needing REX prefixes where you might not have before).


All of that is a totally separate question from how x86-64 machine code encodes 32-bit operand-size.

AMD chose to make 32-bit the default and 64-bit operand-size require a REX prefix.

They could have gone the other way and made 64-bit operand-size the default, requiring REX.W=0 to set it to 32, or 0x66 operand-size to set it to 16. That might have led to smaller machine code for code that mostly manipulates things that have to be 64-bit anyway (usually pointers), if it didn't need r8..r15.

A REX prefix is also required to use r8..r15 at all (even as part of an addressing mode), so code that needs lots of registers often finds itself using a REX prefix on most instructions anyway, even when using the default operand-size.

A lot of code does use int for a lot of stuff, so 32-bit operand-size is not rare. And as noted above, it's sometimes faster. So it kind of makes sense to make the fastest instructions the most compact (if you avoid r8d..r15d).

It also maybe lets the decoder hardware be simpler if the same opcode decodes the same way with no prefixes in 32 and 64-bit mode. I think this was AMD's real motivation for this design choice. They certainly could have cleaned up a lot of x86 warts but chose not to, probably also to keep decoding more similar to 32-bit mode.

It might be interesting to see if you'd save overall code size for a version of x86-64 with a default operand-size of 64-bit. e.g. tweak a compiler and compile some existing codebases. You'd want to teach its optimizer to favour the legacy registers RAX..RDI for 64-bit operands instead of 32-bit, though, to try to minimize the number of instructions that need REX prefixes.

(Many instructions like add or imul reg,reg can safely be used at 64-bit operand-size even if you only care about the low 32, although the high garbage will affect the FLAGS result.)


Re: misinformation in comments: compat with 32-bit machine code has nothing to do with this. 64-bit mode is not binary compatible with existing 32-bit machine code; that's why x86-64 introduced a new mode. 64-bit kernels run 32-bit binaries in compat mode, where decoding works exactly like 32-bit protected mode.

https://en.wikipedia.org/wiki/X86-64#OPMODES has a useful table of modes, including long mode (and 64-bit vs. 32 and 16-bit compat modes) vs. legacy mode (if you boot a kernel that's not x86-64 aware).

In 64-bit mode some opcodes are different, and operand-size default to 64-bit for push/pop and other stack instruction opcodes.

32-bit machine code would decode incorrectly in that mode. e.g. 0x40 is inc eax in compat mode but a REX prefix in 64-bit mode. See x86-32 / x86-64 polyglot machine-code fragment that detects 64bit mode at run-time? for an example.

Also

64-bit mode decoding mostly similarly is a matter of sharing transistors in the decoders, not binary compatibility. Presumably it's easier for the decoders to only have 2 mode-dependent default operand sizes (16 or 32-bit) for opcodes like 03 add r, r/m, not 3. Only special-casing for opcodes like push/pop that warrant it. (Also note that REX.W=0 does not let you encode push r32; the operand-size stays at 64-bit.)

AMD's design decisions seem to have been focused on sharing decoder transistors as much as possible, perhaps in case AMD64 didn't catch on and they were stuck supporting it without people using it.

They could have done lots of subtle things that removed annoying legacy quirks of x86, for example made setcc a 32-bit operand-size instruction in 64-bit mode to avoid needing xor-zeroing first. Or CISC annoyances like flags staying unchanged after zero-count shifts (although AMD CPUs handle that more efficiently than Intel, so maybe they intentionally left that in.)

Or maybe they thought that subtle tweaks could hurt asm source porting, or in the short term make it harder to get compiler back-ends to support 64-bit code-gen.

Kindig answered 21/1, 2020 at 23:26 Comment(5)
I totally agree with the misinformation in the commentsShuntwound
aside from that, why in code segment descriptor for IA-32e in the Default bit for 64-bit mode is 32 bit that doesn't make sense to me if you look at compatibility mode and legacy mode they have 16bit option and 32bit option that makes sense but in 64bit mode, there are only 32bit and no 64bit operand sizeShuntwound
@zerocool: That would increase the complexity of the decoders for little benefit. AMD made their choice for what the CPU can do efficiently, instead of ever having to support REX.W=0 overriding the operand size down to 32. Not supporting a larger matrix of default vs. possible operand sizes in various modes probably simplifies things.Kindig
just go easy on me as I have no idea what REX.W mean the only thing I know is that every instruction has an operand so can give me some articles to read about this REX.W==0 thing or at least give a meaning to it. I'm sorry for such a stupid questions, but learning on your own the is hardShuntwound
@zerocool: If you're going to ask technical questions about x86-64 machine code encoding rules / defaults, you should probably understand the basics of the current design: wiki.osdev.org/X86-64_Instruction_Encoding is pretty good with diagrams.Kindig

© 2022 - 2024 — McMap. All rights reserved.