In the x86-64 Tour of Intel Manuals, I read
Perhaps the most surprising fact is that an instruction such as
MOV EAX, EBX
automatically zeroes upper 32 bits ofRAX
register.
The Intel documentation (3.4.1.1 General-Purpose Registers in 64-Bit Mode in the manual Basic Architecture) quoted at the same source tells us:
- 64-bit operands generate a 64-bit result in the destination general-purpose register.
- 32-bit operands generate a 32-bit result, zero-extended to a 64-bit result in the destination general-purpose register.
- 8-bit and 16-bit operands generate an 8-bit or 16-bit result. The upper 56 bits or 48 bits (respectively) of the destination general-purpose register are not be modified by the operation. If the result of an 8-bit or 16-bit operation is intended for 64-bit address calculation, explicitly sign-extend the register to the full 64-bits.
In x86-32 and x86-64 assembly, 16 bit instructions such as
mov ax, bx
don't show this kind of "strange" behaviour that the upper word of eax is zeroed.
Thus: what is the reason why this behaviour was introduced? At a first glance it seems illogical (but the reason might be that I am used to the quirks of x86-32 assembly).
r32
destination operand zero the high 32, rather than merging. For example, some assemblers will replacepmovmskb r64, xmm
withpmovmskb r32, xmm
, saving a REX, because the 64bit destination version behaves identically. Even though the Operation section of the manual lists all 6 combinations of 32/64bit dest and 64/128/256b source separately, the implicit zero-extension of the r32 form duplicates the explicit zero-extension of the r64 form. I'm curious about the HW implementation... – Gillettxor eax,eax
orxor r8d,r8d
is the best way to zero RAX or R8 (saving a REX prefix for RAX, and 64-bit XOR isn't even handled specially on Silvermont). Related: How exactly do partial registers on Haswell/Skylake perform? Writing AL seems to have a false dependency on RAX, and AH is inconsistent – Gillett