As Harold pointed out: there is no reason.
Maybe the authors found the use of a forward move more appealing than a reverse move or maybe they just picked up the first opcode.
I took a look at the NASM source code and found out that the encoding is done essentially with a big lookup table so it is really a matter of taste.
Using the other opcode (8AC3) would have simplified the code (I guess) if the parsing had not used a lookup table: instructions like addps
are asymmetric and by using 8A /r
for mov al, bl
the code could be reused to compute the ModR/M byte for addps
and similar instructions too. addps xmm0, xmm3
use the same ModR/M byte (C3) as mov al, bl
when the 8A /r
is used.
Note that the registers A
(B
) and xmm0
(xmm0
) are encoded with the same numbers.
It is still fun, however, to figure out why there are two encodings.
As Mark Hopkins (re)discovered, the earlier x86 instructions encoding make a lot more sense in octal.
A byte in octal has three digits that I'll call G P F (Group, oPeration, Flags).
G is the octal group, instructions in the same group tend to perform similar tasks (e.g. arithmetic vs moves).
However, this is not a strict division.
P is the operation; for example, in the arithmetic group an operation is the subtraction and another is the addition.
F is a bitset used to control the behaviour of the operation. Each group and operation use the digit F as they please, it may not even be a bit set (for example G=2, P=7 is mov r16, imm16
and F is used to select r16
).
For the mov
instructions that move from memory/register into a register or the other way around the G is 2 and P is 1.
The F is a 3-bit field with semantic:
2 1 0 bit
+---+---+---+
| s | d | b |
+---+---+---+
s = 1 if moving to/from a segment register
0 if moving to/from a gp register
d = 1 if moving mem -> reg
0 if moving mem <- reg
b = 1 if moving a WORD
0 if moving a BYTE
We can begin to form opcodes but we still miss a way to select the operands.
G=2, P=1, F={s=0, d=0, b=0} 210 (88) mov r/m8, r8
G=2, P=1, F={s=0, d=0, b=1} 211 (89) mov r/m16, r16
G=2, P=1, F={s=0, d=1, b=0} 212 (8A) mov r8, r/m8
G=2, P=1, F={s=0, d=1, b=1} 213 (8B) mov r16, r/m16
G=2, P=1, F={s=1, d=0, b=0} 214 (8C) mov r/m16, Sreg
G=2, P=1, F={s=1, d=0, b=1} 215 (8D) Not a move, segment registers are 16-bit
G=2, P=1, F={s=1, d=1, b=0} 216 (8E) mov Sreg, r/m16
G=2, P=1, F={s=1, d=1, b=1} 217 (8F) Not a move, segment registers are 16-bit
After the opcode it must come the ModR/M byte, it is used to select the addressing mode and the register.
The ModR/M byte can be regarded, in octal, as three fields: X R M.
X and M are combined together to form the addressing mode.
R selects the register (e.g. 0 = A, 3 = B).
One of the addressing mode (X=3, M=any) lets us address the registers (through M) and not the memory.
For example, X=3, R=0, M=3 (C3) sets the register B as the "memory" operand and the register A as the register operand.
While X=3, R=3, M=0 (D8) sets the register A as the "memory" operand and the register B as the register operand.
Here we can see where the ambiguity lies: the ModR/M byte lets us encode a source register and a destination register. Meanwhile, the opcode let us encode a move from the source to the destination or from the destination to the source - this gives us the freedom to choose which register is what.
For example, suppose we want to move B into A.
If we settle on A as the register operand (source) and B as the memory operand (destination) then the ModR/M byte is X=3, R=0, M=3 (C3).
To move from B to A, as in your example, using the lower 8 bits only, we encode the move as
G=2, P=1, F={s=0,d=1,b=0} (8A) because we move mem->reg (B->A).
Thus the final instruction is 8AC3.
If we choose A as the memory operand (destination) and B as the register operand (source) the ModR/M byte is X=3, R=3, M=0 (D8).
The move is G=2, P=1, F={s=0,d=0,b=0} (88) because we move reg->mem (B->A).
The final instruction is 88D8.
If we want to move the whole 16-bit register (we ignore operand size prefixes here) we just set the b bit of F:
G=2, P=1, F={s=0,d=1,b=1} for the first case, leading to 8BC3.
G=2, P=1, F={s=0,d=0,b=1} for the second case, leading to 89D8.
You can check this out with ndisasm
00000000 8AC3 mov al,bl
00000002 88D8 mov al,bl
00000004 8BC3 mov ax,bx
00000006 89D8 mov ax,bx