The difference is because your assembler strangely and dangerously accepts 0FF20h
as implying word
operand-size. But even for your assembler, leading zeros don't imply operand-size, just the actual value; presumably it checks the position of the most significant bit.
This is not the case for a well-designed and consistent assembler syntax like NASM: If I try to assemble this in 16-bit mode with nasm -fbin foo.asm
mov [es: si], 2
mov [es: si], 0ff20H
I get these errors:
foo.asm:1: error: operation size not specified
foo.asm:2: error: operation size not specified
Only a register can imply an operand-size for the whole instruction, not the width of a constant. (mov [si], ax
is not ambiguous: there is no form of mov
where the destination has a different width than the source, and ax
is definitely word
sized.)
Same applies for GAS (the GNU assembler), in both AT&T and Intel syntax modes. (Its Intel-syntax mode is very similar to MASM.)
There's no mov r/m16, sign_extended_imm8
encoding, but there is for add
and most ALU operations, so there's no reason for an assembler to assume that xyz [mem], 0
means byte operand size. More likely the programmer forgot to specify, so it treats it as an error instead of silently accepting something ambiguous.
mov word [mem], 0
is a totally normal way to zero a word in memory.
Besides all that, x86 supports 32-bit operand size in 16-bit code, using a 66h
operand-size prefix. This is independent from the address-size.
mov dword ptr es:[si], 0FF20h
is also encodeable, and completely ambiguous with mov word ptr es:[si], 0FF20h
if you leave out the size ptr
specifier.
As Jester commented, if leading zeros counted as part of the width of the constant, 0FF20h
could easily be taken as implying dword
.
Note that you had to write 0FF20H
with a leading zero too so if the assembler really relied on the length of the literal, it could have thought that was a dword ... similarly for 0FFH
. It would be a dangerous game. Note sensible assemblers don't even allow your second form without explicit size. That's just a bug waiting to happen.
(Sensible assemblers include NASM and GAS, like I showed above).
If I were you, I'd be unhappy that my assembler accepted mov es:[si], 0FF20h
without complaint. I thought emu8086 was even worse than MASM, and usually accepted stuff like mov [si], 2
with some default operand size instead of warning even then.
I'm not a big fan of how MASM magically infers operand-size from symbol db 1, 2, 3
either, but that's not ambiguous, it just means you have to look at how a symbol was declared to know what operand-size it will imply.
0020H
could also be just a byte but0FF20H
doesn't fit in a single byte I guess. Not sure though, just a hunch. – Tobacco0FF20H
with a leading zero too so if the assembler really relied on the length of the literal, it could have thought that was a dword ... similarly for0FFH
. It would be a dangerous game. Note sensible assemblers don't even allow your second form without explicit size. That's just a bug waiting to happen. – Williawilliamword ptr
also on line 44, to show your intent to update 16 bits of memory. The fact that it does compile as expected by accident is irrelevant, especially in something as fragile as assembly you should be rather completely explicit and accurate, for the purpose of review and debugging (for example when you ask something on SO, and post your source, you can bet majority of readers will be unable to tell what is the "default" behaviour of your assembler, so being explicit in every ambiguous case helps a lot with reviews). – Amuletmov [si], word 0x20
- but I don't use it), and the reasoning was very solid, by statingmov word ptr [si],20h
you are saying that you want to modify 16 bits of memory, but you don't mind encoding of constant as 8 bit, if such opcode (mov word ptr [r],sign-extended-imm8
) does exist, so you give the assembler more accurate information what your really want, and leave him relaxed constraints on constant optimization. – Amuletmov
doesn't have encodings with narrow immediates, except formov r64, sign_extended_imm32
. ALU instructions likeadd word [mem], imm8
exist, though. It would be a nice code-size saving for x86-64 to use one of the opcode bytes it freed up, like SALC orPOP ES
, as the opcode for amov r/m64/32/16, sign-extended-imm8
, giving youmov eax,1
in 3 bytes. And the very commonmov qword [mem], 0
in 4 bytes + extra for the addressing mode. Saving 3 bytes vs. imm32 for memory dst.) – Gobo