Confusing brackets in MASM32
Asked Answered
Y

1

12

I am trying to get to grips with MASM32 and am confused by the following:

I thought that brackets were used for indirection so if I have the a pre-defined variable

 .data
   item dd 42

then

 mov ebx, item

would put the contents of 'item', i.e. the number 42, into ebx and

 mov ebx, [item]

would put the address of 'item', i.e. where the 42 is stored, into ebx.

But the following code in a console app:

 mov ebx, item
 invoke dwtoa, ebx, ADDR valuestr 
 invoke StdOut, ADDR valuestr
 mov ebx, [item]
 invoke dwtoa, ebx, ADDR valuestr 
 invoke StdOut, ADDR valuestr

prints 42 twice. To get the address of 'item' I seem to need

 mov ebx, [OFFSET item]
 invoke dwtoa, ebx, ADDR valuestr 
 invoke StdOut, ADDR valuestr

Can anybody explain what square brackets are for in MASM, or point me at a good reference.

Yardarm answered 5/8, 2014 at 0:54 Comment(1)
A duplicate of this question also points out that var2 dword var1 assembles to the address of var1. This is the only sane behaviour, because var1 could be extern, making its contents unavailable at assemble time. Fortunately, offset var1 is allowed in that context, so you can always use unambiguous notation.Shores
G
18

MASM is unusual for an assembly language in that is has types. MASM knows because of how you defined the symbol item that is a memory location of type DWORD. When you use it as an operand knows that you (probably) mean that you want the value stored at the address, not the value of the address. So it doesn't matter if you use item or [item] MASM assumes you mean the later. If you really want the address of item instead you need to use OFFSET item.

On the other hand if you had defined item as constant using item = 42 then mov ebx, item would load the immediate value. Because of this ambiguity, you need to know how item was defined to determine if it's an immediate operand or a memory operand, it's good idea to avoid using a bare symbol as an operand.

I should add that the square brackets [] mean pretty much nothing to MASM when you're just using symbols or numbers. They only mean something when you use them with registers. Here's some examples:

item    DD  42
const   =   43

    mov eax, item             ; memory operand (the value stored at item)
    mov eax, [item]           ; memory operand
    mov eax, OFFSET item      ; immediate operand (the address of item)
    mov eax, [OFFSET item]    ; immediate operand

    mov eax, const            ; immediate operand (43)
    mov eax, [const]          ; immediate operand
    mov eax, ds:[const]       ; memory operand (the value at address 43)
    mov eax, fs:30h           ; memory operand (the value at address 30h + fs base)
    mov eax, OFFSET const     ; immediate operand
    mov eax, [OFFSET const]   ; immediate operand

    mov eax, 42               ; immediate operand
    mov eax, ebx              ; register operand (the value of EBX)
    mov eax, [ebx]            ; indirect operand (the value pointed to by EBX)

So without registers square brackets only show your intent. You should put square brackets around symbols if you intend to use them as memory operands, and use OFFSET with symbols you intend to use as immediate values.

Grote answered 5/8, 2014 at 2:8 Comment(7)
What does MASM do if you use mov eax, [esi + OFFSET item]? Does the presence of a register in the effective address turn it into a load instead of a mov r32, imm32?Shores
@PeterCordes The combination of the register and the brackets make it a memory operand. Because there's a register added to a value the brackets are required.The instruction mov eax, esi + OFFSET item is illegal, as it can't be encoded as an immediate operand or register operand, as I'm sure you know. If you want the operation implied by this instruction, adding the offset of item to ESI and storing it in EAX, you can use lea eax, [esi + OFFSET item]. I'd also add that mov eax. [esi + item] means almost the same thing as your example, except if item is a label it must of type DWORD.Grote
I think I meant to ask whether it was a load or a syntax error, since I was really surprised that [OFFSET item] is an immediate. Of course [esi + OFFSET item] can't be an imm32 :P Anyway, ok, so it's a load and the brackets are required, thanks.Shores
@PeterCordes I reverted your edit because it's a bit more complicated than what you described. The problem in the question that you recently closed as a duplicate is that the jump table label was defined with table: making it a NEAR symbol (or whatever the x64 version of MASM calls it). If it was defined as table QWORD ... then it would be a QWORD symbol and an indirect jump would be used. Either way the brackets don't make a difference, but the type of the symbol does matter. Using QWORD PTR fixes the problem by changing the type of the address, but normally this wouldn't be necessary.Grote
Ah thanks, my bad. Want me to reopen so you or Michael can post an answer? Or you can do that yourself, I see you got an asm gold badge at some point :)Shores
fs:30h is also a memory operand, right? You only show the fs:[30h] form. [Difference between two instructions: mov eax, dword ptr fs:[30h] and mov eax, large fs:30h in dereferencing PEB?](Difference between two instructions: mov eax, dword ptr fs:[30h] and mov eax, large fs:30h in dereferencing PEB?) has a mov eax, large fs:30h (maybe IDA syntax), and the OP is confused by the lack of []. (So it could be a duplicate of this)Shores
@PeterCordes Yah, fs:30h and fs:[30h] mean the same thing. The key thing is that there's both a segment and a offset.Grote

© 2022 - 2024 — McMap. All rights reserved.