What is the difference between MOV and LEA?
Asked Answered
U

12

188

I would like to know what the difference between these instructions is:

MOV AX, [TABLE-ADDR]

and

LEA AX, [TABLE-ADDR]
Unbated answered 9/11, 2009 at 8:32 Comment(6)
duplicate: #1658794Depart
thanks nick. First of all, I wouldn't have found an answer to this question by looking into that link. Here I was looking for a specific info, the discussion in the link you provided is more genral in nature.Unbated
I upvoted @Nick's dup ages ago but vtc'd just now. On reflection, I was too hasty and now with naveen that a) the other question does not answer "what's the difference" and b) this is a useful question. Apologies to naveen for my mistake - if only I could undo vtc...Shoreless
LEA vs add: https://mcmap.net/q/22923/-lea-or-add-instructionCataclysmic
Related: Using LEA on values that aren't addresses / pointers? talks about other uses of LEA, for arbitrary math.Pigeonhearted
Does this answer your question? What's the purpose of the LEA instruction?Wop
S
231
  • LEA means Load Effective Address
  • MOV means Load Value

In short, LEA loads a pointer to the item you're addressing whereas MOV loads the actual value at that address.

The purpose of LEA is to allow one to perform a non-trivial address calculation and store the result [for later usage]

LEA ax, [BP+SI+5] ; Compute address of value

MOV ax, [BP+SI+5] ; Load value at that address

Where there are just constants involved, MOV (through the assembler's constant calculations) can sometimes appear to overlap with the simplest cases of usage of LEA. Its useful if you have a multi-part calculation with multiple base addresses etc.

Shoreless answered 9/11, 2009 at 8:40 Comment(5)
It confuses me that lea has "load" in the name and people say it "loads" a computed address into a register, because all of the inputs to compute the memory location are either immediate values or registers. AFAICT lea only performs a computation, it doesn't load anything, where loading means touching memory?Searching
@josephGarvin IIRC the term fetch would be applied to that aspect; Load is just how you replace the value in a register with something from scratch. e.g. LAHF is: Load FLAGS into AH register. In the CLR's CIL (which is a higher level stack based abstract machine, the term load refers to putting a value onto the notional stack and is normally l..., and the s... equivalent does the inverse). These notes: cs.umd.edu/class/sum2003/cmsc311/Notes/Mips/load.html) suggest that there are indeed architectures where your distinction does apply.Shoreless
it all reminds me of slideshare.net/pirhilton/… ;)Shoreless
@JosephGarvin: In computer-architecture terminology, the word "load" can be used for anything that writes a register, if you're talking about how it works internally. Notably in RISC ISAs, load-immediate (which just puts a constant from the machine code into a register), like MIPS lui $t0, 1 (Load Upper Immediate) which sets $a0 to 1<<16, although in that case the value was in memory as part of the machine code. (Or in modern ISAs like AArch64, encoded somehow, not literally there.) And yes, it drives me nuts when code has comments like "store 1 in EDI".Pigeonhearted
As Ruben noted, x86 uses the LAHF mnemonic as well, which writes a GP register. (But I always have to look up whether it's load AH into flags or from FLAGS, since FLAGS is also a register. And there's an SAHF which goes the other direction, reading the general-purpose register AH.)Pigeonhearted
O
59

In NASM syntax:

mov eax, var       == lea eax, [var]   ; i.e. mov r32, imm32
lea eax, [var+16]  == mov eax, var+16
lea eax, [eax*4]   == shl eax, 2        ; but without setting flags

In MASM syntax, use OFFSET var to get a mov-immediate instead of a load.

Outmarch answered 27/6, 2013 at 14:37 Comment(3)
in NASM syntax only. In MASM syntax, mov eax, var is a load, the same as mov eax, [var], and you have to use mov eax, OFFSET var to use a label as an immediate constant.Pigeonhearted
Clear, simple, and demonstrates what I was trying to confirm. Thanks.Carvelbuilt
Note that in all of these examples, lea is the worse choice except in 64-bit mode for RIP-relative addressing. mov r32, imm32 runs on more ports. lea eax, [edx*4] is a copy-and-shift which can't be done in one instruction otherwise, but in the same register LEA just takes more bytes to encode because [eax*4] requires a disp32=0. (It runs on different ports than shifts, though.) See agner.org/optimize and stackoverflow.com/tags/x86/info.Pigeonhearted
H
37

None of the previous answers quite got to the bottom of my own confusion, so I'd like to add my own.

What I was missing is that lea operations treat the use of parentheses different than how mov does.

Think of C. Let's say I have an array of long that I call array. Now the expression array[i] performs a dereference, loading the value from memory at the address array + i * sizeof(long) [1].

On the other hand, consider the expression &array[i]. This still contains the sub-expression array[i], but no dereferencing is performed! The meaning of array[i] has changed. It no longer means to perform a deference but instead acts as a kind of a specification, telling & what memory address we're looking for. If you like, you could alternatively think of the & as "cancelling out" the dereference.

Because the two use-cases are similar in many ways, they share the syntax array[i], but the existence or absence of a & changes how that syntax is interpreted. Without &, it's a dereference and actually reads from the array. With &, it's not. The value array + i * sizeof(long) is still calculated, but it is not dereferenced.

The situation is very similar with mov and lea. With mov, a dereference occurs that does not happen with lea. This is despite the use of parentheses that occurs in both. For instance, movq (%r8), %r9 and leaq (%r8), %r9. With mov, these parentheses mean "dereference"; with lea, they don't. This is similar to how array[i] only means "dereference" when there is no &.

An example is in order.

Consider the code

movq (%rdi, %rsi, 8), %rbp

This loads the value at the memory location %rdi + %rsi * 8 into the register %rbp. That is: get the value in the register %rdi and the value in the register %rsi. Multiply the latter by 8, and then add it to the former. Find the value at this location and place it into the register %rbp.

This code corresponds to the C line x = array[i];, where array becomes %rdi and i becomes %rsi and x becomes %rbp. The 8 is the length of the data type contained in the array.

Now consider similar code that uses lea:

leaq (%rdi, %rsi, 8), %rbp

Just as the use of movq corresponded to dereferencing, the use of leaq here corresponds to not dereferencing. This line of assembly corresponds to the C line x = &array[i];. Recall that & changes the meaning of array[i] from dereferencing to simply specifying a location. Likewise, the use of leaq changes the meaning of (%rdi, %rsi, 8) from dereferencing to specifying a location.

The semantics of this line of code are as follows: get the value in the register %rdi and the value in the register %rsi. Multiply the latter by 8, and then add it to the former. Place this value into the register %rbp. No load from memory is involved, just arithmetic operations [2].

Note that the only difference between my descriptions of leaq and movq is that movq does a dereference, and leaq doesn't. In fact, to write the leaq description, I basically copy+pasted the description of movq, and then removed "Find the value at this location".

To summarize: movq vs. leaq is tricky because they treat the use of parentheses, as in (%rsi) and (%rdi, %rsi, 8), differently. In movq (and all other instruction except lea), these parentheses denote a genuine dereference, whereas in leaq they do not and are purely convenient syntax.


[1] I've said that when array is an array of long, the expression array[i] loads the value from the address array + i * sizeof(long). This is true, but there's a subtlety that should be addressed. If I write the C code

long x = array[5];

this is not the same as typing

long x = *(array + 5 * sizeof(long));

It seems that it should be based on my previous statements, but it's not.

What's going on is that C pointer addition has a trick to it. Say I have a pointer p pointing to values of type T. The expression p + i does not mean "the position at p plus i bytes". Instead, the expression p + i actually means "the position at p plus i * sizeof(T) bytes".

The convenience of this is that to get "the next value" we just have to write p + 1 instead of p + 1 * sizeof(T).

This means that the C code long x = array[5]; is actually equivalent to

long x = *(array + 5)

because C will automatically multiply the 5 by sizeof(long).

So in the context of this StackOverflow question, how is this all relevant? It means that when I say "the address array + i * sizeof(long)", I do not mean for "array + i * sizeof(long)" to be interpreted as a C expression. I am doing the multiplication by sizeof(long) myself in order to make my answer more explicit, but understand that due to that, this expression should not be read as C. Just as normal math that uses C syntax.

[2] Side note: because all lea does is arithmetic operations, its arguments don't actually have to refer to valid addresses. For this reason, it's often used to perform pure arithmetic on values that may not be intended to be dereferenced. For instance, cc with -O2 optimization translates

long f(long x) {
  return x * 5;
}

into the following (irrelevant lines removed):

f:
  leaq (%rdi, %rdi, 4), %rax  # set %rax to %rdi + %rdi * 4
  ret
Humbug answered 18/2, 2020 at 0:37 Comment(10)
Yup, good explanation, in more detail than the other answers, and yes C's & operator is a good analogy. Perhaps worth pointing out that LEA is the special case, while MOV is just like every other instruction that can take a memory or register operand. e.g. add (%rdi), %eax just uses the addressing mode to address memory, same as MOV. Also related: Using LEA on values that aren't addresses / pointers? takes this explanation further: LEA is how you can use the CPU's HW support for address math to do arbitrary calculations.Pigeonhearted
"get the value at %rdi" -- This is strangely worded. You mean that the value in the register rdi should be used. Your use of "at" seems to mean a memory dereference where there is none.Meddle
@PeterCordes Thanks! I've added the point about it being a special case to the answer.Humbug
@Meddle Good point; I didn't notice that. I've changed it now, thank you! :)Humbug
FYI, shorter phrasing that fixes the problem ecm pointed out includes: "the value of %rdi" or "the value in %rdi". Your "value in the register %rdi" is long but fine, and perhaps might help someone struggling to understand registers vs. memory.Pigeonhearted
I tidied up / improved (IMO) some parts of this. Feel free to roll back or remove anything that you think is confusing or distracting for beginners in case I misjudged.Pigeonhearted
Good update; most of it is an improvement over my edit. But note that array + i * sizeof(long) in C isn't &array[i]. C pointer math scales by the type width implicitly: array[i] is literally defined in the C standard as being equivalent to *(array + i). i.e. array + i is the right C expression. To talk about asm you want to show explicitly scaling the index by the type width to get a byte offset (unless you simplify by using char), but to do that you should probably avoid using exactly C syntax. Otherwise you're showing &array[i*sizeof(*array)]Pigeonhearted
@PeterCordes I went back and forth on whether I should write array + i or array + i * sizeof(long). I decided to do the latter since we're in the context of asm and since I never use the expression array + i within a C statement. However, it still is valid C syntax, as you've pointed out. I'll add a footnote.Humbug
The last trick is really awsome.. Compilers really do a great job at making the exe efficient.Beano
So, LEA x, [y] and LEA x, y are the same? I'm an assembly newbie. I was wondering why use square brackets(or parentheses) with LEA, because it seems meaningless(maybe not in your example such as multiple arguments exist in the parentheses). Besides, thank you Quelklef and Peter Cordes for this amazing explanation.Trost
D
36

The instruction MOV reg,addr means read a variable stored at address addr into register reg. The instruction LEA reg,addr means read the address (not the variable stored at the address) into register reg.

Another form of the MOV instruction is MOV reg,immdata which means read the immediate data (i.e. constant) immdata into register reg. Note that if the addr in LEA reg,addr is just a constant (i.e. a fixed offset) then that LEA instruction is essentially exactly the same as an equivalent MOV reg,immdata instruction that loads the same constant as immediate data.

Dapper answered 9/11, 2009 at 9:3 Comment(0)
N
10

If you only specify a literal, there is no difference. LEA has more abilities, though, and you can read about them here:

http://www.oopweb.com/Assembly/Documents/ArtOfAssembly/Volume/Chapter_6/CH06-1.html#HEADING1-136

Nellenelli answered 9/11, 2009 at 8:38 Comment(4)
I guess, with the exception that in GNU assembler it's not true when it comes to labels in the .bss segment? AFAIR you can't really leal TextLabel, LabelFromBssSegment when you got smth. like .bss .lcomm LabelFromBssSegment, 4, you would have to movl $TextLabel, LabelFromBssSegment, isn't it?Enamor
@JSmyth: That's only because lea requires a register destination, but mov can have an imm32 source and a memory destination. This limitation is of course not specific to the GNU assembler.Pigeonhearted
Also, this answer is basically wrong because the question is asking about MOV AX, [TABLE-ADDR], which is a load. So there is a major difference. The equivalent instruction is mov ax, OFFSET table_addrPigeonhearted
The link is dead.Grotesque
O
9

As stated in the other answers:

  • MOV will grab the data at the address inside the brackets and place that data into the destination operand.
  • LEA will perform the calculation of the address inside the brackets and place that calculated address into the destination operand. This happens without actually going out to the memory and getting the data. The work done by LEA is in the calculating of the "effective address".

Because memory can be addressed in several different ways (see examples below), LEA is sometimes used to add or multiply registers together without using an explicit ADD or MUL instruction (or equivalent).

Since everyone is showing examples in Intel syntax, here are some in AT&T syntax:

MOVL 16(%ebp), %eax       /* put long  at  ebp+16  into eax */
LEAL 16(%ebp), %eax       /* add 16 to ebp and store in eax */

MOVQ (%rdx,%rcx,8), %rax  /* put qword at  rcx*8 + rdx  into rax */
LEAQ (%rdx,%rcx,8), %rax  /* put value of "rcx*8 + rdx" into rax */

MOVW 5(%bp,%si), %ax      /* put word  at  si + bp + 5  into ax */
LEAW 5(%bp,%si), %ax      /* put value of "si + bp + 5" into ax */

MOVQ 16(%rip), %rax       /* put qword at rip + 16 into rax                 */
LEAQ 16(%rip), %rax       /* add 16 to instruction pointer and store in rax */

MOVL label(,1), %eax      /* put long at label into eax            */
LEAL label(,1), %eax      /* put the address of the label into eax */
Olindaolinde answered 21/11, 2019 at 18:21 Comment(2)
You never want lea label, %eax for an absolute [disp32] addressing mode. Use mov $label, %eax instead. Yes it works, but it's less efficient (larger machine code and runs on fewer execution units). Since you mention AT&T, Using LEA on values that aren't addresses / pointers? uses AT&T, and my answer there has some other AT&T examples.Pigeonhearted
so leaq (%rax,%rax), %rdx means rdx = rax + 1 * rax? My clang dissassembler should have written it like that, instead, right? leaq (%rax,%rax,), %rdx Note the extra comma. Since this bracketed part is a ternary expression, not a binary one.Spaniard
G
7

It depends on the used assembler, because

mov ax,table_addr

in MASM works as

mov ax,word ptr[table_addr]

So it loads the first bytes from table_addr and NOT the offset to table_addr. You should use instead

mov ax,offset table_addr

or

lea ax,table_addr

which works the same.

lea version also works fine if table_addr is a local variable e.g.

some_procedure proc

local table_addr[64]:word

lea ax,table_addr
Genoese answered 9/11, 2009 at 8:44 Comment(5)
The difference between the x86 instructions MOV and LEA most definitely does NOT depend on the assembler.Waken
My heydays in assembly programming were around 1984 on 6502 :) One would think, that for more complicated architectures (like x86_64), the assembly syntax would have improved in those 40 years and be less cryptic and error prone...Spaniard
@BitTickler: There are better x86 assemblers that were developed after MASM, like NASM where things are as simple and unambiguous as possible. e.g. [] always means a memory operand, lack of [] means definitely not a memory operand. (Of course, the LEA instruction needs its source operand to be a memory operand, but takes the address. That's just part of how the ISA works, unless you want to invent whole new syntax for LEA despite the fact it uses the normal addressing-mode encodings. Stuff like lea rax, [rdi + rsi*4] is when it's useful, or for RIP-relative LEA in 64-bit mode.)Pigeonhearted
@Bartosz Wójcik - Why did you roll back the edits, making the formatting / style of your answer worse? Having "coz", a slang spelling of "because", on a line by itself just slows down and distracts readers.Pigeonhearted
@BartoszWójcik The edits were fine. Refrain from further rollbacks. See stackoverflow.com/help/editingGink
J
4

Basically ... "Move into REG ... after computing it..." it seems to be nice for other purposes as well :)

if you just forget that the value is a pointer you can use it for code optimizations/minimization ...what ever..

MOV EBX , 1
MOV ECX , 2

;//with 1 instruction you got result of 2 registers in 3rd one ...
LEA EAX , [EBX+ECX+5]

EAX = 8

originaly it would be:

MOV EAX, EBX
ADD EAX, ECX
ADD EAX, 5
Josejosee answered 19/10, 2017 at 22:7 Comment(1)
Yup, lea is a shift-and-add instruction that uses memory-operand machine encoding and syntax, because the hardware already knows how to decode ModR/M + SIB + disp0/8/32.Pigeonhearted
H
2

Lets understand this with a example.

mov eax, [ebx] and

lea eax, [ebx] Suppose value in ebx is 0x400000. Then mov will go to address 0x400000 and copy 4 byte of data present their to eax register.Whereas lea will copy the address 0x400000 into eax. So, after the execution of each instruction value of eax in each case will be (assuming at memory 0x400000 contain is 30).

eax = 30 (in case of mov) eax = 0x400000 (in case of lea) For definition mov copy the data from rm32 to destination (mov dest rm32) and lea(load effective address) will copy the address to destination (mov dest rm32).

Headstand answered 24/11, 2019 at 22:48 Comment(0)
A
1

MOV can do same thing as LEA [label], but MOV instruction contain the effective address inside the instruction itself as an immediate constant (calculated in advance by the assembler). LEA uses PC-relative to calculate the effective address during the execution of the instruction.

Amethyst answered 22/6, 2020 at 5:59 Comment(1)
That's only true for 64-bit mode (where PC-relative addressing was new); in other modes lea [label is a pointless waste of bytes vs. a more compact mov, so you should specify the conditions you're talking about. Also, for some assemblers [label] isn't the right syntax for a RIP-relative addressing mode. But yes, that's accurate. How to load address of function or label into register in GNU Assembler explains in more detail.Pigeonhearted
R
0

The difference is subtle but important. The MOV instruction is a 'MOVe' effectively a copy of the address that the TABLE-ADDR label stands for. The LEA instruction is a 'Load Effective Address' which is an indirected instruction, which means that TABLE-ADDR points to a memory location at which the address to load is found.

Effectively using LEA is equivalent to using pointers in languages such as C, as such it is a powerful instruction.

Rabies answered 9/11, 2009 at 8:46 Comment(1)
I think this answer is confusing at best. "The LEA instruction is a 'Load Effective Address' which is an indirected instruction, which means that TABLE-ADDR points to a memory location at which the address to load is found." Actually LEA will load the address, not the contents of the address. I think actually the questioner needs to be reassured that MOV and LEA can overlap, and do exactly the same thing, in some circumstancesDapper
L
0

LEA (Load Effective Address) is a shift-and-add instruction. It was added to 8086 because hardware is there to decode and calculate adressing modes.

Labiche answered 21/11, 2019 at 15:39 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.