Les instruction purpose?

Asked 8/9, 2011 at 5:57 Answered 8/9, 2011 at 6:18

What is purpose of les instruction in assembly?

Why do we need to load es segment and a register? Book gives following example:

les    bx, p           ; Load p into ES:BX
mov    es:[bx], al     ; Store away AL

Why do we need to load es and bx in this case?

Also why do we use es:[bx]? If p points to 100h in memory, isn't both es and bx 100h = 200h (bx+es)?

Baresark answered 8/9, 2011 at 5:57 Comment(0)

Its too bad you are learning assembler for a microprocessor with a messy architecture. You get confusing concepts such as the LES instruction.

Conventional microprocessor have registers large enough to contain a full memory address. You can simply load the address of a memory location into a register, and then access that location (and usually those nearby with indexing) via the register.

Some machines (notably the Intel 286 in real mode, which seems to be what you are programming), had only 16 bit registers but could address 1MB of memory. In this case, a register doesn't have enough bits: you need 20 bits, but the registers are only 16 bits.

The solution is to have a second register that contains the missing bits. A simple scheme would have been to require 2 registers, one of which had the lower 16 bits, one of which had the upper 16 bits, to produce a 32 bit address. Then the instruction that references two registers makes sense: you need both to get a full memory address.

Intel chose a messier segment:offset scheme: the normal register (bx in your case) contains the lower 16 bits (the offset), and the special register (called ES) contains 16 bits which are left-shifted 4 bits, and added to the offset, to get the resulting linear address. ES is called a "segment" register, but this will make no sense unless you go read about the Multics operating system circa 1968.

(x86 allows other addressing modes for the "effective address" or "offset" part of an address, like es:[bx + si + 1234], but always exactly one segment register for a memory address.)

[Segments and segment registers really are an interesting idea when fully implemented the Multics way. If you don't know what this is, and you have any interest in computer and/or information architectures, find the Elliot Organick book on Multics and read it cover to cover. You will be dismayed at what we had in the late 60s and seem to have lost in 50 years of "progress". If you want a longer discussion of this, see my discussion on the purpose of FS and GS segment registers ]

What's left of the idea in the x86 is pretty much a joke, at least the way it it used in "modern" operating systems. You don't really care; when some hardware designer presents you with a machine, you have to live with it as it is.

For the Intel 286, you simply have to load a segment register and an index register to get a full address. Each machine insturction has to reference one index register and one segment register in order to form a full address. For the Intel 286, there are 4 such segment reigsters: DS, SS, ES, and CS. Each instruction type explicitly designates an index register and implicitly chooses one of the 4 segment registers unless you provide an explicit override that says which one to use. JMP instructions use CS unless you say otherwise. MOV instructions use DS unless you say otherwise. PUSH instructions use SS unless you say otherwise (and in this case you better not). ES is the "extra" segment; you can only use it by explicitly referencing it in the instruction (except the block move [MOVB} instruction, which uses both DS and ES implicitly).

Hope that helps.

Best to work with a more modern microprocessor, where segment register silliness isn't an issue. (For example, 32-bit mode x86, where mainstream OSes use a flat memory model with all segment bases = 0. So you can just ignore segmentation and have single registers as pointers, only caring about the "offset" part of an address.)

Mede answered 8/9, 2011 at 6:14 Comment(6)

Your answer is mostly correct. However, all modern x86 processors use segment registers. Even in 64-bit mode which is mostly flat, you still have GS and FS that aren't flat. If anything, this segment register silliness is much more complicated nowadays than you describe in your post. – Zavala 8/9, 2011 at 7:15

Yes, they do, but OP didn't need to hear this complication. Nor is the current utilization (x64) anything but the barest vestige of real segment registers. Such a shame, see the Multics reference. (Andy Grove got up in the mid 80s at one talk and blew his stack... Intel designed the 386 segment registers to really do Multics, and he got ignored by the Unix weenies. We deserve what we accept). – Mede 8/9, 2011 at 7:33

Thanks so much! "You need 20 bits, but the registers are only 16 bits." I completely forgot I was working with 16 bit CPU! – Baresark 8/9, 2011 at 7:45

Three nitpicks: 1) this is 8086 here 2) implicit segment register depends on the operand, e.g. SI gets DS and DI gets ES 3) 8086 (real mode) segments have nothing to do with Multics (they only serve to address up to 1MB of memory without bank switching), you are thinking of 286 (protected mode) segments. – Abeu 25/11, 2011 at 21:22

Doesn't matter how the segment registers work, they affect the logical address mapping to physical memory, and control whether access is legal or not. The fact the 8086 segment registers are trivial and have no actual protection bits associated with them simply make them them extremely primitive versions of what we got from Multics and eventually the 32 bit Intel CPUs. The fact that Intel figured that out was genius on thier part; the fact that the rest of the world was too stupid to understand this is sheer idiocy. So instead of Multics, we got flat address space "Eunuchs". Bah. – Mede 25/11, 2011 at 22:0

"Some machines (notably the Intel 286 which seems to be what you are programming), had only 16 bit registers but could address 1mB of memory." -- Actually the 286 can address up to 16 MiB in Protected Mode. And in that mode a segment register's value is not directly used to compute the base. Only in Real Address Mode (always when on 8086/186) the segment register "contains 16 bits which are left-shifted 4 bits" to form the segment base. – Flycatcher 12/8, 2020 at 16:0

The 8086 segment registers cs, ds, es, and ss are the original mechanism by which 16-bit registers can address more than 64K of memory. In the 8086/8088, there were 20 bit addresses (1024 K) to be generated. Subsequent versions of the x86 processors added new schemes to address even more, but generating 20+ bits of address from a pair of 16-bit values is the basic reason.

In so-called "real mode" (native to 8086/8088/80186), an address is computed by multiplying the contents of the segment register by 16 (or, equivalently, shifted left by four places) and adding the offset.

In protected mode (available with the 80286 and later), the segment register selects a "descriptor" which contains a base physical address. The operand es:[bx], for example, adds bx to that physical address to generate the operand address.

Natividadnativism answered 8/9, 2011 at 6:18 Comment(1)

Protected Mode was introduced with the 286 however. – Flycatcher 12/8, 2020 at 16:2

p points to a 32-bit FAR pointer with segment and offset part (in contrast to a NEAR pointer, which is only the offset part). LES will load segment:offset into ES:BX.

Otherwise, you would have to use three instructions. One for loading BX, and two for loading ES (segment registers cannot be loaded directly from memory but have to be loaded into a general-purpose register and then into the segment register).

Oh, yeah, wallyk had a good point with mentioning protected mode (although that is beside the point of your question). Here, ES will be interpreted as a selector, not an actual segment.

A segment (address) in this context is a part of the physical address:
Shift the segment by 4 bits to the left (i.e. multiply it by 2^4 = 16) and add the offset to get the physical address from segment:offset.

In contrast, a selector is a pointer to an entry in a so-called descriptor table (i.e. a selector points to a descriptor) and is used in protected mode. A descriptor table (e.g. GDT) may contain entries of information about chunks of memory, including information about the physical memory address, the chunk size, access rights etc. (there are some slightly other uses as well).

Ciccia answered 8/9, 2011 at 6:6 Comment(1)

"segment registers cannot be loaded directly from memory" This is incorrect. You can load like in mov es, word [1234h]. The only limitations are that you cannot use segment registers in computations (no inc, add, and, etc) and you cannot load an immediate value embedded in an instruction (no mov es, 0ABCDh). – Flycatcher 12/8, 2020 at 16:5

Recommended topics

Hot tags