Assembler Alias
REP MOVS DWORD PTR ES:[EDI], DWORD PTR [ESI]
is a synonym for REP MOVSD
and
REP MOVS BYTE PTR ES:[EDI], BYTE PTR[ESI]
is a synonym of
REP MOVSB
You can write this way in an attempt to "improve" the readability of the code. The idea might have been the following: "if somebody forgot that MOVSB moves from ESI do EDI, this longer syntax will help to make things clearer". This in no way affects the compiled binary form. The difference is only in the textual source code.
As you know, there are the following MOVS commands, based on data sizes:
- MOVSB (byte, 8-bit)
- MOVSW (word, 16-bit)
- MOVSD (dword, 32-bit)
- MOVSQ (qword, 64 bit) - only available in 64-bit mode
What Does This Instruction Do
The MOVS command copies data from DS:SI(ESI/RSI) to ES:DI(EDI/RDI) -- the size of SI/DI register is based on your current mode - 16-bit, 32-bit or 64-bit. It also increases (decreases) SI and DI registers (based on the D flag, set CLD to increase the registers).
The MOVS command cannot use other registers than DS:SI/ES:DI, so it is unnecessary to specify them. In my opinion, it is even redundant to set them, and the readability doesn't improve but worsens.
About the Segment Registers ES and DS
The DS and ES are "segment" registers. As I wrote before, the MOVS only operates with SI/DI as index registers and DS/ES as segment registers. You cannot modify the registers with which the MOVS command works.
But you should not worry about the segment registers because they are usually already set up correctly, and you should not modify or be concerned about them if your program runs under a standard OS like Linux, Windows, etc. These segment registers may be needed only in the following cases:
- You are writing a program for 16-bit mode, like MS-DOS 16-bit real mode or MS-DOS 16-bit protected mode (available on 80286).
- You are writing a kernel/supervisor for a new operating system, or your application runs on bare metal hardware without any operating system.
In 16-bit mode, on Intel CPUs from 8086 to 80286, there were the following segment registers: CS DS ES SS.
In the real mode, the 16-bit segment register is interpreted as the most significant 16 bits of a linear 20-bit address (so the CPU did essentially multiply the value of the segment register by 16 to get the base address of the segment). For example, if you move 1 to DS, and you move 2 to SI, the "byte ptr DS:[SI]" will mean 1*16+2 = 18 (18th byte from the start of the memory space).
In protected mode (80286 and on) the segment registers no longer held 16-bit integer values. They now contain an index into a table of segment descriptors containing the 24-bit base address.
In the Intel 80386 and later, 32-bit protected mode retains the segmentation mechanism of 80286 protected mode. Still, a paging unit has been added as a second layer of address translation between the segmentation unit and the physical bus. Also, the segment base in each segment descriptor is also 32-bit (instead of 24-bit). Also, two new segment registers were added: FS and GS.
The 64-bit architecture does not use segmentation. Four of the segment registers: CS, SS, DS, and ES are forced to 0, and the limit to 264. The segment registers FS and GS can still have a nonzero base address. This allows operating systems to use these segments for particular purposes.
A notable fact is that the 386 and later Intel x86 CPUs still use 16-bit size segment registers because they merely hold an index of the segment descriptor table.
Since, as I wrote before, in a standard operating system, be it 32-bit or 64-bit, segment registers are DS and ES registers already pre-configured and point to the same memory, you can just ignore them.
You can find more information in Chapter 7.3.9.1 "String Instructions" of the Intel® 64 and IA-32 Architectures Software Developer's Manual (Combined Volumes: 1, 2A, 2B, 2C, 2D, 3A, 3B, 3C, 3D and 4). Quote:
These instructions operate on individual elements in a string, which
can be a byte, word, or doubleword. The string elements to be operated
on are identified with the ESI (source string element) and EDI
(destination string element) registers. Both of these registers
contain absolute addresses (offsets into a segment) that point to a
string element. By default, the ESI register addresses the segment
identified with the DS segment register. A segment-override prefix
allows the ESI register to be associated with the CS, SS, ES, FS, or
GS segment register. The EDI register addresses the segment identified
with the ES segment register; no segment override is allowed for the
EDI register. The use of two different segment registers in the string
instructions permits operations to be performed on strings located in
different segments. Or by associating the ESI register with the ES
segment register, both the source and destination strings can be
located in the same segment. (This latter condition can also be
achieved by loading the DS and ES segment registers with the same
segment selector and allowing the ESI register to default to the DS
register.) The MOVS instruction moves the string element addressed by
the ESI register to the location addressed by the EDI register. The
assembler recognizes three "short forms" of this instruction, which
specify the size of the string to be moved: MOVSB (move byte string),
MOVSW (move word string), and MOVSD (move doubleword string).
About the Performance of the MOVS Instruction
Since the first Pentium CPU was produced in 1993, Intel began to made simple commands faster and complex commands (like REP MOVS) slower.
So, REP MOVS became very slow, and there was no more practical reason to use it.
In 2013, Intel decided to revisit REP MOVS. If the CPU (produced after 2013) has CPUID ERMSB (Enhanced REP MOVSB) bit, then REP MOVSB and REP STOSB commands are executed differently than on older processors, and are supposed to be fast. In practice, it is only fast for large blocks, 256 bytes and larger, and only when certain conditions are met:
- both the source and destination addresses have to be aligned to a 16-Byte boundary;
- the source region should not overlap with the destination region;
- the length has to be a multiple of 64 to produce higher performance;
- the direction has to be forward (CLD).
See the Intel Manual on Optimization, section 3.7.6 Enhanced REP MOVSB and STOSB operation (ERMSB) http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf
They are very slow on small blocks because of very high startup cost – about 35 cycles.
DS:
. Why isn't itDS:[ESI]
? – ChanticleerBYTE PTR DS:[ESI]
, which attempts to explain what the instruction does but is in fact unnecessary. The instruction should be justREP MOVS
- both the byte size and the the source and target specifications are implicit. – Sheritasherjmovs
bemovsb
ormovsd
?). – LindnerPTR
is an operator in MASM/TASM syntax.BYTE PTR foo
tells the assembler thatfoo
should be treated as being of typeBYTE
(the value atfoo
that is, not the address offoo
). As for whether theMOVS
in this case is aMOVSB
orMOVSD
; the use ofBYTE PTR
strongly impliesMOVSB
. – Horizon