I recently stumbled across the following assembly instruction sequence:
rep stos dword ptr [edi]
I recently stumbled across the following assembly instruction sequence:
rep stos dword ptr [edi]
For ecx
repetitions, stores the contents of eax
into where edi
points to, incrementing or decrementing edi
(depending on the direction flag) by 4 bytes each time. Normally, this is used for a memset
-type operation.
Usually, that instruction is simply written rep stosd
. Experienced assembly coders know all the details mentioned above just by seeing that. :-)
ETA for completeness (thanks PhiS): Each iteration, ecx
is decremented by 1, and the loop stops when it reaches zero. For stos
, the only thing you will observe is that ecx
is cleared at the end. But, for scas
or the like, where the repz
/repnz
prefixes are used, ecx
can be greater than zero if the operation stopped before exhausting ecx
bytes/words/whatevers.
Before you ask, scas
is used for implementing strchr
-type operations. :-P
cx
/ecx
/rcx
). In your case, since you're using the 32-bit instruction, it will use the 32-bit version of that register, thus, ecx
. –
Assess memset(edi, eax, ecx)
, where edi
, eax
, and ecx
are the registers?...Except the 2nd parameter may be more than a single byte. –
Sanalda memset
, where the thing to set to is a 32-bit quantity (unlike memset
, where the thing to set is a char
). ecx
specifies the number of dwords (and not number of bytes) to set. –
Assess memset(edi, eax, ecx * 4)
–
Convolvulaceous stosd
, the individual bytes in the dword can have different contents. –
Assess Empty array:
char buff[256] = { };
776 1c5: 48 8d 95 e0 fc ff ff lea -0x320(%rbp),%rdx
777 1cc: b8 00 00 00 00 mov $0x0,%eax
778 1d1: b9 20 00 00 00 mov $0x20,%ecx
779 1d6: 48 89 d7 mov %rdx,%rdi
780 1d9: f3 48 ab **rep stos %rax,%es:(%rdi)**
rep stosq
, but sure close enough. (disassembled with AT&T syntax). That looks like un-optimized gcc output; it will inline rep stos
in some cases instead of calling memset
even with optimization. Obviously optimized code wouldn't spend 2 separate instructions getting the pointer into RDI, and would zero RAX with xor %eax,%eax
. (If it didn't optimize away the array entirely.) –
Gianina mov $0, %eax
to zero RAX without the xor-zero peephole optimization (which gcc only looks for at -O2
, which enables -fpeephole2
). Using an extra REX prefix would be strictly worse with XOR, like it would be with MOV. What is the best way to set a register to zero in x86 assembly: xor, mov or and? –
Gianina rep stos
do?"). –
Lambency © 2022 - 2024 — McMap. All rights reserved.