Why do the x86 instruction INC
(increment) and DEC
(decrement) not affect the CF
(carry flag) in FLAGSREGISTER?
To understand why you probably need to remember the current "x86" CPUs with 32 and 64 bit values started life as much more limited 8-bit machines, going back to the Intel 8008. (I coded in this world back in 1973, I still remember (ugh) it!).
In that world, registers were precious and small. You need INC
/DEC
for various purposes, the most common being loop control. Many loops involved doing "multi-precision arithmetic" (e.g, 16 bits or more!) By having INC
/DEC
set the Zero flag (Z
), you could use them to control loops pretty nicely; by insisting the loop control instructions not change the Carry flag (CF
), the carry is preserved across loop iterations and you can implement multiprecision operations without writing tons of code to remember the carry state.
This worked pretty well, once you got used to the ugly instruction set.
On more modern machines with larger word sizes, you don't need this is much, so INC
and DEC
could be semantically equivalent to ADD
...,1 etc. That in fact is what I use when I need the carry set :-}
Mostly, I stay away from INC
and DEC
now, because they do partial condition code updates, and this can cause funny stalls in the pipeline, and ADD
/SUB
don't. So where it doesn't matter (most places), I use ADD
/SUB
to avoid the stalls. I use INC
/DEC
only when keeping the code small matters, e.g., fitting in a cache line where the size of one or two instructions makes enough difference to matter. This is probably pointless nano[literally!]-optimization, but I'm pretty old-school in my coding habits.
My explanation tells us why INC
/DEC
set the Zero flag (Z
). I don't have a particularly compelling explanation for why INC
/DEC
set the sign (and the Parity flag).
EDIT April 2016: It seems that the stall problem is handled better on modern x86s. See INC instruction vs ADD 1: Does it matter?
dec / jge
to loop from n down to 0, instead of n down to 1. (i.e. fall through when dec
produces -1
instead of 0
.) This is occasionally useful. –
Felonious cmova
and cmovbe
are still 2 uops because of 4 total inputs, unlike other cmov instructions that only need 3. jcc and setcc are always 1 uop even if they need 2 separate flag inputs). See @Bee's answer on What is a Partial Flag Stall? –
Felonious The question of why sign when you have zero flag set by inc/dec is best addressed with question: would you rather do without option a ?
a) for (n=7;n>=0;n--) // translates to `dec + jns`
b) for (n=8;n>0;n--) // translates to `dec + jnz`
As Ira Baxter already clarified, Carry flag is used in a lot of algorithms -- not only multiprecision arithmetic, but also for say bitmap processing in monochrome/cga/EGA era: This shifts 80 pixel wide row one pixel right...
mov cx, 10
begin: lodsb
rcr al,1 // this is rotate though carry:
stosb // for the algorithm to work, carry must not be destroyed
LOOP begin //
But then: why parity?
I believe the answer is why not. This instruction set is from the late 70's, when transistors were scarce. Denying the calculation of parity flag for some particular instruction would have not made any sense, but just added to the complexity of the CPU.
The instructions inc and dec are typically used to maintain iteration or loop count. Using 32 bits, the number of iterations can be as high as 4,294,967,295. This number is sufficiently large for most applications. What if we need a count that is greater than this? Do we have to use add instead of inc? This leads to the second, and the main, reason.
The condition detected by the carry flag can also be detected by the zero flag. Why? Because inc and dec change the number only by 1. For example, suppose that the ECX register has reached its maximum value 4,294,967,295 (FFFFFFFFH). If we then execute
inc ECX
we would normally expect the carry flag to be set to 1. However, we can detect this condition by noting that ECX = 0, which sets the zero flag. Thus, setting the carry flag is really redundant for these instructions.
inc
, but not for dec
. For dec
, ZF==0 detects the iteration before carry (wraparound from 0 to 2^32-1). I made the same comment this on another answer that also made this argument. –
Felonious Because there is no need to affect. It is quite enough to check the Zero flag. So, after inc and dec instructions the carry flag remain the same and in some cases is this usefull.
inc
: carry goes from 0xFFFF... to 0, so ZF and CF are equal. But dec
goes from 0
to 0xFFFF...
, so ZF is set on the decrement before carry. The real reason I think is that sub reg,1
is available if you need CF, and normally you don't need the CF result. So preserving CF makes it possible to use it mixed with ADC, shifts, RCL, and other stuff that keeps something in CF. (Of course this design decision was made for 8086, long before partial-register stalls were on the radar as a future problem. At that point loop
was efficient, so flagless looping was easy.) –
Felonious © 2022 - 2024 — McMap. All rights reserved.