Arithmetic identities and EFLAGS

Asked 5/6, 2020 at 15:29 Answered 5/6, 2020 at 16:24

Since −x = not(x)+1 which then implies a-b = a+not(b)+1, would then

sub rax, rcx

be equivalent to

mov temp, rcx
not temp
add rax, temp
add rax, 1

where temp is some register considered to be volatile?

In other words, does the latter affect EFLAGS in the exact same way? If not, how can it be forced to?

Profluent answered 5/6, 2020 at 15:29 Comment(7)

Although you seem to recognise it, given that it's part of your question, I do want to note that there's no reason to assume that the effect on EFLAGS would be the same. It being mathematically equivalent doesn't suggest that the effect on the state of the processor would be identical (ignoring temp). – Sidewalk 5/6, 2020 at 15:43

Is sub broken in your CPU? - aside: perhaps you could avoid temp with not rcx; add rax, rcx; add rax, 1; not rcx;? – Metternich 5/6, 2020 at 15:49

@500 No, and I suppose you meant to swap the latter two proposed instructions – Profluent 5/6, 2020 at 15:51

@ThomasJager But theoretically speaking, it could be the case, which is the question. – Profluent 5/6, 2020 at 15:52

@super: To preserve the flags, you mean? Good point. – Metternich 5/6, 2020 at 15:54

@500-InternalServerError: not doesn't affect flags, unlike most other ALU instructions. (felixcloutier.com/x86/not#flags-affected) – Thalassa 5/6, 2020 at 15:59

@PeterCordes Oh didn't know – Profluent 5/6, 2020 at 16:2

Yes, that gets the same integer result in RAX.

In other words, does the latter affect EFLAGS in the exact same way?

Of course not. ZF, SF, and PF only depend on the integer result, but CF and OF¹ depend on how you get there. x86's CF carry flag is a borrow output from subtraction. (Unlike some ISAs such as ARM, where subtraction sets the carry flag if there was no borrow.)

Trivial counterexample you could check in your head:
0 - 1 with sub sets CF=1. But your way clears CF.

mov temp, rcx        # no effect on FLAGS
not temp             # no effect on FLAGS, unlike most other x86 ALU instructions
add rax, ~1 = 0xFF..FE     # 0 + anything  clears CF
add rax, 1                 # 0xFE + 1 = 0xFF..FF = -1.  clears CF

(Fun fact: not doesn't affect FLAGS, unlike most other ALU instructions including neg. neg sets flags the same as sub from 0. A strange quirk of x86 history. https://www.felixcloutier.com/x86/not#flags-affected)

Footnote 1: so does AF, the half-carry flag (auxiliary) from the low to high nibble in the low byte. You can't branch on it directly, and x86-64 removed the BCD instructions like aaa that read it, but it's still there in RFLAGS where you can read it with pushf / pop rax for example.

If not, how can it be forced to?

Use different instructions. The easiest and most efficient way to get the desired effect on EFLAGS would be to optimize it back to sub rax, rcx. That's why x86 has sub and sbb instructions. If that's what you want, use it.

If you want an alternative, you definitely need to avoid something like add rax,1 as the last step. That would set CF only if the final result is zero, wrapping from ULONG_MAX = -1.

Doing x -= y as x += -y works for OF in most cases. (But not the most-negative number y=LONG_MIN (1UL<<63), where neg rcx would overflow).

But CF tells you about the 65-bit full result of 64 + 64-bit addition or subtraction. 64-bit negation isn't sufficient: x += -y doesn't always set CF opposite of what x -= y would.

Possibly something involving neg / sbb could be useful? But no, that treats carry-out from negation as -0 / -1, not -(1<<64).

# Broken attempt that fails for CF when rcx=0 at least, probably many more cases.
# Also fails for OF for rcx=0x8000000000000000 = LONG_MIN
mov temp, rcx        # no effect on FLAGS
neg temp             # or NOT + INC  if you insist on avoiding sub-like operations
add rax, temp        # x += -y
cmc                  # complement carry.  CF = !CF

Notice that we combine x and y in a single step. Your add rax, 1 at the end steps on the earlier CF result, making it even less likely / possible for CF to be what you want.

Signed-overflow (OF) has a corner case. It would be the same for most inputs, where the signed arithmetic operation is the same for x -= y or x += -y. But if -y overflows to still be negative (the most-negative 2's complement number has no inverse), it's adding a negative instead of subtracting a negative.

e.g. -LONG_MIN == LONG_MIN because of signed overflow. (C notation; signed overflow is UB in ISO C, but in asm it wraps).

Counterexample for this attempt for CF:

-1 - 0 doesn't borrow, so CF=0. -1 + -0 = -1 + 0 doesn't carry either, and then CMC will flip CF to 1

But -1 (0xff...ff) plus any other number does carry-out, while -1 minus any number doesn't.

So it's not easy, and probably not very interesting to emulate the borrow output of sub accurately.

Note that hardware ALUs often use something like a binary Adder–subtractor that muxes A or ~A as an input to full-adders in a carry/borrow aware way to implement A + B or A - B with a correct borrow output for subtraction.

It should be possible to use stc / adc dst, inverted_src in asm to replicate what hardware like that actually does: addition of the inverse with a carry-in of 1. Not separately adding 1.

(TODO: rewrite more of this answer to show using not / stc / adc instead of multiple operations that potentially need to propagate carry all the way through the number).

When are the carry flags set by x86 negation (NEG) / subtraction (SUB)?

Thalassa answered 5/6, 2020 at 16:24 Comment(14)

Hi Peter, When you do a 0-1 in binary, where are you borrowing from? there i no bit/number on the left side of 0 so where is the "borrowed" 1 coming from? is this where a flags gets set to indicate a non existent borrow? I found this post which explains my question but the answers are not sufficient: #46571441 – Oribel 29/7, 2021 at 3:38

@Dan: From the next higher bit position. Just like when you add 1+1 in binary, there's not enough room to store the 10 result, so the result for that bit is 0 with a carry-out of 1. (en.wikipedia.org/wiki/Adder_(electronics)#Full_adder). 0-1 = 1 with a borrow output of 1. en.wikipedia.org/wiki/Subtractor#Full_subtractor – Thalassa 29/7, 2021 at 3:55

But the next higher bit position is 0 in this case. Assume we have an 8 bit cpu then run 0b00000000 - 0b00000001. first inputs bits are all 0s, nothing to borrow from? – Oribel 29/7, 2021 at 3:57

@Dan: Right, borrow propagates infinitely through zeros, that's why 2's complement works that way. The result of that sub is of course 0b11111111, with a borrow output of 1, like you'd expect from building a binary subtractors out of eight en.wikipedia.org/wiki/Subtractor#Full_subtractor blocks with ripple carry, or any optimized equivalent. On x86 for example, CF (the carry flag) gets set from the borrow output. On ARM, the C (carry flag) = !borrow from sub / sbc instructions. (But that's what sbc is expecting so sub / sbc is still how you do a 2-register subtraction). – Thalassa 29/7, 2021 at 4:2

So is the CPU just giving you a 1 out of thing air and sets a flag? how would this work if I subtract by hand on paper? I go all the way to the left and see no 1s to borrow, then what should be done? I can't even find a youtube video which explains this. – Oribel 29/7, 2021 at 4:10

@Dan: On paper by hand you use sign/magnitude arbitrary-precision decimal - you know the result is negative if there's a borrow coming out of the most significant digit of the first operand. In binary we normally use 2's complement, which works like a chain of full subtractors. (That's not a coincidence because it's the same way that unsigned binary wrapping operations works.) To put it another way, you're asking about a subtraction that produces -1 (or 0xff). The existence of negative numbers are what make it possible to subtract from 0. 0 - 1 is just math (or logic gates, see wiki). – Thalassa 29/7, 2021 at 4:17

Sorry let me clear my question with this drawing: app.sketchtogether.com/s/sketch/M1HCT.2.1 , this is showing 0b10 - 0b01. It first borrows a 1 from the MSB of 0b10, then continues with the calculation. But in 0b00 - 0b01, there is nothing to borrow from, this is what is confusing me. 0b00 is all 0's, there is no 1 to borrow from any of its bits. – Oribel 29/7, 2021 at 4:25

@Dan: Right, 0b10 - 0b01 = 0b01, including a borrow into the 2nd bit from the 0-1 in the low bit. But there's no further borrow out of that bit, so the borrow output of the whole 2-bit subtraction (out of the most-significant full-subtractor) is 0. In 0 - 1, there is a borrow output from the whole thing. i.e. the left hand operand was unsigned-below the right-hand operand. i.e. 0 - 1 = -1 doesn't have anything to borrow from, so the result is negative. i.e. 0b11 (no signed overflow, just unsigned carry). – Thalassa 29/7, 2021 at 4:28

@Dan: Think through the logic of a 1-bit full subtractor (en.wikipedia.org/wiki/Subtractor#Full_subtractor) and how that works on its own. Keep in mind that borrow (like carry) propagates from LSB to MSB; it doesn't matter what's there to borrow from. Once you understand the details of what actually happens, you can start thinking about how to assign mathematical meaning to those bits for unsigned or signed 2's complement interpretations of those bits. (A binary sign/magnitude would work differently). – Thalassa 29/7, 2021 at 5:19

Thanks again for the explanation. – Oribel 29/7, 2021 at 17:37

I had to ask another related question: #68583309 – Oribel 29/7, 2021 at 21:9

I found this answer which explains adder/subtracter are the same hardware? stackoverflow.com/a/56279548 , I fully understand your explanation where the method won't set flags correctly, so how would it work according to the link I pasted? – Oribel 2/8, 2021 at 18:15

maybe above is mips/risc specific and x64 does it differently? – Oribel 2/8, 2021 at 18:32

Also related: Using One's Complement In Place of Directly Subtracting Two Binary Numbers has an example and discussion of the fact that subtraction by adding the inverse needs to be a single add with the +1 as a carry-in, not done separately, if you want the FLAGS results to be meaningful. – Thalassa 20/11, 2022 at 4:42

No, they're not equivalent. For instance if rax = 1 and rcx = 3, then sub rax, rcx will set the carry flag, because you are subtracting a larger number from a smaller one. But in your second sequence of instructions, following add rax, temp, rax will contain -3 (i.e. 0xfffffffffffffffd), and adding 1 to -3 does not cause a carry. So after your second sequence of instructions, the carry flag would be cleared.

I do not know of any simple way to exactly emulate the behavior of sub including its effect on flags (other than by using cmp, but that's cheating because it's really just sub under the hood). In principle, you could write a long sequence of instructions that manually did all the same tests that sub does internally (referring to its precise description in the instruction set manual), and sets the flags at the end using sahf or popf of the like.

This would be a lot of work, especially if you are not going to use cmp, and I am not going to go through it for this answer. Especially because I also can't think of any reason why one would need to do it, except as a fairly pointless exercise.

Virology answered 5/6, 2020 at 16:3 Comment(0)