Assembly comparison flags understanding

Asked 31/12, 2012 at 11:36 Answered 24/5, 2013 at 13:2

I am struggling to understand the following code snippet in assembler:

if ( EAX >= 5 )
  EBX = 1;
else
  EBX = 2;"

In assembler this can be written as follows (according to my book), emulating the jge instruction you'd normally use in terms of "simpler" branches that only look at one flag at once:

1     cmp eax, 5       ;(assuming eax is signed)
2     js signon        ;goto signon if SF = 1
3     jo elseblock     ;goto elseblock if OF = 1 and SF = 0
4     jmp thenblock    ;goto thenblock if SF = 0 and OF = 0
5 signon:
6     jo thenblock     ;goto thenblock if SF = 1 and OF = 1
7 elseblock:
8     mov ebx, 2
9     jmp next
10 thenblock:
11    mov ebx, 1
12    next:

I can understand that the resulting flags can be: (if ( EAX >= 5 )):

SF = 0 & OF = 0 But I cannot understand how the flags can be: SF = 1 & OF = 1? What computation gives this?

To clarify what I mean:

If eax is in the lower negative bound it would potentially overflow into the positive by subtracting 5. If it would be in the upper positive bound it could not overflow into the negative by subtracting 5?

Stinking answered 31/12, 2012 at 11:36 Comment(14)

"The comparison is performed by subtracting the second operand from the first operand and then setting the status flags in the same manner as the SUB instruction. " – Obannon 31/12, 2012 at 11:38

Yes I understand, but how can eax overflow into a negative number? I can understand how SF = 0 & OF = 1 does not satisfy the condition but not how SF = 1 & OF = 1 does. – Stinking 31/12, 2012 at 11:42

What book/documentation/website are you reading? SF = 1 means the result is negative, OF = 1 means the operation caused an over/underflow with signed operands. If they are both set, it means that the left operand was a large positive number and the right operand was a small negative number, enough to cause the result to roll over from positive to negative. This is not possible when the second operand is 5. (Edit: In my opinion, the best book for learning this is Reversing - Secrets of Reverse Engineering.) – Obannon 31/12, 2012 at 12:11

@DCoder I have updated the question. Please have a look at it. Your help is very much appreciated! – Stinking 31/12, 2012 at 13:59

What book is this? Why is it bothering with multiple conditional branches, when a simple jge will do the same and be a lot more readable? I think that whole js/jo/jmp/jo block was copy-pasted from a generalized solution without bothering to adapt it to the specific context. – Obannon 31/12, 2012 at 14:2

SF also means unsigned overflow (as well as carry out). OF is signed overflow. Normally you use one or the other (both are computed, you normally only care about one) – Misconstruction 31/12, 2012 at 14:56

It means that sign bit is set, but there is no carry. If I recall correctly, OF = SF xor CF; So it means that you had positive number and it became negative, thus data was lost. – Treatment 31/12, 2012 at 15:11

@dwelch SF is not related to unsigned overflow. It is just the highest bit=1. – Treatment 31/12, 2012 at 15:13

OK so from your answers I take that the condition SF=1 & OF = 1 will never happen in the above case? @DCoder The book is called PC Assembly Language. The jge will be introduced in the following chapter. Thank you all for you comments! – Stinking 31/12, 2012 at 15:17

Example(8 bit): 0111 1111 + 0000 0001 -> 1000 0000 => OF = 1, SF = 1, CF = 0. I guess author just wanted to explain logic behind these flags by such an example. – Treatment 31/12, 2012 at 15:20

But how can 0000 0001 be "added" when you subtract 0000 0001 with the cmp instruction? – Stinking 31/12, 2012 at 15:27

I see a question by topicstarter: "What computation gives this?" and it is a general question, not only about cmp. imho. – Treatment 31/12, 2012 at 15:53

right I was thinking carry flag CF not SF signed flag. thanks. – Misconstruction 31/12, 2012 at 16:0

Is this book the Daily WTF? That's horrible code, I hope that's just some sort of weird example and I don't ever run across something like that in production anywhere! – Neutron 24/5, 2013 at 14:44

Much easier to think of these in terms of 3 bit numbers, it all scales. Hmmm, if this is signed (you didnt specify/post in your high level code) then four bits is better because you used a 5. Walk through the numbers near 5 (this shows the output of the alu)

cmp reg,5
0111 - 0101 = 0111 + 1010 + 1 = 10010
0110 - 0101 = 0110 + 1010 + 1 = 10001
0101 - 0101 = 0101 + 1010 + 1 = 10000
0100 - 0101 = 0100 + 1010 + 1 = 01111
0011 - 0101 = 0011 + 1010 + 1 = 01110

Now you have to understand how the hardware works. Some processor families when you do a subtract invert the carry flag coming out of the alu, others dont. either way you can definitely see a state change at the 5 - 5 point. And you dont need the carry flag here anyway, code doesnt use it.

In case you are doing signed math, then try some negative numbers as well.

0000 - 0101 = 0000 + 1010 + 1 = 01011  
1111 - 0101 = 1111 + 1010 + 1 = 11010
1110 = 0101 = 1110 + 1010 + 1 = 11001

And that sheds some light on the problem.

signed overflow is defined as the carry in being not equal to the carry out on the msbit of the adder. That can get messy so we just need to know where that boundary is.

0111 - 0101 = 7 - 5 = 2
0110 - 0101 = 6 - 5 = 1
0101 - 0101 = 5 - 5 = 0
0100 - 0101 = 4 - 5 = -1
0011 - 0101 = 3 - 5 = -2

and so on. Using this 4 bit model, in a signed interpretation we are limited to +7 (0b0111) down to -8 (0b1000). So the after -3 - 5 we will get into trouble:

1110 - 0101 = 1110 + 1010 + 1 = 11001 , -2 - 5 = -7
1101 - 0101 = 1101 + 1010 + 1 = 11000 , -3 - 5 = -8
1100 - 0101 = 1100 + 1010 + 1 = 10111 , -4 - 5 = 7 (-9 if we had more bits)
1011 - 0101 = 1011 + 1010 + 1 = 10110 , -5 - 5 = 6 (-10 if we had more bits)
1010 - 0101 = 1010 + 1010 + 1 = 10101 , -6 - 5 = 5 (-11 if we had more bits)
1001 - 0101 = 1001 + 1010 + 1 = 10100 , -7 - 5 = 4 (-12 if we had more bits)
1000 - 0101 = 1000 + 1010 + 1 = 10011 , -8 - 5 = 3 (-13 if we had more bits)

The latter five are a signed overflow, the signed result cannot be represented in the number of bits available. (remember we are playing with a four bit system for now, that top bit is the carry bit, visually remove it when you look at the result).

The signed flag is simply the msbit of the result, which is also changing a the interesting boundaries. Cases where the signed flag, (msbit of result) is set is the positive (eax) values below 5 and the negative numbers that do not result in a signed overflow (+4 down to -3). All of which are in the <5 category so they want to have a result of 2. The first test looks for cases where sign is set, why it bothers to then test the signed overflow? That makes no sense, we already know all signed results are in the less than 5 category. the extra jump if signed overflow doesnt hurt.

so if you fall through js signon then the sign bit is off which is numbers greater than or equal to 5 (want a result of 1) or results negative enough to cause a signed overflow (want a result of 2). so jo elseblock sorts these two cases out by picking up the result of 2 cases (signed overflow, very negative). and jmp thenblock takes the positive numbers above 5.

It looks to me like you are doing signed math here (somewhat obvious from using the signed overflow flag). Since you are using a 5 to compare against and signed math, you need 4 or more bits in your system to implement this code, so 8, 32, 64, 123456 bits, it doesnt matter it all works the same as a 4 bit system (for this comparision). I find it easier to minimize the number of bits to do the analysis. Hardcoded comparisons like this make it that much easier, as above hand compute results just above, at, and below. then walk through the all zeros (zero) to all ones (minus one) for signed numbers, and very negative into the signed overflow range. for unsigned numbers it is a bit easier but the same process.

Misconstruction answered 31/12, 2012 at 16:0 Comment(3)

Hi! I have some questions after studying your answer. First should it not be "the latter five are a signed overflow" instead of "the latter two..." another question I have is about the sign flag. You say that the the sign flag is simply the msbit of the result. In your answer you say that the sign flag is set on the positive eax values of <5 and when there is signed overflow. But to me the sign flag must be set in the whole range of (-4<eax<5). The sign flag is not set when there is a signed overflow since the result is positive? Am I missing something? – Stinking 4/1, 2013 at 9:4

Thanks for catching those mistakes, does it make sense now or do I still have errors in the answer? – Misconstruction 4/1, 2013 at 18:16

Thanks a lot for you update @dwelch. You have really help me with this one! – Stinking 4/1, 2013 at 23:46

if ( EAX >= 5 ) EBX = 1; else EBX = 2;"

cmp eax,5
jae biggerthan
mov ebx,2
jmp out
.biggerthan
mov ebx,1
.out

Feeder answered 24/5, 2013 at 13:2 Comment(0)

Recommended topics

Hot tags