When to use CMP & TEQ instructions in ARM Assembly?
Asked Answered
C

1

5

why two separate instructions instead of one instruction? Practically in what kind of situations we need to use CMP and TEQ instructions.

I know how both the instruction works.

Curse answered 4/9, 2019 at 11:28 Comment(9)
From the usage notes in ARM DDI 0100E: "TEQ is used to test if two values are equal, without affecting the V flag (as CMP does). The C flag is also unaffected in many cases. TEQ is also useful for testing whether the two values have the same sign. After the comparison, the N flag is the logical Exclusive OR of the sign bits of the two operands."Royroyal
@Michael, I assume the OP is wondering why ARM did implement a TEQ instruction whereas the CMP instruction can be also used to compare two values (even if the flags setting is different).Herlindaherm
@GuillaumePetitjean I was replying to the "Practically in what kind of situations we need to use CMP and TEQ instructions."-part (though it's more "can" than "need to").Royroyal
@Michael. Gotcha. I did a quick test on godbolt with arm-none-eabi-gcc and the generated instruction for both if ( a != b)and if(a > b) always consists of a CMP instruction. Indeed it's not easy to understand the need to have 2 separate instructionsHerlindaherm
@GuillaumePetitjean Well, they don't do the same thing. CMP sets the flag based on op1 - op2, while TEQ sets the flags based on op1 XOR op2. So CMP can check for the ordering of two values (==, >, <, etc). TEQ on the other hand can check for equality and whether the signs are the same.Royroyal
Sure I understand this. But you can do what TEQdo with a CMP, right ? Perhaps TEQis faster on some MCUs ? Or is there another reason to have both instructions ?Herlindaherm
How would you check if the two operands have the same sign with a single CMP?Royroyal
typical ISA designs use flags.Faenza
godbolt.org/z/07WwSI Would return 0 if signs differ or the product if it would be non-zero. Of course you could do the multiple and then just clamp to zero. However, the MUL is actually expensive on some CPUs so this would be faster. It is also possible to do multiple conditions at once; gcc doesn't seem to even try this, but an assembler programmer can. Ie, you can set 'N' and 'C' for different operands but test for both. Also, it might be useful in combination with subs, etc which does destructive testing as opposed to cmp.Anorthite
S
7

short: Both serve different purposes each, cmp is subs without a destination while teq is eors without a destination.

cmp is very straightforward: you compare two numbers A and B
signed:
gt: A > B
ge: A >= B
eq: A == B
le: A <= B
lt: A < B

unsigned:
hi: A > B
hs: A >= B
eq: A == B
ls: A <= B
lo: A < B

Let's assume the problem below though:

int32_t foo(int32_t A)
{
    if (((A < 0) && ((A & 1) == 1)) || ((A >= 0) && ((A & 1) == 0)))
    {
        A += 1;
    }
    else
    {
        A -= 1;
    }

    return A;
}

In human language, the if statement is true if A is either an (odd negative number) or an (even positive number), and Linaro GCC 7.4.1 @ O3 will generate that mess below:

foo
        0x00000000:    CMP      r0,#0
        0x00000004:    AND      r3,r0,#1
        0x00000008:    BLT      {pc}+0x14 ; 0x1c
        0x0000000C:    CMP      r3,#0
        0x00000010:    BEQ      {pc}+0x14 ; 0x24
        0x00000014:    SUB      r0,r0,#1
        0x00000018:    BX       lr
        0x0000001C:    CMP      r3,#0
        0x00000020:    BEQ      {pc}-0xc ; 0x14
        0x00000024:    ADD      r0,r0,#1
        0x00000028:    BX       lr

People knowledgeable in the field of bit hacking would alter the if statement like below:

int32_t bar(int32_t A)
{
    if ((A ^ (A<<31)) >= 0)
    {
        A += 1;
    }
    else
    {
        A -= 1;
    }

    return A;
}

And the results are:

bar
        0x0000002C:    EORS     r3,r0,r0,LSL #31
        0x00000030:    ADDPL    r0,r0,#1
        0x00000034:    SUBMI    r0,r0,#1
        0x00000038:    BX       lr

And finally, assembly programmers will replace EORS with teq r0, r0, lsl #31.

It won't make the code any faster, but it doesn't need R3 as the scratch register.

Note that the code above is just a show case, being a separate function where you have excess of available registers.

In real life however, registers are by far the most scarce resource, especially inside a loop, and even compilers will make use of the teq instruction in similar situations.

Summing it up, there are fields such as error correction, decryption/encryption, etc where tons of xor operations are done, and people dealing with those problems just know to appreciate instructions such as teq and when to us them.

And always remember: never trust compilers

Stonechat answered 4/9, 2019 at 16:9 Comment(3)
I think the important point is that the surrounding instructions as opposed to isolation show the benefit of TEQ.Anorthite
@artlessnoise You have a very good point, but it's kinda hard to explain by short examples. My focus was on the nature of the xor operations since I had the feeling that the OP knows them only from textbooks.Caledonian
That is a great point, so it is also worth mentioning tst. So you have an equivalent of subs, ands and eors with cmp, tst, and teq. ldr is sort of similar with pld. All are instructions that don't update registers, but do get the side effects of what the other operation would do. But basically if you don't put restriction on register use, they don't make much sense (which goes back to my first point).Anorthite

© 2022 - 2024 — McMap. All rights reserved.