Are If Thens faster than multiplication and assignment?
Asked Answered
M

10

7

I have a quick question, suppose I have the following code and it's repeated in a simliar way 10 times for example.

if blah then
    number = number + 2^n
end if

Would it be faster to evaluate:

number = number + blah*2^n?

Which also brings the question, can you multiply a boolean value times a integer (Although I am not sure the type that is returned from 2^n, is it an integer or unsigned..etc)? (I'm working in Ada, but let's try to generalize this maybe?)

EDIT: Sorry I should clarify I am looking at 2 to the power of n, and I put c in there cause I was interested for my own learning in the future if I ever run into this problem in c and I think there are more c programmers out there on these boards then Ada (I'm assuming and you know what that means), however my current problem is in the Ada language, but the question should be fairly language independent (I hope).

Monopteros answered 26/10, 2010 at 13:35 Comment(9)
The caret ^ operator means XOR in C, so just keep that in mind.Haig
Be careful. Since C does not have a built in boolean type, there is no guarantee that blah is equal to either 1 or 0. Some functions which return true or false might choose to return something other than 1 in place of true.Lifesize
@Lifesize Thanks I didn't realize that boolean doesn't always mean 0 and 1, that could have ended up being a hard lesson to learn.Monopteros
There are not a few Ada programmers hanging around StackOverflow, and we've pretty much all got RSS feeds (or something comparable) set up to watch for the 'Ada' tag, so don't worry about an Ada programmer not noticing an Ada question :-)Bubonocele
@Marc C Thanks I guess I made an assumption because I don't see nearly as many ada questions (however that might be my failure cause I don't look around a lot for it), so I just assumed that the more questions of one type there were, brought more of those type of programmers around, my bad, thank's for enlightening me :)Monopteros
@Marc C - That's pretty slick. I'm just checking by hand. He's right that this is really a language-independent question though. The only wrinkle Ada adds is that its compilers have more information to allow for an even better job optimizing. So what is true for C (don't micro-optimize like this) is even more true for Ada.Crosspollination
@Brian: C has a boolean type, _Bool, aliased to bool in stdbool.h. Also, while any nonzero value counts as true for conditionals, the result of a boolean expression (comparison operators, logical and/or/not operators, ...) is always 0 or 1.Ashy
By the way, I hope you realize blah*2^n is blah<<n (bit-shift) and one of the fastest possible operations.Ashy
@R..: You're right, C99 did introduce an explicit boolean type. I'd still be careful, though. I actually have had this issue bite me before, particularly when interacting between code across different programming languages.Lifesize
B
8

if we are talking about C and blah is not within your control, then just do this:

if(blah) number += (1<<n);

There is really not a boolean in C and does not need to be, false is zero and true is not zero, so you cannot assume that not zero is 1 which is what you would need for your solution, nor can you assume that any particular bit in blah is set, for example:

number += (blah&1)<<n;

Would not necessarily work either because 0x2 or 0x4 or anything non-zero with bit zero clear is considered a true. Typically you will find 0xFFF...FFFF (minus one, or all ones) used as true, but you cannot rely on typical.

Now, if you are in complete control over the value in blah, and keep it strictly to a 0 for false and 1 for true then you could do what you were asking about:

number += blah<<n;

And avoid the potential for a branch, extra cache line fill, etc.

Back to the generic case though, taking this generic solution:

unsigned int fun ( int blah, unsigned int n, unsigned int number )
{
    if(blah) number += (1<<n);
    return(number);
}

And compiling for the two most popular/used platforms:

    testl   %edi, %edi
    movl    %edx, %eax
    je  .L2
    movl    $1, %edx
    movl    %esi, %ecx
    sall    %cl, %edx
    addl    %edx, %eax
.L2:

The above uses a conditional branch.

The one below uses conditional execution, no branch, no pipeline flush, is deterministic.

  cmp   r0,#0
  movne r3,#1
  addne r2,r2,r3,asl r1
  mov   r0,r2
  bx    lr

Could have saved the mov r0,r2 instruction by re-arranging the arguments in the function call, but that is academic, you wouldnt burn a function call on this normally.

EDIT:

As suggested:

unsigned int fun ( int blah, unsigned int n, unsigned int number )
{
    number += ((blah!=0)&1)<<n;
    return(number);
}
  subs  r0, r0, #0
  movne r0, #1
  add   r0, r2, r0, asl r1
  bx    lr

Certainly cheaper, and the code looks good, but I wouldnt make assumptions that the result of blah!=0, which is zero or whatever the compiler has defined as true always has the lsbit set. It doesnt have to have that bit set for the compiler to generate working code. Perhaps the standards dictate the specific value for true. by re-arranging the function parameters the if(blah) number +=... will also result in three single clock instructions and not have assumptions.

EDIT2:

Looking at what I understand to be the C99 standard:

The == (equal to) and != (not equal to) operators are analogous to the relational operators except for their lower precedence. Each of the operators yields 1 if the specified relation is true and 0 if it is false.

Which explains why the above edit works and why you get the movne r0,#1 and not some other random number.

The poster was asking the question with regards to C but also noted that ADA was the current language, from a language independent perspective you should not assume "features" like the C feature above and use an if(blah) number = number + (1<<n). But this was asked with a C tag so the generically (processor independent) fastest result for C is, I think, number += (blah!=0)<<n; So Steven Wright's comment had it right and he should get credit for this.

The posters assumption is also basically correct, if you can get blah into a 0 or 1 form then using it in the math is faster in the sense that there is no branch. Getting it into that form without it being more expensive than a branch is the trick.

Bighorn answered 27/10, 2010 at 14:18 Comment(4)
What about number += ((blah!=0)&1)<<n; ?Remora
the result of blah!=0 is either 0 for false or true which is not deterministic.Bighorn
Reading an answer to a similar SO question the standard may dictate that the intermediate comparison does return a 1 or 0, if that is true and the compiler meets that standard everywhere then just do number += (blah!=0)<<n; I am still waiting on a good standard to come out and for a compiler that actually meets any standard so I would rather play it safe. (blah!=0)<<n; should result in three instructions on ARM, so not any faster than if(blah) number += 1<<n; for that architecture. for x86 is should be quite a bit faster.Bighorn
thanks, remember to give simon a +1 for his contribution. and pay it forward (help someone else on stackoverflow out)Bighorn
T
10

There is no general answer to such a question, this depends a lot on your compiler and CPU. Modern CPU have conditional move instructions, so everything is possible.

The only ways to know here are to inspect the assembler that is produced (usually -S as compiler option) and to measure.

Timid answered 26/10, 2010 at 13:45 Comment(0)
B
8

if we are talking about C and blah is not within your control, then just do this:

if(blah) number += (1<<n);

There is really not a boolean in C and does not need to be, false is zero and true is not zero, so you cannot assume that not zero is 1 which is what you would need for your solution, nor can you assume that any particular bit in blah is set, for example:

number += (blah&1)<<n;

Would not necessarily work either because 0x2 or 0x4 or anything non-zero with bit zero clear is considered a true. Typically you will find 0xFFF...FFFF (minus one, or all ones) used as true, but you cannot rely on typical.

Now, if you are in complete control over the value in blah, and keep it strictly to a 0 for false and 1 for true then you could do what you were asking about:

number += blah<<n;

And avoid the potential for a branch, extra cache line fill, etc.

Back to the generic case though, taking this generic solution:

unsigned int fun ( int blah, unsigned int n, unsigned int number )
{
    if(blah) number += (1<<n);
    return(number);
}

And compiling for the two most popular/used platforms:

    testl   %edi, %edi
    movl    %edx, %eax
    je  .L2
    movl    $1, %edx
    movl    %esi, %ecx
    sall    %cl, %edx
    addl    %edx, %eax
.L2:

The above uses a conditional branch.

The one below uses conditional execution, no branch, no pipeline flush, is deterministic.

  cmp   r0,#0
  movne r3,#1
  addne r2,r2,r3,asl r1
  mov   r0,r2
  bx    lr

Could have saved the mov r0,r2 instruction by re-arranging the arguments in the function call, but that is academic, you wouldnt burn a function call on this normally.

EDIT:

As suggested:

unsigned int fun ( int blah, unsigned int n, unsigned int number )
{
    number += ((blah!=0)&1)<<n;
    return(number);
}
  subs  r0, r0, #0
  movne r0, #1
  add   r0, r2, r0, asl r1
  bx    lr

Certainly cheaper, and the code looks good, but I wouldnt make assumptions that the result of blah!=0, which is zero or whatever the compiler has defined as true always has the lsbit set. It doesnt have to have that bit set for the compiler to generate working code. Perhaps the standards dictate the specific value for true. by re-arranging the function parameters the if(blah) number +=... will also result in three single clock instructions and not have assumptions.

EDIT2:

Looking at what I understand to be the C99 standard:

The == (equal to) and != (not equal to) operators are analogous to the relational operators except for their lower precedence. Each of the operators yields 1 if the specified relation is true and 0 if it is false.

Which explains why the above edit works and why you get the movne r0,#1 and not some other random number.

The poster was asking the question with regards to C but also noted that ADA was the current language, from a language independent perspective you should not assume "features" like the C feature above and use an if(blah) number = number + (1<<n). But this was asked with a C tag so the generically (processor independent) fastest result for C is, I think, number += (blah!=0)<<n; So Steven Wright's comment had it right and he should get credit for this.

The posters assumption is also basically correct, if you can get blah into a 0 or 1 form then using it in the math is faster in the sense that there is no branch. Getting it into that form without it being more expensive than a branch is the trick.

Bighorn answered 27/10, 2010 at 14:18 Comment(4)
What about number += ((blah!=0)&1)<<n; ?Remora
the result of blah!=0 is either 0 for false or true which is not deterministic.Bighorn
Reading an answer to a similar SO question the standard may dictate that the intermediate comparison does return a 1 or 0, if that is true and the compiler meets that standard everywhere then just do number += (blah!=0)<<n; I am still waiting on a good standard to come out and for a compiler that actually meets any standard so I would rather play it safe. (blah!=0)<<n; should result in three instructions on ARM, so not any faster than if(blah) number += 1<<n; for that architecture. for x86 is should be quite a bit faster.Bighorn
thanks, remember to give simon a +1 for his contribution. and pay it forward (help someone else on stackoverflow out)Bighorn
B
5

In Ada...

The original formulation:

if Blah then
  Number := Number + (2 ** N);
end if;

The alternative general formulation, assuming Blah is of type Boolean and Number and N are of suitable types:

Number := Number + (Boolean'pos(Blah) * (2 ** N));

(For N and Number of user-defined integer or floating point types, suitable definitions and type conversions may be required, the key point here is the Boolean'pos() construct, which Ada guarantees will give you a 0 or 1 for the predefined Boolean type.)

As for whether this is faster or not, I concur with @Cthutu:

I would keep it with the conditional. You shouldn't worry about low-level optimisation details at this point. Write the code that describes your algorithm best and trust your compiler.

Bubonocele answered 26/10, 2010 at 14:18 Comment(5)
Nice on the pos part, I didn't think of that. If blah was a predictable value I could understand the compiler piece that both yourself and cthutu say, but since this is a discrete value coming in from a piece of hardware I'm not sure how the compiler can optimize this further, would you (or Cthutu) mind expanding?Monopteros
Does Ada really guarantee 0 and 1 for the Boolean type? The only comment on this in the LRM is that False < True. However, I expect this to be the case with most (all?) compilers. And of course, the paranoid can define their own representation for a derived boolean type which guarantees 0 and 1 as the values :)Lazarolazaruk
Yes, for predefined Boolean this is guaranteed. It's because of the definition of the 'Pos attribute, which returns the position number of the enumeration, i.e. Boolean'Pos(False) is 0, Boolean'Pos(True) is 1. Even if the underlying representations were 0, and something-other-than-0, the 'Pos property would still hold (to get the actual representation you'd have to use an Unchecked_Conversion instantiation to get at it.)Bubonocele
If the compiler generates a boolean value, it'll definitely have the appropriate 'Pos behaviour. However, if you generate a "boolean" value using some form of unchecked conversion (eg, importing from C) it may be invalid and most bets are off. For example, with the GCC Ada compiler, 42 can appear to be false in some contexts (because its LSB is clear), true in others, or result in Constraint_Error. As ever, if you're importing from a foreign context, validate at the interface.Remora
& Simon: Thanks for the input. Now re-reading the LRM, this is clear. I confused 'Pos with the internal representation. Useful new information.Lazarolazaruk
K
4

I would keep it with the conditional. You shouldn't worry about low-level optimisation details at this point. Write the code that describes your algorithm best and trust your compiler. On some CPUs the multiplication is slower (e.g. ARM processors that have conditionals on each instruction). You could also use the ?: expression which optimises better under some compilers. For example:

number += (blah ? 2^n : 0);

If for some reason this little calculation is the bottleneck of your application after profiling then worry about low-level optimisation.

Koger answered 26/10, 2010 at 13:47 Comment(5)
When doing code reviews for embedded systems, I usually look at written code from the perspective of with little tweaks can it be a little quicker, I'm not planning any kind of massive re-write, or weeks of time tweaking the little things, but just hopefully obvious little things, but perhaps the compiler takes care of this. Although I don't think it can optimize it out, as the data in the boolean is a discrete in this case so it's not known until runtime.Monopteros
I would really recommend sticking with a boolean check if your logic is triggered when a condition is true, rather than using an integer/float and multiply it. It will be more obvious to you when you get back to your code in 6 months time.Koger
Be very weary of tweaking for optimisation. You might be making your code harder to read, and worse still make it slower. Intuition is not always your best friend when it comes to optimisation.Koger
based on the comment associated with the function that runs this code, I would be surprise to fail to read the code easily, but I definitely do see your point. I suppose that a quick way to see whether this is quicker or slower (even with the compiler setup) would be to run a similar "function" a bunch of times taking a ton of time measurements, and that should tell me on our particular system whether it's faster or slower.Monopteros
I was trying to explain that you should not worry about optimisation at that level and I was describing a general approach, rather anything specific to the example code. Profile your code often and use that as a guide to where you should be focusing your optimisation efforts, should your app need it.Koger
N
4

In C, regarding blah*2^n: Do you have any reason to believe that blah takes the values 0 and 1? The language only promises that 0 <-> FALSE and (everything else) <-> TRUE. C allows you to multiply a "boolean" temporary with another number, but the result is not defined except insofar as result=0 <=> the bool was false or the number was zero.

In Ada, regarding blah*2^n: The language does not define a multiplication operator on type Boolean. Thus blah cannot be a bool and be multiplied.

Nature answered 26/10, 2010 at 13:50 Comment(3)
I spoke with a co-worker and he mentioned that you can't multiply booleans with numbers. It made sense but I wasn't sure if that was an ada only restriction, or if most languages don't allow.Monopteros
Eric: This answer is misleading. The result of a logical/comparison operator in C is always 0 or 1. This is well-defined by the standard. You can use other nonzero values with conditionals, but that's quite different from your implication that "true" takes on random nonzero values.Ashy
@R..: Hmm... Now you have me trying to remember in which environment I had been surprised to find true (visibly) implemented as -1. I think I recall the "pun" that both !true==false and ~true==false.Nature
H
1

If your language allows multiplication between a boolean and a number, then yes, that is faster than a conditional. Conditionals require branching, which can invalidate the CPU's pipeline. Also if the branch is big enough, it can even cause a cache miss in the instructions, though that's unlikely in your small example.

Haig answered 26/10, 2010 at 13:41 Comment(1)
Interesting, I was thinking about the whole branch thing. I forgot about pipelining (Shame on me, we have been going over it quite a bit in school, I should know better)Monopteros
C
1

Generaly, and particularly when working with Ada, you should not worry about micro-optimization issues like this. Write your code so that it is clear to a reader, and only worry about performance when you have a problem with performance, and have it tracked down to that portion of the code.

Different CPUs have different needs, and they can be insanely complex. For example, in this case which is faster depends a lot on your CPU's pipeline setup, what's in cache at the time, and how its branch prediction unit works. Part of your compiler's job is to be an expert in those things, and it will do a better job than all but the very best assembly programmers. Certianly better than you (or me).

So you just worry about writing good code, and let the compiler worry about making efficient machine code out of it.

Crosspollination answered 26/10, 2010 at 14:38 Comment(0)
B
1

For the problem stated, there is indeed simple expressions in C that may produce efficient code.

The nth power of 2 can be computed with the << operator as 1 << n, provided n is less than the number of value bits in an int.

If blah is a boolean, namely an int with a value of 0 or 1, your code fragment can be written:

number += blah << n;

If blah is any scalar type that can be tested for its truth value as if (blah), the expression is slightly more elaborate:

number += !!blah << n;

which is equivalent to number += (blah != 0) << n;

The test is still present but, for modern architectures, the generated code will not have any jumps, as can be verified online using Godbolt's compiler explorer.

Biphenyl answered 1/8, 2017 at 18:51 Comment(1)
Glad you decided to give this answer. I almost gave the same answer myself a while back, but it was an old question. Somehow it keeps going active though, so there should probably be an optimal answer.Gerbold
H
0

In either case, you can't avoid a branch (internally), so don't try!

In

number = number + blah*2^n

the full expression will always have to be evaluated, unless the compiler is smart enough to stop when blah is 0. If it is, you'll get a branch if blah is 0. If it's not, you always get an expensive multiply. In case blah is false, you'll also get the unnecessary add and assignment.

In the "if then" statement, the statement will only do the add and assignment when blah is true.

In short, the answer to your question in this case is "yes".

Hydrargyrum answered 26/10, 2010 at 13:55 Comment(3)
Why is everyone missing the fact that this multiply is not expensive but virtually free? (Hint: it's a power of 2.)Ashy
Just because it's a power of two does not make it faster than not doing it at all.Hydrargyrum
you can avoid the branch it depends on the architecture. you cannot avoid some sort of conditional execution, that is true, unless blah is known to be not just a generic boolean but is specifically a 1 or 0.Bighorn
I
0

This code shows they perform similarly, but multiplication is usually slightly faster.

@Test
public void manual_time_trial()
{
    Date beforeIfElse = new Date();
    if_else_test();
    Date afterIfElse = new Date();
    long ifElseDifference = afterIfElse.getTime() - beforeIfElse.getTime();
    System.out.println("If-Else Diff: " + ifElseDifference);

    Date beforeMultiplication = new Date();
    multiplication_test();
    Date afterMultiplication = new Date();
    long multiplicationDifference = afterMultiplication.getTime() - beforeMultiplication.getTime();
    System.out.println("Mult Diff   : " + multiplicationDifference);

}

private static long loopFor = 100000000000L;
private static short x = 200;
private static short y = 195;
private static int z;

private static void if_else_test()
{
    short diff = (short) (y - x);
    for(long i = 0; i < loopFor; i++)
    {
        if (diff < 0)
        {
            z = -diff;
        }
        else
        {
            z = diff;
        }
    }
}

private static void multiplication_test()
{
    for(long i = 0; i < loopFor; i++)
    {
        short diff = (short) (y - x);
        z = diff * diff;
    }
}
Iphlgenia answered 1/8, 2017 at 18:17 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.