I'm writing some code in Java where, at some point, the flow of the program is determined by whether two int variables, "a" and "b", are non-zero (note: a and b are never negative, and never within integer overflow range).
I can evaluate it with
if (a != 0 && b != 0) { /* Some code */ }
Or alternatively
if (a*b != 0) { /* Some code */ }
Because I expect that piece of code to run millions of times per run, I was wondering which one would be faster. I did the experiment by comparing them on a huge randomly generated array, and I was also curious to see how the sparsity of the array (fraction of data = 0) would affect the results:
long time;
final int len = 50000000;
int arbitrary = 0;
int[][] nums = new int[2][len];
for (double fraction = 0 ; fraction <= 0.9 ; fraction += 0.0078125) {
for(int i = 0 ; i < 2 ; i++) {
for(int j = 0 ; j < len ; j++) {
double random = Math.random();
if(random < fraction) nums[i][j] = 0;
else nums[i][j] = (int) (random*15 + 1);
}
}
time = System.currentTimeMillis();
for(int i = 0 ; i < len ; i++) {
if( /*insert nums[0][i]*nums[1][i]!=0 or nums[0][i]!=0 && nums[1][i]!=0*/ ) arbitrary++;
}
System.out.println(System.currentTimeMillis() - time);
}
And the results show that if you expect "a" or "b" to be equal to 0 more than ~3% of the time, a*b != 0
is faster than a!=0 && b!=0
:
I'm curious to know why. Could anyone shed some light? Is it the compiler or is it at the hardware level?
Edit: Out of curiosity... now that I learned about branch prediction, I was wondering what the analog comparison would show for a OR b is non-zero:
We do see the same effect of branch prediction as expected, interestingly the graph is somewhat flipped along the X-axis.
Update
1- I added !(a==0 || b==0)
to the analysis to see what happens.
2- I also included a != 0 || b != 0
, (a+b) != 0
and (a|b) != 0
out of curiosity, after learning about branch prediction. But they are not logically equivalent to the other expressions, because only a OR b needs to be non-zero to return true, so they are not meant to be compared for processing efficiency.
3- I also added the actual benchmark that I used for the analysis, which is just iterating an arbitrary int variable.
4- Some people were suggesting to include a != 0 & b != 0
as opposed to a != 0 && b != 0
, with the prediction that it would behave more closely to a*b != 0
because we would remove the branch prediction effect. I didn't know that &
could be used with boolean variables, I thought it was only used for binary operations with integers.
Note: In the context that I was considering all this, int overflow is not an issue, but that's definitely an important consideration in general contexts.
CPU: Intel Core i7-3610QM @ 2.3GHz
Java version: 1.8.0_45
Java(TM) SE Runtime Environment (build 1.8.0_45-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode)
if (!(a == 0 || b == 0))
? Microbenchmarks are notoriously unreliable, this is unlikely to really be measurable (~3% sounds like a margin of error to me). – Gatewaya*b != 0
with!( a == 0 || b == 0)
. If those two results are much closer together, it's an optimization quirk. – Trigona != 0 & b != 0
. – Transilienta*b!=0
has one less branch – Wendalyna*b==0
to(a|b) == 0
, where optimizing the short-circuited comparison is more work. Regardless, you should try thea|b
option, since this will be no more than one cycle (it could even be "free" if dispatched with another op), whereas multiply is usually 2 to 4 cycles in modern x86's. – Zanazander(1<<16) * (1<<16) == 0
yet both are different from zero. – Beveriea*b
is zero if one ofa
andb
is zero;a|b
is zero only if both are. – Bussy983040 * 6422528 == 0
(as 32-bit integers) but neither factor is zero (as a 32-bit integer). If you see it as hex, with arbitrary precision my example is0xF0000 * 0x620000 == 0x5BE00000000
, but since a 32-bit integer preserves only the 8 least significant hex digits, the product is0x00000000
, or just zero. – Extinction(a&b)!=0
. – Zanazander1&2
. – Danczyk&&
rather than&
when the result is the same (no side-effects to consider) that's the premature optimisation. We tend to favour&&
in C-style languages because it tends to be faster (do less work) but sometimes that's premature when it turns out that the added branch costs more than the omitted work. (In fairness, it is also more often the case that&
is just wrong than that&&
is wrong, but the favouring of&&
goes back to a time in C when branch prediction wasn't as big an influence and the speed impact influenced the habits in later languages). – Rickyrico&
operator:if ((a != 0) & (b != 0))
. Not only will this avoid the overflow problem, but should be considerably faster, given how slow multiplication is. It and division are the only operations that require multiple cycles; bit-twiddling is extremely fast by comparison. – Guide&
are booleans, it is not a bitwise but a logical operator, see JLS §15.22. – Mcdougall(a|b) != 0
as an optimisation fora != 0 || b != 0
and(a*b) != 0
fora != 0 && b != 0
— the former is fine, the latter risks overflow on numbers >= 2¹⁶. – Affirmatoryb
ina&b
, which is what will determine whether a branch is used (and whether side-effects occur). But if you are only criticising “bitwise” you are right — unless we accept that a Boolean has one bit! – Affirmatorya
andb
are bigger numbers? – Erythrism0.3
which is 30% not 3%... it might be exactly1 - sqrt(0.5)
which is approximately 29.3%, and then the "OR" case issqrt(0.5)
– Printmaking