I was working on a personal project recently when I stumbled across an odd issue.
In a very tight loop I have an integer with a value between 0 and 15. I need to get -1 for values 0, 1, 8, and 9 and 1 for values 4, 5, 12, and 13.
I turned to godbolt to check a few options and was surprised that it seemed the compiler couldn't optimize a switch statement the same way as an if chain.
The link is here: https://godbolt.org/z/WYVBFl
The code is:
const int lookup[16] = {-1, -1, 0, 0, 1, 1, 0, 0, -1, -1, 0, 0, 1, 1, 0, 0};
int a(int num) {
return lookup[num & 0xF];
}
int b(int num) {
num &= 0xF;
if (num == 0 || num == 1 || num == 8 || num == 9)
return -1;
if (num == 4 || num == 5 || num == 12 || num == 13)
return 1;
return 0;
}
int c(int num) {
num &= 0xF;
switch (num) {
case 0: case 1: case 8: case 9:
return -1;
case 4: case 5: case 12: case 13:
return 1;
default:
return 0;
}
}
I would have thought that b and c would yield the same results, and I was hoping that I could read the bit-hacks to come up with an efficient implementation myself since my solution (the switch statement - in another form) was fairly slow.
Oddly, b
compiled to bit-hacks while c
was either pretty much un-optimized or reduced to a different case of a
depending on target hardware.
Can anybody explain why there is this discrepancy? What is the 'correct' way to optimize this query?
EDIT:
Clarification
I want the switch solution to be the fastest, or a similarly "clean" solution. However when compiled with optimizations on my machine the if solution is significantly faster.
I wrote a quick program to demonstrate and TIO has the same results as I find locally: Try it online!
With static inline
the lookup table speeds up a bit: Try it online!
-O3
, and it compiledc
to something likely worse thana
orb
(c
had two conditional jumps plus a few bit manipulations, vs. only one conditional jump and simpler bit manip forb
), but still better than naive item by item tests. I'm not sure what you're really asking for here; the simple fact is that an optimizing compiler can turn any of these into any of the others if it so chooses, and there are no hard and fast rules for what it will or will not do. – Pahlif
still beatsswitch
(oddly lookup becomes even faster) [TIO to follow] – Sixth