Consider the following, simple program (adapted from this question):
#include <cstdlib>
int main(int argc, char** argv) {
int mul1[10] = { 4, 1, 8, 6, 3, 2, 5, 8, 6, 7 }; // sum = 50
int mul2[10] = { 4, 1, 8, 6, 7, 9, 5, 1, 2, 3 }; // sum = 46
int x1 = std::atoi(argv[1]);
int x2 = std::atoi(argv[2]);
int result = 0;
// For each element in mul1/mul2, accumulate the product with x1/x2 in result
for (int i = 0; i < 10; ++i) {
result += x1 * mul1[i] + x2 * mul2[i];
}
return result;
}
I believe it is functionally equivalent to the following one:
#include <cstdlib>
int main(int argc, char** argv) {
int x1 = std::atoi(argv[1]);
int x2 = std::atoi(argv[2]);
return x1 * 50 + x2 * 46;
}
And yet clang 3.7.1, gcc 5.3 and icc 13.0.1 seem to be unable to make such optimization, even with -Ofast
. (Note by the way how the generated assembly is vastly different between compilers!). However, when removing mul2
and x2
from the equation, clang is able to perform a similar optimization, even with -O2
.
What prevents both compilers from optimizing the first program into the second?
x2 * mul2[i]
from the equation. Personally I feel that one should not expect anything from compiler optimizer; whatever is received is a bonus! – Casein