Firstly a trivial mathematical fact: given integers n
and m
, we have n < m
if, and only if, n <= m - 1
.
GCC seems to prefer immediate values of smaller absolute value. Hence, when m
is known and other conditions are met, the compiler chooses among equivalent comparison expressions the one minimizing absolute values. For instance, it prefers n <= 1000
over n < 1001
and GCC 9.2 translates this
bool f(uint32_t n) {
return n < 1001;
}
into this x86
assembly code
f(unsigned int):
cmpl $1000, %edi
setbe %al
ret
There might be good performance reasons for that but that's not my question. What I'd like to know is this: Is there a way to force GCC to keep the original comparison? More specifically, I'm not worried about portability and thus, GCC specifics (options, pragmas, attributes, ...) are OK for me. However, I'm looking for a constexpr
friendly solution which seems to exclude inline asm
. Finally, I'm targeting C++17 which excludes things like std::is_constant_evaluated
. (Having said that, please, fell free to provide answers regardless of my constraints because it might still be useful for others.)
You might ask why I want to do such thing. Here we go. To my understanding (please, correct me if I'm wrong) this behavior might be a "pessimization" for x86_64
in the following example:
bool g(uint64_t n) {
n *= 5000000001;
return n < 5000000001;
}
which is translated by GCC 6.2 into
g(unsigned long):
movabsq $5000000001, %rax
imulq %rax, %rdi
movabsq $5000000000, %rax
cmpq %rax, %rdi
setbe %al
ret
In x86_64
, computations with 64-bits immediate values have some restrictions that might imply these values to be loaded into registers. In the above example, this happens twice: constants 5000000001
and 5000000000
are stored in rax
for the multiplication and the comparison. Had GCC kept the original comparison as it appears in the C++ code (i.e., against 5000000001
) there would be no need for the second movabs
.
This also implies a code size penalty which, I guess, was considered an issue and more recent versions of GCC (e.g. 9.2) produce this:
g(unsigned long):
movabsq $5000000001, %rax
imulq %rax, %rdi
subq $1, %rax
cmpq %rax, %rdi
setbe %al
ret
Hence the 10-bytes long movabs
was replaced by a 4-bytes long subq
instruction. In any case, subq
also seems unnecessary.
imm8
instead ofimm32
whenever possible, or a sign_extended_imm32 instead of movabs. But when it's not on the cusp there, it's not useful. – Exteriorize