I am in a situation where I need to compute something like size_t s=(size_t)floorf(f);
. That is, the argument is a float, but it has an integer value (assume floorf(f)
is small enough to be represented exactly). While optimizing this, I discovered something interesting.
Here are some conversions from float
to integers (GCC 5.2.0 -O3). For clarity, the conversion given is the return value of a test function.
Here's int32_t x=(int32_t)f
:
cvttss2si eax, xmm0
ret
Here's uint32_t x=(uint32_t)f
:
cvttss2si rax, xmm0
ret
Here's int64_t x=(int64_t)f
:
cvttss2si rax, xmm0
ret
Last, here's uint64_t x=(uint64_t)f;
:
ucomiss xmm0, DWORD PTR .LC2[rip]
jnb .L4
cvttss2si rax, xmm0
ret
.L4:
subss xmm0, DWORD PTR .LC2[rip]
movabs rdx, -9223372036854775808
cvttss2si rax, xmm0
xor rax, rdx
ret
.LC2:
.long 1593835520
This last one is much more complex than the others. Moreover, Clang and MSVC behave similarly. For your convenience, I've translated it into pseudo-C:
float lc2 = (float)(/* 2^63 - 1 */);
if (f<lc2) {
return (uint64_t)f;
} else {
f -= lc2;
uint64_t temp = (uint64_t)f;
temp ^= /* 2^63 */; //Toggle highest bit
return temp;
}
This looks like it is trying to compute the first overflow mod 64 correctly. That seems kindof bogus, since the documentation for cvttss2si tells me that if an overflow happens (at 2^32, not 2^64), "the indefinite integer value (80000000H) is returned".
My questions:
- What is this really doing, and why?
- Why wasn't something similar done for the other integer types?
- How can I change the conversion so as to produce similar code (only output lines 3 and 4) (again, assume that the value is exactly representable)?