Why is AsDouble1
much more straightforward than AsDouble0
// AsDouble0(unsigned long): # @AsDouble0(unsigned long)
// movq xmm1, rdi
// punpckldq xmm1, xmmword ptr [rip + .LCPI0_0] # xmm1 = xmm1[0],mem[0],xmm1[1],mem[1]
// subpd xmm1, xmmword ptr [rip + .LCPI0_1]
// movapd xmm0, xmm1
// unpckhpd xmm0, xmm1 # xmm0 = xmm0[1],xmm1[1]
// addsd xmm0, xmm1
// addsd xmm0, xmm0
// ret
double AsDouble0(uint64_t x) { return x * 2.0; }
// AsDouble1(unsigned long): # @AsDouble1(unsigned long)
// shr rdi
// cvtsi2sd xmm0, rdi
// addsd xmm0, xmm0
// ret
double AsDouble1(uint64_t x) { return (x >> 1) * 2.0; }
Code available at: https://godbolt.org/z/dKc6Pe6M1
on Godbolt if you just want to look at the asm. Also semi-related Are there unsigned equivalents of the x87 FILD and SSE CVTSI2SD instructions? except that's the reverse direction. How to efficiently perform double/int64 conversions with SSE/AVX? covers packed conversions. – Intrastate(uint64_t)float
. – Intrastate