The classic x86 architecture uses floating-point unit (FPU) to perform floating-point calculations. The FPU performs all calculations in its internal registers, which have 80-bit precision each. Every time you attempt to work with float
or double
, the variable is first loaded from memory into the internal register of the FPU. This means that there is absolutely no difference in the speed of the actual calculations, since in any case the calculations are carried out with full 80-bit precision. The only thing that might be different is the speed of loading the value from memory and storing the result back to memory. Naturally, on a 32-bit platform it might take longer to load/store a double
as compared to float
. On a 64-bit platform there shouldn't be any difference.
Modern x86 architectures support extended instruction sets (SSE/SSE2) with new instructions that can perform the very same floating-point calculations without involving the "old" FPU instructions. However, again, I wouldn't expect to see any difference in calculation speed for float
and double
. And since these modern platforms are 64-bit ones, the load/store speed is supposed to be the same as well.
On a different hardware platform the situation could be different. But normally a smaller floating-point type should not provide any performance benefits. The main purpose of smaller floating-point types is to save memory, not to improve performance.
Edit: (To address @MSalters comment)
What I said above applies to fundamental arithmetical operations. When it comes to library functions, the answer will depend on several implementation details. If the platform's floating-point instruction set contains an instruction that implements the functionality of the given library function, then what I said above will normally apply to that function as well (that would normally include functions like sin
, cos
, sqrt
). For other functions, whose functionality is not immediately supported in the FP instruction set, the situation might prove to be significantly different. It is quite possible that float
versions of such functions can be implemented more efficiently than their double
versions.