Should I use double or float?
Asked Answered
P

11

105

What are the advantages and disadvantages of using one instead of the other in C++?

Pinole answered 2/7, 2009 at 13:53 Comment(1)
Has anyone tried making an array of floats and an array of doubles and see if indeed there are 4 bytes between members on floats and 8 bytes between members on doubles? It's possible that a 64bit compiler/computer might still reserve 8 bytes per member for floats even though they don't need that much.Cornell
G
108

If you want to know the true answer, you should read What Every Computer Scientist Should Know About Floating-Point Arithmetic.

In short, although double allows for higher precision in its representation, for certain calculations it would produce larger errors. The "right" choice is: use as much precision as you need but not more and choose the right algorithm.

Many compilers do extended floating point math in "non-strict" mode anyway (i.e. use a wider floating point type available in hardware, e.g. 80-bits and 128-bits floating), this should be taken into account as well. In practice, you can hardly see any difference in speed -- they are natives to hardware anyway.

Gabbey answered 2/7, 2009 at 14:1 Comment(7)
Yes. With modern CPUs prefetching larger and larger chunks of memory, parallel numerical processing units and pipelined architectures, the speed issue is really not an issue. If you're dealing with huge quantities of numbers, than perhaps the size difference between a 4-byte float and an 8-byte double might make a difference in memory footprint.Titicaca
Well SSE (or any vertor floating point unit) will be able to process twice the number of flops in single precision compared to double precision. If you are doing just x87 (or any scalar) floating point then it probably won't matter.Eddings
@Greg Rogers: compilers are not that smart at this moment. Unless you are writing raw assembly, it don't have large different. And yes, this may change as the compiler evolves.Gabbey
An additional notes: If you have absoluatly no idea what the data look like (or just have no clue at all the maths in the links), just use double -- it is safer in most case.Gabbey
@jokoon, there is nothing simple in floating point and the whole precision/numerical stability problem area.Parahydrogen
can you see the difference in speed in GPUs?Secondbest
"double would produce larger errors"? Had a look at the (1991) paper, and unless I misread, he's talking about the rounding error can be higher with a greater β ; but β is not the mantissa (precision p in the doc) it's the base... and both double and float have β = 2.Multicolored
R
56

Unless you have some specific reason to do otherwise, use double.

Perhaps surprisingly, it is double and not float that is the "normal" floating-point type in C (and C++). The standard math functions such as sin and log take doubles as arguments, and return doubles. A normal floating-point literal, as when you write 3.14 in your program, has the type double. Not float.

On typical modern computers, doubles can be just as fast as floats, or even faster, so performance is usually not a factor to consider, even for large calculations. (And those would have to be large calculations, or performance shouldn't even enter your mind. My new i7 desktop computer can do six billion multiplications of doubles in one second.)

Rupture answered 25/9, 2009 at 8:12 Comment(0)
O
27

This question is impossible to answer since there is no context to the question. Here are some things that can affect the choice:

  1. Compiler implementation of floats, doubles and long doubles. The C++ standard states:

    There are three floating point types: float, double, and long double. The type double provides at least as much precision as float, and the type long double provides at least as much precision as double.

    So, all three can be the same size in memory.

  2. Presence of an FPU. Not all CPUs have FPUs and sometimes the floating point types are emulated and sometimes the floating point types are just not supported.

  3. FPU Architecture. The IA32's FPU is 80bit internally - 32 bit and 64 bit floats are expanded to 80bit on load and reduced on store. There's also SIMD which can do four 32bit floats or two 64bit floats in parallel. Use of SIMD is not defined in the standard so it would require a compiler that does more complex analysis to determine if SIMD can be used, or requires the use of special functions (libraries or intrinsics). The upshot of the 80bit internal format is that you can get slightly different results depending on how often the data is saved to RAM (thus, losing precision). For this reason, compilers don't optimise floating point code particularly well.

  4. Memory bandwidth. If a double requires more storage than a float, then it will take longer to read the data. That's the naive answer. On a modern IA32, it all depends on where the data is coming from. If it's in L1 cache, the load is negligible provided the data comes from a single cache line. If it spans more than one cache line there's a small overhead. If it's from L2, it takes a while longer, if it's in RAM then it's longer still and finally, if it's on disk it's a huge time. So the choice of float or double is less important than the way the data is used. If you want to do a small calculation on lots of sequential data, a small data type is preferable. Doing a lot of computation on a small data set would allow you to use bigger data types with any significant effect. If you're accessing the data very randomly, then the choice of data size is unimportant - data is loaded in pages / cache lines. So even if you only want a byte from RAM, you could get 32 bytes transferred (this is very dependant on the architecture of the system). On top of all of this, the CPU/FPU could be super-scalar (aka pipelined). So, even though a load may take several cycles, the CPU/FPU could be busy doing something else (a multiply for instance) that hides the load time to a degree.

  5. The standard does not enforce any particular format for floating point values.

If you have a specification, then that will guide you to the optimal choice. Otherwise, it's down to experience as to what to use.

Olenta answered 2/7, 2009 at 14:43 Comment(0)
R
16

Double is more precise but is coded on 8 bytes. float is only 4 bytes, so less room and less precision.

You should be very careful if you have double and float in your application. I had a bug due to that in the past. One part of the code was using float while the rest of the code was using double. Copying double to float and then float to double can cause precision error that can have big impact. In my case, it was a chemical factory... hopefully it didn't have dramatic consequences :)

I think that it is because of this kind of bug that the Ariane 6 rocket has exploded a few years ago!!!

Think carefully about the type to be used for a variable

Regelate answered 2/7, 2009 at 14:21 Comment(2)
Note that 4/8 byts for float/double is not even guaranteed, it will depend on the platform. It might even be the same type...Buchner
The Ariane 5 code tried to convert a 64 bit floating point, whose value was greater than 32,767, into a 16 bit signed integer. This generated an overflow exception which caused the rocket to initiate its self-destruct sequence. The code in question, was code that was reused from an older, smaller rocket.Anabal
D
5

I personnaly go for double all the time until I see some bottlenecks. Then I consider moving to float or optimizing some other part

Dumyat answered 2/7, 2009 at 13:57 Comment(0)
M
4

This depends on how the compiler implements double. It's legal for double and float to be the same type (and it is on some systems).

That being said, if they are indeed different, the main issue is precision. A double has a much higher precision due to it's difference in size. If the numbers you are using will commonly exceed the value of a float, then use a double.

Several other people have mentioned performance isssues. That would be exactly last on my list of considerations. Correctness should be your #1 consideration.

Metralgia answered 2/7, 2009 at 13:58 Comment(0)
R
3

I think regardless of the differences (which as everyone points out, floats take up less space and are in general faster)... does anyone ever suffer performance issues using double? I say use double... and if later on you decide "wow, this is really slow"... find your performance bottleneck (which is probably not the fact you used double). THEN, if it's still too slow for you, see where you can sacrifice some precision and use float.

Redheaded answered 2/7, 2009 at 14:1 Comment(0)
W
3

Use whichever precision is required to achieve the appropriate results. If you then find that your code isn't performing as well as you'd like (you used profiling correct?) take a look at:

Wax answered 2/7, 2009 at 14:4 Comment(0)
B
2

It depends highly on the CPU the most obvious trade-offs are between precision and memory. With GBs of RAM, memory is not much of an issue, so it's generally better to use doubles.

As for performance, it depends highly on the CPU. floats will usually get better performance than doubles on a 32 bit machine. On 64 bit, doubles are sometimes faster, since it is (usually) the native size. Still, what will matter much more than your choice of data types is whether or not you can take advantage of SIMD instructions on your processor.

Bullfinch answered 2/7, 2009 at 14:14 Comment(0)
F
1

double has higher precision, whereas floats take up less memory and are faster. In general you should use float unless you have a case where it isn't accurate enough.

Fletcher answered 2/7, 2009 at 13:55 Comment(1)
On typical modern computers, double is just as fast as float.Rupture
W
1

The main difference between float and double is precision. Wikipedia has more info about Single precision (float) and Double precision.

Wolffish answered 2/7, 2009 at 13:58 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.