Why does C++ promote an int to a float when a float cannot represent all int values?

Asked 18/1, 2015 at 14:8 Answered 20/1, 2015 at 0:29

Solved c++types int floating-point-conversion promotions

Say I have the following:

int i = 23;
float f = 3.14;
if (i == f) // do something

i will be promoted to a float and the two float numbers will be compared, but can a float represent all int values? Why not promote both the int and the float to a double?

Salina answered 18/1, 2015 at 14:8 Comment(11)

This is relevant: #759514 – Cori 18/1, 2015 at 14:11

float can represent the range of int values, it just can't represent those longer that 7 digits exactly (and double can't represent more than 15 digits exactly). – Calathus 18/1, 2015 at 14:17

And in general (exception for values < 32 bits) the rule (similar to most other languages) is to "promote" values in an expression to the closest common representation. (Besides which, 3.14 is never going to compare equal to an int.) – Calathus 18/1, 2015 at 14:20

probably because C did it that way – Youngyoungblood 18/1, 2015 at 14:29

https://mcmap.net/q/281311/-why-the-operands-of-an-operator-needs-to-be-of-the-same-type/560648 is relevant too – Peart 18/1, 2015 at 14:41

That's not a promotion, it's a conversion (formally, one of the usual arithmetic conversions, see 5 [expr] p10) – Minefield 18/1, 2015 at 15:22

How is this not a duplicate more than 6 years after Stack Overflow launced? – Warrior 18/1, 2015 at 18:56

One should of course also consider that == on floats is often a bad idea... – Postulate 18/1, 2015 at 19:58

"why promote in the first place" -- The Peter Principle. – Calathus 18/1, 2015 at 22:0

this is what you get for using a weakly-typed language ;) – Barrow 19/1, 2015 at 4:40

As far as I can see, every single answer offered to this question is pure speculation. No-one has done any historical research to see who made the decision, and whether they documented their reasons. Everyone is answering, in effect, "given the state of the world at the time, someone might have thought it was a good idea because XXXX". – Popelka 19/1, 2015 at 22:39

When int is promoted to unsigned in the integral promotions, negative values are also lost (which leads to such fun as 0u < -1 being true).

Like most mechanisms in C (that are inherited in C++), the usual arithmetic conversions should be understood in terms of hardware operations. The makers of C were very familiar with the assembly language of the machines with which they worked, and they wrote C to make immediate sense to themselves and people like themselves when writing things that would until then have been written in assembly (such as the UNIX kernel).

Now, processors, as a rule, do not have mixed-type instructions (add float to double, compare int to float, etc.) because it would be a huge waste of real estate on the wafer -- you'd have to implement as many times more opcodes as you want to support different types. That you only have instructions for "add int to int," "compare float to float", "multiply unsigned with unsigned" etc. makes the usual arithmetic conversions necessary in the first place -- they are a mapping of two types to the instruction family that makes most sense to use with them.

From the point of view of someone who's used to writing low-level machine code, if you have mixed types, the assembler instructions you're most likely to consider in the general case are those that require the least conversions. This is particularly the case with floating points, where conversions are runtime-expensive, and particularly back in the early 1970s, when C was developed, computers were slow, and when floating point calculations were done in software. This shows in the usual arithmetic conversions -- only one operand is ever converted (with the single exception of long/unsigned int, where the long may be converted to unsigned long, which does not require anything to be done on most machines. Perhaps not on any where the exception applies).

So, the usual arithmetic conversions are written to do what an assembly coder would do most of the time: you have two types that don't fit, convert one to the other so that it does. This is what you'd do in assembler code unless you had a specific reason to do otherwise, and to people who are used to writing assembler code and do have a specific reason to force a different conversion, explicitly requesting that conversion is natural. After all, you can simply write

if((double) i < (double) f)

It is interesting to note in this context, by the way, that unsigned is higher in the hierarchy than int, so that comparing int with unsigned will end in an unsigned comparison (hence the 0u < -1 bit from the beginning). I suspect this to be an indicator that people in olden times considered unsigned less as a restriction on int than as an extension of its value range: We don't need the sign right now, so let's use the extra bit for a larger value range. You'd use it if you had reason to expect that an int would overflow -- a much bigger worry in a world of 16-bit ints.

Lax answered 18/1, 2015 at 15:15 Comment(10)

"are lost" ... "should be understood in terms of hardware operations" ... No, they are defined independent of hardware operations arithmetically, in a way which makes sense, considering the limitations of allowing for efficient implementation. Take a llok at 1s-complement and sign-magnitude. – Devinna 18/1, 2015 at 16:57

There is no denying, I should think, that many design decisions in the early development of C can be understood in terms of the hardware of the day. Do not confuse them with the standardization procedures of today -- by the time C was standardized, 20 years of uncontrolled compiler growth had already happened, and the committee had to codify what notable compilers were already doing. They wrote it down in a way that is independent of hardware, of course, but the reasons this is the way it is are closely coupled to hardware considerations. Had C begun as a spec, things might be different. – Lax 18/1, 2015 at 17:53

That's quite a wall of text. What about adding a TL;DR section or the like? – Harewood 18/1, 2015 at 20:41

I am new to coding, but is it also because an integer can be expressed as a float, since its only whole numbers, but a float cannot be represented as an int, since it could have decimals, which INT doesn't support? Anybody know? – Overalls 19/1, 2015 at 10:29

@Akham basically yes: The int -> float is sometimes perfect and often subjectively "less wrong" than float -> int. I also assume that in the 1970s int -> float was often loss free for all ints which e.g. on a PDP 11 were only 16 bits. The OP wondered from a today's perspective where ints have 32 bits so that larger numbers cannot be exactly presented in a single percision float, why a conversion which is lossy in most cases is the default. – Nomen 19/1, 2015 at 13:28

The C standard addresses non-2's complement machines, where signed-to-unsigned conversion isn't a no-op (just to give an example where this matters, though this was different before C89 (implementation-defined, iirc), and at the time the standard was made, these machines were already rare). – Unto 19/1, 2015 at 15:33

@Harewood TL;DR C++ does what C does. C was designed by and for assembly coders from the 1970s. It does what your average '70s assembly coder would expect to see or have done by hand. – Vacate 19/1, 2015 at 17:37

@stonemetal: Originally, C used to promote all floating-point types to double, since it meant that it could omit all code for handling floats other than convert-to-double and convert-from-double. In general, converting everything to the highest-precision type is semantically superior to working with lower precision, provided that (1) it's possible for code to define storage locations of the highest-precision type, and (2) conversions to lower-precision types only happen when assigning to storage locations of those types. From a performance perspective, on many platforms... – Guss 19/1, 2015 at 20:12

...*with and without floating-point units*, use of a higher-precision intermediate type could improve performance as well (on a 16-bit or 32-bit processor with no FPU, most operations on float or double require unpacking the mantissa and exponent into separate registers, doing the operation, and then packing the result; keeping the values in separate registers allows higher precision with no performance cost). It's too bad so many C compilers failed to let code specify deterministic extra-precision semantics, since it was useful in many ways. – Guss 19/1, 2015 at 20:21

Note that floating-point conversions are still excruciatingly slow on modern CPUs. – Masked 19/1, 2015 at 22:51

Even double may not be able to represent all int values, depending on how much bits does int contain.

Why not promote both the int and the float to a double?

Probably because it's more costly to convert both types to double than use one of the operands, which is already a float, as float. It would also introduce special rules for comparison operators incompatible with rules for arithmetic operators.

There's also no guarantee how floating point types will be represented, so it would be a blind shot to assume that converting int to double (or even long double) for comparison will solve anything.

Ersatz answered 18/1, 2015 at 14:14 Comment(0)

The type promotion rules are designed to be simple and to work in a predictable manner. The types in C/C++ are naturally "sorted" by the range of values they can represent. See this for details. Although floating point types cannot represent all integers represented by integral types because they can't represent the same number of significant digits, they might be able to represent a wider range.

To have predictable behavior, when requiring type promotions, the numeric types are always converted to the type with the larger range to avoid overflow in the smaller one. Imagine this:

int i = 23464364; // more digits than float can represent!
float f = 123.4212E36f; // larger range than int can represent!
if (i == f) { /* do something */ }

If the conversion was done towards the integral type, the float f would certainly overflow when converted to int, leading to undefined behavior. On the other hand, converting i to f only causes a loss of precision which is irrelevant since f has the same precision so it's still possible that the comparison succeeds. It's up to the programmer at that point to interpret the result of the comparison according to the application requirements.

Finally, besides the fact that double precision floating point numbers suffer from the same problem representing integers (limited number of significant digits), using promotion on both types would lead to having a higher precision representation for i, while f is doomed to have the original precision, so the comparison will not succeed if i has a more significant digits than f to begin with. Now that is also undefined behavior: the comparison might succeed for some couples (i,f) but not for others.

Maya answered 18/1, 2015 at 15:0 Comment(2)

The types are sorted by their maximum value not the size of their range. int and unsigned, for example, have an equally large range. – Dhobi 14/2, 2015 at 10:53

@Dhobi I think you're right: I believe the compiler will interpret an unsigned int as an int or viceversa depending on the type of the lvalue. – Maya 14/2, 2015 at 15:7

can a float represent all int values?

For a typical modern system where both int and float are stored in 32 bits, no. Something's gotta give. 32 bits' worth of integers doesn't map 1-to-1 onto a same-sized set that includes fractions.

The i will be promoted to a float and the two float numbers will be compared…

Not necessarily. You don't really know what precision will apply. C++14 §5/12:

The values of the floating operands and the results of floating expressions may be represented in greater precision and range than that required by the type; the types are not changed thereby.

Although i after promotion has nominal type float, the value may be represented using double hardware. C++ doesn't guarantee floating-point precision loss or overflow. (This is not new in C++14; it's inherited from C since olden days.)

Why not promote both the int and the float to a double?

If you want optimal precision everywhere, use double instead and you'll never see a float. Or long double, but that might run slower. The rules are designed to be relatively sensible for the majority of use-cases of limited-precision types, considering that one machine may offer several alternative precisions.

Most of the time, fast and loose is good enough, so the machine is free to do whatever is easiest. That might mean a rounded, single-precision comparison, or double precision and no rounding.

But, such rules are ultimately compromises, and sometimes they fail. To precisely specify arithmetic in C++ (or C), it helps to make conversions and promotions explicit. Many style guides for extra-reliable software prohibit using implicit conversions altogether, and most compilers offer warnings to help you expunge them.

To learn about how these compromises came about, you can peruse the C rationale document. (The latest edition covers up to C99.) It is not just senseless baggage from the days of the PDP-11 or K&R.

Illuminant answered 19/1, 2015 at 5:32 Comment(7)

@FractalizeR Thanks… I was late to the party, so we'll see what happens. – Illuminant 21/1, 2015 at 12:8

Historically, computations on long double were often faster than double on machines which defined it as a distinct type. If ANSI C had provided a means by which prototypes for variable-argument functions could specify a type to which all floating-point values would be converted (with the default being double), then it would have been practical to say that all operations on double promote to long double (which may or may not be larger than double) and all operations on float promote to long float (which could be float, long double, or anything in-between). – Guss 6/7, 2015 at 22:16

Unfortunately, the broken behavior of printf with regard to the type long double has resulted in the type being unsuitable for its original intended use, since it precluded double operands from being cleanly promoted to long double even on processors where using long double would be faster than using double. – Guss 6/7, 2015 at 22:18

@Guss There are some misconceptions there. 1. Floating-point operands are not promoted like integer ones are: the expression 1.f + 1.f has type float, not double. C varargs is special. 2. The precision of intermediate results is independent of their floating-point type: 1.f + 1.e-10f - 1.f may well be 1.e-10f. 3. If double and long double are both conforming to IEEE 754, then they're 64 and {either 80 or 128 bits}, respectively. Most FPUs today are 64 bits wide, and inside a vector datapath. That long double was faster, ever or anywhere, is no more than a historical fluke. – Illuminant 7/7, 2015 at 2:33

Also, IIRC, long double was a late addition to the C language, circa C89. There's no way it could have been made the default format for printf, upon its introduction, without breaking every every ABI or trivializing itself. – Illuminant 7/7, 2015 at 2:36

@Potatoswatter: In C as originally defined, all floating-point values promoted to a common type when passed to a varargs function. Having varargs arguments promote to long double in the absence of a prototype requesting it would indeed have been a breaking change, but a simple solution would have been to say that in the absence of a prototype requesting that all floating-point arguments promote to long double, they should all (including long double) get converted to double. ABI compatibility would then require that code use prototypes suitable for the library being used, which... – Guss 7/7, 2015 at 15:35

...is hardly an unusual requirement. Further, many processors without FPU (still common in the embedded world) can work with 1+15+32 or 1+15+64-bit floating-point types more efficiently than with 8+1+(implied 1)+23 or 1+11+(implied 1)+52. No standard emerged for 1+15+32, but 1+15+64 is a great format for machines without floating-point units. I'd consider the purposes and design objectives for a vector-math unit different from those of general-purpose floating-point unit; if C hadn't destroyed the usefulness of 80-bit types, having an 80-bit unit plus a vector unit would be a useful combo. – Guss 7/7, 2015 at 15:39

It is fascinating that a number of answers here argue from the origin of the C language, explicitly naming K&R and historical baggage as the reason that an int is converted to a float when combined with a float.

This is pointing the blame to the wrong parties. In K&R C, there was no such thing as a float calculation. All floating point operations were done in double precision. For that reason, an integer (or anything else) was never implicitly converted to a float, but only to a double. A float also could not be the type of a function argument: you had to pass a pointer to float if you really, really, really wanted to avoid conversion into a double. For that reason, the functions

int x(float a)
{ ... }

and

int y(a)
float a;
{ ... }

have different calling conventions. The first gets a float argument, the second (by now no longer permissable as syntax) gets a double argument.

Single-precision floating point arithmetic and function arguments were only introduced with ANSI C. Kernighan/Ritchie is innocent.

Now with the newly available single float expressions (single float previously was only a storage format), there also had to be new type conversions. Whatever the ANSI C team picked here (and I would be at a loss for a better choice) is not the fault of K&R.

Jocasta answered 18/1, 2015 at 16:27 Comment(4)

And how was that double type called in K&R C? I guess it shouldn't have been double, otherwise there should also single or float have been present. – Schonthal 19/1, 2015 at 5:5

@Ruslan: You understood it wrong. He was talking only about float-operation, not float-storage. – Oversold 19/1, 2015 at 15:22

@Ruslan: In C as originally intended, all operations on floating-point types would convert all operands to the highest-precision type (which happened to be double), operate on them, and then convert the final result back to the specified type (which could originally be float or double). On machines without floating-point units, such an approach will generally require less code than would using separately-coded operations for different floating-point types, and also generally yields better results. For some kinds of applications, there may be advantages to doing things other ways... – Guss 7/7, 2015 at 17:31

...but evaluating an expression like f5=f1*f2+f3*f4 by computing double-precision products and then performing the addition will yield 0.51ulp of accuracy easily; ensuring even 1ulp of accuracy without promotion to double is often much harder. – Guss 7/7, 2015 at 17:36

Q1: Can a float represent all int values?

IEE754 can represent all integers exactly as floats, up to about 2²³, as mentioned in this answer.

Q2: Why not promote both the int and the float to a double?

The rules in the Standard for these conversions are slight modifications of those in K&R: the modifications accommodate the added types and the value preserving rules. Explicit license was added to perform calculations in a “wider” type than absolutely necessary, since this can sometimes produce smaller and faster code, not to mention the correct answer more often. Calculations can also be performed in a “narrower” type by the as if rule so long as the same end result is obtained. Explicit casting can always be used to obtain a value in a desired type.

Source

Performing calculations in a wider type means that given float f1; and float f2;, f1 + f2 might be calculated in double precision. And it means that given int i; and float f;, i == f might be calculated in double precision. But it isn't required to calculate i == f in double precision, as hvd stated in the comment.

Also C standard says so. These are known as the usual arithmetic conversions . The following description is taken straight from the ANSI C standard.

...if either operand has type float , the other operand is converted to type float .

Source and you can see it in the ref too.

A relevant link is this answer. A more analytic source is here.

Here is another way to explain this: The usual arithmetic conversions are implicitly performed to cast their values in a common type. Compiler first performs integer promotion, if operands still have different types then they are converted to the type that appears highest in the following hierarchy:

enter image description here

Source.

Heartbroken answered 18/1, 2015 at 14:13 Comment(5)

"Because the C standard says so." is not even an answer to the question why C works like that, let alone an answer to the question why C++ works like that. – Yore 18/1, 2015 at 15:0

@hvd you caught me on editing, I think it's a bit better now. :) – Heartbroken 18/1, 2015 at 15:3

Yeah, that's definitely an improvement. It's not exactly what the question asks, though. Performing calculations in a wider type means that given float f1; and float f2;, f1 + f2 might be calculated in double precision. And it means that given int i; and float f;, i == f might be calculated in double precision. But it isn't required to calculate i == f in double precision, and the question asks why not. – Yore 18/1, 2015 at 15:7

Oh you are right @hvd. If you think I should delete the answer let me know. – Heartbroken 18/1, 2015 at 15:9

Improving the answer is generally better than deleting it :) You did include relevant information in your answer, and I think it's worth keeping that. (FWIW, I don't have a good answer either, else I'd have posted it already.) – Yore 18/1, 2015 at 15:12

When a programming language is created some decisions are made intuitively.

For instance why not convert int+float to int+int instead of float+float or double+double? Why call int->float a promotion if it holds the same about of bits? Why not call float->int a promotion?

If you rely on implicit type conversions you should know how they work, otherwise just convert manually.

Some language could have been designed without any automatic type conversions at all. And not every decision during a design phase could have been made logically with a good reason.

JavaScript with it's duck typing has even more obscure decisions under the hood. Designing an absolutely logical language is impossible, I think it goes to Godel incompleteness theorem. You have to balance logic, intuition, practice and ideals.

Mentalist answered 18/1, 2015 at 15:9 Comment(3)

"Why call int->float a promotion if it holds the same about of bits?" It's not called a promotion. – Minefield 18/1, 2015 at 15:27

the property that makes javascript comparisons wacky is its weak typing, not duck typing. duck ≈ dynamic. – Barrow 19/1, 2015 at 4:42

@Barrow I would say duck ≈ typeless polymorphism. – Ersatz 22/1, 2015 at 10:5

The question is why: Because it is fast, easy to explain, easy to compile, and these were all very important reasons at the time when the C language was developed.

You could have had a different rule: That for every comparison of arithmetic values, the result is that of comparing the actual numerical values. That would be somewhere between trivial if one of the expressions compared is a constant, one additional instruction when comparing signed and unsigned int, and quite difficult if you compare long long and double and want correct results when the long long cannot be represented as double. (0u < -1 would be false, because it would compare the numerical values 0 and -1 without considering their types).

In Swift, the problem is solved easily by disallowing operations between different types.

Hernadez answered 18/1, 2015 at 15:38 Comment(0)

The rules are written for 16 bit ints (smallest required size). Your compiler with 32 bit ints surely converts both sides to double. There are no float registers in modern hardware anyway so it has to convert to double. Now if you have 64 bit ints I'm not too sure what it does. long double would be appropriate (normally 80 bits but it's not even standard).

Whim answered 20/1, 2015 at 0:29 Comment(5)

Don't assume that every machine is a PC or Mac. There is a lot of 16/32 bit hardware in use, which has 32 bit FPU registers or don't have FPU at all. Or even 8 bit modern hardware like AVR CPUs, which even don't have hardware division of integrals, not mentioning floating point unit. These CPUs are embedded in all kinds of "stupid" devices around you. You are surrounded by such stuff (routers, elevators, traffic light controllers, printers...). They are often meant to be programmed with C or C++, so don't underestimate low-end - it's maybe more important than your shiny Haswell desktop. – Ersatz 21/1, 2015 at 21:38

I haven't seen one such for which int isn't 16 bit though. – Whim 21/1, 2015 at 21:55

int is at least 16 bit by a standard, but... -> gcc.gnu.org/wiki/avr-gcc – Ersatz 21/1, 2015 at 22:4

@doc: it's 16 bit there too. – Whim 21/1, 2015 at 22:15

8-bit int with -mint8 With -mint8 int is only 8 bits wide which does not comply to the C standard – Ersatz 21/1, 2015 at 22:21

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags