What happens when I mix signed and unsigned types in C++?
Asked Answered
R

3

36

I have some doubt about type conversion, could you explain me what happens in an expression like this:

unsigned int u = 10; 
int a = -42; 
std::cout << u - a << std::endl;

Here I know that the result will be 52 if I apply the rules when we have two mathematical operators.

However, I wonder what happens when the compiler has to convert a to an unsigned value and creates a temporary object of unsigned type, what happens after? The expression should now be 10 - 4294967254.

Rosiorosita answered 1/9, 2014 at 15:32 Comment(1)
Step 1: You get a copy of the C++ or C Standard (latest drafts are free) and check it. Step 2: You decide that you'll never be able to remember the rules and avoid that kind of thing in the future.Rasheedarasher
H
47

In simple terms, if you mix types of the same rank (in the sequence of int, long int, long long int), the unsigned type "wins" and the calculations are performed within that unsigned type. The result is of the same unsigned type.

If you mix types of different rank, the higher-ranked type "wins", if it can represent all values of lower-ranked type. The calculations are performed within that type. The result is of that type.

Finally, if the higher-ranked type cannot represent all values of lower-ranked type, then the unsigned version of the higher ranked type is used. The result is of that type.

In your case you mixed types of the same rank (int and unsigned int), which means that the whole expression is evaluated within unsigned int type. The expression, as you correctly stated, is now 10 - 4294967254 (for 32 bit int). Unsigned types obey the rules of modulo arithmetic with 2^32 (4294967296) as the modulo. If you carefully calculate the result (which can be expressed arithmetically as 10 - 4294967254 + 4294967296), it will turn out as the expected 52.

Homo answered 1/9, 2014 at 16:15 Comment(6)
sorry i lost myself, when the expressio becomes : unsigned int temporary = 10 - 4294967254 ( ok i've understood this ) but i can't understand why the expression becomes 10 - 4294967254 + 4294967296 (why you add to the expression the modulo arithmetic ? ).Rosiorosita
@Piero Borrelli: One way to calculate the modulo N equivalent of a negative value V is to add N to it as many times as necessary (V + N, V + 2N, V + 3N and so on) until you hit the first non-negative value. In case of C++ additive operations a mathematically negative result needs the modulo value added only once to arrive at the proper unsigned result.Homo
@Piero Borrelli: Of course, this is a purely arithmetic rule. The compiler does not have to do anything like that. It does not have to worry about it at all. If the negative values are represented through 2's complement, a simple reinterpretation of that representation as unsigned one immediately provides the correct result.Homo
Can you define what you mean by "rank"? C++ doesn't use rank in that way, making this answer ambiguous at best, nonsensical at worst.Raseta
@Adrian: Actually, it does. I'm referring to the concept of integer conversion rank, as it is used in the description of usual arithmetic conversions. The description in my answer is not the exact quote from the standard, since it is intended to be tailored to the specific case of u - a from the original question.Homo
Regarding the last rule "Finally, if the higher-ranked type cannot represent all values of lower-ranked type, then the unsigned version of the higher ranked type is used. The result is of that type.", can anybody give a specific example to illustrate it? Thanks.Synchrocyclotron
D
10
  1. Due to standard conversion rules, the signed type a is converted to an unsigned type prior to subtraction. That conversion happens according to [conv.integral] p3:

Otherwise, the result is the unique value of the destination type that is congruent to the source integer modulo 2N, where N is the width of the destination type.

Algebraically a becomes be a very large positive number, and certainly larger than u.

  1. u - a is an nameless temporary object and will be of unsigned type. (You can verify this by writing auto t = u - a and inspecting the type of t in your debugger.) Mathematically, this will first be a negative number, but after implicit conversion to the unsigned type, a wraparound rule similar to above is invoked.

In short, the two conversion operations have equal and opposite effects and the result will be 52. In practice, the compiler might optimize out all these conversions.

Dot answered 1/9, 2014 at 15:47 Comment(0)
C
-3

Here is the disassemble code which says: first sets -42 to its complement and do the sub operation. So the result is 10 + 42

0x0000000000400835 <+8>:    movl   $0xa,-0xc(%rbp)
0x000000000040083c <+15>:   movl   $0xffffffd6,-0x8(%rbp)
0x0000000000400843 <+22>:   mov    -0x8(%rbp),%eax
0x0000000000400846 <+25>:   mov    -0xc(%rbp),%edx
0x0000000000400849 <+28>:   sub    %eax,%edx
0x000000000040084b <+30>:   mov    %edx,%eax`
Castrate answered 1/9, 2014 at 16:6 Comment(1)
In general case disassembled code cannot serve as a meaningful source for understanding the language-level semantics. Code generation is one-way function. It is not possible to "trace it back". i.e. to figure out what the compiler was actually trying to do by looking at generated code.Homo

© 2022 - 2024 — McMap. All rights reserved.