Is a negative integer summed with a greater unsigned integer promoted to unsigned int?
Asked Answered
Z

5

20

After getting advised to read "C++ Primer 5 ed by Stanley B. Lipman" I don't understand this:

Page 66. "Expressions Involving Unsigned Types"

unsigned u = 10;
int i = -42;
std::cout << i + i << std::endl; // prints -84
std::cout << u + i << std::endl; // if 32-bit ints, prints 4294967264

He said:

In the second expression, the int value -42 is converted to unsigned before the addition is done. Converting a negative number to unsigned behaves exactly as if we had attempted to assign that negative value to an unsigned object. The value “wraps around” as described above.

But if I do something like this:

unsigned u = 42;
int i = -10;
std::cout << u + i << std::endl; // Why the result is 32?

As you can see -10 is not converted to unsigned int. Does this mean a comparison occurs before promoting a signed integer to an unsigned integer?

Ziwot answered 14/1, 2019 at 22:26 Comment(5)
As you can see -10 is not converted to unsigned int. It is.Sclerous
Google about binary numbers and the way they are represented, in particular signedness. Then all shall become clear.Hereupon
What result were you expecting instead of 32?Marvismarwin
Possibly related: c++ safeness of code with implicit conversion between signed and unsignedOutflow
Thank you all guys. You make it clear for me now.Ziwot
Q
26

-10 is being converted to a unsigned integer with a very large value, the reason you get a small number is that the addition wraps you back around. With 32 bit unsigned integers -10 is the same as 4294967286. When you add 42 to that you get 4294967328, but the max value is 4294967296, so we have to take 4294967328 modulo 4294967296 and we get 32.

Quadriceps answered 14/1, 2019 at 22:36 Comment(5)
What I find interesting with modulo arithmetic is that by "convoluted" means, you get the same answer as regular arithmetic. This is one reason to justify wrapping behavior for overflow handling.Quag
@MatthieuM. The "convolution" only happens when you think of unsigned integers as numbers. The fact that adding a number from the equivalence class of 42 to a number from the equivalence class of -10 yields a value from the equivalence class of 32 is no less intuitive than the signed result IMO.Horace
So easy and simple to understand thank you. One thing: Why short and char are not promoted to unsigned?Ziwot
@Ziwot You're welcome. short and char get promoted to int as that is the smallest built in type that the mathematical operators use. The only time where that could be different is when you have a unsigned char or unsigned short and it has the same size as int. Then it would be promoted to an unsigned int.Quadriceps
@NathanOliver: Perfect! thank you again. Really got your point.Ziwot
F
18

Well, I guess this is an exception to "two wrongs don't make a right" :)

What's happening is that there are actually two wrap arounds (unsigned overflows) under the hood and the final result ends up being mathematically correct.

  • First, i is converted to unsigned and as per the wrap around behavior the value is std::numeric_limits<unsigned>::max() - 9.

  • When this value is summed with u the mathematical result would be std::numeric_limits<unsigned>::max() - 9 + 42 == std::numeric_limits<unsigned>::max() + 33 which is an overflow and we get another wrap around. So the final result is 32.


As a general rule in an arithmetic expression if you only have unsigned overflows (no matter how many) and if the final mathematical result is representable in the expression data type, then the value of the expression will be the mathematically correct one. This is a consequence of the fact that unsigned integers in C++ obey the laws of arithmetic modulo 2n (see bellow).


Important notice. According to C++ unsigned arithmetic does not overflow:

§6.9.1 Fundamental types [basic.fundamental]

  1. Unsigned integers shall obey the laws of arithmetic modulo 2n where n is the number of bits in the value representation of that particular size of integer 49

49) This implies that unsigned arithmetic does not overflow because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting unsigned integer type.

I will however leave "overflow" in my answer to express values that cannot be represented in regular arithmetic.

Also what we colloquially call "wrap around" is in fact just the arithmetic modulo nature of the unsigned integers. I will however use "wrap around" also because it is easier to understand.

Facility answered 14/1, 2019 at 22:36 Comment(18)
Unsigned arithmetic does not overflow.Horace
afaik it does and it is well definedFacility
@BaummitAugen what? of course it overflows. Add two large unsigned ints, you can't represent a number that doesn't fit in the number formatIngratiating
No. It can represent any integer by picking the representative of its equivalence class that lies in [0, 2^NumBit - 1]. Overflow occurs when some result cannot be represented. That cannot happen in unsigned arithmetic.Horace
@GarrGodfrey Yes it can represent the result. You are misunderstanding the arithmetic it implements.Horace
We are talking about C++. unsigned int is a fixed size. It cannot expand infinitely.Ingratiating
@BaummitAugen unsigned numbers to over/underflow. They just do so in a well defined manner.Quadriceps
@BaummitAugen I went to the standard to prove my point. Turns out I was wrong. Thank you.Facility
@GarrGodfrey Every integer is in some equivalence class of Z / n for any natural number n. Finite bits suffice.Horace
"if the final mathematical result" result of WHAT operation?Neutralization
@Neutralization any arithmetic expression. I edited to be more clear. Hope now it is.Facility
it's a bit ridiculous to argue losing the high order bits doesn't count as overflow just because the low order bits are still accurate.Ingratiating
@GarrGodfrey I searched the standard and he is indeed correct. There is no overflow. And this is because unsigned integers do not represent natural numbers, but, as BaummitAugen said, they represent equivalence classes.Facility
@Facility "they represent equivalence classes" then why would anyone use them to represent a number of bytes, a number of objects in a container, etc.?Neutralization
@Neutralization because they are useful like thatFacility
@Facility At least with signed integers the compiler is allowed to stop programs that overflow. Overflow is most of the time unintentional and the sign of a bug. (A bug that could easily be a security vulnerability.) Defining overflow has wrap around is sometimes useful but can hide bugs. If people rely on it being defined, you can't add a debugging mode later that diagnoses it because of the false alerts.Neutralization
@Neutralization I've heard some ppl (experts) saying that having wraparound behavior for unsigned was a bad decision in hindsight. They main reason however was the optimization it inhibits. They argued for regular unsigned to have undefined behavior on overflow and to exist some other data type that had wraparound behavior. But it is what it is.Facility
@Neutralization "then why would anyone use them to represent a number of bytes, a number of objects in a container, etc.?" Several prominent members of the committee consider exactly that a historical mistake. channel9.msdn.com/Events/GoingNative/2013/… 9:50, 42:40, 1:02:50Horace
H
6

i is in fact promoted to unsigned int.

Unsigned integers in C and C++ implement arithmetic in ℤ / 2nℤ, where n is the number of bits in the unsigned integer type. Thus we get

[42] + [-10] ≡ [42] + [2n - 10] ≡ [2n + 32] ≡ [32],

with [x] denoting the equivalence class of x in ℤ / 2nℤ.

Of course, the intermediate step of picking only non-negative representatives of each equivalence class, while it formally occurs, is not necessary to explain the result; the immediate

[42] + [-10] ≡ [32]

would also be correct.

Horace answered 14/1, 2019 at 22:39 Comment(0)
R
3

"In the second expression, the int value -42 is converted to unsigned before the addition is done"

yes this is true

unsigned u = 42;
int i = -10;
std::cout << u + i << std::endl; // Why the result is 32?

Supposing we are in 32 bits (that change nothing in 64b, this is just to explain) this is computed as 42u + ((unsigned) -10) so 42u + 4294967286u and the result is 4294967328u truncated in 32 bits so 32. All was done in unsigned

Racoon answered 14/1, 2019 at 22:39 Comment(0)
I
2

This is part of what is wonderful about 2's complement representation. The processor doesn't know or care if a number is signed or unsigned, the operations are the same. In both cases, the calculation is correct. It's only how the binary number is interpreted after the fact, when printing, that is actually matters (there may be other cases, as with comparison operators)

-10 in 32BIT binary is FFFFFFF6
42 IN 32bit BINARY is  0000002A

Adding them together, it doesn't matter to the processor if they are signed or unsigned, the result is: 100000020. In 32bit, the 1 at the start will be placed in the overflow register, and in c++ is just disappears. You get 0x20 as the result, which is 32.

In the first case, it is basically the same:

-42 in 32BIT binary is FFFFFFD6
10 IN 32bit binary is 0000000A

Add those together and get FFFFFFE0

FFFFFFE0 as a signed int is -32 (decimal). The calculation is correct! But, because it is being PRINTED as an unsigned, it shows up as 4294967264. It's about interpreting the result.

Ingratiating answered 14/1, 2019 at 22:50 Comment(3)
As a side note, on x86, most arithmetic operations set various bits in the flags register, which can then be used to e.g. perform branching. For example, addition would set CF (carry flag) to signify that unsigned "wrapping" has occurred, and/or OF (overflow flag) to signify that signed overflow has occurred, if you were to consider the operands signed. There was even a 1-byte INTO instruction which generated a software interrupt if OF was set, which could help with debugging, but unfortunately was not supported from higher level languages, and is no longer available in 64 bit mode.Electrojet
Technically speaking, nothing mandates 2's-complement representation. It's just that the modular-arithmetic conversion specified by the standard is exactly equivalent to using 2's-complement (as you'll see if you perform the conversion on a sign-magnitude or 1's-complement system, if it's correct).Herewith
implementing an adder in 1's-compliment or sign-magnitude is very complicated vs 2's compliment. Not required, except for sanity.Ingratiating

© 2022 - 2024 — McMap. All rights reserved.