What happens if I assign a negative value to an unsigned variable?
Asked Answered
S

7

98

I was curious to know what would happen if I assign a negative value to an unsigned variable.

The code will look somewhat like this.

unsigned int nVal = 0;
nVal = -5;

It didn't give me any compiler error. When I ran the program the nVal was assigned a strange value! Could it be that some 2's complement value gets assigned to nVal?

Susie answered 26/4, 2010 at 6:42 Comment(8)
My hunch (haven't been able to find it in the standard yet) is that the behavior is technically undefined. Furthermore, I suspect that you'll see what you expect on pretty much any compiler you can find. So while you'll usually see that behavior, it's probably not a good idea to count on it.Lenni
It isn't undefined (see §4.7/2), but the representation (e.g. 2s complement) isn't mandated by the standard.Stripe
@gf (et al below), cool. Looks like the behavior is, in fact, explicitly defined to be what you expected, @viswanathan.Lenni
The second line is equivalent to nVal = (unsigned int) -5;. The cast of -5 to unsigned int is defined in 6.3.1.3. The representation in 2s complement is not mandated by the standard but the algorithm to convert to unsigned is: "the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the newtype until the value is in the range of the newtype."Fennell
@Pascal: Where did you find that?Bellda
@Pascal: You seem to be referring to C99, but the question is tagged C++.Stripe
Oops, sorry. I withdraw my comment. I was indeed assuming C99.Fennell
Two's complement is the only signed integer representation allowed by C++20 and C23.Chirpy
B
83

For the official answer - Section 4.7 conv.integral for conversion from signed integral types.

"If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type). [ Note: In a two’s complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). —end note ]

This essentially means that if the underlying architecture stores in a method that is not Two's Complement (like Signed Magnitude, or One's Complement), that the conversion to unsigned must behave as if it was Two's Complement.

(C++20 and later require that signed integers use 2's complement, but earlier versions of the standard allowed those other representations.)

The "... congruent ..." part means that you add or subtract 2n until the value is in the value-range of the unsigned type. For 2's complement, this means doing 2's complement sign-extension or truncation. For the same width, the bit-pattern is unchanged because adding 2n is a no-op: the low n bits of 2n are all zero. 2's complement addition/subtraction are the same bitwise operation as unsigned, which is what makes it special.


PS. conversion from floating point to unsigned works differently: it's undefined behavior if the value is out of range for the unsigned type (if it's negative after truncation to integer or too large). Modulo reduction only happens for signed integer to unsigned integer.

Bellda answered 26/4, 2010 at 6:53 Comment(6)
What does the least unsigned integer congruent to the source integer mean?Snap
@DavidRodríguez-dribeas As an example, 5 and 3 are "congruent mod 2" since 5%2 and 3%2 are both 1.Oliana
Which versions of C++ standard does it relate to? All?Fromenty
So is it true that (uint)((int)a + (int)b) might be undefined because a and b could overflow but (uint)a + (uint)b is well-defined because overflow is allowed for uint. And both will give identical results for all well-defined sums?Gylys
Your summary of the standard excerpt seems correct although I think odd that it focuses on the edge case where ints are not stored as 2s complement. I think the most interesting thing is the typical case (in the note) where ints are stored as 2s compliment in which case conversion to unsigned is a no-op! The representation (aka bit pattern) of an unsigned int is the same as for the signed int it is converted from. Once that is understood, then what you say makes more sense: that the mathematical result of conversion acts as if ints are stored as 2s complement regardless of actual storage.Hexapartite
@steve: The note part of the excerpt already covers the 2's complement case, which is special because 2's complement add/sub is the same bitwise operation as unsigned add/sub. So adding 2^n to modulo-reduce the represented value into the value-range of the unsigned type is a no-op because that's adding a value with its only set bit to the left of the bits in this n-bit number. (Or if the unsigned type is wider, it ends up being the same as sign-extending first, or truncating if narrower.) Bolding that would maybe be a good idea, or at least italicizing.Jarrett
P
45

It will assign the bit pattern representing -5 (in 2's complement) to the unsigned int. Which will be a large unsigned value. For 32 bit ints this will be 2^32 - 5 or 4294967291

Plum answered 26/4, 2010 at 6:51 Comment(7)
Bit's have nothing to do with it.Swollen
@BenVoigt: Fair enough, I meant it had nothing to do with how bits are interpreted. (That is, the "bits" in the quoted part is just shorthand for ceil(log_2(x)).)Swollen
@Swollen Bit's (as in, belongs to the bit)? 2's Compliment (that's very nice of you)? GAAAAAAAAAAAAAH!Baltoslavic
@NullUserException: Haha, I know. Writing "*'s" in place of just "*s" is a terrible habit I've had for a while. As for compliment instead of complement, that's just pure tomfoolery. :)Swollen
Simplicity is key. This answer has that. (2^32 - 5) explains this behaviour better than quoting the documentation.Emancipate
Considering C++'s ubiquity of undefined behavior, such a strong statement should cite a sourceNeuron
@PostSelf: indeed. The reason this is true is not that 2's complement is special to C++ (it's not until C++20), it's that 2's complement is special in that 2's complement add/subtract are the same as binary add/subtract, and the actual C++ rule is that conversion from a signed integral type to an unsigned integral type works by modulo-reduction of the value into the value-range of the unsigned type. For 2's complement systems, this means taking the signed bit-pattern unchanged, or truncating or sign-extending it. For other systems, it effectively means converting to 2's complement.Jarrett
I
5

You're right, the signed integer is stored in 2's complement form, and the unsigned integer is stored in the unsigned binary representation. C (and C++) doesn't distinguish between the two, so the value you end up with is simply the unsigned binary value of the 2's complement binary representation.

Iconium answered 26/4, 2010 at 6:49 Comment(4)
It may not be stored in 2's compliment.Swollen
What does it mean if something is "stored in 2's?" @SwollenCorelative
@JeremyF: Not "2's", "2's compliment". It's a Google-able term, and a way of representing signed integers.Swollen
Two's complement is the only signed integer representation allowed by C++20 and C23.Chirpy
B
4

It will show as a positive integer of value of max unsigned integer - 4 (value depends on computer architecture and compiler).

BTW
You can check this by writing a simple C++ "hello world" type program and see for yourself

Banket answered 26/4, 2010 at 6:46 Comment(2)
I wrote and checked it thats why i asked the question but i didnt know how the compiler arrived at that positive value. ThanksSusie
Unfortunately with C++, writing programs to test behavior is not always a good idea. For instance, if one tried to test what happens in the case of signed overflow, it will lead to undefined behavior, which is not guaranteed to be the same on every machine/compiler.Springclean
C
3

When you assign a negative value to an unsigned variable then it uses the 2's complement method to process it and in this method it flips all 0s to 1s and all 1s to 0s and then adds 1 to it. In your case, you are dealing with int which is of 4 byte(32 bits) so it tries to use 2's complement method on 32 bit number which causes the higher bit to flip. For example:

┌─[student@pc]─[~]
└──╼ $pcalc 0y00000000000000000000000000000101      # 5 in binary
        5                       0x5                     0y101
┌─[student@pc]─[~]
└──╼ $pcalc 0y11111111111111111111111111111010      # flip all bits  
      4294967290      0xfffffffa      0y11111111111111111111111111111010
┌─[student@pc]─[~]
└──╼ $pcalc 0y11111111111111111111111111111010 + 1  # add 1 to that flipped binarry
      4294967291      0xfffffffb      0y11111111111111111111111111111011
Confucius answered 27/4, 2020 at 10:6 Comment(0)
G
2

Yes, you're correct. The actual value assigned is something like all bits set except the third. -1 is all bits set (hex: 0xFFFFFFFF), -2 is all bits except the first and so on. What you would see is probably the hex value 0xFFFFFFFB which in decimal corresponds to 4294967291.

Gorham answered 26/4, 2010 at 6:48 Comment(5)
Your answer is correct, stringent, to the point and something I would never use in class.Gorham
see my answer for the 2's complement of -5. I don't think you did your math correctly on the binary values here.Squeak
@GManNickG: C and C++ actually do specify some about integer bit-patterns, like that they're binary with bits in some order. And that unsigned char can alias anything and has no padding bits, so you can read object representations of anything as binary numbers. I'm not sure if a Deathstation 9000 could use a different endianness for unsigned vs. int, but within each unsigned char chunk I think things are fairly nailed down. However, endianness is only a factor if accessing object-representations with unsigned char; this answer is not wrong in practice if unsigned is a 32-bit typeJarrett
@PeterCordes Yea, even at the time of the comment I'm pretty sure it was mandated to be either ones-complement, twos-complement, or signed magnitude. These days, didn't C and/or C++ now commit to twos-complement? Either way, there are definitely guarantees here!Swollen
@GManNickG: Yes, C++20 requires all signed integer types to use 2's complement, not just std::atomic<int> (since C++11). Ramifications of C++20 requiring two's complement - but signed overflow is still UB.Jarrett
Z
0

In Windows and Ubuntu Linux that I have checked assigning any negative number (not just -1) to an unsigned integer in C and C++ results in the assignment of the value UINT_MAX to that unsigned integer.

Compiled example link.

Ziguard answered 12/12, 2022 at 12:25 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.