(-2147483648> 0) returns true in C++?
Asked Answered
A

4

252

-2147483648 is the smallest integer for integer type with 32 bits, but it seems that it will overflow in the if(...) sentence:

if (-2147483648 > 0)
    std::cout << "true";
else
    std::cout << "false";

This will print true in my testing. However, if we cast -2147483648 to integer, the result will be different:

if (int(-2147483648) > 0)
    std::cout << "true";
else
    std::cout << "false";

This will print false.

I'm confused. Can anyone give an explanation on this?


Update 02-05-2012:

Thanks for your comments, in my compiler, the size of int is 4 bytes. I'm using VC for some simple testing. I've changed the description in my question.

That's a lot of very good replys in this post, AndreyT gave a very detailed explanation on how the compiler will behave on such input, and how this minimum integer was implemented. qPCR4vir on the other hand gave some related "curiosities" and how integers are represented. So impressive!

Animadversion answered 4/2, 2013 at 20:33 Comment(2)
"we all know that -2147483648 is the smallest number of integer" That depends on the size of the integer.Graber
@Inisheer With 4 Byte integers you may have a INT_MIN of -9223372036854775808, if CHAR_BIT is 16. And even with CHAR_BIT == 8 and sizeof(int==4)` you may get -9223372036854775807 because C do not require 2-Complement numbers.Antiserum
W
403

-2147483648 is not a "number". C++ language does not support negative literal values.

-2147483648 is actually an expression: a positive literal value 2147483648 with unary - operator in front of it. Value 2147483648 is apparently too large for the positive side of int range on your platform. If type long int had greater range on your platform, the compiler would have to automatically assume that 2147483648 has long int type. (In C++11 the compiler would also have to consider long long int type.) This would make the compiler to evaluate -2147483648 in the domain of larger type and the result would be negative, as one would expect.

However, apparently in your case the range of long int is the same as range of int, and in general there's no integer type with greater range than int on your platform. This formally means that positive constant 2147483648 overflows all available signed integer types, which in turn means that the behavior of your program is undefined. (It is a bit strange that the language specification opts for undefined behavior in such cases, instead of requiring a diagnostic message, but that's the way it is.)

In practice, taking into account that the behavior is undefined, 2147483648 might get interpreted as some implementation-dependent negative value which happens to turn positive after having unary - applied to it. Alternatively, some implementations might decide to attempt using unsigned types to represent the value (for example, in C89/90 compilers were required to use unsigned long int, but not in C99 or C++). Implementations are allowed to do anything, since the behavior is undefined anyway.

As a side note, this is the reason why constants like INT_MIN are typically defined as

#define INT_MIN (-2147483647 - 1)

instead of the seemingly more straightforward

#define INT_MIN -2147483648

The latter would not work as intended.

Wed answered 4/2, 2013 at 20:38 Comment(26)
This is also why this is done: #define INT_MIN (-2147483647 - 1).Graber
Funny, clang for me seems to run this fine (printing false both times), even though my integer size is only 4 bytes. Try again!Vegetal
@RichardJ.RossIII - with clang you are probably getting a 64-bit-typed literal, since it was too big to fit in an int. OP's implementation may not have a 64-bit type.Kravits
@RichardJ.RossIII: I believe this behaviour is implementation-defined/undefined.Tetramethyldiarsine
The funny thing is that I thought of this, but then decided on the negative sign being included in the literal for some reason.Rompish
I never thought that a "negative number" isn't parsed as such. I don't see a reason. I hope that -1.0 is parsed as a negative double value, isn't it?Latreese
@OliCharlesworth: I believe not: 5.4 If during the evaluation of an expression, the result is not mathematically defined or not in the range of representable values for its type, the behavior is undefined. This doesn't go for unsigned types as the standard defines it's modulo behaviour earlier.Graber
Well, after thinking again, for floats / doubles it doesn't make any difference. They are somewhat "symmetrical", if you know what I mean. The compiler will "optimize" negative numbers anyway. So it's only about the parsing process, not about performance.Latreese
@nightcracker: But it's not unsigned, it's a long int (at least, it would be in C, I can't be bothered to trawl the C++ spec right now... ;) )Tetramethyldiarsine
@OliCharlesworth: huh? The sentence I quoted is straight from the C++ spec, and goes for any expression.Graber
@nightcracker: I know. But like you said, it's well-defined for unsigned, and undefined for signed types, and the OP's code involves a signed value (2147483647 is interpreted as a signed (long) (long) int).Tetramethyldiarsine
@OliCharlesworth: Ah, I read your original comment again. I looked over the undefined behaviour part, and only saw the implementation-defined part. What I meant to say that it is not implementation-defined, but undefined behaviour.Graber
2147483648 is promoted to unsigned int, not long.Exudation
@qPCR4vir: No. As I wrote in my comment to your answer, neither modern C nor C++ allow using unsigned types in this case (with an unsuffixed decimal constant). Only the first standard C (C89/90) permitted unsigned long int in this context, but in C99 this permission was removed. Unsuffixed literals in C and C++ are required to have signed types. If you see unsigned type here when a signed one would work, it means your compiler is broken. If you see unsigned type here when no signed type would work, then this is just a specific manifestation of undefined behavior.Wed
@AndreyT #define INT_MIN (-2147483647 - 1) comes from limits.h (climits) so I think you should add a reference. You can probably also mention how limits.h write a literal constant of unsigned int (e.g. #define UINT_MAX 0xffffffffU) @nightcrackerYonita
I learn yet another thing about C/C++ ! Thanks! But I wonder: is there a place with a list both concise & precise of most of the "caveats" in C/C++? (and in the end I think it makes C/even more and not less reliable than other langages, which may have similar problems but with less ways to grok them...) The C-faq and abridged version of it from Usenet is a good place to start, but any other pointers?Importunate
I always wondered why they didn't #define INT_MIN (~INT_MAX)Orinasal
Recently I learned bit-overflow is undefined behaviour ref:ISO C section 6.5 paragraph 5, But doing i = 2147483648 is undefined ? yes i = -2147483648 is valid.Folliculin
@Orinasal Because two's complement is not guaranteed (until C++20).Barmecidal
@L.F. But the bit pattern for INT_MIN is the same in both one's and two's complement, just its value is different.Orinasal
@Orinasal Well, how about those other than ones' and two's complement? (BTW, it is ones' complement, not one's complement :)Barmecidal
@L.F. Ah right, there's also sign and magnitude. But doesn't (-2147483647-1) only works in twos' (my spellchecker flags that) complement anyway?Orinasal
@Orinasal Well, on other architectures INT_MIN has to be defined another way. (And your spellchecker is right — it's two's complement, and ones' complement.)Barmecidal
@L.F. Sorry, I'm not following; if they're going to define it differently on other architectures anyway, what's wrong with defining it as (~INT_MAX) on two's complement?Orinasal
@Orinasal Oh, (~INT_MAX) is OK on two's complement. That's fine. I thought you wanted to automatically adapt to all architectures, but I was thinking wrong. Sorry for that.Barmecidal
@L.F. I may have thought that at the time... I can't really remember, sorry.Orinasal
E
44

The compiler (VC2012) promote to the "minimum" integers that can hold the values. In the first case, signed int (and long int) cannot (before the sign is applied), but unsigned int can: 2147483648 has unsigned int ???? type. In the second you force int from the unsigned.

const bool i= (-2147483648 > 0) ;  //   --> true

warning C4146: unary minus operator applied to unsigned type, result still unsigned

Here are related "curiosities":

const bool b= (-2147483647      > 0) ; //  false
const bool i= (-2147483648      > 0) ; //  true : result still unsigned
const bool c= ( INT_MIN-1       > 0) ; //  true :'-' int constant overflow
const bool f= ( 2147483647      > 0) ; //  true
const bool g= ( 2147483648      > 0) ; //  true
const bool d= ( INT_MAX+1       > 0) ; //  false:'+' int constant overflow
const bool j= ( int(-2147483648)> 0) ; //  false : 
const bool h= ( int(2147483648) > 0) ; //  false
const bool m= (-2147483648L     > 0) ; //  true 
const bool o= (-2147483648LL    > 0) ; //  false

C++11 standard:

2.14.2 Integer literals [lex.icon]

An integer literal is a sequence of digits that has no period or exponent part. An integer literal may have a prefix that specifies its base and a suffix that specifies its type.

The type of an integer literal is the first of the corresponding list in which its value can be represented.

enter image description here

If an integer literal cannot be represented by any type in its list and an extended integer type (3.9.1) can represent its value, it may have that extended integer type. If all of the types in the list for the literal are signed, the extended integer type shall be signed. If all of the types in the list for the literal are unsigned, the extended integer type shall be unsigned. If the list contains both signed and unsigned types, the extended integer type may be signed or unsigned. A program is ill-formed if one of its translation units contains an integer literal that cannot be represented by any of the allowed types.

And these are the promotions rules for integers in the standard.

4.5 Integral promotions [conv.prom]

A prvalue of an integer type other than bool, char16_t, char32_t, or wchar_t whose integer conversion rank (4.13) is less than the rank of int can be converted to a prvalue of type int if int can represent all the values of the source type; otherwise, the source prvalue can be converted to a prvalue of type unsigned int.

Exudation answered 4/2, 2013 at 20:34 Comment(8)
@qPCR4vir: In C89/90 the compilers were supposed to use types int, long int, unsigned long int to represent unsuffixed decimal constants. That was the only language that allowed using unsigned types for unsuffixed decimal constants. In C++98 it was int or long int. No unsigned types allowed. Neither C (starting from C99) nor C++ permits the compiler to use unsigned types in this context. Your compiler is, of course, free to use unsigned types if none of the signed ones work, but this is still just a specific manifestation of undefined behavior.Wed
@AndreyT . Great! Of couse, your rigth. Is VC2012 broken?Exudation
@qPCR4vir: AFAIK, VC2012 is not a C++11 compiler yet (is it?), which means that it has to use either int or long int to represent 2147483648. Also, AFAIK, in VC2012 both int and long int are 32-bit types. This means that in VC2012 literal 2147483648 should lead to undefined behavior. When the behavior is undefined, the compiler is allowed to do anything. That would mean that VC2012 is not broken. It simply issued a misleading diagnostic message. Instead of telling you that behavior is flat out undefined it decided to use an unsigned type.Wed
@AndreyT: Are you saying that compilers are free to emit nasal demons if source code contains an unsuffixed decimal literal which exceeds the maximum value of a signed long, and are not required to issue a diagnostic? That would seem broken.Spiker
Same "warning C4146" in VS2008 and "this decimal constant is unsigned only in ISO C90" in G++Biggin
@supercat: Yes, nasal demons are allowed, no diagnostic required. As an approximate rule, diagnostics are not required when it was clear that implementations could offer a sensible extension. And by 1998, supporting literals bigger than LONG_MAX was such an obvious extension.Sternutatory
@MSalters: It may be reasonable not to require a diagnostic for a number bigger than a long in cases where there's a longer type that can handle it, but allowing nasal demons without a diagnostic for on compilers that don't have such a type seems broken. The fact that a compiler is not required to handle a particular program should not mean that any compiler that accepts the program should be free to emit nasal demons. I would think it would be sensible to say that if a compiler accepts a certain program without complaint, it must have certain behavior.Spiker
@supercat: It's easy to say that informally, but try to put that in Standardese. In this specific case, C++11 does have the necessary Standardese, "extended integer type (3.9.1)". But addressing it in a generic matter is very, very hard.Sternutatory
A
6

Because -2147483648 is actually 2147483648 with negation (-) applied to it, the number isn't what you'd expect. It is actually the equivalent of this pseudocode: operator -(2147483648)

Now, assuming your compiler has sizeof(int) equal to 4 and CHAR_BIT is defined as 8, that would make 2147483648 overflow the maximum signed value of an integer (2147483647). So what is the maximum plus one? Lets work that out with a 4 bit, 2s compliment integer.

Wait! 8 overflows the integer! What do we do? Use its unsigned representation of 1000 and interpret the bits as a signed integer. This representation leaves us with -8 being applied the 2s complement negation resulting in 8, which, as we all know, is greater than 0.

This is why <limits.h> (and <climits>) commonly define INT_MIN as ((-2147483647) - 1) - so that the maximum signed integer (0x7FFFFFFF) is negated (0x80000001), then decremented (0x80000000).

Atmo answered 5/2, 2013 at 18:21 Comment(4)
For a 4 bit number, the two's complement negation of -8 is still -8.Dinodinoflagellate
Except that -8 is interpreted as 0-8, not negative 8. And 8 overflows a 4 bit signed intAtmo
Consider -(8) which in C++ is the same as -8 -- it is negation applied to a literal, not a negative literal. The literal is 8, which doesn't fit in a signed 4-bit integer, so it must be unsigned. The pattern is 1000. So far your answer is correct. The two's complement negation of 1000 in 4 bits is 1000, it doesn't matter if it is signed or unsigned. Your answer, says "interpret the bits as a signed integer" which makes the value -8 after the two's complement negation, just as it was before negation.Dinodinoflagellate
Of course, in "4-bit C++" there is no "interpret the bits as a signed integer step". The literal becomes the smallest type that can express it, which is unsigned 4-bit integer. The value of the literal is 8. Negation is applied (modulo 16), resulting in a final answer of 8. The encoding is still 1000 but the value is different because an unsigned type was chosen.Dinodinoflagellate
P
5

In Short, 2147483648 overflows to -2147483648, and (-(-2147483648) > 0) is true.

This is how 2147483648 looks like in binary.

In addition, in the case of signed binary calculations, the most significant bit ("MSB") is the sign bit. This question may help explain why.

Politic answered 5/2, 2013 at 2:42 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.