Why is 0 < -0x80000000?
Asked Answered
J

6

255

I have below a simple program:

#include <stdio.h>

#define INT32_MIN        (-0x80000000)

int main(void) 
{
    long long bal = 0;

    if(bal < INT32_MIN )
    {
        printf("Failed!!!");
    }
    else
    {
        printf("Success!!!");
    }
    return 0;
}

The condition if(bal < INT32_MIN ) is always true. How is it possible?

It works fine if I change the macro to:

#define INT32_MIN        (-2147483648L)

Can anyone point out the issue?

Jollity answered 9/12, 2015 at 15:31 Comment(11)
How much is CHAR_BIT * sizeof(int)?Renaldorenard
Have you tried printing out bal?Aphonia
IMHO the more interesting thing is that it is true only for -0x80000000, but false for -0x80000000L, -2147483648 and -2147483648L (gcc 4.1.2), so the question is: why is the int literal -0x80000000 different from the int literal -2147483648?Concave
@Bathsheba I just running program on online compiler tutorialspoint.com/codingground.htmJollity
If you've ever noticed that (some incarnations of) <limits.h> defines INT_MIN as (-2147483647 - 1), now you know why.Cantor
similar: Casting minimum 32-bit integer (-2147483648) to float gives positive number (2147483648.0), Why it is different between -2147483648 and (int)-2147483648, large negative integer literalsCaravel
Modern compilers warn about -0x80000000Perforated
@LưuVĩnhPhúc On a standard 32 bit system that supposed duplicate will print "success" just as expected. As opposed to this question which will print "failed", which was not expected. The difference is that the post you linked uses a signed long long literal, instead of an unsigned int literal.Lib
@LưuVĩnhPhúc I think C++ is actually far behind C here, since C had a long long type which guaranteed at least 64 bits back in 1999, but C++ doesn't seem have gotten one until 2011? Meaning that C and C++ would have displayed different results until C++11.Lib
@Lib The first edition of the C++ standard (C++1998) was published only one year before the second edition of the C standard (C1999). I guess WG21 saw no particular hurry about putting out a second edition of C++, despite the significant revisions in C99 that it would have been nice to have uplifted. Looking back from 2015, that was probably a mistake.Cantor
Why is 1 not greater than -0x80000000Caravel
L
367

This is quite subtle.

Every integer literal in your program has a type. Which type it has is regulated by a table in 6.4.4.1:

Suffix      Decimal Constant    Octal or Hexadecimal Constant

none        int                 int
            long int            unsigned int
            long long int       long int
                                unsigned long int
                                long long int
                                unsigned long long int

If a literal number can't fit inside the default int type, it will attempt the next larger type as indicated in the above table. So for regular decimal integer literals it goes like:

  • Try int
  • If it can't fit, try long
  • If it can't fit, try long long.

Hex literals behave differently though! If the literal can't fit inside a signed type like int, it will first try unsigned int before moving on to trying larger types. See the difference in the above table.

So on a 32 bit system, your literal 0x80000000 is of type unsigned int.

This means that you can apply the unary - operator on the literal without invoking implementation-defined behavior, as you otherwise would when overflowing a signed integer. Instead, you will get the value 0x80000000, a positive value.

bal < INT32_MIN invokes the usual arithmetic conversions and the result of the expression 0x80000000 is promoted from unsigned int to long long. The value 0x80000000 is preserved and 0 is less than 0x80000000, hence the result.

When you replace the literal with 2147483648L you use decimal notation and therefore the compiler doesn't pick unsigned int, but rather tries to fit it inside a long. Also the L suffix says that you want a long if possible. The L suffix actually has similar rules if you continue to read the mentioned table in 6.4.4.1: if the number doesn't fit inside the requested long, which it doesn't in the 32 bit case, the compiler will give you a long long where it will fit just fine.

Lib answered 9/12, 2015 at 15:52 Comment(16)
"... replace the literal with -2147483648L you explicitly get a long, which is signed." Hmmm, In a 32-bit long system 2147483648L, will not fit in a long, so it becomes long long, then the - is applied - or so I thought.Chassis
why cant 0x80000000 fit into an int on a 32 bit system??Epiglottis
@Epiglottis Because the maximum number an int can have is then 0x7FFFFFFF. Try it yourself: #include <limits.h> printf("%X\n", INT_MAX);Lib
I know, it is the maximum positive number. the question is, when you specify a number in hex, does it need to be positive?Epiglottis
@Epiglottis Don't confuse hexadecimal representation of integer literals in source code with the underlying binary representation of a signed number. The literal 0x7FFFFFFF when written in source code is always a positive number, but your int variable can of course contain raw binary numbers up to value 0xFFFFFFFF.Lib
Sorry I am still confused. ìnt n = 0xFFFFFFFF; cout << n; displays -1. Also ìnt n = 0x80000000; cout << n; displays -2147483648. I question the statement "can't fit inside a signed type like int". It probably needs further digging or be stated differently.Epiglottis
@Epiglottis ìnt n = 0x80000000 forces a conversion from the unsigned literal to a signed type. What will happen is up to your compiler - it is implementation-defined behavior. In this case it chose to show the whole literal into the int, overwriting the sign bit. On other systems it might not be possible to represent the type and you invoke undefined behavior - the program might crash. You'll get the very same behaviour if you do int n=2147483648; so it is not related to the hex notation at all.Lib
That's also why you'll find code like this in the standard C headers: #define INT_MIN (-INT_MAX - 1)Corgi
The behavior of "wrap around" unsigned numbers is fixed by the C++ standard. It has nothing to do with 2's complement (and only to do with the sizeof(unsigned)). Are you sure it is different in C?Fulcher
"Instead, on a two's complement system," - actually the system of representing negative numbers does not affect unsigned arithmetic, which is defined in terms of modular arithmetic. In the case of 32-bit int, -0x80000000 is always 0x80000000.Perforated
@Lib out-of-range assignment from integer type to signed integer type is always implementation-defined; there are no UB casesPerforated
@Perforated I believe the standard says something about an "implementation-defined signal may be raised". What that signal is or what happens if it isn't handled, is not covered by the standard. But sure, I can edit that part.Lib
The behaviour of signals is part of the Standard; the default handling of each signal is implementation-defined too (7.14/4)Perforated
I am surprised that this overly complicated explanation is so popular. It turns out that the comparison (<) has nothing to do with it really, your last two paragraphs seem to be completely irrelevant. Just try to output the value INT32_MIN to see how it is represented.Gilgai
@Gilgai The paragraph about implicit promotion is relevant: suppose long is 32 bits, and we have a nearly identical example where the other operand is a long with any randomly picked value. Then the usual arithmetic conversions would instead have forced that operand to convert to unsigned, and the expression would have been evaluated in a completely different manner. As for the last paragraph, it answers the question.Lib
The explanation of how unary - is applied to unsigned integers could be expanded a bit. I had always assumed (though fortunately never relied on the assumption) that unsigned values would be "promoted" to signed values, or possibly that the result would be undefined. (Honestly, it should be a compile-error; what does - 3u even mean?)Welcome
T
29

0x80000000 is an unsigned literal with value 2147483648.

Applying the unary minus on this still gives you an unsigned type with a non-zero value. (In fact, for a non-zero value x, the value you end up with is UINT_MAX - x + 1.)

Tussis answered 9/12, 2015 at 15:42 Comment(1)
In this case, -0x80000000 is 0x80000000, unsigned, since UINT_MAX+1 is 0xFFFFFFFF+1 = 1ULL<<32. (Or actually 0 since UINT_MAX+1 wraps to 0 if you evaluated that expression according to C rules after re-arranging to UINT_MAX+1 - x, since addition is associative when signed-overflow UB isn't a factor.) Fun fact: signed -INT_MIN causes signed-overflow UB, unlike any other int value. The most-negative number is its own complement in 2's complement systems.Awfully
Y
23

This integer literal 0x80000000 has type unsigned int.

According to the C Standard (6.4.4.1 Integer constants)

5 The type of an integer constant is the first of the corresponding list in which its value can be represented.

And this integer constant can be represented by the type of unsigned int.

So this expression

-0x80000000 has the same unsigned int type. Moreover it has the same value 0x80000000 in the two's complement representation that calculates the following way

-0x80000000 = ~0x80000000 + 1 => 0x7FFFFFFF + 1 => 0x80000000

This has a side effect if to write for example

int x = INT_MIN;
x = abs( x );

The result will be again INT_MIN.

Thus in in this condition

bal < INT32_MIN

there is compared 0 with unsigned value 0x80000000 converted to type long long int according to the rules of the usual arithmetic conversions.

It is evident that 0 is less than 0x80000000.

Yongyoni answered 9/12, 2015 at 15:44 Comment(0)
C
13

A point of confusion occurs in thinking the - is part of the numeric constant.

In the below code 0x80000000 is the numeric constant. Its type is determine only on that. The - is applied afterward and does not change the type.

#define INT32_MIN        (-0x80000000)
long long bal = 0;
if (bal < INT32_MIN )

Raw unadorned numeric constants are positive.

If it is decimal, then the type assigned is first type that will hold it: int, long, long long.

If the constant is octal or hexadecimal, it gets the first type that holds it: int, unsigned, long, unsigned long, long long, unsigned long long.

0x80000000, on OP's system gets the type of unsigned or unsigned long. Either way, it is some unsigned type.

-0x80000000 is also some non-zero value and being some unsigned type, it is greater than 0. When code compares that to a long long, the values are not changed on the 2 sides of the compare, so 0 < INT32_MIN is true.


An alternate definition avoids this curious behavior

#define INT32_MIN        (-2147483647 - 1)

Let us walk in fantasy land for a while where int and unsigned are 48-bit.

Then 0x80000000 fits in int and so is the type int. -0x80000000 is then a negative number and the result of the print out is different.

[Back to real-word]

Since 0x80000000 fits in some unsigned type before a signed type as it is just larger than some_signed_MAX yet within some_unsigned_MAX, it is some unsigned type.

Chassis answered 9/12, 2015 at 16:32 Comment(0)
B
12

The numeric constant 0x80000000 is of type unsigned int. If we take -0x80000000 and do 2s compliment math on it, we get this:

~0x80000000 = 0x7FFFFFFF
0x7FFFFFFF + 1 = 0x80000000

So -0x80000000 == 0x80000000. And comparing (0 < 0x80000000) (since 0x80000000 is unsigned) is true.

Barony answered 9/12, 2015 at 15:42 Comment(4)
This supposes 32-bit ints. Although that's a very common choice, in any given implementation int might be either narrower or wider. It is a correct analysis for that case, however.Stilton
This isn't relevant to OP's code, -0x80000000 is unsigned arithmetic. ~0x800000000 is different code.Perforated
This seems to be the best and correct answer to me simply put. @M.M. he is explaining how to take a twos complement. This answer specifically addresses what the negative sign is doing to the number.Gilgai
@Gilgai the negative sign is not applying 2's complement to the number (!) Although this seems clear, it's not describing what happens in the code -0x80000000 ! In fact 2's complement is irrelevant to this question entirely.Perforated
E
8

C has a rule that the integer literal may be signed or unsigned depends on whether it fits in signed or unsigned (integer promotion). On a 32-bit machine the literal 0x80000000 will be unsigned. 2's complement of -0x80000000 is 0x80000000 on a 32-bit machine. Therefore, the comparison bal < INT32_MIN is between signed and unsigned and before comparison as per the C rule unsigned int will be converted to long long.

C11: 6.3.1.8/1:

[...] Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type.

Therefore, bal < INT32_MIN is always true.

Evanthe answered 9/12, 2015 at 16:4 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.