Casting minimum 32-bit integer (-2147483648) to float gives positive number (2147483648.0)
Asked Answered
I

2

15

I was working on an embedded project when I ran into something which I thought was strange behaviour. I managed to reproduce it on codepad (see below) to confirm, but don't have any other C compilers on my machine to try it on them.

Scenario: I have a #define for the most negative value a 32-bit integer can hold, and then I try to use this to compare with a floating point value as shown below:

#define INT32_MIN (-2147483648L)

void main()
{
    float myNumber = 0.0f;
    if(myNumber > INT32_MIN)
    {
        printf("Everything is OK");
    }
    else
    {
        printf("The universe is broken!!");
    }
}

Codepad link: http://codepad.org/cBneMZL5

To me it looks as though this this code should work fine, but to my surprise it prints out The universe is broken!!.

This code implicitly casts the INT32_MIN to a float, but it turns out that this results in a floating point value of 2147483648.0 (positive!), even though the floating point type is perfectly capable of representing -2147483648.0.

Does anyone have any insights into the cause of this behaviour?

CODE SOLUTION: As Steve Jessop mentioned in his answer, limits.h and stdint.h contain correct (working) int range defines already, so I'm now using these instead of my own #define

PROBLEM/SOLUTION EXPLANATION SUMMARY: Given the answers and discussions, I think this is a good summary of what's going on (note: still read the answers/comments because they provide a more detailed explanation):

  • I'm using a C89 compiler with 32-bit longs, so any values greater than LONG_MAX and less or equal to ULONG_MAX followed by the L postfix have a type of unsigned long
  • (-2147483648L) is actually a unary - on an unsigned long (see previous point) value: -(2147483648L). This negation operation 'wraps' the value around to be the unsigned long value of 2147483648 (because 32-bit unsigned longs have the range 0 - 4294967295).
  • This unsigned long number looks like the expected negative int value when it gets printed as an int or passed to a function because it is first getting cast to an int, which is wrapping this out-of-range 2147483648 around to -2147483648 (because 32-bit ints have the range -2147483648 to 2147483647)
  • The cast to float, however, is using the actual unsigned long value 2147483648 for conversion, resulting in the floating-point value of 2147483648.0.
Irinairis answered 18/7, 2012 at 7:32 Comment(8)
fwiw, the universe isn't (as) broken if you use gcc 4.2.1. (So you're probably right about the compiler.)Aruspex
are you on a 32 bit system? note that - is an operator. it is not part of the integer literal.Liggett
Can't reproduce with gcc 4.2.1. What compiler are you using?Reciprocate
I can reproduce this with gcc 3.1. Yeah, I know that's a bit old :-) With gcc 4.6 on a 32-bit system gcc -std=c90 produces the observed behaviour, gcc -std=c99 works "as expected".Mahone
See also this question #9941761 which contains some standards discussion.Mahone
I assume that you added the L suffix to your constant specifically to avoid the problem described in the answers. However, on your platform int and long have the same size, which is why the overflow persists. You can add LL suffix instead and the overflow should go away (assuming long long has larger range)Menarche
It's int main(void), not void main().Pinxit
Slight correction to your explanation: it's not true that any values with L suffix are unsigned long, it's specifically values greater than LONG_MAX and less or equal to ULONG_MAX.Yarndyed
Y
12

In C89 with a 32 bit long, 2147483648L has type unsigned long int (see 3.1.3.2 Integer constants). So once modulo arithmetic has been applied to the unary minus operation, INT32_MIN is the positive value 2147483648 with type unsigned long.

In C99, 2147483648L has type long if long is bigger than 32 bits, or long long otherwise (see 6.4.4.1 Integer constants). So there is no problem and INT32_MIN is the negative value -2147483648 with type long or long long.

Similarly in C89 with long larger than 32 bits, 2147483648L has type long and INT32_MIN is negative.

I guess you're using a C89 compiler with a 32 bit long.

One way to look at it is that C99 fixes a "mistake" in C89. In C99 a decimal literal with no U suffix always has signed type, whereas in C89 it may be signed or unsigned depending on its value.

What you should probably do, btw, is include limits.h and use INT_MIN for the minimum value of an int, and LONG_MIN for the minimum value of a long. They have the correct value and the expected type (INT_MIN is an int, LONG_MIN is a long). If you need an exact 32 bit type then (assuming your implementation is 2's complement):

  • for code that doesn't have to be portable, you could use whichever type you prefer that's the correct size, and assert it to be on the safe side.
  • for code that has to be portable, search for a version of the C99 header stdint.h that works on your C89 compiler, and use int32_t and INT32_MIN from that.
  • if all else fails, write stdint.h yourself, and use the expression in WiSaGaN's answer. It has type int if int is at least 32 bits, otherwise long.
Yarndyed answered 18/7, 2012 at 8:16 Comment(4)
I didn't know that. Thanks for providing the standard.Video
Small nit, it might be better to explicitly state that In C99, 2147483648L has type long long also assumes 32-bit long.Isogamete
Interestingly and informatively, limits.h (first online source I could find...) uses the #define INT_MIN (-2147483647 - 1) mechanism recommended by WiSaGaN.Donegan
Great info, thanks for that it was very informative. I am using the TI CGT v3.2.2 for MSP430 (embedded project), but I'm not really sure if this is C89 or something else. I'm now using the limits.h header for my defines so thanks for that tip too. As I said in my comment to the other answer, I completely missed that this could be the case because I was using this define without problem (assigning it to variables, passing it to functions, printing it out, etc) until I tried to convert it to a float. Thanks for your help!Irinairis
V
13

Replace

#define INT32_MIN (-2147483648L)

with

#define INT32_MIN (-2147483647 - 1)

-2147483648 is interpreted by the compiler to be the negation of 2147483648, which causes overflow on an int. So you should write (-2147483647 - 1) instead.
This is all C89 standard though. See Steve Jessop's answer for C99.
Also long is typically 32 bits on 32-bit machines, and 64 bits on 64-bit machines. int here gets the things done.

Video answered 18/7, 2012 at 7:38 Comment(12)
More specifically, writing -2147483648 causes the compiler to evaluate it as -(2147483648). The literal in brackets overflows, resulting in -(-2147483648) = 2147483648. The universe then shatters.Thoughtful
The moral is that there are no negative integral literals, only (unsigned) integral literals and unary minuses.Contortion
arguing solely on the bitpatterns on two complement, it should end up as negative. probably an optimization pass interferes here.Liggett
@KerrekSB Exactly. I haven't checked the C standard. I suppose there should be something explicitly saying this.Video
Be careful about generalizations on the type long - it's not always 64-bits on 64-bit systems (Win64 for example).Amphisbaena
@MichaelBurr You are right. It's the typical case, not a sure thing. I'll edit it later.Video
Ah yes this solves the problem. I didn't even consider that this could have been the case, mainly because I did things like assign INT32_MIN to a 32-bit int variable, pass INT32_MIN to a function, and print INT32_MIN all without any issues. What would cause this difference in behaviour?Irinairis
@Video The code to demonstrate can be found at: http://codepad.org/JkRoyKRe. My compiler (TI CGT v3.2.2 for MSP430) seems to behave the same way as codepad in this situation.Irinairis
@Irinairis That's interesting. If you define int a = INT32_MIN;, then pass a to where it would previously fail, namely printf("As float (straight): %f (FAIL)\n", (float)a);, you would get correct result. It seems there's weird things the compiler are doing.Video
@WiSaGaN: it's not that weird: if you assign the out-of-range positive value 2147483648 to int then the result is implementation-defined, but it's not uncommon to see it wraparound to -2147483648.Yarndyed
@SteveJessop Very interesting, the wraparound was the last piece of information I needed for this to 'click' in my head. I think I understand what's going on now, so thanks to everyone! I've added a summary to my original post. I think it's accurate, but feel free to edit it or let me know if it's not.Irinairis
@WiSaGaN: A lot of code expects unsigned long to wrap at 2^32 and will behave very badly if it does not do so. The fundamental problem is that C has consistently failed to provide unsigned types with predictable wrapping behavior (it still doesn't have them, since a standards-compliant compiler could legitimately have a 64-bit int type but include a uint32_t type; on such a compiler, multiplying together two uint32_t values could turn the CPU into a heap of molten slag).Metrorrhagia
Y
12

In C89 with a 32 bit long, 2147483648L has type unsigned long int (see 3.1.3.2 Integer constants). So once modulo arithmetic has been applied to the unary minus operation, INT32_MIN is the positive value 2147483648 with type unsigned long.

In C99, 2147483648L has type long if long is bigger than 32 bits, or long long otherwise (see 6.4.4.1 Integer constants). So there is no problem and INT32_MIN is the negative value -2147483648 with type long or long long.

Similarly in C89 with long larger than 32 bits, 2147483648L has type long and INT32_MIN is negative.

I guess you're using a C89 compiler with a 32 bit long.

One way to look at it is that C99 fixes a "mistake" in C89. In C99 a decimal literal with no U suffix always has signed type, whereas in C89 it may be signed or unsigned depending on its value.

What you should probably do, btw, is include limits.h and use INT_MIN for the minimum value of an int, and LONG_MIN for the minimum value of a long. They have the correct value and the expected type (INT_MIN is an int, LONG_MIN is a long). If you need an exact 32 bit type then (assuming your implementation is 2's complement):

  • for code that doesn't have to be portable, you could use whichever type you prefer that's the correct size, and assert it to be on the safe side.
  • for code that has to be portable, search for a version of the C99 header stdint.h that works on your C89 compiler, and use int32_t and INT32_MIN from that.
  • if all else fails, write stdint.h yourself, and use the expression in WiSaGaN's answer. It has type int if int is at least 32 bits, otherwise long.
Yarndyed answered 18/7, 2012 at 8:16 Comment(4)
I didn't know that. Thanks for providing the standard.Video
Small nit, it might be better to explicitly state that In C99, 2147483648L has type long long also assumes 32-bit long.Isogamete
Interestingly and informatively, limits.h (first online source I could find...) uses the #define INT_MIN (-2147483647 - 1) mechanism recommended by WiSaGaN.Donegan
Great info, thanks for that it was very informative. I am using the TI CGT v3.2.2 for MSP430 (embedded project), but I'm not really sure if this is C89 or something else. I'm now using the limits.h header for my defines so thanks for that tip too. As I said in my comment to the other answer, I completely missed that this could be the case because I was using this define without problem (assigning it to variables, passing it to functions, printing it out, etc) until I tried to convert it to a float. Thanks for your help!Irinairis

© 2022 - 2024 — McMap. All rights reserved.