Negative power of 2
Asked Answered
T

4

6

I am learning the characteristics of the different data type. For example, this program increasingly prints the power of 2 with four different formats: integer, unsigned integer, hexadecimal, octal

#include<stdio.h>
int main(int argc, char *argv[]){
        int i, val = 1;
        for (i = 1; i < 35; ++i) {
                printf("%15d%15u%15x%15o\n", val, val, val, val);
        val *= 2;
        }
    return 0;
}

It works. unsigned goes up to 2147483648. integer goes up to -2147483648. But why does it become negative?

I have a theory: is it because the maximum signed integer we can represent on a 32 bit machine is 2147483647? If so, why does it return the negative number?

Trope answered 29/6, 2017 at 10:30 Comment(9)
your theory is right. overlow happensKoch
Your program causes a signed integer overflow which is undefined behaviour.Gillies
as often, because undefined behavior is undefined. signed integer overflow is undefined. A somewhat likely outcome is a wrap-around into the negative area, but that's not guaranteed -> your program doesn't have defined behavior.Tasimeter
Thank you guys for confirming my suspect. Do you suggest any reference to help me better comprehend undefined behaviors such as that?Trope
Consult the Standard for better understanding of all aspects of C. Here is a section that summarizes undefined behaviors.Humblebee
"I have a theory: is it because the maximum signed integer we can represent on a 32 bit machine is 2147483647" - That's not even guaranteed for int and definitively not for all integer types.Theona
Thanks @DavidBowling for the reference!Trope
Thanks @Olaf for your note.Trope
Code has 2 sources of UB, the potential int overflow of val *= 2; (identified by many) and printing an int val using %15u%15x%15o when val < 0.Rote
T
6

First of all, you should understand that this program is undefined. It causes signed integer overflow, and this is declared undefined in the C Standard.

The technical reason is that no behavior can be predicted as different representations are allowed for negative numbers and there could even be padding bits in the representation.

The most probable reason you see a negative number in your case is that your machine uses 2's complement (look it up) to represent negative numbers while arithmetics operate on bits without overflow checks. Therefore, the highest bit is the sign bit, and if your value overflows into this bit, it turns negative.

Tasimeter answered 29/6, 2017 at 10:39 Comment(6)
Thanks a lot Felix for your concise explanation!Trope
The " reason you see a negative number" misleads. 1) had a rare non-2's complement machine been used, a negative number may have appeared too. 2) The 2's complement idea does provide a potential explanation why the particular value -2147483648 was seen and not some other value, 3) It's all UB anyways, even if the machine is 2's complement.Rote
@chux the fact that it's undefined is explained in the very first paragraph and I think that last one is worded so it's clear that 2's complement is one possible (and very likely) reason for the actual wraparound? What would you change in the wording to make that clearer?Tasimeter
The last says "reason you see a negative number is that your machine probably uses 2's complement ..." is the weakest part. The reason for some negative number is not 2's complement.Rote
Another, more important reason is some architectures use saturation arithmetics for signed integers, i.e. a result is capped at the max/min value, hence never will under-/overflow. This is typical for e.g. DSPs where the toolchain explicitly defines this behaviour or the program code can enforce it for a part of the code by setting a machine flag.Theona
@chux @ Olaf I didn't intend to make an exhaustive list of possible reasons to observe what OP observed ... but I'll change the wording to emphasize this is just one guess I consider very probable in OP's case.Tasimeter
G
2

What you describe is UB caused by integer overflow. Since the behavior is undefined, anything could happen (“When the compiler encounters [a given undefined construct] it is legal for it to make demons fly out of your nose”), BUT, what actually happens on some machines (I suspect yours included) is this:

You start with int val = 1;. That is represented 0b00...1 in binary form. Each time you val *= 2; the value is multiplied by 2, therefore the representation changes to 0b00...10 and then to 0b00...100 and so on (the 1 bit moves one step each time). The last time you val *= 2; you get 0b100.... Now, using 2's complement (which is what I guess your machine uses, as it very common) the value is actually -1 * 0b1000... which is -2147483648

Note, that even though this might be what's really going on in your machine, it's not to be trusted or thought of as the "right" thing to happen, since, as mentioned before, this is UB

Grappa answered 29/6, 2017 at 10:49 Comment(7)
NMDV yet int overflow is undefined behavior -UB. This answer seems to explain why something happened as if that is the specified and expected behavior - which it is not. It is just a common outcome of UB that may differ tomorrow.Rote
I see, thanks for the input :) Yet, this is a common output and UB doesn't mean we can't predict the outcome (on specific machine), it just means the standard did not specify it...Grappa
True that UB can be predicted and like the weather, it might be wrong. The key being, even if the outcome matches our prediction today, the outcome is not reliably for the future - thus UB. For learners, instilling a sense of consistency concerning UB leads to weak programs that rely on that.Rote
Close. The last line should read: "the value is actually -1 * 0b100000... which is -2147483648" since negation in 2's complement is calculated as -a == ~a + 1.Tidbit
@chux I agree with you that any UB should be clearly stated as such on this site, as none of our readers should ever rely on any behavior that's actually UB. Nevertheless, UB is where the abstractions leak, and explaining why a specific UB construct produces a specific behavior can be quite insightful for those who have not ventured looking behind the abstractions yet. Thus, I find answers like this one quite helpful.Tidbit
@cmaster Agree, yet the original answer was not explaining UB - it was attempting to explain behavior. It did not identifying it as UB. With the UB labelling and caution added, it is improved.Rote
@chux Ah, I see.Tidbit
D
1

This is happening because of basically limits.When the value get exceed the data type limit that cause overflow.

Normally a signed integer data can ranges between -2,147,483,648 to 2,147,483,647.

Let's get into deep, signed integer data uses 4 bytes to store data which is 32-bits of memory.

In signed integer data type 1st-bit in 32-bits is reserved for sign, which represents the stored number is positive or negative.

Remaining 31-bits are used to store the data.

once again

1st-bit is sign-bit
31-bits are data-bits

if the sign-bit is 0 it represents positive number.if the sign-bit is 1 it represents negative number.

Based on the limit of the positive number that can be stored in signed integer data type is 2,147,483,647.


2,147,483,647 ->0 (+ve) 1111111111111111111111111111111(31-bit)

The first bit 0 represents  this number is positive

and the other bits are data bits which represents this number 2,147,483,647 in binary.

One more thing to note,by analyzing the binary value you can notice that all the 31-bits are one which represent it is the maximum value that can be stored in 31-bits.

So let's now get into your question.

It works. unsigned goes up to 2147483648. integer goes up to -2147483648. But why does it become negative?

First of all maximum value that can be stored in unsigned is 4,294,967,295 which is equal to 2^32-1.

Let consider this example code.

#include <stdio.h>

int main() {
    int a=2147483647;
    printf("%d\n",a);
    a++;
    printf("%d\n",a);

    return 0;
}

Output:

2147483647

-2147483648

In this code the value a stores the value 2147483647.and I tried increment it by one I'm getting -2147483648 instead of 2147483648.

This is because, if you convert 2147483648 this number in binary you will get:

2147483648 ->1 (-)ve 0000000000000000000000000000000(32-bit)

The leading one need to be get stored in data bits only,but it get overflowed to sign-bit.the sign-bit used to represent the number sign.

Because of that 1 overflowed into sign-bit the stored number is considered as negative number.Negative numbers are stored in memory in the form of 2's compliment of its corresponding positive value.

If you convert this the binary of this positive value 2147483648 in 2's compliment you will get this value

-2147483648->1 (-ve) 0000000000000000000000000000000(32-bit)

This is the value which get stored in memory,so you got this (-ve) 2147483648 reasonly.

Derick answered 29/6 at 8:28 Comment(0)
L
-1

In this program, the val value will overflow, if it is a 32- bit machine, because the size of integer is 4 bytes. Now, we have 2 type of values in math, positive and negative, so to do calculation involving negative results, we use sign representations i.e int or char in C language.

Lets take the example of char, range -128 to 127, unsigned char range 0-255 . It tells, range is divided into two parts for signed representation. So for any signed variable, if it crosses its range of +ve value, it goes into negative value. Like here in case of char, as the value goes above the 127, it becomes -ve. And suppose if you add 300 to any char or unsigned char variable what happens, it rolls over and starts again from zero.

char a=2;
a+=300;

what is the value?? now you know max value of char is 255(total 256 values, including zero), so 300-256 = 44 + 2 =46.
Hope this helps

Lichen answered 29/6, 2017 at 11:23 Comment(3)
"In this program, the val value will overflow" - no. "if it is a 32- bit machine, because the size of integer is 4 bytes" - integers can have any size, depending on type. And the width of the machine is not directly related to the width of the integer types. 68K for instance is considered a 32 bit architecture, but uses 16 and 32 bit int, depending on ABI. " char, range -128 to 127" - not guaranteed and not true for most machines. etc. Your answer is full of wrong assumptions. And char c = 127; c + 1; does not overflow, even for CHAR_MAX == 127 (which implies signed char).Theona
can you please give any reference, so that i can clarify this.Lichen
There is only one autoritative reference for C. ISO9899:2011.Theona

© 2022 - 2024 — McMap. All rights reserved.