How does C store negative numbers in signed vs unsigned integers?
Asked Answered
B

5

5

Here is the example:

#include <stdio.h>

int main()
{
    int x=35;
    int y=-35;
    unsigned int z=35;
    unsigned int p=-35;
    signed int q=-35;
    printf("Int(35d)=%d\n\
Int(-35d)=%d\n\
UInt(35u)=%u\n\
UInt(-35u)=%u\n\
UInt(-35d)=%d\n\
SInt(-35u)=%u\n",x,y,z,p,p,q);

    return 0;
}

Output:

Int(35d)=35
Int(-35d)=-35
UInt(35u)=35
UInt(-35u)=4294967261
UInt(-35d)=-35
SInt(-35u)=4294967261

Does it really matter if I declare the value as signed or unsigned int? Because, C actually only cares about how I read the value from memory. Please help me understand this and I hope you prove me wrong.

Boeschen answered 31/10, 2014 at 10:57 Comment(1)
It is not C, but a particular computer, which stores the numbers. The C99 standard just documents a set of properties of their behavior.Cavour
U
3

Does it really matter if I declare the value as signed or unsigned int?

Yes.

For example, have a look at

#include <stdio.h>

int main()
{
    int a = -4;
    int b = -3;
    unsigned int c = -4;
    unsigned int d = -3;
    printf("%f\n%f\n%f\n%f\n", 1.0 * a/b, 1.0 * c/d, 1.0*a/d, 1.*c/b);
}

and its output

1.333333
1.000000
-0.000000
-1431655764.000000

which clearly shows that it makes a huge difference if I have the same byte representation interpreted as signed or unsigned.

Undermine answered 31/10, 2014 at 11:8 Comment(10)
Is there a way to read the binary representation to see exactly what happens there?Boeschen
@Clean Thanks, do you know the specifier for unsigned char? I usually use c for chars, but I'm not sure how to read as unsigned.Boeschen
@A6Tech: You read the (int?-) variable as a succession of sizeof unsigned chars. But print those constituent unsigned chars as integers. (For best effect, 2-digit hex: %02X)Clean
And the output is: FFFFFFFC FFFFFFFD FFFFFFFC FFFFFFFD So, is doesn't show any difference.Boeschen
@A6Tech: So, looks like you have a 2s-complement-setup with 32bit-ints and no padding-bits. (Which might be Linux x86 / x86_64, Win32, Win64, Mac (intel), most other modern desktop OSes...)Clean
@Clean But, why does glglgl's example give different results?Boeschen
@Clean But if the numbers are stored the same in memory, and read the same way, why are the outputs different then?Boeschen
@A6Tech: because the values are interpreted differently based on whether the expressions are signed or unsigned.Darrickdarrill
@A6Tech The printf()/scanf() format specifiers for unsigned char are "%hhu", "%hhX", "%hhx", "%hho" to print/scan as a number.Threecolor
Exactly the arithmetics was what I was about.Undermine
D
8

Representation of signed integers is up to the underlying platform, not the C language itself. The language definition is mostly agnostic with regard to signed integer representations. Two's complement is probably the most common, but there are other representations such as one's complement and signed magnitude.

In a two's complement system, you negate a value by inverting the bits and adding 1. To get from 5 to -5, you'd do:

5 == 0101 => 1010 + 1 == 1011 == -5

To go from -5 back to 5, you follow the same procedure:

-5 == 1011 => 0100 + 1 == 0101 == 5

Does it really matter if I declare the value as signed or unsigned int?

Yes, for the following reasons:

  1. It affects the values you can represent: unsigned integers can represent values from 0 to 2N-1, whereas signed integers can represent values between -2N-1 and 2N-1-1 (two's complement).

  2. Overflow is well-defined for unsigned integers; UINT_MAX + 1 will "wrap" back to 0. Overflow is not well-defined for signed integers, and INT_MAX + 1 may "wrap" to INT_MIN, or it may not.

  3. Because of 1 and 2, it affects arithmetic results, especially if you mix signed and unsigned variables in the same expression (in which case the result may not be well defined if there's an overflow).

Darrickdarrill answered 31/10, 2014 at 15:43 Comment(0)
S
4

An unsigned int and a signed int take up the same number of bytes in memory. They can store the same byte values. However the data will be treated differently depending on if it's signed or unsigned.

See http://en.wikipedia.org/wiki/Two%27s_complement for an explanation of the most common way to represent integer values.

Since you can typecast in C you can effectively force the compiler to treat an unsigned int as signed int and vice versa, but beware that it doesn't mean it will do what you think or that the representation will be correct. (Overflowing a signed integer invokes undefined behaviour in C).

(As pointed out in comments, there are other ways to represent integers than two's complement, however two's complement is the most common way on desktop machines.)

Squaw answered 31/10, 2014 at 11:3 Comment(8)
That's how C might do it... some of the time. Misleading and incomplete.Clean
The case is simply that it's not how C does it, but how C sometimes might do it. Other times, it does things differently to a possibly different outcome.Clean
A computer might have one's complement representation of integers and have a C99 standard implementation. But this is very unusual! And a C99 implementation don't even require any computer (you could use a bunch of human slaves but that would be unethical, inefficient, brittle!).Cavour
Your reasoning is wrong. Conversion to unsigned is value-preserving, modulo 1+MAX_VALUE if out-of-range. If that should correspond with 2s-complement-behavior in a specific instance, that's coincidence. (C does not have 2s-complement-behaviour on overflow, it has UB.)Clean
@Clean I edited a bit to make it more obvious what I meant. Didn't mean to imply that it would be neither safe nor correct to do conversions, just stating that it's possible to force the compiler to do things, no matter how stupid they are sometimes. Hope that's more clear now to avoid confusions.Squaw
@mafso Indeed, overflow of signed integers are UB in C. However that's not possible without doing operations. It is possible to treat the byte values of a single unsigned/signed int variable as either signed int or unsigned int without that occuring for two's complement.Squaw
@Basile Starynkevitch Note: Computer is originally a job title, not a device.Threecolor
@ctux: I know that, but I guess that C99 standard does not use computer in that sense (it probably does not speak of computer, only of implementation ... which was my point)Cavour
U
3

Does it really matter if I declare the value as signed or unsigned int?

Yes.

For example, have a look at

#include <stdio.h>

int main()
{
    int a = -4;
    int b = -3;
    unsigned int c = -4;
    unsigned int d = -3;
    printf("%f\n%f\n%f\n%f\n", 1.0 * a/b, 1.0 * c/d, 1.0*a/d, 1.*c/b);
}

and its output

1.333333
1.000000
-0.000000
-1431655764.000000

which clearly shows that it makes a huge difference if I have the same byte representation interpreted as signed or unsigned.

Undermine answered 31/10, 2014 at 11:8 Comment(10)
Is there a way to read the binary representation to see exactly what happens there?Boeschen
@Clean Thanks, do you know the specifier for unsigned char? I usually use c for chars, but I'm not sure how to read as unsigned.Boeschen
@A6Tech: You read the (int?-) variable as a succession of sizeof unsigned chars. But print those constituent unsigned chars as integers. (For best effect, 2-digit hex: %02X)Clean
And the output is: FFFFFFFC FFFFFFFD FFFFFFFC FFFFFFFD So, is doesn't show any difference.Boeschen
@A6Tech: So, looks like you have a 2s-complement-setup with 32bit-ints and no padding-bits. (Which might be Linux x86 / x86_64, Win32, Win64, Mac (intel), most other modern desktop OSes...)Clean
@Clean But, why does glglgl's example give different results?Boeschen
@Clean But if the numbers are stored the same in memory, and read the same way, why are the outputs different then?Boeschen
@A6Tech: because the values are interpreted differently based on whether the expressions are signed or unsigned.Darrickdarrill
@A6Tech The printf()/scanf() format specifiers for unsigned char are "%hhu", "%hhX", "%hhx", "%hho" to print/scan as a number.Threecolor
Exactly the arithmetics was what I was about.Undermine
B
1
#include <stdio.h>

int main(){
    int x = 35, y = -35;
    unsigned int z = 35, p = -35;
    signed int q = -35;

    printf("x=%d\tx=%u\ty=%d\ty=%u\tz=%d\tz=%u\tp=%d\tp=%u\tq=%d\tq=%u\t",x,x,y,y,z,z,p,p,q,q);
}

the result is: x=35 x=35 y=-35 y=4294967261 z=35 z=35 p=-35 p=4294967261 q=-35 q=4294967261

the int number store is not different, it stored with Complement style in memory,

I can use 0X... the 35 in 0X00000023, and the -35 in 0Xffffffdd, it is not different you use sigend or unsigend. it only output with different sytle. The %d and %u is not different about positive, but the negative the first position is sign, if you output with %u is 0Xffffffdd equal 4294967261, but the %d the 0Xffffffdd can be - 0X00000023 equal -35.

Butts answered 31/10, 2014 at 11:48 Comment(2)
That's why I asked what unsigned int is used for then... But glglgl's example shows that reading the same binary numbers as float gives different arithmetic results. But I don't understand how...Boeschen
The glglgl's example show the float number division,that example is not int division, the number stored the memory with IEEE 754(en.wikipedia.org/wiki/IEEE_754-1985) style,and the float division a little complex, you can see some about information and also distinction the sign or unsign.Snakeroot
C
-1

The most fundamental thing that variable's type defines is the way it is stored (that is - read from and written to) in memory and how are the bits interpreted, so your statement can be considered "valid".

You can also look at the problem using conversions. When you store signed and negative value in unsigned variable it gets converted to unsigned. It so happens that this conversion is reversible, so signed -35 converts to unsigned 4294967261, which - when you request it - can be converted to signed -35. That's how 2's complement encoding (see link in other answer) works.

Candelabrum answered 31/10, 2014 at 11:5 Comment(4)
So, what is the need for unsigned int, when I can store all numbers in int and read them as signed or unsigned at my will? :DBoeschen
@A6Tech How will you know which to choose if it has 2 meanings?Squaw
You could also use raw memory (void*) with raw buffers and typecast everywhere, but that's C, not assembler (;Candelabrum
2's complement is not contractual.Clean

© 2022 - 2024 — McMap. All rights reserved.