What is going on with bitwise operators and integer promotion?
Asked Answered
D

4

23

I have a simple program. Notice that I use an unsigned fixed-width integer 1 byte in size.

#include <cstdint>
#include <iostream>
#include <limits>

int main()
{
    uint8_t x = 12;
    std::cout << (x << 1) << '\n';
    std::cout << ~x;

    std::cin.clear();
    std::cin.ignore(std::numeric_limits<std::streamsize>::max(), '\n');
    std::cin.get();

    return 0;
}

My output is the following.

24
-13

I tested larger numbers and operator << always gives me positive numbers, while operator ~ always gives me negative numbers. I then used sizeof() and found...

When I use the left shift bitwise operator(<<), I receive an unsigned 4 byte integer.

When I use the bitwise not operator(~), I receive a signed 4 byte integer.

It seems that the bitwise not operator(~) does a signed integral promotion like the arithmetic operators do. However, the left shift operator(<<) seems to promote to an unsigned integral.

I feel obligated to know when the compiler is changing something behind my back. If I'm correct in my analysis, do all the bitwise operators promote to a 4 byte integer? And why are some signed and some unsigned? I'm so confused!

Edit: My assumption of always getting positive or always getting negative values was wrong. But from being wrong, I understand what was really happening thanks to the great answers below.

Danelledanete answered 27/5, 2015 at 5:34 Comment(2)
How do you have streams to output your uint8_t as a number rather than a character? Are you sure your compiler does not alias that type to int?Feola
@AntonSamsonov In the answer below he explains this as a result of the integral promotion that happens after the bitwise operation takes place. In other words, the data-type was promoted from a uint8_t to an int.Danelledanete
M
16

[expr.unary.op]

The operand of ~ shall have integral or unscoped enumeration type; the result is the one’s complement of its operand. Integral promotions are performed.

[expr.shift]

The shift operators << and >> group left-to-right. [...] The operands shall be of integral or unscoped enumeration type and integral promotions are performed.

What's the integral promotion of uint8_t (which is usually going to be unsigned_char behind the scenes)?

[conv.prom]

A prvalue of an integer type other than bool, char16_t, char32_t, or wchar_t whose integer conversion rank (4.13) is less than the rank of int can be converted to a prvalue of type int if int can represent all the values of the source type; otherwise, the source prvalue can be converted to a prvalue of type unsigned int.

So int, because all of the values of a uint8_t can be represented by int.

What is int(12) << 1 ? int(24).

What is ~int(12) ? int(-13).

Misquotation answered 27/5, 2015 at 5:51 Comment(10)
And the reason I was getting all negative numbers after using the bitwise not operator(~) is because most binary numbers have trailing 0's from the left side and when they are flipped, the left most digit is probably going to be 1 making the number negative. This is especially true if the memory holding the value is 4 bytes in size giving me 2^32 possible values and if the value I choose is much smaller than this range.Danelledanete
It looks like you found this compiler information from an in-depth C++ book or manual. If it is, what's it called? I would like to use it as a resource if I get stuck on compiler operations.Danelledanete
@WanderingIdiot "trailing from the left" is called "leading". The book is called the C++ standard (not a recommended resource). en.cppreference.com/w/cpp/language/…Wolcott
@WanderingIdiot The Holy Document of C++ LawTabanid
But uint8_t byte1, byte2; cout >> sizeof(byte1 & byte2); Returning 4 is to me a failure within the language!Europeanize
@DrumM byte1 & byte2 promotes to an int, what were you expecting?Misquotation
As they are both 1 byte, I expect the result of an AND operation also 1 byte! What about uint64_t? As this is bigger than an int, the result of an AND operation for 2 uint64_ts is uint64_t. This shows there is a flaw in the language, it's not consistent at all.Europeanize
@DrumM It's not a flaw as such but a legacy quirk left over from C; welcome to C++ I guess. Changing the rules now would break an unfathomable amount of legacy code so you'll just have to deal with it like everyone else.Misquotation
Yep correct, it's just that it's hard to find out all of these quirks by yourself, if you have any page with ALL these quirks, let me know ;-) Thanks!Europeanize
@DrumM Any decent C++ intro book should cover most of them, e.g. stroustrup.com/4th.htmlMisquotation
G
5

For performance reasons the C and C++ language consider int to be the "most natural" integer type and instead types that are "smaller" than an int are considered sort of "storage" type.

When you use a storage type in an expression it gets automatically converted to an int or to an unsigned int implicitly. For example:

// Assume a char is 8 bit
unsigned char x = 255;
unsigned char one = 1;

int y = x + one; // result will be 256 (too large for a byte!)
++x;             // x is now 0

what happened is that x and one in the first expression have been implicitly converted to integers, the addition has been computed and the result has been stored back in an integer. In other words the computation has NOT been performed using two unsigned chars.

Likewise if you have a float value in an expression the first thing the compiler will do is promoting it to a double (in other words float is a storage type and double is instead the natural size for floating point numbers). This is the reason for which if you use printf to print floats you don't need to say %lf int the format strings and %f is enough (%lf is needed for scanf however because that function stores a result and a float can be smaller than a double).

C++ complicated the matter quite a bit because when passing parameters to functions you can discriminate between ints and smaller types. Thus it's not ALWAYS true that a conversion is performed in every expression... for example you can have:

void foo(unsigned char x);
void foo(int x);

where

unsigned char x = 255, one = 1;
foo(x);       // Calls foo(unsigned char), no promotion
foo(x + one); // Calls foo(int), promotion of both x and one to int
Gilmagilman answered 27/5, 2015 at 6:8 Comment(6)
I liked your last remark about how implicit conversions are not performed when passing function parameters. That's some useful info to know, thank you.Danelledanete
What is so "natural" about it? Well, the compiler may do whatever it needs / wants to compute the result (in the fastest way possible, or with other considerations in mind), but it should preserve (largest) operand type. Only the programmer is supposed to widen expression type - with an explicit cast; any other behavior is counter-intuitive. Or is it more "natural", from your point of view, to write explicit casts around all expressions to put them back in their original domain?Feola
"Likewise if you have a float value in an expression the first thing the compiler will do is promoting it to a double" - That's not true in C++. It only happens as part of default argument promotions (paragraph [5.2.2p7] in the standard), which are only applied to function arguments that are matched with the ellipsis parameter specification (..., which is why it happens for printf). In a + b, if both a and b are float, neither is promoted and the type of the result is float; if one is float and the other one double, then the conversion happens, but that's something else.Clotho
@bogdan: a function doing a = b + c where a, b and c are float generates the same identical byte-per-byte machine code as one doing instead a = (double)b + (double)c.Gilmagilman
That's an implementation detail, and it's not even true for all implementations. I've just verified that MSVC12 generates very different code for your two cases. It also generates a warning for conversion from double to float for your second case (on /W4), which doesn't happen for the first case. You can verify that b + c has type float by using std::is_same<decltype(a + b), float>::value.Clotho
The reason compilers are allowed to do what you said is paragraph [5p12]: "The values of the floating operands and the results of floating expressions may be represented in greater precision and range than that required by the type; the types are not changed thereby." (emphasis mine). This is very different from the rules for usual arithmetic conversions and integral promotions, which actually change the types of the operands and results.Clotho
N
4

I tested larger numbers and operator << always gives me positive numbers, while operator ~ always gives me negative numbers. I then used sizeof() and found...

Wrong, test it:

uint8_t v = 1;
for (int i=0; i<32; i++) cout << (v<<i) << endl;

gives:

1
2
4
8
16
32
64
128
256
512
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304
8388608
16777216
33554432
67108864
134217728
268435456
536870912
1073741824
-2147483648

uint8_t is an 8-bit long unsigned integer type, which can represent values in the range [0,255], as that range in included in the range of int it is promoted to int (not unsigned int). Promotion to int has precedence over promotion to unsigned.

Northing answered 27/5, 2015 at 6:5 Comment(0)
T
3

Look into two's complement and how computer stores negative integers.
Try this

#include <cstdint>
#include <iostream>
#include <limits>
int main()
{
uint8_t x = 1;
int shiftby=0;
shiftby=8*sizeof(int)-1;
std::cout << (x << shiftby) << '\n'; // or std::cout << (x << 31) << '\n';

std::cout << ~x;

std::cin.clear();
std::cin.ignore(std::numeric_limits<std::streamsize>::max(), '\n');
std::cin.get();
return 0;
}

The output is -2147483648

In general if the first bit of a signed number is 1 it is considered negative. when you take a large number and shift it. If you shift it so that the first bit is 1 it will be negative

** EDIT **
Well I can think of a reason why shift operators would use unsigned int. Consider right shift operation >> if you right shift -12 you will get 122 instead of -6. This is because it adds a zero in the beginning without considering the sign

Tackling answered 27/5, 2015 at 5:39 Comment(2)
I don't believe this answers my question completely but let me follow along what your saying and maybe I'll figure this out. When you left shift 31 times the value overflows. From this, it looks like the left shift operator does signed integral promotion instead of unsigned integral promotion. So if that's true, then that would mean the compiler checks the left most digit of the binary number to determine it's sign. That must be it! The compiler must always do integral promotion with bitwise operators when the operand is narrower then int.Danelledanete
Wait, what? I don't know which compiler you used to test this, but GCC on x86_64 emits a SAR instruction for a signed shift, which preserves the sign. Therefore, (-12 >> 1) == -6.Rodin

© 2022 - 2024 — McMap. All rights reserved.