Why Java unsigned bit shifting for a negative byte is so strange?
Asked Answered
N

3

7

I have a byte variable:

byte varB = (byte) -1; // binary view: 1111 1111

I want to see the two left-most bits and do an unsigned right shift of 6 digits:

varB = (byte) (varB >>> 6);

But I'm getting -1 as if it was int type, and getting 3 only if I shift for 30!

How can I work around this and get the result only with a 6-digit shift?

Nicknickel answered 16/4, 2015 at 17:49 Comment(1)
All arithmetic operations on a char, byte or short promote to int first.Thermopile
C
13

The reason is the sign extension associated with the numeric promotion to int that occurs when bit-shifting. The value varB is promoted to int before shifting. The unsigned bit-shift to the right does occur, but its effects are dropped when casting back to byte, which only keeps the last 8 bits:

varB (byte)     : 11111111
promoted to int : 11111111 11111111 11111111 11111111
shift right 6   : 00000011 11111111 11111111 11111111
cast to byte    : 11111111

You can use the bitwise-and operator & to mask out the unwanted bits before shifting. Bit-anding with 0xFF keeps only the 8 least significant bits.

varB = (byte) ((varB & 0xFF) >>> 6);

Here's what happens now:

varB (byte)     : 11111111
promoted to int : 11111111 11111111 11111111 11111111
bit-and mask    : 00000000 00000000 00000000 11111111
shift right 6   : 00000000 00000000 00000000 00000011
cast to byte    : 00000011
Conductor answered 16/4, 2015 at 17:55 Comment(1)
" when promotion to int" this means convert byte to int: (int)varB. Although their values are the same, the binary changes a lot.Argumentum
D
5

Because thats how shifting for bytes in java is defined in the language: https://docs.oracle.com/javase/specs/jls/se8/html/jls-15.html#jls-15.19.

The gist is that types smaller than int are silently widened to int, shifted and then narrowed back.

Which makes your single line effectively the equivalent of:

byte b = -1;      // 1111_1111
int temp = b;     // 1111_1111_1111_1111_1111_1111_1111_1111
temp >>>= 6;      // 0000_0011_1111_1111_1111_1111_1111_1111
b = (byte) temp;  // 1111_1111

To shift just the byte you need to make the widening conversion explicitly yourself with unsigned semantics (and the narrowing conversion needs to be manually, too):

byte b = -1;          // 1111_1111
int temp = b & 0xFF;  // 0000_0000_0000_0000_0000_0000_1111_1111
temp >>>= 6;          // 0000_0000_0000_0000_0000_0000_0000_0011
b = (byte) temp;      // 0000_0011
Dori answered 16/4, 2015 at 17:57 Comment(0)
C
0

One problem with the top answer is that, although it works correctly for unsigned >>> right shift, it doesn't for signed >> right shift. This is because >> depends on the sign bit (the one farthest to the left) which moves when it's promoted to int. This means when you use >>, you'll get 00000011 when you might expect 11111111. If you want a trick that works for both, try shifting left by 24, doing your chosen right shift, then shifting back to the right by 24. That way your byte data's sign bit is in the right place.

varB = (byte) (varB << 24 >> 6 >> 24);

I've [bracketed] the sign bit. Here's what's happening:

varB (byte)          : [1]1111111
promoted to int      : [1]1111111 11111111 11111111 11111111
shift left 24        : [1]1111111 00000000 00000000 00000000
signed shift right 6 : [1]1111111 11111100 00000000 00000000
shift right 24       : [1]1111111 11111111 11111111 11111111
cast to byte         : [1]1111111

Here you can see it also works for >>>:

varB = (byte) (varB << 24 >>> 6 >> 24);
varB (byte)            : [1]1111111
promoted to int        : [1]1111111 11111111 11111111 11111111
shift left 24          : [1]1111111 00000000 00000000 00000000
unsigned shift right 6 : [0]0000011 11111100 00000000 00000000
shift right 24         : [0]0000000 00000000 00000000 00000011
cast to byte           : [0]0000011

This costs more operations for the convenience of not having to remember the rules about which one you should and shouldn't bitmask. So use whatever solution works for you.

Btw, it's good to know that short is also promoted to int which means everything in these answers applies to it as well. The only difference is that you shift left/right by 16, and the bitmask is 0xFFFF.

Cisco answered 20/12, 2022 at 4:19 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.