I have a question (or more likely a bug report) about bit shifting behavior in Delphi (tested in Borland Delphi 7).
Target: perform an "Arithmetic" bitwise shift right with any number.
This means that sign-bit must be extended – binary number will be filled from the left with 1 instead of 0 if the most significant bit of a number was set.
So, number "-1" after an arithmetic shift right must stay the same number (all bits = 1), but with "logical shift" (which always fill the number with zeroes) must give a maximal positive integer (maximal positive signed integer, to be correct)
I tested it only on 32-bit system (Windows); moreover, I need it to work explicitly with 32-bit integers.
Looks like there is an internal bug in Delphi with "shr" when the source number is stored in a variable.
My example code:
program bug;
{$APPTYPE CONSOLE}
var
I:Integer;
C:Cardinal;
begin
I := -1; // we’ll need that later
C := $FFFFFFFF;
(It’s only the beginning). Next, let’s try some "shr"s:
Writeln('0) ', -1 shr 1 );
Writeln('1) ', $FFFFFFFF shr 1 );
"-1" is a signed equivalent to "$FFFFFFFF". Seems that "shr" behavior (arithmetic or logical) is based on the fact whether the source number is signed or is not (integer or cardinal).
Output is:
0) -1
1) 2147483647
Quite correct. Then I need to try to manually cast these numbers to integers or cardinals:
Writeln('2) ', Integer(-1) shr 1 );
Writeln('3) ', Integer($FFFFFFFF) shr 1 );
Writeln('4) ', Cardinal(-1) shr 1 );
Writeln('5) ', Cardinal($FFFFFFFF) shr 1 );
Result:
2) -1
3) -1
4) 2147483647
5) 2147483647
Still correct. So, I think that I can cast anything to "integer" if I need an arithmetic shift; or cast to "cardinal" when I want a logical shift. But wait! Example with variables (declared above) :
Writeln('6) ', I shr 1 );
Writeln('7) ', C shr 1 );
Suddenly:
6) 2147483647
7) 2147483647
INCORRECT. My "I" was a signed integer, and I expected the arithmetic shift! So, maybe casting could help?
Writeln('8) ', Integer(I) shr 1 );
Writeln('9) ', Cardinal(I) shr 1 );
Writeln('A) ', Integer(C) shr 1 );
Writeln('B) ', Cardinal(C) shr 1 );
No, still the same...
8) 2147483647
9) 2147483647
A) 2147483647
B) 2147483647
Things are even worse if I’ll try to make a function "a shr b" and use it instead:
// Simple shift right with signed integers
function shrI(a,b:Integer):Integer;
begin
Result := a shr b;
end;
// Simple shift right with unsigned integers
function shrC(a,b:Cardinal):Cardinal;
begin
Result := a shr b;
end;
Now:
Writeln('C) ', shrI(-1,1) );
Writeln('D) ', shrC($FFFFFFFF,1) );
– It stopped working even with constant expressions: (it makes sense, because numbers are again stored in variables inside a function)
C) 2147483647
D) 2147483647
Since I need to make my correct arithmetic shift anyway, I wrote these formulas to do this (shift "a" right by "b" bits). First is the logical shift:
(a shr b) and ((1 shl (32-b))-1)
I just need to bitwise-and the result with "32 - b" ones (from the right) to clear "b" left bits in case "shr" fail me and did arithmetic shift instead (no example shows this, but just to make sure). Then the arithmetic shift:
(a shr b) or (( 0-((a shr 31) and 1)) shl (32-b))
I need to bitwise-or the result with "b" ones from the left, but only when most significant bit was set; to do this firstly I take sign bit with "(a shr 31) and 1", then negate this number to get "-1" (or $FFFFFFFF – all bits =1) if source was negative, and 0 otherwise (I put "0-x" instead of just "-x" because in my C-port in some cases bcc32 C-compiler report a warning of tying to negate an unsigned integer); and finally I shifted it to "32 - b" bits left, so I got what I wish even when "shr" fails and give zeroes. I made two version of each function to deal with both integers and cardinals (also I could share names and "overload" them for me, but here I won’t do this to keep the example clear):
// Logical shift right with signed integers
function srlI(a,b:Integer):Integer;
begin
Result := (a shr b) and ((1 shl (32-b))-1);
end;
// Arithmetic shift right with signed integers
function sraI(a,b:Integer):Integer;
begin
Result := (a shr b) or (( 0-((a shr 31) and 1)) shl (32-b));
end;
// Logical shift right with unsigned integers
function srlC(a,b:Cardinal):Cardinal;
begin
Result := (a shr b) and ((1 shl (32-b))-1);
end;
// Arithmetic shift right with unsigned integers
function sraC(a,b:Cardinal):Cardinal;
begin
Result := (a shr b) or (( 0-((a shr 31) and 1)) shl (32-b));
end;
Test it:
Writeln('E) ', sraI(-1,1) );
Writeln('F) ', srlI(-1,1) );
Writeln('G) ', sraC($FFFFFFFF,1) );
Writeln('H) ', srlC($FFFFFFFF,1) );
And got perfect results:
E) -1
F) 2147483647
G) 4294967295
H) 2147483647
(G-case is still correct, because "4294967295" is the unsigned version of "-1")
Final checks with variables:
Writeln('K) ', sraI(I,1) );
Writeln('L) ', srlI(I,1) );
Writeln('M) ', sraC(C,1) );
Writeln('N) ', srlC(C,1) );
Perfect:
K) -1
L) 2147483647
M) 4294967295
N) 2147483647
For this bug I also tried to change the second number (amount of shift) to a variable and/or try different casting it – same bug present, looks like it isn’t related with second argument. And trying to cast the result (to integer or to cardinal) before output didn’t improve anything either.
To make sure that I’m not only one who have the bug, I tried to run my entire example at http://codeforces.com/ (there a registered user can compile and execute a piece of code in different languages and compilers on server-side) to see the output.
"Delphi 7" compiler gave me exactly what I have – bug was present. Alternative option, "Free Pascal 2" shows even more wrong output:
0) 9223372036854775807
1) 2147483647
2) 9223372036854775807
3) 9223372036854775807
4) 2147483647
5) 2147483647
6) 2147483647
7) 2147483647
8) 2147483647
9) 2147483647
A) 2147483647
B) 2147483647
C) 2147483647
D) 2147483647
E) -1
F) 2147483647
G) 4294967295
H) 2147483647
K) -1
L) 2147483647
M) 4294967295
N) 2147483647
Strange "9223372036854775807" in cases 0-2-3 (there was "-1", "Integer(-1)" and "Integer($FFFFFFFF)" who don’t remember).
Here is my entire example in Delphi:
program bug;
{$APPTYPE CONSOLE}
// Simple shift right with signed integers
function shrI(a,b:Integer):Integer;
begin
Result := a shr b;
end;
// Simple shift right with unsigned integers
function shrC(a,b:Cardinal):Cardinal;
begin
Result := a shr b;
end;
// Logical shift right with signed integers
function srlI(a,b:Integer):Integer;
begin
Result := (a shr b) and ((1 shl (32-b))-1);
end;
// Arithmetic shift right with signed integers
function sraI(a,b:Integer):Integer;
begin
Result := (a shr b) or (( 0-((a shr 31) and 1)) shl (32-b));
end;
// Logical shift right with unsigned integers
function srlC(a,b:Cardinal):Cardinal;
begin
Result := (a shr b) and ((1 shl (32-b))-1);
end;
// Arithmetic shift right with unsigned integers
function sraC(a,b:Cardinal):Cardinal;
begin
Result := (a shr b) or (( 0-((a shr 31) and 1)) shl (32-b));
end;
var
I:Integer;
C:Cardinal;
begin
I := -1;
C := $FFFFFFFF;
Writeln('0) ', -1 shr 1 );
Writeln('1) ', $FFFFFFFF shr 1 );
// 0) -1 - correct
// 1) 2147483647 - correct
Writeln('2) ', Integer(-1) shr 1 );
Writeln('3) ', Integer($FFFFFFFF) shr 1 );
// 2) -1 - correct
// 3) -1 - correct
Writeln('4) ', Cardinal(-1) shr 1 );
Writeln('5) ', Cardinal($FFFFFFFF) shr 1 );
// 4) 2147483647 - correct
// 5) 2147483647 - correct
Writeln('6) ', I shr 1 );
Writeln('7) ', C shr 1 );
// 6) 2147483647 - INCORRECT!
// 7) 2147483647 - correct
Writeln('8) ', Integer(I) shr 1 );
Writeln('9) ', Cardinal(I) shr 1 );
// 8) 2147483647 - INCORRECT!
// 9) 2147483647 - correct
Writeln('A) ', Integer(C) shr 1 );
Writeln('B) ', Cardinal(C) shr 1 );
// A) 2147483647 - INCORRECT!
// B) 2147483647 - correct
Writeln('C) ', shrI(-1,1) );
Writeln('D) ', shrC($FFFFFFFF,1) );
// C) 2147483647 - INCORRECT!
// D) 2147483647 - correct
Writeln('E) ', sraI(-1,1) );
Writeln('F) ', srlI(-1,1) );
// E) -1 - correct
// F) 2147483647 - correct
Writeln('G) ', sraC($FFFFFFFF,1) );
Writeln('H) ', srlC($FFFFFFFF,1) );
// G) 4294967295 - correct
// H) 2147483647 - correct
Writeln('K) ', sraI(I,1) );
Writeln('L) ', srlI(I,1) );
// K) -1 - correct
// L) 2147483647 - correct
Writeln('M) ', sraC(C,1) );
Writeln('N) ', srlC(C,1) );
// M) 4294967295 - correct
// N) 2147483647 - correct
end.
Then I was curios, is this bug also present in C++? I wrote a port to C++ and use (Borland!) bcc32.exe to compile it.
Results:
0) -1
1) 2147483647
2) -1
3) -1
4) 2147483647
5) 2147483647
6) -1
7) 2147483647
8) -1
9) 2147483647
A) -1
B) 2147483647
C) -1
D) 2147483647
E) -1
F) 2147483647
G) 4294967295
H) 2147483647
K) -1
L) 2147483647
M) 4294967295
N) 2147483647
Everything is OK. Here is C++ version, in case someone also wants to look:
#include <iostream>
using namespace std;
// Simple shift right with signed integers
int shrI(int a, int b){
return a >> b;
}
// Simple shift right with unsigned integers
unsigned int shrC(unsigned int a, unsigned int b){
return a >> b;
}
// Logical shift right with signed integers
int srlI(int a, int b){
return (a >> b) & ((1 << (32-b))-1);
}
// Arithmetic shift right with signed integers
int sraI(int a, int b){
return (a >> b) | (( 0-((a >> 31) & 1)) << (32-b));
}
// Logical shift right with unsigned integers
unsigned int srlC(unsigned int a, unsigned int b){
return (a >> b) & ((1 << (32-b))-1);
}
// Arithmetic shift right with unsigned integers
unsigned int sraC(unsigned int a, unsigned int b){
return (a >> b) | (( 0-((a >> 31) & 1)) << (32-b));
}
int I;
unsigned int C;
int main(){
I = -1;
C = 0xFFFFFFFF;
cout<<"0) "<<( -1 >> 1 )<<endl;
cout<<"1) "<<( 0xFFFFFFFF >> 1 )<<endl;
// 0) -1 - correct
// 1) 2147483647 - correct
cout<<"2) "<<( ((int)(-1)) >> 1 )<<endl;
cout<<"3) "<<( ((int)(0xFFFFFFFF)) >> 1 )<<endl;
// 2) -1 - correct
// 3) -1 - correct
cout<<"4) "<<( ((unsigned int)(-1)) >> 1 )<<endl;
cout<<"5) "<<( ((unsigned int)(0xFFFFFFFF)) >> 1 )<<endl;
// 4) 2147483647 - correct
// 5) 2147483647 - correct
cout<<"6) "<<( I >> 1 )<<endl;
cout<<"7) "<<( C >> 1 )<<endl;
// 6) -1 - correct
// 7) 2147483647 - correct
cout<<"8) "<<( ((int)(I)) >> 1 )<<endl;
cout<<"9) "<<( ((unsigned int)(I)) >> 1 )<<endl;
// 8) -1 - correct
// 9) 2147483647 - correct
cout<<"A) "<<( ((int)(C)) >> 1 )<<endl;
cout<<"B) "<<( ((unsigned int)(C)) >> 1 )<<endl;
// A) -1 - correct
// B) 2147483647 - correct
cout<<"C) "<<( shrI(-1,1) )<<endl;
cout<<"D) "<<( shrC(0xFFFFFFFF,1) )<<endl;
// C) -1 - correct
// D) 2147483647 - correct
cout<<"E) "<<( sraI(-1,1) )<<endl;
cout<<"F) "<<( srlI(-1,1) )<<endl;
// E) -1 - correct
// F) 2147483647 - correct
cout<<"G) "<<( sraC(0xFFFFFFFF,1) )<<endl;
cout<<"H) "<<( srlC(0xFFFFFFFF,1) )<<endl;
// G) 4294967295 - correct
// H) 2147483647 - correct
cout<<"K) "<<( sraI(I,1) )<<endl;
cout<<"L) "<<( srlI(I,1) )<<endl;
// K) -1 - correct
// L) 2147483647 - correct
cout<<"M) "<<( sraC(C,1) )<<endl;
cout<<"N) "<<( srlC(C,1) )<<endl;
// M) 4294967295 - correct
// N) 2147483647 - correct
}
Before posting here, I tried to search this problem, and didn’t found any mention of this bug. Also I looked here: What is the behaviour of shl and shr for non register sized operands? and here: Arithmetic Shift Right rather than Logical Shift Right – but there other problems were discussed (that compiler internally casts any type to 32-bit number before doing actual shift; or shifting more than 31 bits), but not mine bug.
But wait, here is my problem: http://galfar.vevb.net/wp/2009/shift-right-delphi-vs-c/ !
With one remark: they say –
In Delphi the SHR is always a SHR operation: it never takes into account the sign.
But my example shows that Delphi does take sign into account, but only when source number is a constant expression, not a variable. So "-10 shr 2" equals to "-3", but "x shr 2" equals to "1073741821" when "x:=-10".
So I think this is a bug, and not a "behavior" that "shr" is always logical. You see, not always.
Trying to enable/disable any compiler options such a range checking or optimizations didn’t change anything.
Also, here I posted examples of how to bypass this problem and have correct arithmetic shift right. And my main question is: am I right?
Seems that left shift is always good in Delphi (it never uses the original sign bit, and not "undefined": for signed integers it behave as casting to cardinal before shifting and casting the result back to integer – a number can suddenly became negative of course). But now I wonder, are there any other similar bugs in Delphi? This is the first really significant bug I ever discover in Delphi 7. I love Delphi more than C++ exactly because I was always sure that my code is everytime doing what I want, without debug-testing every new unusual piece of code that I’m about to write (IMHO).
P.S. Here are some useful links that StackOverflow system suggests me when I typed my title before posting this question. Again, interesting information, but not about this bug:
Arithmetic bit-shift on a signed integer
Signed right shift = strange result?
Bitwise shift operators on signed types
Should you always use 'int' for numbers in C, even if they are non-negative?
Are the results of bitwise operations on signed integers defined?
Verifying that C / C++ signed right shift is arithmetic for a particular compiler?
Emulating variable bit-shift using only constant shifts?
P.P.S. Thanks a lot to Stack Exchange Team for assistance in posting this article. Guys, you rock!