For my BigInteger code, output turned out to be slow for very large BigIntegers. So now I use a recursive divide-and-conquer algorithm, which still needs 2'30" to convert the currently largest known prime to a decimal string of more than 22 million digits (but only 135 ms to turn it into a hexadecimal string).
I still want to reduce the time, so I need a routine that can divide a NativeUInt (i.e. UInt32 on 32 bit platforms, UInt64 on 64 bit platforms) by 100 very fast. So I use multiplication by constant. This works fine in 32 bit code, but I am not 100% sure for 64 bit.
So my question: is there a way to check the reliability of the results of multiplication by constant for unsigned 64 bit values? I checked the 32 bit values by simply trying with all values of UInt32 (0..$FFFFFFFF). This took approx. 3 minutes. Checking all UInt64s would take much longer than my lifetime. Is there a way to check if the parameters used (constant, post-shift) are reliable?
I noticed that DivMod100()
always failed for a value like $4000004B
if the chosen parameters were wrong (but close). Are there special values or ranges to check for 64 bit, so I don't have to check all values?
My current code:
const
{$IF DEFINED(WIN32)}
// Checked
Div100Const = UInt32(UInt64($1FFFFFFFFF) div 100 + 1);
Div100PostShift = 5;
{$ELSEIF DEFINED(WIN64)}
// Unchecked!!
Div100Const = $A3D70A3D70A3D71;
// UInt64(UInt128($3 FFFF FFFF FFFF FFFF) div 100 + 1);
// UInt128 is fictive type.
Div100PostShift = 2;
{$IFEND}
// Calculates X div 100 using multiplication by a constant, taking the
// high part of the 64 bit (or 128 bit) result and shifting
// right. The remainder is calculated as X - quotient * 100;
// This was tested to work safely and quickly for all values of UInt32.
function DivMod100(var X: NativeUInt): NativeUInt;
{$IFDEF WIN32}
asm
// EAX = address of X, X is UInt32 here.
PUSH EBX
MOV EDX,Div100Const
MOV ECX,EAX
MOV EAX,[ECX]
MOV EBX,EAX
MUL EDX
SHR EDX,Div100PostShift
MOV [ECX],EDX // Quotient
// Slightly faster than MUL
LEA EDX,[EDX + 4*EDX] // EDX := EDX * 5;
LEA EDX,[EDX + 4*EDX] // EDX := EDX * 5;
SHL EDX,2 // EDX := EDX * 4; 5*5*4 = 100.
MOV EAX,EBX
SUB EAX,EDX // Remainder
POP EBX
end;
{$ELSE WIN64}
asm
.NOFRAME
// RCX is address of X, X is UInt64 here.
MOV RAX,[RCX]
MOV R8,RAX
XOR RDX,RDX
MOV R9,Div100Const
MUL R9
SHR RDX,Div100PostShift
MOV [RCX],RDX // Quotient
// Faster than LEA and SHL
MOV RAX,RDX
MOV R9D,100
MUL R9
SUB R8,RAX
MOV RAX,R8 // Remainder
end;
{$ENDIF WIN32}
$1C0000000000000000 div 100 + 1
with a post-shift of 6, but the result is notn div 100
in the high part. libdivide gives the results I expect for 32 bit, but perhaps I don't understand the way it is used for 64 bit. I'll experiment a little more. – Chowchow