I want to use a DivMod
function that operates exclusively on 32 bit operands. The implementation in the RTL returns values in 16 bit variables. Its declaration is:
procedure DivMod(Dividend: Cardinal; Divisor: Word; var Result, Remainder: Word);
So, I cannot use that since my inputs may overflow the return values.
The naive Pascal implementation looks like this:
procedure DivMod(Dividend, Divisor: Cardinal; out Quotient, Remainder: Cardinal);
begin
Quotient := Dividend div Divisor;
Remainder := Dividend mod Divisor;
end;
This works splendidly but performs the division twice. Since the function is called by part of my code that is in a performance bottleneck, I would like to perform the division once only. To that end I am using Serg's 32 bit DivMod from this question: Is there a DivMod that is *not* Limited to Words (<=65535)?
procedure DivMod(Dividend, Divisor: Cardinal; out Quotient, Remainder: Cardinal);
asm
PUSH EBX
MOV EBX,EDX
XOR EDX,EDX
DIV EBX
MOV [ECX],EAX
MOV EBX,Remainder
MOV [EBX],EDX
POP EBX
end;
This works perfectly.
But now I would like a version of the function for 64 bit code. Note that I still want to operate on 32 bit operands, and return 32 bit values.
Should I re-write the function using 64 bit assembler, or is it sufficient to use the DivMod
overload from the RTL that operates on, and returns, 64 bit values?
Specifically I would like to know if there is a performance benefit in writing 64 bit code that does 32 bit operations. Is that even possible? Or would I simply end up re-implementing the DivMod
overload with UInt64
parameters? If it is worth implementing a bespoke 64 bit asm version, how would I go about doing it, noting that the operands and operations are 32 bit.
I think that it would look like this, but I am no expert and likely have got something wrong:
procedure DivMod(Dividend, Divisor: Cardinal; out Quotient, Remainder: Cardinal);
asm
MOV EAX,ECX // move Dividend to EAX
MOV ECX,EDX // move Divisor to ECX
XOR EDX,EDX // zeroise EDX
DIV ECX // divide EDX:EAX by ECX
MOV [R8],EAX // save quotient
MOV [R9],EDX // save remainder
end;
DIV ECX
in the code above performs a 32 bit operation? Did I understand that right? Does it do unsigned divide of the 64 bit value EDX:EXA by ECX? And is it worth doing the 32 bit operation rather than the full 64 bit operation? – Stillman64-bit
tag withx86-64
tag, as the question is specific to x86-64, not to 64-bit architectures in general. – Korrydiv
for cheapmul
andshr
). Check out : libdivide.com Also section 9.2.4 of intel.com/content/dam/www/public/us/en/documents/manuals/… Some compilers do this automatically - delphi it seems not. – Garwinq := Trunc(i * 0.1); r := i - 10*q
is 33% faster than a singleq := i div 10;
. This is without any trickery or optimization whatsoever. (64-bit doubles are guaranteed to accurately represent all 32-bit integers) – Garwinmod
anddiv
with a common divisor? – Paronym