I wrote the following functions for the ARM GNU
assembler. If you don't have a CPU with udiv/sdiv
machine support, just cut out the first few lines up to the "0:" label in either function.
.arm
.cpu cortex-a7
.syntax unified
.type udiv,%function
.globl udiv
udiv: tst r1,r1
bne 0f
udiv r3,r0,r2
mls r1,r2,r3,r0
mov r0,r3
bx lr
0: cmp r1,r2
movhs r1,r2
bxhs lr
mvn r3,0
1: adds r0,r0
adcs r1,r1
cmpcc r1,r2
subcs r1,r2
orrcs r0,1
lsls r3,1
bne 1b
bx lr
.size udiv,.-udiv
.type sdiv,%function
.globl sdiv
sdiv: teq r1,r0,ASR 31
bne 0f
sdiv r3,r0,r2
mls r1,r2,r3,r0
mov r0,r3
bx lr
0: mov r3,2
adds r0,r0
and r3,r3,r1,LSR 30
adcs r1,r1
orr r3,r3,r2,LSR 31
movvs r1,r2
ldrvc pc,[pc,r3,LSL 2]
bx lr
.int 1f
.int 3f
.int 5f
.int 11f
1: cmp r1,r2
movge r1,r2
bxge lr
mvn r3,1
2: adds r0,r0
adcs r1,r1
cmpvc r1,r2
subge r1,r2
orrge r0,1
lsls r3,1
bne 2b
bx lr
3: cmn r1,r2
movge r1,r2
bxge lr
mvn r3,1
4: adds r0,r0
adcs r1,r1
cmnvc r1,r2
addge r1,r2
orrge r0,1
lsls r3,1
bne 4b
rsb r0,0
bx lr
5: cmn r1,r2
blt 6f
tsteq r0,r0
bne 7f
6: mov r1,r2
bx lr
7: mvn r3,1
8: adds r0,r0
adcs r1,r1
cmnvc r1,r2
blt 9f
tsteq r0,r3
bne 10f
9: add r1,r2
orr r0,1
10: lsls r3,1
bne 8b
rsb r0,0
bx lr
11: cmp r1,r2
blt 12f
tsteq r0,r0
bne 13f
12: mov r1,r2
bx lr
13: mvn r3,1
14: adds r0,r0
adcs r1,r1
cmpvc r1,r2
blt 15f
tsteq r0,r3
bne 16f
15: sub r1,r2
orr r0,1
16: lsls r3,1
bne 14b
bx lr
There are two functions, udiv
for unsigned integer division and sdiv
for signed integer division. They both expect a 64-bit dividend (either signed or unsigned) in r1
(high word) and r0
(low word), and a 32-bit divisor in r2
. They return the quotient in r0
and the remainder in r1
, thus you can define them in a C header
as extern
returning a 64-bit integer and mask out the quotient and remainder afterwards. An error (division by 0 or overflow) is indicated by a remainder having an absolute value greater than or equal the absolute value of the divisor. The signed division algorithm uses case distinction by the signs of both dividend and divisor; it does not convert to positive integers first, since that wouldn't detect all overflow conditions properly.
SDIV
orUDIV
? Cortex-A8 is ARMv7, but from infocenter.arm.com/help/topic/com.arm.doc.qrc0001m/… it looks only some processors are supported. – Quadrivial