So I had an interview question before regarding bit manipulation. The company is a well known GPU company. I had very little background in assembly language (weird despite being a phd student in computer architecture) and as this narrative would indicate, I botched it. The question was a simple:
"Write a fast code that will count the number of 1's in a 32-bit register."
Now I am in the process of studying arm assembly. So naturally I revisited this problem again and have come up with this code just by studying the ISA.
For you arm experts out there, is this correct? Is there a faster way to do this? Being a beginner, I naturally think this is incomplete. The AND instruction in "xx" feels redundant but there is no other way to shift a register in ARM isa...
R1 will contain the number of bits at the end while R2 is the register with bits we want to count. r6 is just a dummy register. Comments are enclosed in ()
MOV R1, #0 (initialize R1 and R6 to zero)
MOV R6, #0
xx: AND R6, R6, R2, LSR #1 (Right shift by 1, right most bit is in carry flag)
ADDCS R1, #1 (Add #1 to R1 if carry flag is set)
CMP R2, #0 (update the status flags if R2 == 0 or not)
BEQ xx (branch back to xx until R2==0)
4*32+2
instructions for all1
. Dave Seal's algorithm takes six instructions plus a memory access; it will probably be as fast for all types of input unless memory is extremely slow. I doubt few people would get Dave Seal's solution. Thereverse subtract
is just weird to me. – Enforce