But I wonder why is division actually slower than multiplication? Isn't division just a glorified subtraction, and multiplication is a glorified addition?
The big difference is that in a long multiplication you just need to add up a bunch of numbers after shifting and masking. In a long division you have to test for overflow after each subtraction.
Lets consider a long multiplication of two n bit binary numbers.
- shift (no time)
- mask (constant time)
- add (neively looks like time proportional to n²)
But if we look closer it turns out we can optimise the addition by using two tricks (there are further optimisations but these are the most important).
- We can add the numbers in groups rather than sequentially.
- Until the final step we can add three numbers to produce two rather than adding two to produce one. While adding two numbers to produce one takes time proportional to n, adding three numbers to produce two can be done in constant time because we can eliminate the carry chain.
So now our algorithm looks like
- shift (no time)
- mask (constant time)
- add numbers in groups of three to produce two until there are only two left (time proportional to log(n))
- perform the final addition (time proportional to n)
In other words we can build a multiplier for two n bit numbers in time roughly proportional to n (and space roughly proportional to n²). As long as the CPU designer is willing to dedicate the logic multiplication can be almost as fast as addition.
In long division we need to know whether each subtraction overflowed before we can decide what inputs to use for the next one. So we can't apply the same parallising tricks as we can with long multiplication.
There are methods of division that are faster than basic long division but still they are slower than multiplication.
"After all they are supposed to know this stuff."
- You might be surprised what most people don't know. – Jodhpur