Your first prime-sieving part is totally alright. Only second factors-finding part should be corrected.
In second part you should divide by prime factor not just single time, but multiple times till it is still divisible. This will ensure that you take into account powers of all prime factors.
This second part is called trial division by prime numbers. As it is well known it is enough to do trial division only by divisors below Sqrt(N)
. Remaining value of N will be automatically also prime.
One more very important speed improvement to your algorithm is that it is enough to make sieve size Sqrt(N)
, and not N
. Because we need only primes that are less than Sqrt(N)
. This is because after trial division remaining N
is automatically becomes prime.
I also found one more improvement for your sieving part, you should replace for i in range(x + x, ...
with for i in range(x * x, ...
. Because you can start sieving not from x + x
but from x * x
, according to classical algorithm of Sieve of Erathosthenes. This will further improve speed. Starting from x + x
is alright, but x * x
will also give correct results with better performance.
One more obvious improvement is when you want to factorize multiple numbers, then you can reuse and extend sieve multiple times, this will make code faster. When reusing sieve it is good to keep resulting prime numbers in a separate list, so that on second Trial Division stage you don't iterate whole sieve but only prime numbers, this again will give speedup. I didn't do these two improvements in code below, but if you want I can do, just tell.
Full corrected final version is:
Try it online!
def factorize(n):
sieve = [True] * int(n ** 0.5 + 2)
for x in range(2, int(len(sieve) ** 0.5 + 2)):
if not sieve[x]:
continue
for i in range(x * x, len(sieve), x):
sieve[i] = False
factors = []
for i in range(2, len(sieve)):
if i * i > n:
break
if not sieve[i]:
continue
while n % i == 0:
factors.append(i)
n //= i
if n > 1:
factors.append(n)
return factors
print(factorize(99020)) # [2, 2, 5, 4951]
Input:
99020
Output:
[2, 2, 5, 4951]
If you need to factorize just few numbers it might be even faster to do Trial Division Factorization without intermediate sieving stage. This also makes code more simple and shorter.
Full code for doing factorization without sieving stage is below.
In this code I did one obvious speed improvement by trying only odd divisors, this will double total algorithm speed. Of course for doing this you need to factor out all powers of 2 before, which I did too.
Try it online!
def factorize(n):
factors = []
while (n & 1) == 0:
factors.append(2)
n >>= 1
for d in range(3, 1 << 60, 2):
if d * d > n:
break
while n % d == 0:
factors.append(d)
n //= d
if n > 1:
factors.append(n)
return factors
print(factorize(99020)) # [2, 2, 5, 4951]