Project Euler 3 - Why does this method work?
Asked Answered
H

3

5

The prime factors of 13195 are 5, 7, 13 and 29. What is the largest prime factor of the number 600851475143?

I solved this problem on Project Euler my own way, which was slow, and then I found this solution on someone's github account. I can't figure out why it works. Why are a number of factors removed, equal to an index? Any insight?

def Euler3(n=600851475143):
    for i in range(2,100000):
        while n % i == 0:
            n //= i
            if n == 1 or n == i:
                return i
Humdrum answered 26/9, 2012 at 15:32 Comment(3)
The function returns 6857. It is easy to run.Powerful
it won't work for 600851475149.Intercom
Yep looks like a one-off hack to solve a specific problem on Project EulerPowerful
P
8

This function works by finding successive factors of its input. The first factor it finds will necessarily be prime. After a prime factor is found, it is divided out of the original number and the process continues. By the time we've divided them all out (leaving 1, or the current factor (i)) we've got the last (largest) one.

Let's add some tracing code here:

def Euler3(n=600851475143):
    for i in range(2,100000):
        while n % i == 0:
            n //= i
            print("Yay, %d is a factor, now we should test %d" % (i, n))
            if n == 1 or n == i:
                return i

Euler3()

The output of this is:

$ python factor.py
Yay, 71 is a factor, now we should test 8462696833
Yay, 839 is a factor, now we should test 10086647
Yay, 1471 is a factor, now we should test 6857
Yay, 6857 is a factor, now we should test 1

It is true that for a general solution, the top of the range should have been the square root of n, but for python, calling math.sqrt returns a floating point number, so I think the original programmer was taking a lazy shortcut. The solution will not work in general, but it was good enough for the Euler project.

But the rest of the algorithm is sound.

Powerful answered 26/9, 2012 at 15:55 Comment(0)
H
3

Consider how it solves for n=20:

iteration i=2
  while true (20 % 2 == 0)
    n = n//2 = 20//2 = 10
    if (n == 1 or n == 2) false
  while true (10 % 2 == 0)
    n = n//2 = 10//2 = 5
    if (n == 1 or n == 2) false
  while false (5 % 2 == 0)
iteration i = 3
  while false (5 % 3 == 0)
iteration i = 4 
  while false (5 % 4 == 0)
iteration i = 5
  while true (5 % 5 == 0)
    n = n//5 = 5//5 = 1
    if (n == 1 or n == 5) true
      return i, which is 5, which is the largest prime factor of 20

It is just removing factors, and since it already removes multiples of prime factors (the while loop), many values of i are really just wasted effort. The only values of i that have any chance of doing something within the loop are prime values of i. The n==i test covers the case of numbers like 25 that are squares of a prime number. The range seems to limited though. It would not give the correct answer for 2 * (the next largest prime after 100,000.

Hidden answered 26/9, 2012 at 15:59 Comment(0)
M
1

No one has actually answered your question. The for loop tests each number i in turn. The test of the while loop is successful when i is a factor of n; in that case, it reduces n, then checks if it is finished by comparing i to 1 or n. The test is a while (and not just if) in case i divides n more than once.

Though clever, that's not the way integer factorization by trial division is normally written; it also won't work if n has a factor greater than 100000. I have an explanation on my blog. Here's my version of the function, which lists all the factors of n instead of just the largest:

def factors(n):
    fs = []
    while n % 2 == 0:
        fs += [2]
        n /= 2
    if n == 1:
        return fs
    f = 3
    while f * f <= n:
        if n % f == 0:
            fs += [f]
            n /= f
        else:
            f += 2
    return fs + [n]

This function handles 2 separately, then tries only odd factors. It also doesn't place a limit on the factor, instead stopping when the factor is greater than the square root of the remaining n, because at that point n must be prime. The factors are inserted in increasing order, so the last factor in the output list will be the largest, which is the one you want.

Mize answered 26/9, 2012 at 17:41 Comment(2)
It's pretty darn close to a standard integer factorization. Sure, testing the even numbers is wasteful, but technically so is testing every composite number. That solution just needs a proper upper bound to work in general. You've just described some optimizations, not a radically different approach.Castellanos
@Castellanos wrong, and your downvote (I assume it was) was uncalled for. I'll explain. "just needs proper upper bound" is wrong because the number itself changes when its factors are divided out. So the upper bound has to change too. But with range the upper bound is pre-specified. The accepted answer is also wrong in that respect. I've edited the answer to improve formatting, so the downvoter has a chance to undo their downvote.Intercom

© 2022 - 2024 — McMap. All rights reserved.