Writing in a certain pseudocode, your intent seems to be
pe3 = last [x | x <- [2 .. 775146], isPrime x, rem 600851475143 x == 0]
read it: for x
drawn from range 2 to 775146, if isPrime x
holds, if 600851475143 % x
is 0, collect such x
; return the largest. We also write the application (g x)
without the parentheses here, as just g x
.
Calculating a remainder is cheaper than testing for primality, so we'll swap the two operations:
pe3 n = last [x | x <- [2 .. isqrt n], rem n x == 0, isPrime x]
This algorithm might work for the specific numbers involved, but unfortunately it is incorrect in general - say for 9000009, whose integer square root is 3000, it will return 101. But 9901 is the right answer (i.e. 9901 is the biggest prime divisor of 9000009, not 101).
Let's first focus on finding the smallest prime factor, instead:
pe3a n = head ([x | x <- [2 .. isqrt n], rem n x == 0, isPrime x] ++ [n])
Why the ( ... ++ [n])
(++
meaning the concatenation of lists)?? Because if n
itself is prime, no divisor will be found at all, and the first list will come back empty, []
. In which case the answer must be n
itself. But if not, then the answer is the first (i.e. head
) of the divisors list. Of course when we find it, we don't need to continue searching for more. Just one is enough, if the smallest is all we need.
OK, so trying it (at an imaginary lazy pseudocode interpreter), we get 3 as its first factor:
> pe3a 9000009
3
Now we can divide that 3 out of our number:
> div 9000009 3
3000003
and continue with 3000003, instead of 9000009. That means we can stop at its square root, 1732, instead of at 3000 - a sizable gain in efficiency! Also, we don't need to start over from 2 - it was tested already - we can start from the last found factor:
pe3b (start, n) = (d, div n d)
where
d = head ([x | x <- [start .. isqrt n], rem n x == 0, isPrime x] ++ [n])
> pe3b (3, 3000003)
(3,1000001)
so we get a second 3 back, and the number is divided by the found divisor once again.
> pe3b (3, 1000001)
(101,9901)
the next prime divisor found is 101, and the number to factorize now is 1000001 / 101 = 9901. Again we start from the last found divisor, 101, because all the smaller ones were already tried:
> pe3b (101, 9901)
(9901,1)
Interesting. isqrt(9901) == 99
, the list [101 .. 99]
is empty, and so the result was immediately produced. 9901 is the first factor of itself above 100, and in fact above 1, because all the previous numbers were already tried, in succession, and divided out of it. That means 9901 is a prime, no need to test it for primality.
In fact, all factors found by this algorithm are guaranteed to be prime by construction by the same reasoning, and all the calls to isPrime
were superfluous.
Do also take note that the biggest number for which the division (the remainder operation) was performed here, was 101 - not 3000. Not only our new algorithm is correct, it is also much more efficient!
Now you can code in Scheme this algorithm of repeated pe3b
application and dividing by the last found factor. Stop when 1 is reached.
So, in the same pseudocode,
divStep (start, n) = (d, div n d)
where d = head ([x | x <- [start .. isqrt n], rem n x == 0] ++ [n])
pe3 n = fst . until ((== 1) . snd) divStep $ (2,n) -- (1st,2nd) in a tuple
factorizing n = takeWhile ((> 1) . fst) . drop 1 . iterate divStep $ (2,n)
factors n = map fst . factorizing $ n
isPrime n = factors n == [n]
Read .
and $
as "of". until
pred step start is a higher-order pattern of repeated applications of a function, until a predicate is fulfilled ( ((== 1) . snd)
means that the second component of a result equals 1). It is usually coded in Scheme with named let
.
To see the whole history of computation, iterate
step start is another pattern which collects all the interim results (and the starting value, which we don't need, so we drop
it). To see just the factors themselves, we take the first components of each result with map fst
. A number is prime if it's the only divisor, greater than 1, of itself. Testing:
> pe3 9000009
9901
> factorizing 9000009
[(3,3000003),(3,1000001),(101,9901),(9901,1)]
> factors 9000009
[3,3,101,9901]
> pe3 600851475143
6857
> factorizing 600851475143
[(71,8462696833),(839,10086647),(1471,6857),(6857,1)] -- 1471 is the largest
> factors 600851475143 -- factor tried,
[71,839,1471,6857] -- *not* 775146 !!
> factors 600851475143999917 -- isqrt 600851475143999917 == 775146099
[41,37309,392798360393] -- isqrt 392798360393 == 626736
prime?
function is breaking:b
is not being reset between calls toprime?
. But you shouldn't even be using global mutable variables here. Use a temporary variable that is local to the function. e.g.(define (prime? n) (define (iter b) <body-of-your-function>) (iter 3))
. If you're usingset!
, you have a host of issues you must think about. Better to avoidset!
if you can. – Belterprime?
is broken, do the following sequence and observe the result:(prime? 7)
(prime? 9)
. You should see immediately that there's something funky. :) – Belter