(clarification: this repeats more or less closely the last part of my answer to your question on https://cstheory.stackexchange.com/ and elaborates on it a bit; I'm not sure if it's a good fit there or not.)
Your first version is actually very close to Richard Bird's solution (from page 11 of M. O'Neill's paper). You can get there not by trial and error, but by code transformation and stepwise improvement.
After some renaming, pre-calculating the first step gives us this very nice-looking code:
-- map (*i) [i ..] == map (*i) [i, i+1 ..] == [i*i, i*i+i ..]
-- sieve 2 [] where sieve i [] = (i:) $
-- sieve (i + 1) [i*i, i*i+i ..] == 2 : sieve 3 [4, 6 ..]
primesA :: [Integer]
primesA = 2 : sieve 3 [4, 6 ..]
sieve p cs@(c : t)
| p == c = sieve (p + 1) t
| otherwise = p : sieve (p + 1) (unionI cs [p*p, p*p+p ..])
unionI a@(x:xs) b@(y:ys) | x < y = x : xs `unionI` b
| x == y = x : xs `unionI` ys
| otherwise = y : a `unionI` ys
There are two problems with this code. There is the (...(((a+b)+c)+d)+...)
calculation structure, which puts the more frequently-producing streams lower in the structure than the less frequently-producing ones.
But the more immediate concern is the premature handling of each prime's multiples. E.g. when you reach 5, [25, 30 ..]
is added into the composites. But that should only be done when 25 is reached. Doing so would radically reduce the total number of multiples streams being processed at each moment (from Θ(n)
to Θ(sqrt(n/log n))
, for n-th prime). This has very significant impact on efficiency.
We can explicitly synchronize on squares of primes, taken from the primes sequence itself as it is being produced (this just maintains a separate back-pointer ps
into the sequence, which is advanced at much slower pace than the production point):
primesAC = [2,3] ++ sieve 4 (tail primesAC) 9 [4,6..]
sieve k ps@(p:pt) q cs@(c:ct) -- candidate "k", prime "p"
| k == c = sieve (k + 1) ps q ct -- square "q", composite "c"
| k < q = k : sieve (k + 1) ps q cs
| otherwise = sieve k pt (head pt^2) -- k == q == p*p
(unionI cs [q, q+p ..])
This is now just half a step away from Richard Bird's solution, which fixes both problems with one cleverly used foldr
, creating the a+(b+(c+(...)))
calculation structure where each new prime's multiples stream starts from the prime's square anyway and the synchronization is implicit in the data:
primesAD = (2 :) . sieve 3 .
foldr (\p r-> p*p : unionI [p*p+p, p*p+2*p ..] r) [] $ primesAD
sieve p cs@(c : t) -- == diffI [p..] cs, if p<=c
| p == c = sieve (p + 1) t
| otherwise = p : sieve (p + 1) cs
Each prime's square is produced first unconditionally, to prevent the run-away access prematurely forcing further parts of data.
So foldr
, and not foldl
, is a natural fit to this problem.
You 2nd variant is
primesB :: [Integer]
primesB = [2..] `diffI` composites
composites = foldl f [4,6..] [3..]
where
f cs@(h:t) i | i == h = cs
| otherwise = h : unionI t [i*i, i*i+i ..]
diffI (x:xs) (y:ys) | x < y = x : xs `diffI` (y:ys)
| x == y = xs `diffI` ys
| otherwise = (x:xs) `diffI` ys
This evidently intends to compute composites
by similar iterative addition of each number's multiples into the mix; trouble is, foldl
doesn't let us in into its internal computation process which is never quite finished (as other answers already pointed out). foldl
is unrelenting like that. :) We could turn it into a productive iterate
-based iteration, but the same problems of premature addition and the inverted structure will remain. Plus it now adds multiples of all numbers, not just of primes as before:
composites = foldl f [4,6..] [3..]
-- no part of calculation of (f [4,6..] 3) is forced here, but if it were, ...
= foldl f (f [4,6..] 3) [4..]
= foldl f (4:unionI [6,8..] [9,12..]) [4..]
= foldl f (4:unionI [6,8..] [9,12..]) [5..]
= foldl f (4:union (unionI [6,8..] [9,12..]) [25,30..]) [6..]
= foldl f (4:union (union (unionI [6,8..] [9,12..]) [25,30..]) [36,42..]) [7..]
= ....
And ([2..] `diffI`)
is the same as sieve 2
above, so that adds nothing.