Prime Sieve in Haskell

Asked 1/8, 2012 at 23:21 Answered 1/7, 2022 at 12:21

Solved performance haskell time-complexity primes sieve-of-eratosthenes

I'm very new to Haskell and I'm just trying to find the sum of the first 2 million primes. I'm trying to generate the primes using a sieve (I think the sieve of Eratosthenes?), but it's really really slow and I don't know why. Here is my code.

sieve (x:xs) = x:(sieve $ filter (\a -> a `mod` x /= 0) xs)
ans = sum $ takeWhile (<2000000) (sieve [2..])

Thanks in advance.

Barnwell answered 1/8, 2012 at 23:21 Comment(4)

note: if you use mod it is not Sieve of Eratosthenes – Intercross 1/8, 2012 at 23:33

I added the 'project-euler' tag since I suspect that is where you got this question. Good luck! – Horizontal 2/8, 2012 at 0:55

You can see my similar function here along with some good suggestions. (Especially the link about the real Sieve in Haskell.) – Horizontal 2/8, 2012 at 0:58

The paper "The Genuine Sieve of Eratosthenes" discusses why your code (or code much like it) isn't the Sieve of Eratosthenes. (It's the same paper Code-Guru mentions.) – Unaesthetic 2/8, 2012 at 1:7

It is very slow because the algorithm is a trial division that doesn't stop at the square root.

If you look closely what the algorithm does, you see that for each prime p, its multiples that have no smaller prime divisors are removed from the list of candidates (the multiples with smaller prime divisors were removed previously).

So each number is divided by all primes until either it is removed as a multiple of its smallest prime divisor or it appears at the head of the list of remaining candidates if it is a prime.

For the composite numbers, that isn't particularly bad, since most composite numbers have small prime divisors, and in the worst case, the smallest prime divisor of n doesn't exceed √n.

But the primes are divided by all smaller primes, so until the k^th prime is found to be prime, it has been divided by all k-1 smaller primes. If there are m primes below the limit n, the work needed to find all of them prime is

(1-1) + (2-1) + (3-1) + ... + (m-1) = m*(m-1)/2

divisions. By the Prime number theorem, the number of primes below n is asymptotically n / log n (where log denotes the natural logarithm). The work to eliminate the composites can crudely be bounded by n * √n divisions, so for not too small n that is negligible in comparison to the work spent on the primes.

For the primes to two million, the Turner sieve needs roughly 10¹⁰ divisions. Additionally, it needs to deconstruct and reconstruct a lot of list cells.

A trial division that stops at the square root,

isPrime n = go 2
  where
    go d
      | d*d > n        = True
      | n `rem` d == 0 = False
      | otherwise      = go (d+1)

primes = filter isPrime [2 .. ]

would need fewer than 1.9*10⁹ divisions (brutal estimate if every isPrime n check went to √n - actually, it takes only 179492732 because composites are generally cheap)⁽¹⁾ and much fewer list operations. Additionally, this trial division is easily improvable by skipping even numbers (except 2) as candidate divisors, which halves the number of required divisions.

A sieve of Eratosthenes doesn't need any divisions and uses only O(n * log (log n)) operations, that is quite a bit faster:

primeSum.hs:

module Main (main) where

import System.Environment (getArgs)
import Math.NumberTheory.Primes

main :: IO ()
main = do
    args <- getArgs
    let lim = case args of
                (a:_) -> read a
                _     -> 1000000
    print . sum $ takeWhile (<= lim) primes

And running it for a limit of 10 million:

$ ghc -O2 primeSum && time ./primeSum 10000000
[1 of 1] Compiling Main             ( primeSum.hs, primeSum.o )
Linking primeSum ...
3203324994356

real    0m0.085s
user    0m0.084s
sys     0m0.000s

We let the trial division run only to 1 million (fixing the type as Int):

$ ghc -O2 tdprimeSum && time ./tdprimeSum 1000000
[1 of 1] Compiling Main             ( tdprimeSum.hs, tdprimeSum.o )
Linking tdprimeSum ...
37550402023

real    0m0.768s
user    0m0.765s
sys     0m0.002s

And the Turner sieve only to 100000:

$ ghc -O2 tuprimeSum && time ./tuprimeSum 100000
[1 of 1] Compiling Main             ( tuprimeSum.hs, tuprimeSum.o )
Linking tuprimeSum ...
454396537

real    0m2.712s
user    0m2.703s
sys     0m0.005s

⁽¹⁾ The brutal estimate is

2000000
   ∑ √k ≈ 4/3*√2*10^9
 k = 1

evaluated to two significant digits. Since most numbers are composites with a small prime factor - half the numbers are even and take only one division - that vastly overestimates the number of divisions required.

A lower bound for the number of required divisions would be obtained by considering primes only:

   ∑ √p ≈ 2/3*N^1.5/log N
 p < N
p prime

which, for N = 2000000 gives roughly 1.3*10⁸. That is the right order of magnitude, but underestimates by a nontrivial factor (decreasing slowly to 1 for growing N, and never greater than 2 for N > 10).

Besides the primes, also the squares of primes and the products of two close primes require the trial division to go up to (nearly) √k and hence contribute significantly to the overall work if there are sufficiently many.

The number of divisions needed to treat the semiprimes is however bounded by a constant multiple of

N^1.5/(log N)^2

so for very large N it becomes negligible relative to the cost of treating primes. But in the range where trial division is feasible at all, they still contribute significantly.

Beery answered 2/8, 2012 at 1:31 Comment(6)

1.9*10^9 divisions for optimal TD looks suspiciously too big. if m^2 = 10^10, then m^1.5 = 10**7.5= 3.1e7 only. Empirical run-times in your ans also seem to suggest it (Turner's is not only 5 times slower than TD) ... ? :) – Collbaith 19/12, 2012 at 10:13

It's not for optimal trial division, and it's the 0-th upper bound, sum [sqrt k | k <- [1 .. n]]. But thanks for pointing it out, that should have indeed made clearer from the beginning. I added a better estimate too. – Beery 19/12, 2012 at 11:48

I call a trial division optimal if it divides only by primes (and stops at the square root at the latest). That gives the additional log N factor. – Beery 19/12, 2012 at 15:40

sorry, wasn't reading your code close enough. That's what I mean by "OTD" too. Of course, primes' density is ~ log N. – Collbaith 19/12, 2012 at 15:42

why am I getting cannot find module Math.NumberTheory.Primes is did cabal install primes – Emersen 16/10, 2015 at 19:18

@FranklinDeLosSantos the package name is arithmoi. – Collbaith 26/11, 2015 at 8:44

Here's a sieve of Eratosthenes:

P = {3,5, ...} \ ⋃ {{p², p²+2p, ...} | p in P}

(without the 2). :) Or in "functional" i.e. list-based Haskell,

primes = 2 : g (fix g)  where
   g xs = 3 : (gaps 5 $ unionAll [[p*p, p*p+2*p..] | p <- xs])

unionAll ((x:xs):t) = x : union xs (unionAll $ pairs t)  where
  pairs ((x:xs):ys:t) = (x : union xs ys) : pairs t 
fix g = xs where xs = g xs
union (x:xs) (y:ys) = case compare x y of LT -> x : union  xs (y:ys)
                                          EQ -> x : union  xs    ys 
                                          GT -> y : union (x:xs) ys
gaps k s@(x:xs) | k<x  = k:gaps (k+2) s    
                | True =   gaps (k+2) xs

Compared with the trial division code in the answer by augustss, it is 1.9x times faster at generating 200k primes, and 2.1x faster at 400k, with empirical time complexity of O(n^1.12..1.15) vs O(n^1.4), on said range. It is 2.6x times faster at generating 1 mln primes.

Why the Turner sieve is so slow

Because it opens up multiples-filtering streams for each prime too early, and so ends up with too many of them. We don't need to filter by a prime until its square is seen in the input.

Seen under a stream processing paradigm, sieve (x:xs) = x:sieve [y|y<-xs, rem y p/=0] can be seen as creating a pipeline of stream transducers behind itself as it is working:

[2..] ==> sieve --> 2
[3..] ==> nomult 2 ==> sieve --> 3
[4..] ==> nomult 2 ==> nomult 3 ==> sieve 
[5..] ==> nomult 2 ==> nomult 3 ==> sieve --> 5
[6..] ==> nomult 2 ==> nomult 3 ==> nomult 5 ==> sieve 
[7..] ==> nomult 2 ==> nomult 3 ==> nomult 5 ==> sieve --> 7
[8..] ==> nomult 2 ==> nomult 3 ==> nomult 5 ==> nomult 7 ==> sieve

where nomult p = filter (\y->rem y p/=0). But 8 doesn't need to be checked for divisibility by 3 yet, as it is smaller than 3^2 == 9, let alone by 5 or 7.

This is the single most serious problem with that code, although it is dismissed as irrelevant right at the start of that article which everybody mention. Fixing it by postponing the creation of filters achieves dramatic speedups.

Collbaith answered 2/8, 2012 at 16:5 Comment(0)

What you did is not the Sieve of Eratosthenes; it's trial division (note the mod operator). Here's my version of the Sieve of Eratosthenes:

import Control.Monad (forM_, when)
import Control.Monad.ST
import Data.Array.ST
import Data.Array.Unboxed

sieve :: Int -> UArray Int Bool
sieve n = runSTUArray $ do
    let m = (n-1) `div` 2
        r = floor . sqrt $ fromIntegral n
    bits <- newArray (0, m-1) True
    forM_ [0 .. r `div` 2 - 1] $ \i -> do
        isPrime <- readArray bits i
        when isPrime $ do
            forM_ [2*i*i+6*i+3, 2*i*i+8*i+6 .. (m-1)] $ \j -> do
                writeArray bits j False
    return bits

primes :: Int -> [Int]
primes n = 2 : [2*i+3 | (i, True) <- assocs $ sieve n]

You can run it at http://ideone.com/mu1RN.

Hackamore answered 1/8, 2012 at 23:57 Comment(2)

1-based indexing makes for easier counting. Initial version written by Steve C 2009-12-31. – Collbaith 2/8, 2012 at 16:13

Also, there is a sum Int overflow in the ideone.com code; you need " print $ sum $ map fromIntegral $ primes 2000000" to avoid this; with the output as Integer or slightly faster as Int64 or Word64 for the correct answer of 142913828922. – Debora 12/4, 2014 at 2:55

Personally, I like this way of generating primes

primes :: [Integer]
primes = 2 : filter (isPrime primes) [3,5..]
  where isPrime (p:ps) n = p*p > n || n `rem` p /= 0 && isPrime ps n

It's also quite fast compared to some of the other methods suggested here. It's still trial division, but it only tests with primes. (The termination proof for this code is slightly tricky, though.)

Commotion answered 2/8, 2012 at 9:0 Comment(4)

-1, as this is also trial division as discussed in other answers and as there are suggestions to use the real Sieve of Eratosthenes, which will always be much faster for any reasonably large range even for two million. – Debora 8/4, 2014 at 1:30

@Debora Yes, it's absolutely trial division, and I never claim otherwise. I like it because it's simple and reasonably fast. E.g., for generating 2,000,000 it's about half the speed of the union based solution in another answer. – Commotion 11/4, 2014 at 10:38

I thought it should be clear this isn't the Sieve of Eratosthenes, about which there is so much confusion including on the part of the questioner. You are right, your solution is an elegant expression of a list-based optimized trial division (only tests primes up to the square root) and with the "[Integer]" output changed to "[Int]" is actually faster than simple list based SoE implementations for small ranges such as this up to two million. It's never faster than an array based true SoE, especially one using a mutable array (STUArray), which can sieve this range to two million in msecs. – Debora 12/4, 2014 at 2:51

edit the answer to state clearly that it is Optimized Trial Division and I'll up vote it again. – Debora 12/4, 2014 at 2:54

The algorithm you're using is not a sieve at all, so in terms of it being slow you should be expecting that using trial division.

Primes are roughly occurring with the frequency of the logarithm function ... i.e there are ballpark n/log(n) primes between 1 and n. So for the first 2 million primes you are going to need to go up to about 32 million. But you are building a 2 million element data structure that those primes are going to have pass through. So you can start to see why this was so slow. In fact it is O(n^2). You can cut it down to O(n*(log n)*log(log n))

Here is a page on various treatments that walk you through how to cut that down a bit. http://en.literateprograms.org/Sieve_of_Eratosthenes_(Haskell) (dead link as of 2022).

Brashear answered 2/8, 2012 at 0:21 Comment(3)

The number of primes <= n is approximately n / log n, far greater than sqrt(n). Also, the target is not 2 million primes, but primes below 2 million. – Beery 2/8, 2012 at 0:40

@DanielFischer: This statement indicates otherwise: "I'm just trying to find the sum of the first 2 million primes" – Talie 2/8, 2012 at 10:24

@Talie True, didn't see that, only that the code says sum $ takeWhile (<2000000) (sieve [2..]). – Beery 2/8, 2012 at 10:34

A simple way to implement the genuine sieve of Eratosthenes in Haskell is to iteratively zipWith over a list of Boolean values signifying whether the corresponding number is prime:

primes :: Int -> [Int]
primes n = [x | (x, prime) <- zip [2..n] $
                  sieve (ceiling . sqrt $ fromIntegral n)
              , prime]
    where 
    sieve 1 = repeat True
    sieve n = zipWith (&&) (filt n) (sieve (n - 1))
    filt n = replicate (n - 1) True
             ++ (tail . cycle $ False :
                  replicate (n - 1) True)

Now, primes n gives the list of the first n primes.

Fellatio answered 1/7, 2022 at 12:21 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Why the Turner sieve is so slow

Recommended topics

Hot tags