Explain this chunk of haskell code that outputs a stream of primes

Asked 19/11, 2009 at 15:38 Answered 15/1, 2012 at 17:51

I have trouble understanding this chunk of code:

let
  sieve (p:xs) = p : sieve (filter (\ x -> x `mod` p /= 0) xs)
in sieve [2 .. ]

Can someone break it down for me? I understand there is recursion in it, but thats the problem I can't understand how the recursion in this example works.

Torino answered 19/11, 2009 at 15:38 Comment(4)

@Everyone: as elegant as this algorithm appears, its actually not very efficient at all (dramatically less performant than trial division), and its definitely not an implementation of the Sieve of Eratosthenes. See the following: cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf – Hilarius 19/11, 2009 at 16:6

@Juliet: Umm... this is trial division. – Welcher 19/11, 2009 at 18:32

@newacct: yes and no. On the one hand, it is trial division, but on the other its a bad implementation (the author in the article above calls it an "unfaithful sieve"). Proper implementations just check a number to see if it divides by any previously computed prime up to sqrt(whatever you're checking) for a complexity around theta(n * sqrt(n) / (log n)^2). The code above actually tests an input against all previously computed primes yielding a complexity around theta(n^2 / (log n)^2). – Hilarius 19/11, 2009 at 21:52

@MihaiMaruseac How can this question be a possible duplicate of another one that didn't exist at that time? – Conrado 26/2, 2015 at 23:3

It's actually pretty elegant.

First, we define a function sieve that takes a list of elements:

sieve (p:xs) =

In the body of sieve, we take the head of the list (because we're passing the infinite list [2..], and 2 is defined to be prime) and append it (lazily!) to the result of applying sieve to the rest of the list:

p : sieve (filter (\ x -> x 'mod' p /= 0) xs)

So let's look at the code that does the work on the rest of the list:

sieve (filter (\ x -> x 'mod' p /= 0) xs)

We're applying sieve to the filtered list. Let's break down what the filter part does:

filter (\ x -> x 'mod' p /= 0) xs

filter takes a function and a list on which we apply that function, and retains elements that meet the criteria given by the function. In this case, filter takes an anonymous function:

\ x -> x 'mod' p /= 0

This anonymous function takes one argument, x. It checks the modulus of x against p (the head of the list, every time sieve is called):

 x 'mod' p

If the modulus is not equal to 0:

 x 'mod' p /= 0

Then the element x is kept in the list. If it is equal to 0, it's filtered out. This makes sense: if x is divisible by p, than x is divisible by more than just 1 and itself, and thus it is not prime.

Quint answered 19/11, 2009 at 15:49 Comment(2)

-1 for "we take the head of the list ... and append it to the result of ...". Novices are easily confused by such imprecise formulations. – Subassembly 27/1, 2012 at 18:29

It's about laziness, and timing. We don't use results in guarded recursion (to use a result of recursive call is recursion). The "result" will be calculated later and put in its place. Plus, the "result" is never present all at once. It's the process of calculating the result's consituents, one-by-one, that is really defined here. Just my personal take. – Subassembly 27/1, 2012 at 18:51

Contrary to what others have stated here, this function does not implement the true sieve of Eratosthenes. It does returns an initial sequence of the prime numbers though, and in a similar manner, so it's okay to think of it as the sieve of Eratosthenes.

I was about done explaining the code when mipadi posted his answer; I could delete it, but since I spent some time on it, and because our answers are not completely identical, I'll leave it here.

Firs of all, note that there are some syntax errors in the code you posted. The correct code is,

let sieve (p:xs) = p : sieve (filter (\x -> x `mod` p /= 0) xs) in sieve [2..]

let x in y defines x and allows its definition to be used in y. The result of this expression is the result of y. So in this case we define a function sieve and return the result of applying [2..] to sieve.
Now let us have a closer look at the let part of this expression:
```
sieve (p:xs) = p : sieve (filter (\x -> x `mod` p /= 0) xs)
```
1. This defines a function sieve which takes a list as its first argument.
2. (p:xs) is a pattern which matches p with the head of said list and xs with the tail (everything but the head).
3. In general, p : xs is a list whose first element is p. xs is a list containing the remaining elements. Thus, sieve returns the first element of the list it receives.
4. Not look at the remainder of the list:
```
sieve (filter (\x -> x `mod` p /= 0) xs)
```
  1. We can see that sieve is being called recursively. Thus, the filter expression will return a list.
  2. filter takes a filter function and a list. It returns only those elements in the list for which the filter function returns true.
  3. In this case xs is the list being filtered and
```
(\x -> x `mod` p /= 0)
```
    is the filter function.
  4. The filter function takes a single argument, x and returns true iff it is not a multiple of p.
Now that sieve is defined, we pass it [2 .. ], the list of all natural numbers starting at 2. Thus,
1. The number 2 will be returned. All other natural number which are a multiple of 2 will be discarded.
2. The second number is thus 3. It will be returned. All other multiples of 3 will be discarded.
3. Thus the next number will be 5. Etc.

Bergmann answered 19/11, 2009 at 15:50 Comment(3)

Thanks, Mipadi posted first so I gave the correct ans to him, but your answer is equally good as well, thank you! – Torino 19/11, 2009 at 17:3

in the first call to sieve, haskell not producing an infinite list (it can't be done...) of numbers and forward them to the next call of sieve that itself need to make a list of infinite list etc etc. so how the mechanism works here? and if haskell not make infinite lists, in the second call to sieve (p = 3) how it knows that it should discard 4 and move to 5? the second sieve doesn't "know" about the first call to sieve in which there 4 of course would be eliminated (but again, if haskell not really making infinite list how it know that number 4 or 60002 should be removed from the list?) – Paper 12/11, 2017 at 9:55

@Paper (just now noticed your question) Indeed Haskell isn't producing infinite lists here, but it does define them. That can be done. So the first invocation of sieve receives (a definition of) infinite list [2..] and produces a definition of infinite list [x | x <- [3..], rem x 2 > 0]. When next invocation of sieve receives [x | x <- [3..], rem x 2 > 0] it calculates it to 3 : [x | x <- [4..], rem x 2 > 0], produces 3 and invokes sieve [y | y <- [x | x <- [4..], rem x 2 > 0], rem y 3 > 0]. It calculates as little of these lists as possible, and the numbers pop out 1 by 1. – Subassembly 29/6, 2019 at 10:14

It's actually pretty elegant.

First, we define a function sieve that takes a list of elements:

sieve (p:xs) =

p : sieve (filter (\ x -> x 'mod' p /= 0) xs)

So let's look at the code that does the work on the rest of the list:

sieve (filter (\ x -> x 'mod' p /= 0) xs)

We're applying sieve to the filtered list. Let's break down what the filter part does:

filter (\ x -> x 'mod' p /= 0) xs

filter takes a function and a list on which we apply that function, and retains elements that meet the criteria given by the function. In this case, filter takes an anonymous function:

\ x -> x 'mod' p /= 0

This anonymous function takes one argument, x. It checks the modulus of x against p (the head of the list, every time sieve is called):

 x 'mod' p

If the modulus is not equal to 0:

 x 'mod' p /= 0

Quint answered 19/11, 2009 at 15:49 Comment(2)

-1 for "we take the head of the list ... and append it to the result of ...". Novices are easily confused by such imprecise formulations. – Subassembly 27/1, 2012 at 18:29

It defines a generator - a stream transformer called "sieve",

Sieve s = 
  while( True ):
      p := s.head
      s := s.tail
      yield p                             -- produce this
      s := Filter (nomultsof p) s         -- go next

primes := Sieve (Nums 2)

which uses a curried form of an anonymous function equivalent to

nomultsof p x = (mod x p) /= 0

Both Sieve and Filter are data-constructing operations with internal state and by-value argument passing semantics.

Here we can see that the most glaring problem of this code is not, repeat not that it uses trial division to filter out the multiples from the working sequence, whereas it could find them out directly, by counting up in increments of p. If we were to replace the former with the latter, the resulting code would still have abysmal run-time complexity.

No, its most glaring problem is that it puts a Filter on top of its working sequence too soon, when it should really do that only after the prime's square is seen in the input. As a result it creates a quadratic number of Filters compared to what's really needed. The chain of Filters it creates is too long, and most of them aren't even needed at all.

The corrected version, with the filter creation postponed until the proper moment, is

Sieve ps s = 
  while( True ):
      x := s.head
      s := s.tail
      yield x                             -- produce this
      p := ps.head
      q := p*p
      while( (s.head) < q ):
          yield (s.head)                  --    and these
          s := s.tail
      ps := ps.tail                       -- go next
      s  := Filter (nomultsof p) s

primes := Sieve primes (Nums 2)

or in Haskell,

primes = sieve primes [2..] 
sieve ps (x:xs) = x : h ++ sieve pt [x | x <- t, rem x p /= 0]
     where (p:pt) = ps
           (h,t)  = span (< p*p) xs

rem is used here instead of mod as it can be much faster in some interpreters, and the numbers are all positive here anyway.

Measuring the local orders of growth of an algorithm by taking its run times t1,t2 at problem-size points n1,n2, as logBase (n2/n1) (t2/t1), we get O(n^2) for the first one, and just above O(n^1.4) for the second (in n primes produced).

Just to clarify it, the missing parts could be defined in this (imaginary) language simply as

Nums x =            -- numbers from x
  while( True ):
      yield x
      x := x+1

Filter pred s =     -- filter a stream by a predicate
  while( True ):
      if pred (s.head) then yield (s.head)
      s := s.tail

see also.

update: Curiously, the first instance of this code in David Turner's 1976 SASL manual according to A.J.T. Davie's 1992 Haskell book,

primes = sieve [2..]

-- [Int] -> [Int]
sieve (p:nos) = p : sieve (remove (multsof p) nos)

actually admits two pairs of implementations for remove and multsof going together -- one pair for the trial division sieve as in this question, and the other for the ordered removal of each prime's multiples directly generated by counting, aka the genuine sieve of Eratosthenes (both would be non-postponed, of course). In Haskell,

-- Int -> (Int -> Bool)                   -- Int -> [Int]
multsof p n = (rem n p)==0                multsof p = [p*p, p*p+p..]

-- (Int -> Bool) -> ([Int] -> [Int])      -- [Int] -> ([Int] -> [Int])
remove m xs = filter (not.m) xs           remove m xs = minus xs m

(If only he would've postponed picking the actual implementation here...)

As for the postponed code, in a pseudocode with "list patterns" it could've been

    primes = [2, ...sieve primes [3..]]

    sieve [p, ...ps] [...h, p*p, ...nos] =
                     [...h, ...sieve ps (remove (multsof p) nos)]

which in modern Haskell can be written with ViewPatterns as

{-# LANGUAGE ViewPatterns #-}

primes = 2 : sieve primes [3..]

sieve (p:ps) (span (< p*p) -> (h, _p2 : nos)) 
                             = h ++ sieve ps (remove (multsof p) nos)

Subassembly answered 15/1, 2012 at 17:51 Comment(9)

+1. Are you sure the complexity for the second one is O(n^1.4)? How did you arrive at that result? – Wenonawenonah 16/1, 2012 at 6:13

log (t2/t1) / log (n2/n1). Empirical local time complexity. It's actually just above that, 1.40..1.42, in the measured low range of n values. I ran the interpreted code in GHCi, taking time statistics for primes!! 1000 and then primes!! 2000, and calculating logBase 2 (1+t2/t1) (since the calculation is accumulative in this case). See the whole saga at haskellwiki page. – Subassembly 16/1, 2012 at 9:5

@Wenonawenonah when GHC -O2 compiled and run standalone, it was n^1.36,1.46 for 50-100-200 thousandth prime. The span call is non-local and causes space leak. With span inlined it runs at n^1.36,1.43,1.43 for 50-100-200-400,000 primes. – Subassembly 16/1, 2012 at 10:50

actually my guess is that it is still O(n^2). As I understand it is still a trial division algorithm and has to check for divisibility with all previous primes each time. The imperative mutable version, which uses STUArrays, calculates the one-millionths prime instantly while this implementation takes a minute. I need to do more analysis to be accurate though. – Wenonawenonah 16/1, 2012 at 18:57

not with all preceding primes, but with all primes preceding the square root. The theoretical complexity of O(n^1.5/(log n)^0.5), in n primes produced (as opposed to expressing it in m the upper limit, of O(m^1.5/(log m)^2) IIRC). Do take a look at that haskellwiki page, for the real version to check. This one is just first step in a long sequence of improvements. – Subassembly 17/1, 2012 at 7:53

yes I mean there is a filter function call corresponding to each prime number. I don't know really how to reason about this lazy chain of filters, it would be helpful if you elaborate more on that part. And I read that Haskellwiki page before, it has some nice ideas. – Wenonawenonah 17/1, 2012 at 12:55

@Wenonawenonah thanks. FWIW the fastest, tree-merging on wheels, list-based Haskell version is steady 3.3x slower than STUArrays, and 20..15 slower than sample baseline C++ implementation. As for chain of filters, by postponing the creation of each filter, the total number of filters is capped, and each candidate number is tested by primes below its square root only. We get far ahead along the candidates stream, than we are on the primes stream, from which we take out primes to filter the candidates by. – Subassembly 17/1, 2012 at 13:4

@Wenonawenonah I thought the generator metaphor should make it clearer. There's nothing lazy about it; we stop on each yield and wait until a value gets yanked. That's exactly what happens in Haskell when the list gets accessed - new value gets requested from the generator, that's all. Filter, just like Sieve, just creates a new data object with its own internal state, responding to next requests. So each new Frame just puts a new frame so to speak above the underlying stream, which is a generator too: [2..] = Nums 2 where Nums x=LOOP:; yield x; x:=x+1; goto LOOP. Simple. Imperative. – Subassembly 17/1, 2012 at 13:10

erhm, "nothing special about it" I should've said. And a typo: each new Filter just puts a new frame above its input stream. – Subassembly 11/3, 2012 at 7:46

It's implementing the Sieve of Eratosthenes

Basically, start with a prime (2), and filter out from the rest of the integers, all multiples of two. The next number in that filtered list must also be a prime, and therefore filter all of its multiples from the remaining, and so on.

Vegetate answered 19/11, 2009 at 15:42 Comment(0)

It says "the sieve of some list is the first element of the list (which we'll call p) and the sieve of the rest of the list, filtered such that only elements not divisible by p are allowed through". It then gets things started by by returning the sieve of all integers from 2 to infinity (which is 2 followed by the sieve of all integers not divisible by 2, etc.).

I recommend The Little Schemer before you attack Haskell.

Butterfly answered 19/11, 2009 at 15:47 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags