Sieve of Atkin explanation
Asked Answered
J

5

22

I am doing a project at the moment and I need an efficient method for calculating prime numbers. I have used the sieve of Eratosthenes but, I have been searching around and have found that the sieve of Atkin is a more efficient method. I have found it difficult to find an explanation (that I have been able to understand!) of this method. How does it work? Example code (preferably in C or python) would be brilliant.

Edit: thanks for your help, the only thing that I still do not understand is what the x and y variables are referring to in the pseudo code. Could someone please shed some light on this for me?

Joost answered 21/6, 2009 at 12:13 Comment(3)
This looks a lot like a university question...Emmen
or a Project Euler questionPhrenic
related: #2068872Concomitant
S
13

The wiki page is always a good place to start, since it explains the algorithm in full and provides commented pseudocode. (N.B. There's a lot of detail, and since the wiki website is reliably up, I won't quote it here.)

For references in the specific languages you mentioned:

Hope that helps.

Sifuentes answered 21/6, 2009 at 12:17 Comment(2)
-1. the copied code is for sieve of Eratosthenes. And providing links is not an explanation.Tensile
def sieveOfErat(end): wasn't an indication that this was the sieve of Eratosthenes and not Atkin?!Dustpan
T
10

Wikipedia article has an explanation:

  • "The algorithm completely ignores any numbers divisible by two, three, or five. All numbers with an even modulo-sixty remainder are divisible by two and not prime. All numbers with modulo-sixty remainder divisible by three are also divisible by three and not prime. All numbers with modulo-sixty remainder divisible by five are divisible by five and not prime. All these remainders are ignored."

Let's start with the famous

primes = sieve [2..] = sieve (2:[3..])   
  -- primes are sieve of list of 2,3,4... , i.e. 2 prepended to 3,4,5...
sieve (x:xs) = x : sieve [y | y <- xs, rem y x /= 0]   -- set notation
  -- sieve of list of (x prepended to xs) is x prepended to the sieve of 
  --                  list of `y`s where y is drawn from xs and y % x /= 0

Let's see how it proceeds for a few first steps:

primes = sieve [2..] = sieve (2:[3..]) 
                     = 2 : sieve p2     -- list starting w/ 2, the rest is (sieve p2)
p2 = [y | y <- [3..], rem y 2 /= 0]     -- for y from 3 step 1: if y%2 /= 0: yield y

p2 is to contain no multiples of 2. IOW it'll only contain numbers coprime with 2 — no number in p2 has 2 as its factor. To find p2 we don't actually need to test divide by 2 each number in [3..] (that's 3 and up, 3,4,5,6,7,...), because we can enumerate all the multiples of 2 in advance:

rem y 2 /= 0  ===  not (ordElem y [2,4..])     -- "y is not one of 2,4,6,8,10,..."

ordElem is like elem (i.e. is-element test), it just assumes that its list argument is an ordered, increasing list of numbers, so it can detect the non-presence safely as well as presence:

ordElem y xs = take 1 (dropWhile (< y) xs) == [y]   -- = elem y (takeWhile (<= y) xs) 

The ordinary elem doesn't assume anything, so must inspect each element of its list argument, so can't handle infinite lists. ordElem can. So, then,

p2 = [y | y <- [3..], not (ordElem y [2,4..])]  -- abstract this as a function, diff a b =
   = diff      [3..]                 [2,4..]    --       = [y | y <- a, not (ordElem y b)]
   -- 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
   -- . 4 . 6 . 8 . 10  . 12  . 14  . 16  . 18  . 20  . 22  .
   = diff [3..] (map (2*)            [2..] )  --  y > 2, so [4,6..] is enough
   = diff [2*k+x | k <- [0..],  x <- [3,4]]   -- "for k from 0 step 1: for x in [3,4]:
          [2*k+x | k <- [0..],  x <- [  4]]   --                            yield (2*k+x)"
   = [     2*k+x | k <- [0..],  x <- [3  ]]   -- 2 = 1*2 = 2*1
   = [3,5..]                                  -- that's 3,5,7,9,11,...

p2 is just a list of all odds above 2. Well, we knew that. What about the next step?

sieve p2 = sieve [3,5..] = sieve (3:[5,7..]) 
                         = 3 : sieve p3
p3 = [y | y <- [5,7..], rem y 3 /= 0]
   = [y | y <- [5,7..], not (ordElem y [3,6..])]           -- 3,6,9,12,...
   = diff [5,7..] [6,9..]         -- but, we've already removed the multiples of 2, (!)
   -- 5 . 7 . 9 . 11 . 13 . 15 . 17 . 19 . 21 . 23 . 25 . 27 .
   -- . 6 . . 9 . . 12  . . 15 . . 18  . . 21 . . 24  . . 27 .
   = diff [5,7..] (map (3*) [3,5..])                       -- so, [9,15..] is enough
   = diff [2*k+x | k <- [0..], x <- [5]] (map (3*)
          [2*k+x | k <- [0..], x <- [    3]] )
   = diff [6*k+x | k <- [0..], x <- [5,7,9]]               -- 6 = 2*3 = 3*2
          [6*k+x | k <- [0..], x <- [    9]] 
   = [     6*k+x | k <- [0..], x <- [5,7  ]]               -- 5,7,11,13,17,19,...

And the next,

sieve p3 = sieve (5 : [6*k+x | k <- [0..], x <- [7,11]])
         = 5 : sieve p5
p5 = [y | y <-        [6*k+x | k <- [0..], x <- [7,11]], rem y 5 /= 0]
   = diff [ 6*k+x | k <- [0..], x <- [7,11]]   (map   (5*)
          [ 6*k+x | k <- [0..], x <- [                  5,       7]]) -- no mults of 2 or 3!
   = diff [30*k+x | k <- [0..], x <- [7,11,13,17,19,23,25,29,31,35]]  -- 30 = 6*5 = 5*6
          [30*k+x | k <- [0..], x <- [                 25,      35]]
   = [     30*k+x | k <- [0..], x <- [7,11,13,17,19,23,   29,31   ]]

This is the sequence with which the sieve of Atkin is working. No multiples of 2, 3 or 5 are present in it. Atkin and Bernstein work modulo 60, i.e. they double the range:

p60 = [ 60*k+x | k <- [0..], x <- [1, 7,11,13,17,19,23,29,31, 37,41,43,47,49,53,59]]

Next, they use some sophisticated theorems to know some properties of each of those numbers and deal with each accordingly. The whole thing is repeated (a-la the "wheel") with a period of 60.

  • "All numbers n with modulo-sixty remainder 1, 13, 17, 29, 37, 41, 49, or 53 (...) are prime if and only if the number of solutions to 4x^2 + y^2 = n is odd and the number is squarefree."

What does that mean? If we know that the number of solutions to that equation is odd for such a number, then it is prime if it is squarefree. That means it has no repeated factors (like 49, 121, etc).

Atkin and Bernstein use this to reduce the number of operations overall: for each prime (from 7 and up), we enumerate (and mark for removal) the multiples of its square, so they are much further apart than in the sieve of Eratosthenes, so there are fewer of them in a given range.

The rest of the rules are:

  • "All numbers n with modulo-sixty remainder 7, 19, 31, or 43 (...) are prime if and only if the number of solutions to 3x^2 + y^2 = n is odd and the number is squarefree."

  • "All numbers n with modulo-sixty remainder 11, 23, 47, or 59 (...) are prime if and only if the number of solutions to 3x^2 − y^2 = n is odd and the number is squarefree."

This takes care of all the 16 core numbers in the definition of p60.

see also: Which is the fastest algorithm to find prime numbers?


The time complexity of the sieve of Eratosthenes in producing primes up to N is O(N log log N), while that of the sieve of Atkin is O(N) (putting aside the additional log log N optimizations that don't scale well). The accepted wisdom though is that the constant factors in the sieve of Atkin are much higher and so it might only pay off high above the 32-bit numbers (N > 232), if at all.

Tensile answered 3/3, 2015 at 9:28 Comment(0)
B
3

Edit: the only thing that I still do not understand is what the x and y variables are referring to in the pseudo code. Could someone please shed some light on this for me?

For a basic explanation of the common use of the 'x' and 'y' variables in the Wikipedia page pseudo-code (or other better implementations of the Sieve of Atkin), you might find my answer to another related question helpful.

Baguio answered 20/8, 2013 at 18:39 Comment(0)
A
2

Here's a c++ implementation of sieve of atkins that might help you...

#include <iostream>
#include <cmath>
#include <fstream>
#include<stdio.h>
#include<conio.h>
using namespace std;

#define  limit  1000000

int root = (int)ceil(sqrt(limit));
bool sieve[limit];
int primes[(limit/2)+1];

int main (int argc, char* argv[])
{
   //Create the various different variables required
   FILE *fp=fopen("primes.txt","w");
   int insert = 2;
   primes[0] = 2;
   primes[1] = 3;
   for (int z = 0; z < limit; z++) sieve[z] = false; //Not all compilers have false as the       default boolean value
   for (int x = 1; x <= root; x++)
   {
        for (int y = 1; y <= root; y++)
        {
             //Main part of Sieve of Atkin
             int n = (4*x*x)+(y*y);
             if (n <= limit && (n % 12 == 1 || n % 12 == 5)) sieve[n] ^= true;
             n = (3*x*x)+(y*y);
             if (n <= limit && n % 12 == 7) sieve[n] ^= true;
             n = (3*x*x)-(y*y);
             if (x > y && n <= limit && n % 12 == 11) sieve[n] ^= true;
        }
   }
        //Mark all multiples of squares as non-prime
   for (int r = 5; r <= root; r++) if (sieve[r]) for (int i = r*r; i < limit; i += r*r) sieve[i] = false;
   //Add into prime array
   for (int a = 5; a < limit; a++)
   {
            if (sieve[a])
            {
                  primes[insert] = a;
                  insert++;
            }
   }
   //The following code just writes the array to a file
   for(int i=0;i<1000;i++){
             fprintf(fp,"%d",primes[i]);
             fprintf(fp,", ");
   }
      
   return 0;
 }
Albata answered 20/10, 2012 at 14:14 Comment(2)
This looks like a translation of WP's pseudocode, but look at the talk page where there's discussion about that pseudocode being no good really. See this subsection en.wikipedia.org/wiki/… esp. towards its end.Tensile
It looks to be worth quoting one phrase by "EJ" = Emil Jeřábek as "the relative speed of the sieve of Atkin vs. sieve of Eratosthenes is extremely sensitive to various optimization tricks used in the implementation" where he explains that while the SoA is very slightly theoretically faster than the SoE, this is very difficult to realize in practice. A short code segment as posted here will not have those maximum optimizations, which are very much more complex than basic optimizations of the SoE, so generally the SoA is slower than the SoE for quick and easy implementations.Baguio
A
1
// Title : Seive of Atkin ( Prime number Generator) 

#include <iostream>
#include <cmath>
#include <vector>

using namespace std;

int main()
{
    ios_base::sync_with_stdio(false);
    long long int n;
    cout<<"Enter the value of n : ";
    cin>>n;
    vector<bool> is_prime(n+1);
    for(long long int i = 5; i <= n; i++)
    {
        is_prime[i] = false;
    }
    long long int lim = ceil(sqrt(n));

    for(long long int x = 1; x <= lim; x++)
    {
        for(long long int y = 1; y <= lim; y++)
        {
            long long int num = (4*x*x+y*y);
            if(num <= n && (num % 12 == 1 || num%12 == 5))
            {
                is_prime[num] = true;
            }

            num = (3*x*x + y*y);
            if(num <= n && (num % 12 == 7))
            {
                is_prime[num] = true;
            }

            if(x > y)
            {
                num = (3*x*x - y*y);
                if(num <= n && (num % 12 == 11))
                {
                    is_prime[num] = true;
                }
            }
        }
    }
    // Eliminating the composite by seiveing
    for(long long int i = 5; i <= lim; i++)
    {
        if(is_prime[i])
            for(long long int j = i*i; j <= n; j += i)
            {
                is_prime[j] = false;
            }
    }
    // Here we will start printing of prime number
   if(n > 2)
   {
       cout<<"2\t"<<"3\t";
   }
   for(long long int i = 5; i <= n; i++)
   {
         if(is_prime[i])
         {
             cout<<i<<"\t";
         }
    }
    cout<<"\n";
    return 0;
}
Allocation answered 12/11, 2013 at 6:48 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.