Fastest prime test for small-ish numbers

Asked 20/9, 2010 at 21:47 Answered 12/4, 2022 at 15:45

I'm playing through project Euler in my spare time, and it's come to the point where I need to do some refactoring. I've implemented Miller-Rabin, as well as a few sieves. I've heard before that sieves are actually faster for small-ish numbers, as in under a few million. Does anybody have any information on this? Google wasn't very helpful.

Procrastinate answered 20/9, 2010 at 21:47 Comment(3)

Randomly, on question 10, my trial division up to root(n) algorithm kicked the pants off my miller-rabin algorithm. – Procrastinate 20/9, 2010 at 21:48

Why not memoize the previously-seen prime numbers in a trie? That is a super-cheap operation. – Codel 20/9, 2010 at 21:56

Why don't you try it? Take a look at this answer for a simple Java sieve that I used many times in Project Euler: #1043402 – Ingvar 21/9, 2010 at 19:45

Yes, you'll find with most algorithms that you can trade space for time. In other words, by allowing the use of more memory, the speed is greatly increased ^*a.

I don't actually know the Miller-Rabin algorithm but, unless it's simpler than a single shift-left/add and memory extraction, it will be blown out of the water by a pre-calculated sieve.

The important thing here is pre-calculated. It's a good idea, in terms of performance, to pre-calculate things like this since the first million primes will be unlikely to change in the near future :-)

In other words, create your sieve with something like:

unsigned char primeTbl[] = {0,0,1,1,0,1,0,1,0,0,0,1};
#define isPrime(x) ((x < sizeof(primeTbl) ? primeTbl[x] : isPrimeFn(x))

with all the usual caveats about not passing things like a++ into macros. This gives you the best of both worlds, a blindingly fast table lookup for "small-ish" primes, dropping back to a calculation method for those outside the range.

Obviously you would write a program using one of the other methods to generate that lookup table - you don't really want to have to type it all in by hand.

But, as with all optimisation questions, measure, don't guess!

^*a A classic case of this was some trig functions I once had to write for an embedded system. This was a competitive contract bid and the system had a little more storage than CPU grunt.

We actually won the contract since our benchmark figures for the functions blew the competition away.

Why? Because we pre-calculated the values into a lookup table originally calculated on another machine. By judicious use of reduction (bringing the input values down below 90 degrees) and trig properties (the fact that cosine is just a phase shift of sine and that the other three quadrants are related to the first), we got the lookup table down to 180 entries (one per half degree).

The best solutions are those that are elegant and devious :-)

For what it's worth, the following C code will generate such a table for you, all the primes below four million (283,000 of them).

#include <stdio.h>

static unsigned char primeTbl[4000000];

int main (void) {
    int i, j;

    for (i = 0; i < sizeof(primeTbl); i++)
        primeTbl[i] = 1;

    primeTbl[0] = 0;
    primeTbl[1] = 0;
    for (i = 2; i < sizeof(primeTbl); i++)
        if (primeTbl[i])
            for (j = i + i; j < sizeof(primeTbl); j += i)
                primeTbl[j] = 0;

    printf ("static unsigned char primeTbl[] = {");
    for (i = 0; i < sizeof(primeTbl); i++) {
        if ((i % 50) == 0) {
            printf ("\n   ");
        }
        printf ("%d,", primeTbl[i]);
    }
    printf ("\n};\n");
    printf ("#define isPrime(x) "
        "((x < sizeof(primeTbl) ? primeTbl[x] : isPrimeFn(x))\n");

    return 0;
}

If you can bump up the primeTbl table to sixteen million entries (16M), you'll find that's enough to keep the prime count above a million (the first 1,031,130 primes).

Now there are ways to make that take less storage such as only storing odd numbers and adjusting the macro to take care of that, or using a bit mask instead of unsigned characters. I prefer simplicity of algorithms myself if the memory is available.

Blurt answered 20/9, 2010 at 21:59 Comment(7)

+1 for "the first million primes will be unlikely to change in the near future", LOL. I'm not familiar with Project Euler rules, perhaps this isn't allowed? – Bylaw 20/9, 2010 at 22:2

@Mark: There are no formal rules in Project Euler. – Pyonephritis 20/9, 2010 at 22:4

Yes, if you're willing to blow your whole L1 cache (61 kB) on the table, you can check odds under a million for primality with very fast amortized performance. But for Project Euler you're going to need primes in a much wider range, and performance for larger numbers will dominate the runtime. – Phonograph 21/9, 2010 at 16:29

You can also store the differences between consecutive primes starting with 3. This allows 1 byte per prime for a really large range but precludes lookups. So checking for factors is faster, but checking if something is in the list is poor. – Gulick 21/9, 2010 at 19:14

@phkahler: This works up to 436,273,009, that is, a 22 MB list. If you start with 5 instead of 3 and store half the difference, you can go much higher, to 304,599,508,537; this yields an 11 GB list. – Phonograph 15/11, 2010 at 1:31

@Mark: the informal rules of Project Euler are that you can take as long as you like to write the program, but it should take at most a minute to run (or some similar time limit). It's then self-policed - if you can work out the answer on paper that probably counts as a runtime of 0, but if you write a C++ template meta-program that takes an hour to compile, that probably doesn't. As long as the prime table is shared between lots of different problems (which it would be), personally I'd probably allow myself to amortize the cost of generating it. – Heidyheifer 12/3, 2011 at 10:13

There are obvious grey areas that arise, for example if my first attempt takes two minutes, but once I see the answer I realize that there's an optimization I can make to bring it down, then have I "cheated"? Does it depend whether I can write a formal proof of the correctness of the optimization? I tend to think that I should be able to sketch-prove on paper (without using the results of other, slow programs) that my program finds the correct answer. So a large prime table probably ought to be re-generated under the time limit for each problem, and IIRC that's what I've done in the past – Heidyheifer 12/3, 2011 at 10:22

I recommend a tiered approach. First, make sure there are no small prime factors. Trial-dividing by the first 20 or 30 primes works, though if you use a clever approach you can reduce the number of divisions needed by using gcds. This step filters out about 90% of the composites.

Next, test if the number is a strong probable prime (Miller-Rabin test) to base 2. This step removes almost all remaining composites, but some rare composites can pass.

The final proving step depends on how large you want to go. If you are willing to work in a small range, do a binary search on a list of 2-pseudoprimes up the the largest you allow. If that's 2^32, your list will have only 10,403 members, so the lookup should take only 14 queries.

If you want to go up to 2^64, it now suffices (thanks to the work of Jan Feitisma) to check if the number is a BPSW pseudoprime. (You could also download the 3 GB list of all exceptions, remove those which trial division would remove, and write a disk-based binary search.) T. R. Nicely has a nice page explaining how to implement this reasonably efficiently.

If you need to go higher, implement the above method and use it as a subroutine for a Pocklington-style test. This stretches the definition of "small-ish"; if you want more information on these methods, just ask.

Phonograph answered 21/9, 2010 at 18:32 Comment(0)

As a variant on the notion of pre-computation, you can first cheaply check whether the candidate number p is divisible by 2, 3, 5, 7, or 11. If not, then declare p prime if 2^p-1 = 1 (mod p). This will fail at some point, but it works up to 100 million because I tested it (pre-computation).

In other words, all the small-ish Fermat pseudo-primes to the base 2 are divisible by one of 3, 5, 7, or 11.

EDIT:

As correctly noted by @starblue, the above is simply wrong. I had a bug in my program. The best I can do is amend the above to:

If candidate p is divisible by 2, 3, 5, 7, or 11, declare it composite;
Else if p is one of {4181921, 4469471, 5256091, 9006401, 9863461}, declare it composite;
Else if p passes the Miller-Rabin test for bases 2 and 5 then declare it prime;
Else declare it composite.

This I tested for integers less than 10,000,000. Perhaps a different pair of bases would do even better.

Please accept my apologies for my mistakes.

EDIT 2:

Well, it appears that the information I was after is already on the Wikipedia page for the Miller-Rabin algorithm, the section titled "Deterministic variants of the test".

Ocarina answered 21/9, 2010 at 0:20 Comment(11)

@Greg, that final test looks slightly weird to me (1 mod p is always 1 for p > 1). I assume you mean if (2^(p-1) mod p) = 1, yes? – Blurt 21/9, 2010 at 1:16

by "=", I mean congruent. I should have parenthesized the mod p part. Corrected. – Ocarina 21/9, 2010 at 1:39

This seems like very useful information to keep around for a rainy day - what's the first number that it fails on? – Waligore 21/9, 2010 at 2:32

I still haven't found one. Maybe I'll run it overnight sometime and see if I hit one. – Ocarina 21/9, 2010 at 2:36

Isn't this just Miller-Rabin using 2 as a in a^(n-1)=1 (mod n)? It's interesting that it works up to 100 million if it is, as I maybe incorrectly assumed that Miller-Rabin was probabilistic and needed to be run with multiple different numbers as a to be accurate... – Procrastinate 21/9, 2010 at 7:17

@Josh: The Miller-Rabin primality test is based on, and is an improvement of, the Fermat primality test. The Miller-Rabin and Fermat test are probabilistic and are normally run multiple times, but this is a special case. – Ocarina 21/9, 2010 at 11:37

@caf: It first fails for 1387. There are 1458 exceptions up to 100 million. – Phonograph 21/9, 2010 at 16:22

@GregS: It still looks like the base case of Miller-Rabin to me. Why does this work? – Procrastinate 21/9, 2010 at 17:24

@GregS: Oh duh, I didn't read the part about division by primes. – Procrastinate 21/9, 2010 at 17:25

-1 29341 = 13 * 37 * 61 is a Carmichael number, where it must fail. oeis.org/classic/A002997 – Ingvar 21/9, 2010 at 19:59

+1: the new test works up to 14,709,241. Bases 2 and 11 are pretty nice; 5 exceptions make it work until 63,388,033. Bases 2 and 733 are slightly better with 74,927,161. – Phonograph 22/9, 2010 at 5:10

The only way is to benchmark yourself. When you do, write it up, and post it online somewhere.

Lukas answered 20/9, 2010 at 21:54 Comment(3)

Seriously. You've already done the implementations, why not time them yourself? If you're afraid you might have missed the fastest algorithm, post your best as a new question and see if someone can do better. – Bylaw 20/9, 2010 at 21:59

I could do this, but my implementation of a few of these tests are really crappy. I'm pretty sure the way I've written my miller-rabin is pretty bad. Actually I know it's bad. I was wondering for best-case scenarios, so I can just work on the "correct" implementation without refactoring every single one of mine to be "good enough" before testing them. – Procrastinate 21/9, 2010 at 6:18

Java also has Miller-Rabin in the library, for BigInteger. – Ingvar 21/9, 2010 at 19:52

Assuming n < 4669921 it would be very fast :

if ((n == 1) == (n & 1)) return n == 2;
return ((n & 1) & ((n < 6) * 42 + 0x208A2882) >> n % 30 && (n < 49 || (n % 7 && n % 11 && n % 13 && n % 17 && n % 19 && n % 23 && n % 29 && (n < 961 || (n % 31 && n % 37 && n % 41 && n % 43 && n % 47 && n % 53 && n % 59 && n % 61 && n % 67 && (n < 5041 || (n % 71 && n % 73 && n % 79 && n % 83 && n % 89 && n % 97 && n % 101 && n % 103 && n % 107 && (n < 11881 || (n % 109 && n % 113 && n % 127 && n % 131 && n % 137 && n % 139 && n % 149 && n % 151 && n % 157 && (n < 26569 || (n % 163 && n % 167 && n % 173 && n % 179 && n % 181 && n % 191 && n % 193 && n % 197 && n % 199 && (n < 44521 || (n % 211 && n % 223 && n % 227 && n % 229 && n % 233 && n % 239 && n % 241 && n % 251 && n % 257 && (n < 69169 || (n % 263 && n % 269 && n % 271 && n % 277 && n % 281 && n % 283 && n % 293 && n % 307 && n % 311 && (n < 97969 || (n % 313 && n % 317 && n % 331 && n % 337 && n % 347 && n % 349 && n % 353 && n % 359 && n % 367 && (n < 139129 || (n % 373 && n % 379 && n % 383 && n % 389 && n % 397 && n % 401 && n % 409 && n % 419 && n % 421 && (n < 185761 || (n % 431 && n % 433 && n % 439 && n % 443 && n % 449 && n % 457 && n % 461 && n % 463 && n % 467 && (n < 229441 || (n % 479 && n % 487 && n % 491 && n % 499 && n % 503 && n % 509 && n % 521 && n % 523 && n % 541 && (n < 299209 || (n % 547 && n % 557 && n % 563 && n % 569 && n % 571 && n % 577 && n % 587 && n % 593 && n % 599 && (n < 361201 || (n % 601 && n % 607 && n % 613 && n % 617 && n % 619 && n % 631 && n % 641 && n % 643 && n % 647 && (n < 426409 || (n % 653 && n % 659 && n % 661 && n % 673 && n % 677 && n % 683 && n % 691 && n % 701 && n % 709 && (n < 516961 || (n % 719 && n % 727 && n % 733 && n % 739 && n % 743 && n % 751 && n % 757 && n % 761 && n % 769 && (n < 597529 || (n % 773 && n % 787 && n % 797 && n % 809 && n % 811 && n % 821 && n % 823 && n % 827 && n % 829 && (n < 703921 || (n % 839 && n % 853 && n % 857 && n % 859 && n % 863 && n % 877 && n % 881 && n % 883 && n % 887 && (n < 822649 || (n % 907 && n % 911 && n % 919 && n % 929 && n % 937 && n % 941 && n % 947 && n % 953 && n % 967 && (n < 942841 || (n % 971 && n % 977 && n % 983 && n % 991 && n % 997 && n % 1009 && n % 1013 && n % 1019 && n % 1021 && (n < 1062961 || (n % 1031 && n % 1033 && n % 1039 && n % 1049 && n % 1051 && n % 1061 && n % 1063 && n % 1069 && n % 1087 && (n < 1190281 || (n % 1091 && n % 1093 && n % 1097 && n % 1103 && n % 1109 && n % 1117 && n % 1123 && n % 1129 && n % 1151 && (n < 1329409 || (n % 1153 && n % 1163 && n % 1171 && n % 1181 && n % 1187 && n % 1193 && n % 1201 && n % 1213 && n % 1217 && (n < 1495729 || (n % 1223 && n % 1229 && n % 1231 && n % 1237 && n % 1249 && n % 1259 && n % 1277 && n % 1279 && n % 1283 && (n < 1661521 || (n % 1289 && n % 1291 && n % 1297 && n % 1301 && n % 1303 && n % 1307 && n % 1319 && n % 1321 && n % 1327 && (n < 1852321 || (n % 1361 && n % 1367 && n % 1373 && n % 1381 && n % 1399 && n % 1409 && n % 1423 && n % 1427 && n % 1429 && (n < 2053489 || (n % 1433 && n % 1439 && n % 1447 && n % 1451 && n % 1453 && n % 1459 && n % 1471 && n % 1481 && n % 1483 && (n < 2211169 || (n % 1487 && n % 1489 && n % 1493 && n % 1499 && n % 1511 && n % 1523 && n % 1531 && n % 1543 && n % 1549 && (n < 2411809 || (n % 1553 && n % 1559 && n % 1567 && n % 1571 && n % 1579 && n % 1583 && n % 1597 && n % 1601 && n % 1607 && (n < 2588881 || (n % 1609 && n % 1613 && n % 1619 && n % 1621 && n % 1627 && n % 1637 && n % 1657 && n % 1663 && n % 1667 && (n < 2785561 || (n % 1669 && n % 1693 && n % 1697 && n % 1699 && n % 1709 && n % 1721 && n % 1723 && n % 1733 && n % 1741 && (n < 3052009 || (n % 1747 && n % 1753 && n % 1759 && n % 1777 && n % 1783 && n % 1787 && n % 1789 && n % 1801 && n % 1811 && (n < 3323329 || (n % 1823 && n % 1831 && n % 1847 && n % 1861 && n % 1867 && n % 1871 && n % 1873 && n % 1877 && n % 1879 && (n < 3568321 || (n % 1889 && n % 1901 && n % 1907 && n % 1913 && n % 1931 && n % 1933 && n % 1949 && n % 1951 && n % 1973 && (n < 3916441 || (n % 1979 && n % 1987 && n % 1993 && n % 1997 && n % 1999 && n % 2003 && n % 2011 && n % 2017 && n % 2027 && (n < 4116841 || (n % 2029 && n % 2039 && n % 2053 && n % 2063 && n % 2069 && n % 2081 && n % 2083 && n % 2087 && n % 2089 && (n < 4405801 || (n % 2099 && n % 2111 && n % 2113 && n % 2129 && n % 2131 && n % 2137 && n % 2141 && n % 2143 && n % 2153)))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))));

Longfellow answered 12/4, 2022 at 15:45 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags