Explain a code to check primality based on Fermat's little theorem
Asked Answered
N

2

6

I found some Python code that claims checking primality based on Fermat's little theorem:

def CheckIfProbablyPrime(x):
    return (2 << x - 2) % x == 1

My questions:

  1. How does it work?
  2. What's its relation to Fermat's little theorem?
  3. How accurate is this method?
  4. If it's not accurate, what's the advantage of using it?

I found it here.

Nichrome answered 12/4, 2015 at 23:25 Comment(6)
This is a pretty direct implementation of the theorem, using a == 2. It uses a bit shift operator to do 2**x, read up on the operator hereFrisk
I couldn't understand most of it, but does this help?: math.stackexchange.com/questions/659529/…Prayer
@Frisk ok, so he did it to gain performance. still why using 2 ? isn't it working for any prime? no difference?Nichrome
@jakekimds thanks, that was a great and short explanation for how the theorem works.Nichrome
@bigOTHER I believe 341, 561, 645 and 2 doesn't work to name a few.Prayer
@jakekimds I noticed about 2, that's why I got confused about its advantage and how should be usedNichrome
B
17

1. How does it work?

Fermat's little theorem says that if a number x is prime, then for any integer a:

Fermat's Little Theorem Part 1

If we divide both sides by a, then we can re-write the equation as follows:

Fermat's Little Theorem Part 2

I'm going to punt on proving how this works (your first question) because there are many good proofs (better than I can provide) on this wiki page and under some Google searches.

2. Relation between code and theorem

So, the function you posted checks if (2 << x - 2) % x == 1.

First off, (2 << x-2) is the same thing as writing 2**(x-1), or in math-form:

2**(x-1)

That's because << is the logical left-shift operator, which is explained better here. The relation between bit-shifting and multiplying by powers of 2 is specific to the way that numbers are represented on computers (in binary), but it all boils down to

2 << (x-1) == 2**(x)

I can subtract 1 from the exponent on both sides, which gives

2 << (x-2) == 2**(x-1)


Now, we know from above that for any number a,

a**(x-1) == 1 (mod x)

Let's say then that a = 2. That gives us

2**(x-1) == 1 (mod x)

Well heck, that's the same as 2 << (x-2)! So then we can write:

Almost final relation

Which leads to the final relation:

Final Relation


Now, the math version of mod looks kind of odd, but we can write the equivalent code as follows:

(2 << x - 2) % x == 1

And that's the relation.

3. Accuracy of method

So, I think "accuracy" is a bad term here, because Fermat's little theorem is definitely true for all prime numbers. However, that does not mean that it's true or false for all numbers -- which is to say, if I have some number i, and I'm not sure if i is prime, using Fermat's Little Relation will only tell me if it is definitely NOT prime. If Fermat's Little Relation is true, then i could not be prime. These kinds of numbers are called pseudoprime numbers, or more specifically in this case Fermat Pseudoprime numbers.

If this sort of thing sounds interesting, take a look at the Carmichael numbers AKA the Absolute Fermat Pseudoprimes, which pass the Fermat test in any base but are not prime. In our case we run into numbers which pass in base 2, but Fermat's little theorem might not hold for these numbers in other bases -- the Carmichael numbers pass the test for all bases coprime to x.

On the wiki page of the Carmichael there is a discussion of their distribution over the range of natural numbers -- they appear exponentially with the size of the range over which you're looking, though the exponent is less than 1 (about 1/3). So, if you're searching for primes over a big range, you're going to run into exponentially more Carmichael numbers, which are effectively false positives for this method CheckIfProbablyPrime. That might be okay, depending on your input and how much you care about running into false positives.

4. Why is this useful?

In short, it's an optimization.

The main reason to use something like this is to speed up a search for prime numbers. That's because actually checking if a number is prime is expensive -- i.e. more than O(1) running time. Doable, but still more expensive than O(1) time. So, if we can avoid doing that actual check for some numbers, we'll be able to devote more time to checking actual candidates. Since Fermat's little relation will only say yes if a number is possibly prime (it will never say no if the number is prime), and it can be checked in O(1) time, we can toss it into an is_prime loop to ignore a fair amount of numbers. So, we can speed things up.

There are many primality checks like this one, you can find some coded prime checkers here


Final Note

One of the confusing things about this optimization is that it uses the bit shift operator << instead of the exponentiation operator **. This is because bit shifting is one of the fastest operations that your computer can do, while exponentiation is slower by some amount. It is not always the best optimization in many cases, because most modern languages know how to replace things we write with more optimized operations. But, that's my venture as to why the authors of this code used the bit shift instead of 2**(x-1).


Edit: As MarkDickinson notes, taking the exponent of a number and then modding it explicitly is not the best way to do it. This is a thing called modular exponentiation, and there exist algorithms which can do it faster than the way we've written it. Python's builtin pow actually implements one of these algorithms, and takes an optional third argument to mod by. So we can write a final version of this function:

def CheckIfProbablyPrime(x):
    return pow(2, x-1, x) == 1

Which is not only more readable but also faster than the confusing bit-shift crap. You know what they say.

Blinding answered 13/4, 2015 at 0:59 Comment(9)
you said , the function is to tell if a number is "definitely Not a prime" . Could you explain, why 341, 561, 645 and 2 return False?Nichrome
Nice answer. It's also worth noting that pow(2, x-1, x) is going to be much more (time and space) efficient than (2 << x - 2) % x for large x.Originate
@bigOTHER: If the test returns False, then the input is definitely not an odd prime. If it returns True, the test is inconclusive: it may or may not be a prime. Not surprisingly, 2 is a special case here, but the other numbers you ask about (341, 561, 645) are all composite.Originate
@MarkDickinson Thanks for clarification, and thanks for pointing that pow will be better then shifting for large numbers.Nichrome
Bit shifting is not my specialty, but if x is large, is computing (2 << x-2)%x computationally better? That's a huge number of bits that you're trying to find the modulus of, would it be better to go with traditional trial division?Aerosol
@Dan: Good answer, but your definition of Carmichael numbers is incorrect. A base-2 pseudoprime is a composite which this test claims is prime. A Carmichael number is a composite which a test to any base claims is prime.Hopi
@Hopi thanks for the clarification! I've flipped some prose around to reflect that these are not Carmichael numbers, just fermat pseudoprimes in base 2. Also, as MarkDickinson correctly commented, a modular exponentiation function will almost certainly be faster than either of the approaches I wrote about.Blinding
Let me tell you one more interesting topic. pow (2, x-1, x) == 1 This expression also returns "True" when some even numbers are given. You can access a list of these numbers here. oeis.org/a006935. Therefore, it would be more correct to update the code as follows. return pow (2, x-1, x) == 1 && x% 2 == 0. This is not enough because you should also check pseudo primes like 341, 561, 645. So the final version of the code should look like this. return pow (2, x-1, x) == 1 && x% 2 == 0 && binary_search_in (x, A001567) == False. A list of Pseudo primes less than 2 ^ 64 can be found below.Phytopathology
Follow these links. oeis.org/A001567 cecm.sfu.ca/Pseudoprimes/index-2-to-64.htmlPhytopathology
I
4

I believe, the code in your example is incorrect because binary left shift operator is not equivalent to power of a number, which is used in Fermat's little theorem. With base of two, binary left shift would be equal to power of x + 1, which is NOT used in a version of Fermat's little format.

Instead, use ** for power of integer in Python.

def CheckIfProbablyPrime(x):
    return (2 ** x - 2) % x == 0

" p − a is an integer multiple of p " therefore for primes, following theorem, result of 2 in power of x - 2 divided by x will leave a leftover of 0 (modulo '%' checks for number left over after division.

For x - 1 version,

def CheckIfProbablyPrime(a, x):
   return (a ** (x-1) - 1) % x == 0

both variations should result as true for prime numbers, because they're representing the Fermat's little theorem in Python

Irreclaimable answered 13/4, 2015 at 0:47 Comment(1)
I think you misinterpreted 2 << x - 2. Subtraction takes precedence in the order of operations, so this is 2 <<(x-2) not (2<<x)-2. 2<<(x-2) is the same as 1<<(x-1) or 2**(x-1).Warfore

© 2022 - 2024 — McMap. All rights reserved.