I will explain the regex part outside of primality testing: the following regex, given a String s
which consists of repeating String t
, finds t
.
System.out.println(
"MamamiaMamamiaMamamia".replaceAll("^(.*)\\1+$", "$1")
); // prints "Mamamia"
The way it works is that the regex captures (.*)
into \1
, and then sees if there's \1+
following it. Using the ^
and $
ensures that a match must be of the whole string.
So, in a way, we're given String s
, which is a "multiple" of String t
, and the regex will find such t
(the longest possible, since \1
is greedy).
Once you understand why this regex works, then (ignoring the first alternate in OP's regex for now) explaining how it's used for primality testing is simple.
- To test primality of
n
, first generate a String
of length n
(filled with the same char
)
- The regex captures a
String
of some length (say k
) into \1
, and tries to match \1+
to the rest of the String
- If there is a match, then
n
is a proper multiple of k
, and therefore n
is not prime.
- If there's no match, then no such
k
exists that divides n
, and n
is therefore a prime
How does .?|(..+?)\1+
match prime numbers?
Actually, it doesn't! It matches String
whose length is NOT prime!
.?
: The first part of the alternation matches String
of length 0
or 1
(NOT prime by definition)
(..+?)\1+
: The second part of the alternation, a variation of the regex explained above, matches String
of length n
that is "a multiple" of a String
of length k >= 2
(i.e. n
is a composite, NOT a prime).
- Note that the reluctant modifier
?
is actually not needed for correctness, but it may help speed up the process by trying smaller k
first
Note the !
boolean
complement operator in the return
statement: it negates the matches
. It's when the regex DOESN'T match, n
is prime! It's a double-negative logic, so no wonder it's kind of confusing!!
Simplification
Here's a simple rewriting of the code to make it more readable:
public static boolean isPrime(int n) {
String lengthN = new String(new char[n]);
boolean isNotPrimeN = lengthN.matches(".?|(..+?)\\1+");
return !isNotPrimeN;
}
The above is essentially the same as the original Java code, but broken apart into multiple statements with assignments to local variables to make the logic easier to understand.
We can also simplify the regex, using finite repetition, as follows:
boolean isNotPrimeN = lengthN.matches(".{0,1}|(.{2,})\\1+");
Again, given a String
of length n
, filled with the same char
,
.{0,1}
checks if n = 0,1
, NOT prime
(.{2,})\1+
checks if n
is a proper multiple of k >= 2
, NOT prime
With the exception of the reluctant modifier ?
on \1
(omitted for clarity), the above regex is identical to the original.
More fun regex
The following regex uses similar technique; it should be educational:
System.out.println(
"OhMyGod=MyMyMyOhGodOhGodOhGod"
.replaceAll("^(.+)(.+)(.+)=(\\1|\\2|\\3)+$", "$1! $2! $3!")
); // prints "Oh! My! God!"
See also
return !new
does? – Jacquelynejacquelynn!new String(new char[n]).matches(".?|(..+?)\\1+")
is equivalent to!((new String(new char[n])).matches(".?|(..+?)\\1+"))
. – GuesswarpisNotPrime n = n<2 || any (\i-> rem n i==0) [2..div n 2]
. – Gratitude