This question is an educational demonstration of the usage of lookahead, nested reference, and conditionals in a PCRE pattern to match ALL palindromes, including the ones that can't be matched by the recursive pattern given in the PCRE man page.
Examine this PCRE pattern in PHP snippet:
$palindrome = '/(?x)
^
(?:
(.) (?=
.*
(
\1
(?(2) \2 | )
)
$
)
)*
.?
\2?
$
/';
This pattern seems to detect palindromes, as seen in this test cases (see also on ideone.com):
$tests = array(
# palindromes
'',
'a',
'aa',
'aaa',
'aba',
'aaaa',
'abba',
'aaaaa',
'abcba',
'ababa',
# non-palindromes
'aab',
'abab',
'xyz',
);
foreach ($tests as $test) {
echo sprintf("%s '%s'\n", preg_match($palindrome, $test), $test);
}
So how does this pattern work?
Notes
This pattern uses a nested reference, which is a similar technique used in How does this Java regex detect palindromes?, but unlike that Java pattern, there's no lookbehind (but it does use a conditional).
Also, note that the PCRE man page presents a recursive pattern to match some palindromes:
# the recursive pattern to detect some palindromes from PCRE man page
^(?:((.)(?1)\2|)|((.)(?3)\4|.))$
The man page warns that this recursive pattern can NOT detect all palindromes (see: Why will this recursive regex only match when a character repeats 2n - 1 times? and also on ideone.com), but the nested reference/positive lookahead pattern presented in this question can.
java.util.regex.Pattern
, MSDN/.NET Framework 4/Regular Expression Language Elements, regular-expressions.info. No doubt there are better resources out there, but that's pretty much all I used. – Limpid