I'm not sure when ereg
will be removed but my bet is as of PHP 6.0.
Regarding your second issue (translating ereg
to preg
) doesn't seem something that hard, if your application has > 1 million lines surely you must have the resources to get someone doing this job for a week at most. I would grep all the ereg_
instances in your code and set up some macros in your favorite IDE (simple stuff like adding delimiters, modifiers and so on).
I bet most of the 1768 regexes can be ported using a macro, and the others, well, a good pair of eyes.
Another option might be to write wrappers around the ereg
functions if they are not available, implementing the changes as needed:
if (function_exists('ereg') !== true)
{
function ereg($pattern, $string, &$regs)
{
return preg_match('~' . addcslashes($pattern, '~') . '~', $string, $regs);
}
}
if (function_exists('eregi') !== true)
{
function eregi($pattern, $string, &$regs)
{
return preg_match('~' . addcslashes($pattern, '~') . '~i', $string, $regs);
}
}
You get the idea. Also, PEAR package PHP Compat might be a viable solution too.
Differences from POSIX regex
As of PHP 5.3.0, the POSIX Regex
extension is deprecated. There are a
number of differences between POSIX
regex and PCRE regex. This page lists
the most notable ones that are
necessary to know when converting to
PCRE.
- The PCRE functions require that the pattern is enclosed by delimiters.
- Unlike POSIX, the PCRE extension does not have dedicated functions for
case-insensitive matching. Instead,
this is supported using the /i pattern
modifier. Other pattern modifiers are
also available for changing the
matching strategy.
- The POSIX functions find the longest of the leftmost match, but
PCRE stops on the first valid match.
If the string doesn't match at all it
makes no difference, but if it matches
it may have dramatic effects on both
the resulting match and the matching
speed. To illustrate this difference,
consider the following example from
"Mastering Regular Expressions" by
Jeffrey Friedl. Using the pattern
one(self)?(selfsufficient)? on the
string oneselfsufficient with PCRE
will result in matching oneself, but
using POSIX the result will be the
full string oneselfsufficient. Both
(sub)strings match the original
string, but POSIX requires that the
longest be the result.
ereg
has been discouraged since around PHP 4.1 - but solely because it is not as optimized as the PCRE functions. It's not overly likely to be removed anytime soon (not with the mythical PHP6 anyway), and even then it would be simple to write a runtime compatibility support wrapper (which you should do for testing). You should list some examples why you think your posix extended regexpressions would be incompatible. The differences are seldomly significant. – Decomposeereg
topreg
; RegexBuddy for example does this and supports COM automation. Depending on the complexity of your regexes, and whether you actually need to migrate, that might be a relevant option. – Centre