Checking if two Python regex patterns are equivalent
Asked Answered
E

1

11

I want to write a regex in re.VERBOSE mode, but I'm not confident that I'll add the verbose part without error.

I remember that, theoretically, the equivalence of two regexes (without backreferences, at least) can be found by generating their automata and trying to find a graph bijection. But there's no instance method I can see for comparing regexes.

Is there a way to either generate the automaton of a regex or directly compare them, preferably with the standard library?

(I've already decided on a different solution to my problem, but this is still of interest to me.)

Execrative answered 28/1, 2014 at 6:16 Comment(0)
W
9

You can use the undocumented re.DEBUG feature:

>>> r1 = re.compile("foo[bar]baz", re.DEBUG)
literal 102
literal 111
literal 111
in
  literal 98
  literal 97
  literal 114
literal 98
literal 97
literal 122
>>> r2 = re.compile("""foo   # foo!
...                    [bar] # b or a or r!
...                    baz   # baz!""", re.VERBOSE|re.DEBUG)
literal 102
literal 111
literal 111
in
  literal 98
  literal 97
  literal 114
literal 98
literal 97
literal 122

If the output is identical, r1 and r2 are identical as well.

Weston answered 28/1, 2014 at 6:23 Comment(5)
More underdocumented than documented. Also, while trying to write a function to check equality of regexes, I found that, due to re.compile caching its results, re.DEBUG might not give output. And that it's not theoretical equivalence of regex, so this only works for re.VERBOSE changes. Here is my implementation, with examples: pastebin.com/DeCWLmF8 (Feel free to add from this comment to your answer.)Execrative
I'm disappointed that re doesn't save the debug output, and that I can't force a recompile with re.DEBUG.Execrative
Raised an issue about re.DEBUG not forcing a recompile: bugs.python.org/issue20426Execrative
@leewangzhong: consider raising a bug for adding a method for this like re.compile(ur'yada').equivalent(re.compile(ur'yada')) :)Alundum
You mean a feature request?Execrative

© 2022 - 2024 — McMap. All rights reserved.