Regular expressions Equivalence
Asked Answered
H

4

16

Is there a way to find out if two arbitrary regular expressions are equivalent? Looks like complex problem to me, but there might be some DFA simplification mechanism or something?

Hognut answered 18/2, 2009 at 8:35 Comment(0)
A
10

To test equivalence you can compute the minimal DFAs for the expressions and compare them.

Accredit answered 18/2, 2009 at 9:35 Comment(3)
what do you mean by comparing two DFAs? Graph Isomorphism?Greyson
Since you have an initial state and the transitions are labeled and deterministic it is easy to check DFAs for equality, much easier than graph isomorphism. One depth-first traversal should suffice.Accredit
@Accredit Great link!Stuppy
K
10

Testability of equality is one of the classical properties of regular expressions. (N.B. This doesn't hold if you're really talking about Perl regular expressions or some other technically nonregular superlanguage.)

Turn your REs to generalised finite automata A and B, then construct a new automaton A-B such that the accepting states of A have null transitions to the start states of B, and that the accepting states of B are inverted. This gives you an automaton that accepts all those strings accepted by A, except for all those accepted by B.

Do the same for B-A, and reduce both to pure FAs. If an FA has no accepting states accessible from a start state then it accepts the empty language. If you can show that both A-B and B-A are empty, you've shown that A = B.

Edit Heh, I can't believe no one noticed the gigantic error there -- an intentional one, of course :-p

The automata A-B as described will accept those strings whose first half is accepted by A and whose second half is not accepted by B. Building the desired A-B is a slightly trickier process. I can't think of it off the top of my head, but I do know it's well-defined (and likely involves creating states to the represent the products of accepting states in A and non-accepting states in B).

Kew answered 18/2, 2009 at 10:3 Comment(1)
For A-B you want to use intersection and complement instead, which unfortunately aren't common in regex libraries, though they are implementable for 'real' regexes (instead of the Perl kind). (Neither is testing if a regex accepts nothing. Libraries suck, don't they?)Thiosinamine
P
2

This really depends on what you mean by regular expressions. As the other posters pointed out, reducing both expressions to their minimal DFA should work, but it only works for the pure regular expressions.

Some of the constructs used in the real world regex libs (backreferences in particular) give them power to express languages that aren't regular, so the DFA algorithm won't work for them. For example the regex : ([a-z]*) \1 matches a double occurence of the same word separated by a space (a a and b b but not b a nor a b). This cannot be recognized by a finite automaton at all.

Pompidou answered 18/2, 2009 at 10:5 Comment(0)
F
1

These two Perlmonks threads discuss this question (specifically, read blokhead's responses):

Frugivorous answered 18/2, 2009 at 8:49 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.