Minimize a DFA with don't care transitions

Asked 3/7, 2019 at 12:48 Answered 6/7, 2019 at 13:3

I have a DFA (Q, Σ, δ, q₀, F) with some “don't care transitions.” These transitions model symbols which are known not to appear in the input in some situations. If any such transition is taken, it doesn't matter whether the resulting string is accepted or not.

Is there an algorithm to compute an equivalent DFA with a minimal amount of states? Normal DFA minimisation algorithms cannot be used as they don't know about “don't care” transitions and there doesn't seem to be an obvious way to extend the algorithms.

Frisco answered 3/7, 2019 at 12:48 Comment(13)

If you don't care about the "don't care" transitions, then why can't you just remove them? – Subternatural 3/7, 2019 at 13:10

@MattTimmermans Because a DFA has a transition for every combination of state and symbol. You can't have “no transition” and the minimization algorithms expect this to be the case. – Frisco 3/7, 2019 at 13:35

If the symbol never occurs(in a state) you can remove the transition for that {state,symbol}(or:you could replace al the unreachable states by one or more error states) – Erg 3/7, 2019 at 13:43

@Erg And replace it with what? Recall, the transition function is complete and must contain an entry for each combination of state and symbol. If I was to change that, the question how to adapt DFA minimization algorithms stays. – Frisco 3/7, 2019 at 13:45

The transition can use a smaller domain than the Carthesian product {states X symbols} – Erg 3/7, 2019 at 13:47

Doesn't Brzozowski's algorithm still work? – Pyosis 3/7, 2019 at 13:50

@Erg Then the Myhill-Nerode relation is either no longer an equivalence relation (if we take “no transition” as equivalent to any other transition) or it marks states with a missing transition as distinct from all states where this transition is present, negating the whole point of “don't care” transitions. How do you fix this? – Frisco 3/7, 2019 at 13:53

@harold Actually, on second thought, I don't think Brzozowski's algorithm is going to work. How would you model “don't care” transitions for it to be used? – Frisco 3/7, 2019 at 14:13

@Frisco well I thought, for a don't-care edge, don't add its reversal to the NFA. It's still an NFA with that, so it doesn't immediately violate a precondition.. I'm not entirely sure what it does to the result though – Pyosis 3/7, 2019 at 14:18

@harold That would be like turning all “don't care” transitions into “reject” transitions. For the DFA I have, I know that it cannot be minimised any further if I do exactly this. I thought about this, too, when I said that it might be doable, but on second thought I think it doesn't work. – Frisco 3/7, 2019 at 14:21

(Sorry if this comment is naive, I studied DFA 40 years ago) I don't see the difference with what you call "a DFA which doesn't care some transitions" and an incomplete DFA. The usual way to complete a DFA is to add a sink state corresponding to the missing transitions. You can minimize such a completed DFA. Then you can remove the sink state from the resulting minimized DFA. In your previous comment, you mentioned it is not optimal for your DFA. I am surprised. Could you provide a DFA example where such a simple method will fail? – Nonsectarian 5/7, 2019 at 19:58

@Nonsectarian This is equal to rejecting all "don't care" strings. I'd be surprised if this among all possible choices would be best. For an example: consider a DFA matching a single word of length n and noot caring about all others. The "sink state" approach generates a DFA with n + 2 states while accepting all don't care strings results in a DFA with a single state that just accepts everything. – Frisco 5/7, 2019 at 22:8

@Nonsectarian Note that simply accepting all don't care strings doesn't work either: suppose you have a DFA that rejects all strings except for one word of n characters. If the word is matched, a “don't care” transition occurs. If we add an accepting sink to the automaton and minimize it, the result has n + 1 states. If we instead reject all words from this don't care, the automaton has a single state, rejecting all strings. – Frisco 5/7, 2019 at 23:34

I think this problem is NP-hard (more on that in a bit). This is what I'd try.

(Optional) Preprocess the input via the usual minimization algorithm with accept/reject/don't care as separate outcomes. (Since don't care is not equivalent to accept or reject, we get the Myhill–Nerode equivalence relation back, allowing a variant of the usual algorithm.)
Generate a conflict graph as follows. Start with all edges between accepting and rejecting states. Take the closure where we iteratively add edges q₁—q₂ such that there exists a symbol s for which there exists an edge σ(q₁, s)—σ(q₂, s).
Color this graph with as few colors as possible. (Or approximate.) Lots and lots of coloring algorithms out there. PartialCol is a good starting point.
Merge each color class into a single node. This potentially makes the new transition function multi-valued, but we can choose arbitrarily.

With access to an alphabet of arbitrary size, it seems easy enough to make this reduction to coloring run the other way, proving NP-hardness. The open question for me is whether having a fixed-size alphabet constrains the conflict graph in such a way as to make the resulting coloring instance easier somehow. Alas, I don't have time to research this.

Quentinquercetin answered 6/7, 2019 at 13:3 Comment(3)

There are potentially many DC (or sink) states, one for each dont care case. Could you detail how these DC states are handled in step 2? Besides, it would be nice if you could illustrate your answer with a small example – Nonsectarian 7/7, 2019 at 16:11

@Nonsectarian DC states exist initially as isolated vertices, but they end up with an edge to each other vertex whenever there exists a string that can distinguish the two. – Quentinquercetin 7/7, 2019 at 22:34

Thanks for pointing this out. This is very similar to the NP-completeness proof I found in literature. – Frisco 8/7, 2019 at 14:6

I believe a slight variation of the normal Moore algorithm works. Here's a statement of the algorithm.

Let S be the set of states.
Let P be the set of all unordered pairs drawn from S.
Let M be a subset of P of "marked" pairs.
Initially set M to all pairs where one state is accepting and the other isn't. 
Let T(x, c) be the transition function from state x on character c.
Do
  For each pair z = <a, b> in P - M
    For each character c in the alphabet
      If <T(a, c), T(b, c)> is in M
        Add z to M and continue with the next z
Until no new additions to M

The final set P - M is a pairwise description of an equivalence relation on states. From it you can create a minimum DFA by merging states and transitions of the original.

I believe don't care transitions can be handled by never marking (adding to M) pairs based on them. That is, we change one line:

      If T(a, c) != DC and T(b, c) != DC and <T(a, c), T(b, c)> is in M

(Actually in an implementation, no real algorithm change is needed if DC is a reserved value of type state that's not a state in the original FA.)

I don't have time to think about a formal proof right now, but this makes intuitive sense to me. We skip splitting equivalence classes of states based on transitions that we know will never occur.

The thing I still need to prove to myself is whether the set P - M is still a pairwise description of an equivalence relation. I.e., can we end up with <a,b> and <b,c> but not <a,c>? If so, is there a fixup?

Phobe answered 4/7, 2019 at 0:41 Comment(3)

It's no longer an equivalence relation: consider a language matching xab, not caring about yab and rejecting zab. Both transitions from xa to xab and from za to zab relate to ya to yab, but they do ot relate to each other. This is what I meant when I said that the Myhill-Nerode relation is no longer an equivalence relation if you extend it to don't cares the obvious way (i.e. the way you chose, too). – Frisco 4/7, 2019 at 3:21

@Frisco Haha. Thanks. So the "fixup" would seem to be finding a partition into maximal cliques, which is NP hard, isn't it? Yikes. – Phobe 13/7, 2019 at 0:1

Yeah. I found an NP-completeness proof by reduction to graph-colouring in the meanwhile. – Frisco 13/7, 2019 at 1:3

Recommended topics

Hot tags