What is the algorithm for generating a random Deterministic Finite Automata?
Asked Answered
A

7

4

The DFA must have the following four properties:

  • The DFA has N nodes

  • Each node has 2 outgoing transitions.

  • Each node is reachable from every other node.

  • The DFA is chosen with perfectly uniform randomness from all possibilities

This is what I have so far:

  1. Start with a collection of N nodes.
  2. Choose a node that has not already been chosen.
  3. Connect its output to 2 other randomly selected nodes
  4. Label one transition 1 and the other transition 0.
  5. Go to 2, unless all nodes have been chosen.
  6. Determine if there is a node with no incoming connections.
  7. If so, steal an incoming connection from a node with more than 1 incoming connection.
  8. Go to 6, unless there are no nodes with no incoming connections

However, this is algorithm is not correct. Consider the graph where node 1 has its two connections going to node 2 (and vice versa), while node 3 has its two connection going to node 4 (and vice versa). That is something like:

1 <==> 2

3 <==> 4

Where, by <==> I mean two outgoing connections both ways (so a total of 4 connections). This seems to form 2 cliques, which means that not every state is reachable from every other state.

Does anyone know how to complete the algorithm? Or, does anyone know another algorithm? I seem to vaguely recall that a binary tree can be used to construct this, but I am not sure about that.

Amuse answered 2/4, 2011 at 20:24 Comment(5)
Are you sure you want all nodes to be reachable from every node? It's much simpler to make one where all nodes are reachable from the start node.Hugh
Yes, I'm sure I need all nodes to be reachable from every node.Amuse
A randomly constructed deterministic finite automata (DFA). The DFA is deterministic. It is chosen randomly, with perfectly uniform distribution, from all possible DFA with N nodes.Amuse
That is, it is chosen randomly from all DFA with the four properties I mentioned in my question.Amuse
Would you say this is really a graph problem? Since there is a graph solution or two below, I'm just curious of the math.SE would be a better place to ask if these didn't work out.Winnie
M
4

Strong connectivity is a difficult constraint. Let's generate uniform random surjective transition functions and then test them with e.g. Tarjan's linear-time SCC algorithm until we get one that's strongly connected. This process has the right distribution, but it's not clear that it's efficient; my researcher's intuition is that the limiting probability of strong connectivity is less than 1 but greater than 0, which would imply only O(1) iterations are necessary in expectation.

Generating surjective transition functions is itself nontrivial. Unfortunately, without that constraint it is exponentially unlikely that every state has an incoming transition. Use the algorithm described in the answers to this question to sample a uniform random partition of {(1, a), (1, b), (2, a), (2, b), …, (N, a), (N, b)} with N parts. Permute the nodes randomly and assign them to parts.

For example, let N = 3 and suppose that the random partition is

{{(1, a), (2, a), (3, b)}, {(2, b)}, {(1, b), (3, a)}}.

We choose a random permutation 2, 3, 1 and derive a transition function

(1, a) |-> 2
(1, b) |-> 1
(2, a) |-> 2
(2, b) |-> 3
(3, a) |-> 1
(3, b) |-> 2
Mango answered 3/4, 2011 at 19:20 Comment(7)
Note that this is assuming distinguishable states. For indistinguishable states, use Brendan McKay's nauty to compute the number of automorphisms A and then accept the graph with probability 1/A if it is strongly connected. Note that A is almost always 1, so this is (a) still efficient (b) not much different in distribution from distinguishable states.Mango
Note also that although Jeremiah Willcock's algorithm is exponential-time, computers are really fast these days and you should avoid enforcing surjectivity until you know that N is too large.Mango
I am referring to indistinguishable states. So, what graph are you saying I need to accept with 1/A probability? The graph produced by the algorithm you gave above?Amuse
Could Ono's recent discovery with the completion of the theory behind the partition function speed this up? Didn't he come up with a way to determine the number of partitions of any number relatively easily?Amuse
@Jeff B. I doubt it, since the linked algorithm really needs # of partitions with at most k parts, which destroys the symmetries. If you're using nauty, then nauty is likely to be the bottleneck anyway. If you forgo nauty, then there are probably better sampling algorithms (but that's probably best left for a separate question).Mango
@qrqwe: Thanks. If A is almost always 1 it shouldn't make that big of a difference for what I am doing.Amuse
@Jeff B In that case I expect that there's a Boltzmann sampler for the partitions running in time o(n^2). If that's still not fast enough and you're willing to settle for more approximation, you might try MCMC.Mango
P
1

In what follows I'll use the basic terminology of graph theory.

You could:

  1. Start with a directed graph with N vertices and no arcs.
  2. Generate a random permutation of the N vertices to produce a random Hamiltonian cycle, and add it to the graph.
  3. For each vertex add one outgoing arc to a randomly chosen vertex.

The result will satisfy all three requirements.

Periodontal answered 2/4, 2011 at 20:43 Comment(7)
Yea, I thought about that, but is it really random if I add in a Hamiltonian cycle? I need it to be chosen with perfectly uniform randomness from all possibilities.Amuse
That's a very strong requirement. If that's what you really need, you ought to clearly state it in your question.Periodontal
There are DFAs which meet the requirements and have no hamiltonian path. Consider the graph with edges {ab,ba,bc,cb,aa,cc}.Hugh
Yes, but they all have a directed cycle.Amuse
For the one you gave, one possibility would be: (ab,bc,cb,ba)Amuse
However, I think that's a problem, because the directed cycles don't occur in equal numbers like Hamiltonian cycles. Thus, you would have to use the right probability distribution to select a directed cycle length.Amuse
In particular, for a directed cycle of length N, there are N^2 possible graphs (for each node, we randomly choose another node to connect it to). However, for a directed cycle of path length N+1, I think it turns out to be (N-1)*N possible graphs (because 1 node must have 2 outgoing connections already).Amuse
L
1

There is a expected running time O(n^{3/2}) algorithm.

If you generate a uniform random digraph with m vertices such that each vertex has k labelled out-arcs (a k-out digraph), then with high probability the largest SCC (strongly connected component) in this digraph is of size around c_k m, where c_k is a constant depending on k. Actually, there is about 1/\sqrt{m} probability that the size of this SCC is exactly c_k m (rounded to an integer).

So you can generate a uniform random 2-out digraph of size n/c_k, and check the size of the largest SCC. If its size is not exactly n, just try again until success. The expected number of trials needed is \sqrt{n}. And generating each digraph should be done in O(n) time. So in total the algorithm has expected running time O(n^{3/2}). See this paper for more details.

Levitate answered 24/4, 2015 at 14:20 Comment(0)
H
0

Just keep growing a set of nodes which are all reachable. Once they're all reachable, fill in the blanks.

Start with a set of N nodes called A.
Choose a node from A and put it in set B.
While there are nodes left in set A
    Choose a node x from set A
    Choose a node y from set B with less than two outgoing transitions.
    Choose a node z from set B
    Add a transition from y to x.
    Add a transition from x to z
    Move x to set B
For each node n in B
    While n has less than two outgoing transitions
         Choose a node m in B
         Add a transition from n to m
Choose a node to be the start node.
Choose some number of nodes to be accepting nodes.

Every node in set B can reach every node in set B. As long as a node can be reached from a node in set B and that node can reach a node in set B, it can be added to the set.

Hugh answered 2/4, 2011 at 20:48 Comment(11)
@Jeff B: I am pretty sure. I can't see any reason my invariant wouldn't be true.Hugh
Will is always be the case that y exists in the set B?Amuse
@Jeff: The most recently added node always has exactly one outgoing transition, so yes there will always be a candidate for y.Hugh
I want to believe that this works but I'm pretty skeptical about whether this gives you a uniform distribution over DFAs with the required property. Can you show why this is guaranteed to give back the right probability distribution?Carolus
@templatetypedef: We'd need to have a good definition of what exactly a distinct DFA is before I could start trying to prove that.Hugh
Hmm... that's a good point. I think that we can just treat the DFAs as tuples of (States, Transitions, Start State, Accepting States). There will be a lot of permutations that end up being the same DFA since we can permute the states, but I think that this is okay because the non-isomorphic DFAs will all show up with the same multiplicity (I think...)Carolus
@Jeff Symmetries in the graph can make permutations of the nodes be equal though.Hugh
Now that I think about it, I'm not so sure about the original construction. By forcing the node into set B, it seems like the first nodes that enter B will tend to have more connections. Thus, the distribution of connections wouldn't be approximately uniform, as you would expect.Amuse
I think aix's algorithm may be almost correct. However, rather than a Hamilonian cycle, you would choose random directed cycle that includes all N nodes, and does not violate the 2 outgoing connections rule.Amuse
Dang, I just realized there are some DFA's which meet the first three requirements which this algorithm can't construct. Consider the DFA with the edges {ab,bc,ca,aa,bb,cc}. I can't make that because to add a then b, I need the edges {ab,ba}.Hugh
Yea, I think this algorithm is seriously flawed. aix's algorithm seems almost right though, given the changes I mentioned.Amuse
A
0

The simplest way that I can think of is to (uniformly) generate a random DFA with N nodes and two outgoing edges per node, ignoring the other constraints, and then throw away any that are not strongly connected (which is easy to test using a strongly connected components algorithm). Generating uniform DFAs should be straightforward without the reachability constraint. The one thing that could be problematic performance-wise is how many DFAs you would need to skip before you found one with the reachability property. You should try this algorithm first, though, and see how long it ends up taking to generate an acceptable DFA.

Actinon answered 3/4, 2011 at 19:15 Comment(0)
T
0

We can start with a random number of states N1 between N and 2N.

Assume the initial state the as the state number 1. For each state, for each character in the input alphabet we generate a random transition (between 1 and N1).

We take the connex automaton starting from the initial state. We check the number of states, and after few tries we get one with N states.

If we wish a minimal automaton too, remains only the assignment of final states, however there are great chances that a random assignment gets a minimal automaton as well.

Transmigrate answered 30/3, 2012 at 16:3 Comment(0)
R
0

The following references seem to be relevant to your question:

F. Bassino, J. David and C. Nicaud, Enumeration and random generation of possibly incomplete deterministic automata, Pure Mathematics and Applications 19 (2-3) (2009) 1-16.

F. Bassino and C. Nicaud. Enumeration and Random Generation of Accessible Automata. Theor. Comp. Sc.. 381 (2007) 86-104.

Recommit answered 24/8, 2013 at 8:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.