Convert finite state machine to regular expression
Asked Answered
D

5

13

Is there a tool (or an algorithm) to convert a finite state machine into a regular expression?

(not the other way around, that would be easy).

Dougald answered 25/4, 2016 at 23:50 Comment(6)
You'd have to de-compile the regex object...Kat
FSM into regex is actually quite simple (and simpler than the reverse, imo). What's your attempt?Bilbo
@Bilbo Any algorithm that you suggest?Dougald
@enrico.bacis: Try to assign each state a regex that matches the rest of the string. Start at the accepting states with empty regexes, then do a reverse graph traversal. Use regex concatenation for each transition, and regex alternation if there are multiple outgoing edges. If you find a circle, add a repetition. After having assigned a regex to each state, use the regex of the start state. Could that work? It's a naive approach I just made up of thin air.Bilbo
Do you have a particular FSM you want to convert? Or is this a general theory thing?Dorindadorine
There are well-known algorithms. See cs.stackexchange.com/questions/2016/…Sweetmeat
S
18

There are several algorithms to perform this task: the "state-elimination method" from Brzozowski and Mc Cluskey, the resolution of a system of linear equation, the method from McNaughton and Yamada, etc. They are very well described in Automata and rational expressions by Jacques Sakarovitch.

The state-elimination method in particular is simple to understand. The key idea is that you are going to build an automaton labeled by rational (aka regular) expressions rather than letters. First, make sure you have a single initial state and a single final state (you may add fresh states and spontaneous transitions if necessary). Then choose a state s to eliminate, say state 1 in the following picture.

A simple automaton

Then consider all the couples (p, q) where p is a predecessor (states from which a transition reaches s, 0 in the picture) and q a successor (state 2). For each such couple (p, q) add a transition from p to q which is labeled by E(p, q) + E(p, s)E(s, s)*E(s, q) where E(p, s) means "the expression that labels the transition from p to s. Once you treated all the couple (p, q), remove the state s. In the previous example:

An automaton labeled by regexps

Do that until you eliminated all the inner states (i.e., keep the initial state and the final state), and just read the result on the transition from the initial state to the final state (here d+ab*c).

You may toy with this algorithm using Vcsn, a tool for rational expressions and automata. Here is a complete example you may reproduce at Vcsn Sandbox.

A complete run of the state elimination method

Stroh answered 27/4, 2016 at 7:57 Comment(0)
D
4

I believe the best tool I have used is greenery. It is a FSM/regex conversion library for python. You can read more about the library here and the algorithm used is well described here.


Example

FSM

The model that can be found on the website can be converted like this:

from greenery import fsm, lego

A, B, C, D, E = range(5)
a, b = 'a', 'b'

# create the FSM
machine = fsm.fsm(
    alphabet = {a, b},
    states   = {A, B, C, D, E},
    initial  = A,
    finals   = {C, E},
    map      = {
            A : {a: B, b: D},
            B : {a: C, b: E},
            C : {a: C, b: E},
            D : {a: B, b: D},
            E : {a: B, b: D}
    },
)

# convert it to regex
rex = lego.from_fsm(machine)

The output is the following:

>>> print(machine)

  name final? a b
------------------
* 0    False  1 3
  1    False  2 4
  2    True   2 4
  3    False  1 3
  4    True   1 3

>>> print(rex)

b*a((a*(a|b+))?ba)*(a+b?|b)

Note

The version on PYPI has some problem with assertions, and the github version has some problems with the memory, but the combination python 3.x + github version is awesome.

Dougald answered 26/4, 2016 at 14:7 Comment(1)
I wonder how can one understand from the graf in the image that C and E are final (terminal states)?Hertz
D
3

You can use fsm2regex, an online tool that does the job.

Dougald answered 26/4, 2016 at 0:30 Comment(2)
Type in a regex, you'll get a feel for the FSM syntax.Dorindadorine
@AustinHastings: I meant "What algorithm does it use to create the regex?", of course, not "How do you use it?".Bilbo
T
2

I have implemented the state elimination algorithm for the http://www.brics.dk/automaton/ Java package. The implementation is based on the algorithm illustrated in Sipser, Michael. Introduction to the Theory of Computation. Vol. 2. Boston: Thomson Course Technology, 2006.

You can check it out at https://github.com/julianthome/autorex. Would be happy to get some feedback.

Trantrance answered 24/9, 2016 at 19:27 Comment(2)
How can one pass a description of a SM (states, transitions) into some class and get a regex?Hertz
If you can provide your automaton as a dk.brics.Automaton it works out of the box. You can find some example test-cases here: github.com/julianthome/autorex/blob/master/src/test/java/org/…. In case you have a custom data-structure, you could translate it into an GNFA (github.com/julianthome/autorex/blob/master/src/main/java/org/…) which can be passed on to the StateEliminator (github.com/julianthome/autorex/blob/master/src/main/java/org/…).Trantrance
D
1

You don't specify what you're doing, but you might want to know there is a tool called Ragel that specializes in FSMs. It generates code for a slew of languages, and when I looked a few years ago, it wasn't too hard to port the machines to other languages.

Dorindadorine answered 26/4, 2016 at 0:25 Comment(3)
This sounds like the inverse of what the OP wants.Bilbo
He wants to take a FSM and turn it into a regex - presumably because he wants to execute it without having to spend a bunch of time writing framework code.Dorindadorine
Ah, sure, but you should add that to your answer if the output is not a regex but an executable program.Bilbo

© 2022 - 2024 — McMap. All rights reserved.