How to convert NFA to Regular Expression
Asked Answered
C

2

12

I knew that converting a regular expression to a NFA, there is a algorithm.

But I was wondering if there is a algorithm to convert a NFA to regular expression. If there is, what is it?

And if there isn't, I am also wondering if all NFA can convert to a regular expression. Is there a NFA that a regular expression that cannot represent?

Thank you! :D

Cancroid answered 9/2, 2012 at 5:16 Comment(3)
A regular expression can express any regular language, so there should exist at least one regular expression for each possible NFA. However, I don't know an algorithm for going from an NFA to a regular expression off the top of my head.Executive
Also, your timing is actually eerie--my friend asked me this exact same question in class today. I didn't remember the answer then either :(Executive
See a variety of answers to your question here: cs.stackexchange.com/questions/2016/…Waitabit
C
8

Here is an algorithm where each transition is incrementally replaced with a regex, until there is only an initial and final state: https://courses.engr.illinois.edu/cs373/sp2009/lectures/lect_08.pdf [PDF]

Chibouk answered 9/2, 2012 at 5:43 Comment(2)
it is correct but it is not work if you had a lots of node, it's good for a simple and a few nodes,is there any other suggestion?Canzone
Link is dead for me. I found this link thought : courses.engr.illinois.edu/cs373/sp2009/lectures/lect_08.pdfGermanize
M
2

An NFA can be written as a system of inequalities (over a Kleene algebra), with one variable for each state, one inequality q ≥ 1 for each final state q and one inequality q ≥ x r for each transition on x from state q to state r. This is a right-affine fixed point system, over a Kleene algebra, whose least fixed point solution gives you, for any q, the regular expression recognized by the NFA that has q as the start state. The system can be collated, so that all the inequalities q ≥ A, q ≥ B, ..., q ≥ Z, for each given q, are combined into q ≥ A + B + ... Z. The result is a matrix system 𝐪 ≥ 𝐚 + H 𝐪, where 𝐪 is the vector of all the variables, 𝐚 the vector of the affine coefficients - 0's for non-final states, 1's for final states, but those details are not important for what follows; and H is the matrix of all the linear coefficients.

To solve a right-affine system, do so one variable at a time. In Kleene algebra, the least fixed point solution to x ≥ a + bx is x = b* a. This applies both singly and matrix-wise, so that the least fixed point solutuion to 𝐪 ≥ 𝐚 + H 𝐪, in matrix form is 𝐪 = H* 𝐚.

Matrices over Kleene algebras form a Kleene algebras, with matrix addition and matrix multiplication, respectively, for the sum and product operators and the matrix star for the Kleene star. Finding the matrix star is one and the same process as solving the corresponding system 𝐪 ≥ 𝐚 + H 𝐪.

A Generic Example:
Consider the NFA with states q, r, s, with q the start state and s the only final state, and with transitions:

    a: q → q, b: q → r, c: q → s,
    d: r → q, e: r → r, f: r → s,
    g: s → q, h: s → r, i: s → s.

Let (x, y, z) = (0, 0, 1) denote the corresponding affine coefficients. Then, the corresponding right-affine system is:

    q ≥ x + a q + b r + c s,
    r ≥ y + d q + e r + f s,
    s ≥ z + g q + h r + i s.

Solve for s, first, to obtain

    s = i* (z + g q + h r) = i* z + i* g q + i* h r.

Substitute in the other inequalities to get:

    q ≥ x + c i* z + (a + c i* g) q + (b + c i* h) r,
    r ≥ y + f i* z + (d + f i* g) q + (e + f i* h) r.

Rewrite this as

    q ≥ x' + a' q + b' r,
    r ≥ y' + d' q + e' r,

where

    x' = x + c i* z, a' = a + c i* g, b' = b + c i* h,
    y' = y + f i* z, d' = d + f i* g, e' = e + f i* h.

Solve for r to get

    r = e'* (y' + d' q) = e'* y' + e'* d' q.

Substitute into the inequality for q to get

    q ≥ (x' + b' e'* y') + (a' + b e'* d') q

and rewrite this as

    q ≥ x" + a" q

where

    x" = x' + b' e'* y', a" = a' + b e'* d'.

Finally, solve this for q to get

    q = a"* x".

This is also the general form for that embodies the generic fail-safe solution for NFA's with 3 states.

Since q is the start state, then a"* x" is the regular expression sought for, with a", x", a', b', d', e', x', y', x, y and z as indicated above. If you try to in-line substitute them all, the expression will blow up to a size that is exponential in the number of states and will be large in size even for three states.

An Optimized Example:
Consider the system for the NFA whose states are q, r, s, with q the start state, s the final state, and the transitions

    a: q → r, a: q → q, b: q → q, b: q → s, a: s → s, b: s → s

The corresponding right-affine system is

    q ≥ a r + a q + b q
    r ≥ b s
    s ≥ 1 + a s + b s

Solve for s first:

    s ≥ 1 + a s + b s = 1 + (a + b) s ⇒ s = (a + b)*.

Substitute into the inequality for r and solve to find the least fixed point:

    r ≥ b (a + b)* ⇒ r = b (a + b)*.

Finally, substitute into the inequality for q and solve to find the least fixed point:

    q ≥ a b (a + b)* + (a + b) q ⇒ q = (a + b)* a b (a + b)*.

The resulting regular expression is (a + b)* a b (a + b)*. So, with chess-playing strategizing, simpler and optimal forms for the solution can be found.

Murther answered 28/10, 2022 at 18:45 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.