Convert regular expression to CFG
Asked Answered
L

4

11

How can I convert some regular language to its equivalent Context Free Grammar? Is it necessary to construct the DFA corresponding to that regular expression or is there some rule for such a conversion?

For example, consider the following regular expression

01+10(11)*

How can I describe the grammar corresponding to the above RE?

Lorrielorrimer answered 14/4, 2010 at 17:11 Comment(1)
wondering whether there are any open source library implementations helpful for this task by nowRuddle
P
16
  • Change A+B to grammar

    G -> A
    G -> B
    
  • Change A* to

    G -> (empty)
    G -> A G
    
  • Change AB to

    G -> AB
    

and proceed recursively on A and B. Base cases are empty language (no productions) and a single symbol.

In your case

 A -> 01
 A -> 10B
 B -> (empty)
 B -> 11B

If the language is described by finite automaton:

  • use states as nonterminal symbols
  • use language as set of terminal symbols
  • add a transition p -> aq for any transition p -> q on letter a in the original automaton
  • use initial state as initial symbol in the grammar
Phonemics answered 14/4, 2010 at 17:31 Comment(3)
Why is B -> 11 instead of B -> B11?Labonte
Why are you changing A+B into G -> A and G -> B? Doesn't + mean "one or more of the previous expression" in regex?Drogue
@LeonOverweel In formal language theory A+B is used to denote sum, in programming languages notation A|B is used instead.Phonemics
N
6

I guess you mean convert it to a formal grammar with rules of the form V->w, where V is a nonterminal and w is a string of terminals/nonterminals. To start, you can simply say (mixing CFG and regex syntax):

S -> 01+10(11)*

Where S is the start symbol. Now let's break it up a bit (and add whitespace for clarity):

S -> 0 A 1 0 B
A -> 1+
B -> (11)*

The key is to convert *es and +es to recursion. First, we'll convert the Kleene star to a plus by inserting an intermediate rule that accepts the empty string:

S -> 0 A 1 0 B
A -> 1+
B -> (empty)
B -> C
C -> (11)+

Finally, we'll convert + notation to recursion:

S -> 0 A 1 0 B
A -> 1
A -> A 1
B -> (empty)
B -> C
C -> 11
C -> C 11

To handle x?, simply split it into a rule producing empty and a rule producing x .

Nausea answered 14/4, 2010 at 17:27 Comment(0)
E
2

Actually, different CFG grammars can produce the same language. So given a regular expression (regular language), its mapping back a CFG is not unique.

Definitely, you can construct a CFG that result in a given regular expression. The above answers shown some ways to achieve this.

Hope this gives you a high level idea.

Enunciate answered 15/5, 2012 at 20:3 Comment(0)
D
0

For the example, the following grammar is equivalent:

S -> 01|10A
A -> 11A|empty
Dogma answered 17/8, 2022 at 15:33 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.