How can I prove that derivations in Chomsky Normal Form require 2n - 1 steps?
Asked Answered
O

1

13

I'm trying to prove the following:

If G is a Context Free Grammar in the Chomsky Normal Form, then for any string w belongs L(G) of length n ≥ 1, it requires exactly 2n-1 steps to make any derivation of w.

How would I go about proving this?

Orebro answered 12/7, 2012 at 16:7 Comment(0)
E
16

As a hint - since every production in Chomsky Normal Form either has the form

S → AB, for nonterminals A and B, or the form

S → x, for terminal x,

Then deriving a string would work in the following way:

  • Create a string of exactly n nonterminals, then
  • Expand each nonterminal out to a single terminal.

Applying productions of the first form will increase the number of nonterminals from k to k + 1, since you replace one nonterminal (-1) with two nonterminals (+2) for a net gain of +1 nonterminal. Since your start with one nonterminal, this means you need to do n - 1 productions of the first form. You then need n more of the second form to convert the nonterminals to terminals, giving a total of n + (n - 1) = 2n - 1 productions.

To show that you need exactly this many, I would suggest doing a proof by contradiction and showing that you can't do it with any more or any fewer. As a hint, try counting the number of productions of each type that are made and showing that if it isn't 2n - 1, either the string is too short, or you will still have nonterminals remaining.

Hope this helps!

Exaggeration answered 3/3, 2013 at 19:13 Comment(5)
:Could you tell why do we need to do n-1 productions of the first form.Chlorobenzene
Sure! Each terminal in the resulting string is eventually formed by taking a nonterminal and expanding it to a terminal via some production of the form A -> a. This means that there have to, at some point, be a total of n nonterminals produced. One of them comes for free from the start symbol. Each time you do a production of the form A -> BC, you get one more nonterminal. Therefore, you need to do n-1 productions of the form A -> BC so that you can create the n-1 additionally-needed nonterminals.Exaggeration
:Oh do you mean to say that the '-1' of the expression 'n-1' comes since one of them comes for free from the start symbol?Chlorobenzene
@Chlorobenzene Yep, exactly.Exaggeration
:Oh sorry but I couldn't get that.Do you mean to say that we're not adding the start symbol production to the stack.Let me show you an example.Let's say the productions are S -> AB ,A -> Amount,B -> Unit,Amount -> [0..9] ,Unit -> $.I could see that we could reach 50$ with 5 steps which is true for the expression '2n-1' where n=3.Aren't we adding the start symbol production to the stack here?Chlorobenzene

© 2022 - 2024 — McMap. All rights reserved.