You are on page 1of 29

Simplifications

of
Context-Free Grammars

Prof. Busch - LSU 1


A Substitution Rule

Equivalent
grammar
S  aB
S  aB | ab
A  aaA
Substitute A  aaA
A  abBc B b A  abBc | abbc
B  aA
B  aA
Bb
Prof. Busch - LSU 2
S  aB | ab
A  aaA
A  abBc | abbc
B  aA
Substitute
B  aA

S  aB | ab | aaA
Equivalent
A  aaA
A  abBc | abbc | abaAc
grammar
Prof. Busch - LSU 3
In general: A  xBz

B  y1

Substitute
B  y1

equivalent
A  xBz | xy1z grammar
Prof. Busch - LSU 4
Nullable Variables

  production : X 
Nullable Variable: Y  
Example: S  aMb
M  aMb
M 

Nullable variable   production


Prof. Busch - LSU 5
Removing   production s
S  aMb S  aMb | ab
Substitute
M  aMb M  M  aMb | ab
M 

After we remove all the   production s


all the nullable variables disappear
(except for the start variable)

Prof. Busch - LSU 6


Unit-Productions
Unit Production: X Y
(a single variable in both sides)

Example: S  aA
Aa
A B
Unit Productions
BA
B  bb
Prof. Busch - LSU 7
Removal of unit productions:

S  aA
S  aA | aB
Aa
Substitute Aa
A B A B B  A| B
BA
B  bb
B  bb

Prof. Busch - LSU 8


Unit productions of form X X
can be removed immediately

S  aA | aB S  aA | aB
Aa Remove Aa
B  A| B BB BA
B  bb B  bb

Prof. Busch - LSU 9


S  aA | aB
S  aA | aB | aA
Aa Substitute
BA Aa
BA
B  bb
B  bb

Prof. Busch - LSU 10


Remove repeated productions

Final grammar
S  aA | aB | aA S  aA | aB
Aa Aa
B  bb B  bb

Prof. Busch - LSU 11


Useless Productions

S  aSb
S 
SA
A  aA Useless Production

Some derivations never terminate...

S  A  aA  aaA    aaaA  
Prof. Busch - LSU 12
Another grammar:

SA
A  aA
A
B  bA Useless Production
Not reachable from S

Prof. Busch - LSU 13


In general:

If there is a derivation
S    xAy    w  L(G)
consists of
terminals
Then variable A is useful

Otherwise, variable A is useless

Prof. Busch - LSU 14


A production A  x is useless
if any of its variables is useless

S  aSb
S  Productions
Variables S  A useless
useless A  aA useless
useless B  C useless

useless C  D useless
Prof. Busch - LSU 15
Removing All
Step 1: Remove Nullable Variables

Step 2: Remove Unit-Productions

Step 3: Remove Useless Variables

This sequence guarantees that


unwanted variables and productions
are removed
Prof. Busch - LSU 16
Normal Forms
for
Context-free Grammars

Prof. Busch - LSU 17


Chomsky Normal Form

Each production has form:

A  BC or Aa

variable variable terminal

Prof. Busch - LSU 18


Examples:

S  AS S  AS
S a S  AAS
A  SA A  SA
Ab A  aa
Chomsky Not Chomsky
Normal Form Normal Form

Prof. Busch - LSU 19


Conversion to Chomsky Normal Form

Example: S  ABa
Not Chomsky
A  aab Normal Form
B  Ac

We will convert it to Chomsky Normal Form

Prof. Busch - LSU 20


Introduce new variables for the terminals:
Ta , Tb , Tc
S  ABTa
S  ABa A  TaTaTb
A  aab B  ATc
B  Ac Ta  a
Tb  b
Tc  c
Prof. Busch - LSU 21
Introduce new intermediate variable V1
to break first production:
S  AV1
S  ABTa
V1  BTa
A  TaTaTb
A  TaTaTb
B  ATc
B  ATc
Ta  a
Ta  a
Tb  b
Tb  b
Tc  c
Tc  c
Prof. Busch - LSU 22
Introduce intermediate variable: V2
S  AV1
S  AV1
V1  BTa
V1  BTa
A  TaV2
A  TaTaTb
V2  TaTb
B  ATc
B  ATc
Ta  a
Ta  a
Tb  b
Tb  b
Tc  c
Prof. Busch - LSU Tc  c 23
Final grammar in Chomsky Normal Form:
S  AV1
V1  BTa
A  TaV2
Initial grammar
V2  TaTb
S  ABa B  ATc
A  aab Ta  a
B  Ac Tb  b
Prof. Busch - LSU
Tc  c 24
In general:

From any context-free grammar


(which doesn’t produce  )
not in Chomsky Normal Form

we can obtain:
an equivalent grammar
in Chomsky Normal Form

Prof. Busch - LSU 25


The Procedure

First remove:
Nullable variables
Unit productions
(Useless variables optional)

Prof. Busch - LSU 26


Then, for every symbol a:
New variable: Ta
Add production Ta  a

In productions with length at least 2


replace a with Ta

Productions of form A  a
do not need to change!
Prof. Busch - LSU 27
Replace any production A  C1C2 Cn

with A  C1V1
V1  C2V2

Vn2  Cn1Cn

New intermediate variables: V1, V2 , ,Vn2


Prof. Busch - LSU 28
Observations

• Chomsky normal forms are good


for parsing and proving theorems

• It is easy to find the Chomsky normal


form for any context-free grammar
(which doesn’t generate )

Prof. Busch - LSU 29

You might also like