You are on page 1of 3

Closure Properties of Context Free Languages

Union of CFLs
Let L1 be language recognized by G1 = (V1, T1, S1, P1) and L2 the language recognized
by G2 = (V2, T2, S2, P2)
Is L1 U L2 a context free language? Yes.
Just add the rule S → S1|S2
But make sure that V1 ∩ V2 = ∅ (by renaming some variables). So formally G has
• V = V1 U V2 U {S}
• T = T1 U T2
• P = P1 U P2 U {S → S1|S2}

Concatenation

Let L1 be language recognized by G1 = (V1, T1, S1, P1) and L2 the language recognized
by G2 = (V2, T2, S2, P2)
Concatenation: L1L2 generated by adding S → S1S2
Ie.,
• V = V1 U V2 U {S}
• T = T1 U T2
• P = P1 U P2 U{S → S1S2}
As before, ensure that V1 ∩ V2 = ∅, S is a new start symbol.

Kleene Closure

Let L1 be language recognized by G1 = (V1, T1, S1, P1) and L2 the language recognized
by G2 = (V2, T2, S2, P2)
Kleene Closure: L∗ generated by adding S → S1S|ǫ
Ie.,
• V = V1 U V2 U {S}
• T = T1 U T2
• P = P1 U P2 U { S → S1S|ǫ }
As before, ensure that V1 ∩ V2 = ∅, S is a new start symbol.

Homomorphism
Proposition: Context free languages are closed under homomorphisms.
Let G = (V, T, S, P) be the grammar generating L, and let h : T∗ → Σ∗ be a
homomorphism.
h(G) is a new grammar over terminals Σ, where the productions are obtained by taking
the productions of G and replacing each symbol a ε Σ by h(a).
h(G) generates h(L).
Example: S → 0S0|1S1|ǫ and h(0) = aba and h(1) = bb. The h(G) has the following rules:
S → abaSaba|bbSbb|ǫ

Substitution
Recall, Substitution means we associate each symbol a with a language La and the image
of L under the substitution is defined as {w1w2 …wn | ¥ a1a2 …an Є L and wi Є Lai}
• Regular substitution: Every La is a regular language
• CFL Substitution: Every La is a CFL
• Homomorphism is a special case of substitution
Proposition: Every CFL L is closed under CFL substitution.
Proof: Let L be generated by G, and let La be generated by
Ga = (Va, Ta, Sa, Pa). Then in each production of G, replace the symbol a by Sa
• This allows any word in La to substitute symbol a
• It must be the case that the sets Va and V (variables of G) are all disjoint.

Intersection

Let L1 and L2 be context free languages. L1 ∩ L2 is not context free!!

L1 = {aibicj | i, j ≥ 0} is CFL
S → XY
X → aXb|ǫ
Y → cY |ǫ

L2 = {aibjcj | i, j ≥ 0} is a CFL
S → XY
X → aX|ǫ
Y → bY c|ǫ
But L1 ∩ L2 = {anbncn | n ≥ 0} is not context free.

Intersection with Regular Languages

Proposition: If L is a CFL and R is a regular language then L ∩ R is a CFL.

Proof: Let P be the PDA that accepts L by final state, and let M be the DFA that accepts
R. The PDA recognizing L ∩ R simulates P and R simultaneously, and accepts if both P
and M accept the input.
• Like in the “cross-product” construction, states of the new machine are pairs of
states, where one element of the pair corresponds to the state of P and the other
corresponds to the state of M.
• Whenever, an input symbol is read, both P and M are simulated; if P needs to
make an ǫ(null) move then M remains stationary.
Complementation

Let L be a context free language. Is L context free? No!

Reason 1: If it were closed under complementation, then by De Morgan’s Law, and by


the fact that CFLs are closed under union, CFLs would be closed under intersection.

Reason 2: L = {x | x not of the form ww} is a CFL. But L = {ww | w Є {a, b}∗} is not a
CFL.
If L were a CFL, then L′ = L ∩ a∗b∗a∗b∗ = {aibjaibj | i, j ≥ 0} would be a CFL. But L′ is
not a CFL!

Set Difference

If L1 is a CFL and L2 is a CFL then L1 \ L2 is not a CFL


• CFL not closed under complementation, and complementation is a special case.
If L is a CFL and R is a regular language then L \ R is a CFL
•L\R=L∩R

Inverse Homomorphisms

Recall, let L be a language and h a homomorphism. The h−1(L) = {w | h(w) Є L}


Proposition: If L is CFL then h−1(L) is CFL.
Proof: For regular languages, the DFA for h−1(L) on reading a symbol a, simulated the
DFA for L on h(a). Can we do the same with PDAs?
The problem is that we need to ensure that the pushes and pops on the stack are done
correctly, when we simulate the PDA on h(a).
The key idea is to store the h(a) on a “buffer” and process symbols from h(a) one at a
time, and the next input symbol is processed only after the “buffer” has been emptied.
Where do we store this “buffer”? In the state of the new PDA!

You might also like