You are on page 1of 14

UNIT – III

PUSH DOWN AUTOMATA


DEFINITIONS
Ø Every regular language or regular grammar has an equivalent finite state
automaton.
Ø This automaton corresponding to a CFL is known as a pushdown
automaton (PDA).
Ø The pushdown automaton is a non-deterministic version of the PDA
accepts only the subset of CFL.
Ø Hence here the correspondence between the automat and the set of
language is not very satisfactory.

Ø The pushdown automaton consists of a input tape, a finite control and a


stack. (A first in last out).
Ø The specialty of a stack is that addition (and deletion) of symbols to (or
from) the stack can be done only at the top of the stack.
Ø Such a stack along with the finite control can be used to recognize non-
regular languages.
Ø Consider the CLF.

L = { wc wr| w belong to (a | b) * }

Ø We can show that the language LS is not regular, with the help of pumping
Lemma. Further more language L is CLF governed by the CFG

S -> a S a
S -> b S b
S -> c

Let us construct a PDA with two states q1 and q2. The device will operate by
the following rules.

1) The machine starts with a Z on the stack and with the finite control in
state q1.

2) The device is in state q1


i) If input to the device is a then A is pushed in to the stack.
ii) If input to the device is b then B is pushed in to the stack.

3) If the input to the device is C and the drive in state q1, then the state is
changed to q2.

4) If the device is in the state q2,


i) the input is a and the top of the stack is an A, then A is removed and
control remains in q2.

ii) The input is b the top of the stack is a B, then the B is removed and
control remains in q2.

5) If the device is in q2 and top of the stack is Z, then the Z is removed from
the stack.

6) For all cases other than those described above, the device can make no
move.

Ø This device accepts an input string if on processing the last symbol of the
input, the stack becomes empty.

Ø The device operates in the following manner. In state q, the device


transfers the input string to the stack.

Ø When it reads the C then it changes to state q2. In state q2 the device
compares the remaining input with the symbols in the stack, if they are
identical then the stack is emptied and the input is accepted. In all other
cases the input is rejected.

There are two different ways, in which PDA accepts a CFL,

i) Acceptance by empty stack ie., input is accepted if stack is empty


after processing the last input symbol

ii) Acceptance by final state ie., input is accepted if the finite control
is in final state on processing the last input symbol.

Now we shall formally design a PDA. M is a system (Q, S,G,d,q0,Z0.F) where


1) Q is a finite set of state,
2) Σ is an alphabet called input alphabet,
3) Γ is an alphabet called the stack alphabet,
4) q0 is Q in the initial state,
5) Z0 in Γ is a particular stack symbol called the start symbol,
6) F< Q is the set of final states and
7) D is a mapping from Q x (Σ U {ε}) x Γ to finite subsets of Q x Γ*
MOVES
Here the function δ represents the moves of PDA.

Consider δ (q,a,z) = {(p1, γ1), (p2, γ2) ………… , (pm, γm)}

Where q and pi 1 < i < m or states of the finite control, with input symbol
a and z at the top of the stack.

Ø Then by the function δ moves to the state pi for any i 1 < i < m and replaces z
with the string i.

Ø After this the input head is advanced. This is known as a move.

Now let us design a formal PDA that accepts the language {wcwR| w in (a+b)*}

M = ({q1,q2}, {a,b,c}, {Z,A,B}, δ, q1, Z, a)

δ (q1, a, Z) = {(q1, AZ)}


δ (q1, a, A) = {(q1, AA)}
δ (q1, a, B) = {(q1, AB)}
δ (q2, a, A) = {(q2, ε)}
δ (q2, b, B) = {(q2, ε)}
δ (q2, ε, Z) = {(q2, ε)}
δ (q1, c, Z) = {(q2, Z)}
δ (q1, c, A) = {(q2, A)}
δ (q1, c, B) = {(q2, B)}
δ (q1, b, Z) = {(q1, BZ)}
δ (q1, b, A) = {(q1, BA)}
δ (q1, b, B) = {(q1, BB)}
INSTANTANEOUS DESCRIPTION (ID)
Ø Instantaneous description of PDA is used to formally describe the
configuration of that PDA at a particular instant.

Ø The ID contains the state in which the PDA is in at that moment q, the
string of unexpanded input symbol w and the string being held in the
stack γ.
Ø Thus we define the IP to be a triple (q, w, γ). If m = (Q, Σ,Γ,δ _T0 ,Z0, F)
is a PDA then we say
(q, aw ,z , α) |-- (p, w, βα) if d(q, a, z)contains (p, β) i.e. the ID (p,w, βα) can be
reached from the ID (q, aw, Zα) in a single move of PDA.

Ø We use |--* for the kleen closure of |-- (ie.) for every ID I I |-- I and for Ids
I,J,K I |-- J and J |-- K implies I |-- K and I |--* J means ID J can be
reached from ID I by zero or more number of moves of the PDA.

DETERMINISTIC PDAs
A PDA is said to be deterministic if at the most one or move is possible for and
ID. Formally we say that a PDA is deterministic if

i) For each q in Q and Z in G, whenever δ (q, ε, Z) is non-empty,


then δ (q, a, z) is empty for all a in __

ii) For each q in Q, Z in G and a in S U {ε}, d (q, a, Z) can contain at


the most only one element

i.e. We say a PDA is deterministic if it does not have a choice of two different
moves for the same ID.

Ø Here the first condition ensures that there is no choice between a move
independent on the input tape and a move involving an input symbol a.

The second condition prevents a choice of move for any (q, a, Z).

Ø We know that for a FSA the deterministic and noon deterministic versions
were equivalent.

Ø No extra power is added top the FSAs in adding non-determinism. All


languages accepted by a NFA can be accepted by DFA. This is not true
for the PDAs.
Ø A language that is accepted by an non-deterministic PDA need not be
accepted by a deterministic PDA.

Ø In fact the language consisting of all palindromes over alphabet Σ is


accepted by a non-deterministic PDA.

Accepted languages:
We have seen that a PDA may accept a string by ending up in one of the
final states or by ending up with an empty stack. So correspondingly for

PDA M = (Q, Σ,Γ,δ,q0,z0,F) we define L(M), the language accepted by


final state to be

{w |(q0,w,z0) |--* (p, ε,γ) for some p in F and g in G*}

and we define N(M), the language accepted by empty stack to be

{w | (q0,w,z0) |--* (p, ε, ε) for some p in Q}.

PUSHDOWN AUTOMATA AND CONTEXT-FREE LANGUAGES


Ø We shall now prove the fundamental result that the class of languages
accepted by PDA's is precisely the class of context free languages.

Ø We first show that the languages accepted by final state are exactly the
languages accepted by PDA's by empty stack.

Ø We then show that the language accepted by empty stack is exactly the
context - free languages.

Equivalence of acceptance by final state and empty stack

Theorem If L is L (M2) for some PDA M2 then L is N (M1) for some PDA, M1.

Proof
we would like M1 to simulate M2 enters a final state.

We use state q of M1 to erase the stack and we use a bottom of stack marker
Xa for M1 does not accidentally accept if M2 empties its stack without entering
a final state.

Let M2 = (Q,∑,T,b, q Zo, F) be a PDA such that L =L(M3 )


Let, M1={QU{qe,q0},∑,ΓΥ{X0},δ',q0,X0, φ),

Where δ’ is defined as follows

1) δ’(q0’, ∈,X0)={(q0,Z0,X0)}
2) δ '(q,a,z ) includes the elements of δ ( q,a,z ) for all q in Q ,
a in ∑ or a= ∈ ,and Z in Γ
3) For all q in F, and in ΓU{Xn}, δ ( q, ∈,z ), contains (q∈,∈ ).
4) For all Z in ΓU{Xn} contains (q∈,∈ ).

Ø Rule (1) causes M1 to enter the initial ID of M2 except that M1 will have its
own bottom of the stack marker Xo , which is below the symbols of M2's
stack.

Ø Rule (2) allows M1 to simulate M2. Should M2 ever enter a final state, rules
(3) and (4) allow M1 the choice of entering stae qo and crasing its tack,
thereby accepting the input, or continuing to simulate M2.

Ø One should note that M2 may possibly erase its entire stack for some input x
not in L(M2).

Ø This is the reason that M1 has its own special bottom-of-stack marker.

Ø Otherwise M1 in simulating M2 would also erase its entire stack, thereby


accepting x when it should not.

Ø Conversely, if M1 accepts x by empty stack, it is easy to show that the


sequence of moves must be one move by rule (1), then a sequence of moves
by rule (2) in which M1 simulates acceptance of x by M2 followed by the
erasure of M1 s stack using rules (3) and (4) Thus x must be in L(M2).

Theorem
If L is N (M1) for some PDA M1, then L is L (M2) for some PDA M2.
Proof Our plan now is to have M2 simulate M1 and detect when M1 empties its
stack, M2 enters a final state when and only when this occurs.
Let M1=(Q,∑,Γ, δ,θ0,Ζ0, φ)

M2=(Q∪,{ q0,q1},∑ , Γ ∪ {X o},δ,q0,X0,{qf})

Where δ’ is defined as follows:

1) δ’(q’o,∈,Xo) = {{qo,ZoXo)}
2) For all q in Q, a in ∑∪{∈}, and Z in Γ.
δ’(q,a,Z) = δ{q,a,Z)
3) For all q in Q, δ’ (q,∈, Xo) contains (qf,∈),

Ø Rule (1) causes M2 to enter the initial ID of M2.

Ø Except that M2 will have its own bottom of stack marker X0, which is
below the symbols of M2's stack.

Ø Rule (2) allows M2 to stimulate M1 to stimulate M1 Should M1 ever


erase its entire stack, then M2 when simulating M1 will erase its
entire stack except the symbol Xo at the bottom.

Ø Rule (3)causes M2when the Xo appears, to enter a final state,


thereby accepting the input x.

EQUIVALENCE OF PDA'S AND CFL'S

Theorem:
If L is a context-free language, then there exists a PDA M such that L =
N(M)

Proof :
we assume that is not in L(G). The reader may modify the construction
for the case where ∈ is in L (G).

Let G = (V,T,P,S) be a context -free grammar in Grammar in Grelbach normal


form generating L

Let M = ({q},T,V,δ,q,S,φ),

Where
δ {q,a,A} contains (q,γ) whenever A→ aγ is in P.

Ø The PDA M simulates left most derivations of G. Since G is in Greibach


normal form,

Ø each sentential form in a leftmost derivation consists of a string of terminals x


followed by a string of variables α.

Ø M stores the suffix α of the left sentential form on its stack after processing
the prefix x. Formally we show that S xα ⇒ by a leftmost derivation if and only

if (q,x,S)

since x =∈ and α=S, For the induction, suppose i ≥ 1, and let x = ya.

Ø If we remove a from the end of the input string in the first j ID's of the
sequence we discover that (q,y,S)

(q, ∈,,β) ,since α can have no effect on the moves of M until it is actually
consumed from the input. By the inductive hypothesis S⇒yβ. The move
(q,α,β)|-(q,∈α) implies that β = Ay for some A in V, A→ aη an is a production of
G and α=ηy.

Hence
S⇒yβ⇒ yaηγ =xα.

Now suppose that S


i →xα by a leftmost derivation. We show by induction on i that (q,x,S)
*
(q,∈,α).The basis, i = 0 ,is again trivial.Let i ≥ 1 and suppose

1 i− →yAγ ⇒ yaηγ,

Where x = ya and α = ηy. By the inductive hypothesis (q,y,S)

(q,∈,Aγ) and thus (q,ya,S)

(q,a,Aγ). Since A→aη is a production,it follows that δ(q,a,A) contains (q,η).Thus


(q,x,S)

(q,a,Aγ)-(q,∈,α),
and the “only if” portion of (5.1) follows.
Ø To conclude the proof ,we have only to note that (5.1) with α = ∈says S x ⇒ if
and only if (q,x,S)

That is x is in L (G) if and only if x is in N (M).

Theorem:
If L is N (M) for some PDA M, then L is a context - free language.

Proof:
Let M be the PDA (Q,∑,T,δ,qo,Zo,φ). Let G=(V,∑,P,S) be a context-free
grammar where V is the set of objects of the form (q,A,p} q and p in Q, and A in
Γ, plus the new symbol S. P is the set of productions.

for each a, q1,q2,…..qm+1 in Q, each a in Σ ∪ {ε}, and A,B1,B2,….,Bm in Γ,

such that δ(q,a,A) contains (q1,B1,B2…..Bm)

.(If m=0, then the production is [q,A,q1]→a.)

Ø To understand the proof it helps to know that the variables and productions of
G have been defined in such a way that a leftmost derivation in G of a
sentence x is a simulation of the PDA M when fed the input x.

Ø In particular, the variables that appear in any step of a leftmost derivation in G


correspond to the symbols on the stack of M at a time when M has seen as
much of the input as the grammar has already generated.

Ø Put another way, the intention is that [q.A,p] derive x if and only f x causes M
to erase an A from its stack by some sequence of moves beginning in state q
and ending in state p.

Ø The string y cn be written y=y1,y2 …..y3 where y1 has the effect of popping
Bj from the stack, possibly after a long equence of moves.

Ø That is, let y1 be the prefix of y at the end of which the stack first becomes as
short s n - 1 symbols.

Ø Let y2 be the symbols of y following y1 such that at the end of y2 the stack
first becomes as short as n-2 symbols, and so on.
Ø The arrangement is shown,Note that B1 need not be the nth stack symbol
from the bottom during the entire time y1 is being read by M since B3 may be
changed if it is at the top of stack and is replaced by one or more symbols.

Ø However, none of B2 B2 …B3 are ever at the top while y3 is being read and
so cannot be changed or influence the computation.

In general , Bj remains on the stack unchanged while y1 y2….yj is read.

Ø The basis j=1, is immediate , since (q,A,p}→x must be a production of G and


therefore δ (q,x,A)must contain (p,}, Note x is ∈ or in ∑ here.

For the induction, suppose

Where q = p. Then we may write x = ax1x2 ….x3 where for with each derivation
taking fewer than j steps.

Ø By the inductive hypothesis If we insert Bj +1 at the bottom of each stack in


the above sequence of ID's we see that From the first step in the derivation of
x from {q,A,p} we know that Is a legal move of M so from this move and
(5,4)for =1,2,….n(q,x,A) follows:

Ø The proof concludes with the observation that (5.3) with q = qo and A=Zo
says
This observation, together with rule (1) of the construction of G, says that
For some state p.

That is x is in L(G) if and only if x is in N(M).

3.7 PUMPING LEMMA

Ø We have seen the pumping Lemma corresponding to regular languages. It


was stated that every sufficiently long string in a regular language contains a
substring that can be pumped.

(i.e) the substring can be repeated as many as times as possible such that the
resulting string still lies in the regular languages.

Ø The pumping Lemmar for a CFL states that every long string of a CFL
contains of two short sub-string that can be repeated the same number of time
and still the resulting string lies in the CFL.

The formal statement of the pumping Lemma is as follows:


Let L be any CFL. There is a constant n , depending only on L is in L 1z1 > n
,then we may write z = uvwxy such that
i) 1vx1 > 1
ii) 1vwx1 < n and
iii) u vi w xi y is in l for all I less than or equal to 0

Proof:
Let G be a CFG in CNF.

First we show that if the parse tree for a word generated by a CNF grammar
has no path of length greater than I, then the word is of length no greater than
2i –1 We use induction oni to prove this.

The case for i = 1 is trivial. We know that every production in a CNF is of the
form
A -> a or A -> BC

So the tree for I= 1 is trival. Should be of the form


S
|
a

(i.e) for I=1 the length of the word generated is at the most |. So it is true for I=1.
Assume that this is true for all trees with no path greater than I-1 can generate
no word with length greater than 2i-2.

For the induction step consider the tree


S
/\
AB
T1 T2

If there are no paths of length greater than I-1 then by the assumption the
words generated by the trees T1 and T2 are no longer than 2i-1 (2i-2 * 2).

Hence a word generated by a parse tree with no path greater than ithen the
word is no longer than 2i-1.

Let the grammar have K non-terminal symbols and let n = 2k. If ZεΤ∗ is in L(G)
and |Z| less than or equal to n, then since|Z| >2k, any parse tree for Z must
contain at least one path of length K+ 1. Such a path would have at least K+2
nodes that are labeled by non-terminals.
But the grammar now only K non-terminals, so there should be at least one
terminal is repeated twice. Let P be a path that is as long or longer than any
other path in the tree. Then there must be two nodes n1 and n2 along P such
that
i) The nodes n1 & n2 both have same label, say A.
ii) The node n1 is closer to the root than n2
iii) The portion of the path from n1 to the leaf is of length at most K+1.

The subtree T1 with root as n1represents the derivation of subtree of length


greater than 2K. This is because no path in T1 can be of length greater than
K+1.

Let Z1 be the yield of the sub tree T1 and if T2 is the sub tree with root as node
n2 and z2 be the yield of this sub tree.

Now we can write Z1as Z3 Z2 Z4. We can see that Z3 and Z4 cannot be both ε,
since the root of sub tree T1 is A and the first production should have been of
the form A -> BC

And the sub tree T2 should have been derived entirely from B or C.

Now we know that

A z3Az4 and A z2 and

|z3z2z4| < 2K = n.

We can see that

A⇒z3Az4 z3z3Az4z4 Z3 Z3 Z3 A Z4 Z4 Z4 ….

Now Z3Z2Z4 was a substring of Z so

Z = uZ3Z2Z4y and
if
Z2 = w,
Z3 = v and
Z4 = x then
We have
uvwxy ε L(G)
uviwxiy ε L(G)

When
|uvwxy| > n
|vwx| < n
|vx| > 1
PART A

1. Define PDA.
2. Write the Components of PDA.
3. Write the formal representation of PDA
4. Give the diagrammatic representation of PDA
5. What are the three ways to recognize PDA?
6. Give informal representation of PDA for the language L= {0n1n| n>=0}
7. Give the mathematical model of PDA for the language L= {0n1m|
n>=0,m>=0,m!=n}
8. Write instantaneous representation of PDA.
9. What is relation between PDA’s and CFL?
10. What is relation between NPDA and DPDA
11. Write the closure properties of CFL.
12. Define Pumping lemma for regular language.
13. Show the language L={anbncn:n>=0} is not context free.
14. Construct a PDA that accept the language generated by the grammar
SàaSbb|aab.

PART-B

1. Write the mathematical model of the language L={0n1n|n>=0}


2. Write the mathematical model of the language L=wcwt
3. Construct PDA for L={anb2n:n>=1}
4. Construct PDA for L={anb3n:n>=1}
5. Construct a PDA equivalent to the grammar SàaAA AàaS|bS|a
6. Construct a PDA equivalent to the grammar SàaAA AàSA|b
n 2n
7. Construct a PDA accepting the following language L={a b : n>=1}
8. Construct a PDA accepting {anbman :n>=1,m>=1,m!=n} by empty store and
reaching the final state.
9. Construct a PDA accepting the following language generated by the grammar
G=({S,A},{a,b},S,P) with the productions SàAA|a, AàSA|b

You might also like