Pec 31 Acd Material

PARSING:-
Even though a complier may not actually construct a parse tree a par
ser must be capable of constructing the tree.
The top-down construction of a parse tree is done starting with the
root, labeled with the starting non-terminal, and repeatedly performing the foll
owing two steps….
1. At node n, labeled with the non-terminal A, select one of the production
s for A and construct children at n, for the symbols on the right side of the pr
oduction.
2. Find the next node at which a sub tree is to be constructed.
a) <type>
b) <type>
Array [<sample>] of <type>

c) <type>
Num dotdot num

d) <type>
Num dotdot num <sample>

e)
<type>
Num dotdot num <sample>

Inte
ger
For some grammar, the above steps can be implemented during a single lef
t-to-right scan of the input string.
The current token being scanned in the input is frequently referred as t
he look ahead symbol.
Initially, the look ahead symbol is the first, i.e., left most token of
the input string.
In general, the selection of a production for a non-terminal may be in t
he involvement of trail-and-error, i.e., we may have to try a production and bac
ktrack to try another production if the first is found to be unsuitable.
A production is unsuitable; we cannot complete the tree to match the inp
ut string.
There is an important special case; however called predictive parsing, i
n which back tracking does not occur.
PREDECTIVE PARSING:-
Recursive –descent parsing is a top-down method of syntax analysis in wh
ich we execute a set of recursive productions to process the input.
A procedure is associated with each non-terminal of the grammar.
The predictive parser consists of a procedures for the non-terminal <typ
e> and <sample> of grammar and an additional procedure, match.
MATCH:-
Match is used to simplify the code for <type> and <simple> it advance to the nex
t input token if it’s the argument t matchs the look ahead symbol
procedure match(t:token)
begin
if look ahead=t then
lookahead=next token
else
error
end
procedure type
begin
if lookahead is in{integer,char,num}them
simple
else
if lookahead=’^’then begin
match(^);match(id);
end
else
if lookahead=array then begin
match(array);match(‘[‘);
simple;
match(‘]);match(of);
type
end
else
error
end
procedure simple
begin
if lookahead=integer then
match(integer)
else
if lookahead=name then begin
match(name);match(dot dot);match(num);
end
else
error
end
tokens,pattern,lexems:-
->in general there is a set of strings in the input for which The same token is
procedured as output
->this set of string is described by a rule called a pattern Or (R.E)associated
with the token
->a lexeme is sequence of character in the source program that is matched by the
pattern for a token
->for eq:
Cost pi=3.1416
The substring pi is lexeme for the token identifier
->tokenscan be treated as terminal strings in the grammar For the source languag
e
->the lexeme mtched by the pattern for the token represent String of charcters i
n the source program that can be Treated as a lexical unit. In most programing l
anguages the following costructs are treated as tokens
Keywords, operators identifers, constants, literal strings, functions sy
mbols such as parantheses, commas and semicolons
In the aboveeq:when the character sequence pi appears in The source program a
token representing an identifier Is returned to the parser
->the returning of token is often implemented by passing an integer correspondin
g to the token
->A pattern is a rule describing the set of lexemes that can Repersent a particu
lar token in source program.
Parsing:
………
àparsering is the process of determing if a string of tokens can be generated by
a grammar
àA parser can be constructed for any grammar
àFor any CFG there is a parser that takes at most6 times to parse a string of n
tokens
àBut it is too expensive
àgiven a programming language generaly a grammar can be constructedthatcan be p
arsed quickly
àlinear algorithms suffice to parse essentially all languages that arise in prac
tice
àprogramming language parsers almost always make a signle left to right scan ove
r the input looking aheadb on looking ahead one token at atime
àmost parsing methods fall into one of two classes called the top down and botto
m up methods
-->top down methods starts at the root and proceeds towards the leaves while the
bottom up construction starts at the leaves and proceeds towards the root
Top-bottom parsing:-
àfor eq:the foloowing generats of the subsets of the types of pascal and assume
that dot dot is the token to emphasize
That the character sequence is treated as limit
Typeà<simple>/^id/array[<simple>]of <type> <simple>àinteger/char/num dot dot num
NON RECRUSSIVE PREDICATE PARSING:
A nonrecursive predicate parsing can be built by maintaining a stack explicitly,
rather than implivity via recursive calls.
The key problems during presdicate parsing is that of determining the production
to be apllied for a nontrminal.
The nonrecursive parse model is illustrated as
input
a + b $
X
Y
Z
$
Predicative parsing programme
Parsing table
M
It looks the produvtion to be apllied in a parsing table
A table_driven predicate praser has an input buffer,a stack,a parsing table,and
an output stream.
The input buffer contains the sting to be parsed,followed by $ ,a symbol used as
a right end marker to indicate the end of the input string.
The stack contains a sequence of grammar symbol with $ on the bottom,indicating
the bottom of the stack.
Initially,the stack contains the start symbol of the grammar on top of $.
The parsing table is a two-dimensional array M[A,a] ,where A is non-terminal and
a terminal or the symbol$.
E→
E+T/E-T/T
T→T*F/T/F/F
F→(E)/id
E→TE’
E’→+TE’/-TE’/e
T→FT’
T’→
*FT’/1FT’/e
The parser is controlled by a program that behaves as follows:
The program determines the action based on the symbols X on top stack, and ‘a’ t
he current input symbol.
The action as follows:
1. If X=a=$, then parser halts and announces successful completion of parsi
ng.
2. If X=a≠$, then the parser pops X off the stack and advances the input po
inter to the next input symbol.
3. If X is a non-terminal, the program consults entry M[X, a] of the parsin
g table M.
This entry may be X-production of grammar of an error. If the entry M[X, a] cont
ains X→UVW, the parser replaces X on the top of the stack by WVU (with U on top)
. IF M[X, a]=error, the parser calls an Error recovery Routine.
Algorithm is as follows: (Non recursive predicting parser)
Input: A string W and a parsing table M for grammar G
Output: IF W is in α (G), leftmost deriv tion of W, otherwise n error
indic tion.
Method: Initi lly the p rser is in $S on the st ck, where S is st rt symbol nd
W$ in the input buffer.
1. Set ‘ip’ to point to the first symbol of W$.
2. Repe t
Let X be the top of st ck symbol nd the symbol pointed to by ‘ip’
If X is termin l or $ then,
If X= then
POP X from st ck nd dv nce ‘ip’
else
error ()
else
if M[X, ]=X Y1Y2……..Yk then
{
POP X from the st ck
PUSH Yk,Yk-1,……Y1 on to st ck, with Y1 on top.
Output the production X Y1Y2…..Yk
}
else
error()
until X=$
The beh vior of the p rser c n be described in terms of its configur tion.
For ex mple: The p rse t ble is s follows
NON
TERMINAL INPUT SYMBOL
Id + * ( ) $
E E TE E TE
E E +TE E € E €
T T FT T FT
T T € T *FT T € T €
F F id F (E)
The configur tions of p rser with imp ct string id+id+id$ nd st ck $E re s f

ollows
St ck input output
$E id+id*id$
$E’T id+id*id$ E TE’
$E’T’F id+id*id$ T FT’
$E’T’id id+id*id$ F id
$E’T’ +id*id$
$E’ +id*id$ T’ E
$E’T+ +id*id$ E’ +TE’
$E’T id*id$
$E’T’F id*id$ T FT’
$E’T’id id*id$ F iD
$E’T’ *id$
$E’T’F* *id$ T’ *FT’
$E’T’F id$
$E’T’id id$ F id
$E’T’ $
$E’ $ T’ E
$ $ E’ E
Constr ction of p rse t ble:-

The constr ction of predictive p rser is ided by two functions ssoci ted
with gr mm r G ,n mely,FIRST nd FOLLOW
FIRST ( lph ) is the set of terimin ls begin the strings derived from ,where
lph E(NuT)*
If lph *=>E,then E is lso in FIRST( lph )FOLLOW(A),for non-terimin l A,
is the set of terimin ls th t c n ppe r immedi tely to the right of A in some
sententi l form ie.,the set of terimin ls such th t there exists deriv tion
of the forms*=> lph A root3 for some lph nd root3
If Ac n be the right most symbol in some sententi lform then $ is n F
OLLOW(A) .
the rules for computing FIRST(X) for ll gr mmer symbols re s follows
1.if X is terimin l,then FIRST(x) is {X}.
2.if X E is production then ddd E to FIRST(X).
3.if X is non terimin l nd X Y1, Y2…..Yk is production then pl ce in FIRST(X
) if for some I, is in FIRST(Yi) nd E is in ll of FIRST(Y1),…..FIRST(Yi-1)i.e;
Y1,Y2,Y3…..Yi-1 *=>E
If E is in FIRST(Yj) for ll j=1,2,3….k, then ddE then FIRST(X).
If y1 does not derive E, then simply the terimin ls of FIRST(Y1) re dded to FIR
ST(X).
If Y1*=>E, then FIRST (Y2) re dded to FIRST(X) nd soon.
the rules for computing FOLLOW(A) for ll non termni ls re s follows.
1.pl ce $ in FOLLOW(S). where s is the stsrt symbol nd $ is the input right end
m rker.
2.if there is production A lph Bbit , then every thing in FIRST(BITA) except
for E is pl ced in FOLLOW(B).
3.if there is productinn A lph B,or production A lph B, where FIRST(BITA)
cont ins E(i.e;bit *=>E), then every thing in FOLLOW(A) is in FOLLOW(B).
For ex :
Consider the CFG
E → TE′
E′→+TE′/€
T→FT′
T′→*FT’/€
F→ (E)/id
THEN FIRST(+)={+},FIRST(*)={*},FIRST(id)={id},
FIRST(‘(‘)={‘(‘},FIRST(‘)’)={’)’},
FIRST(E)={‘(‘,id}
FIRST(T)={‘(‘,id}
FIRST(F)={‘(‘,id}
FIRST(E’)={+,€}
FIRST(T’)={*,€}
FOLLOW(E)={‘)’,$}
FOLLOW(E’)={‘)’,$}
FOLLOW(T)={+}èFOLLOW(T)={+,’)’,$}
FOLLOW( T’)={+,)’’,$}
FOLLOW(F)={*}èFOLLOW(F)={*,+,’)’,$}
Algorithm for construction of predic te p rsing t ble:
Input: Gr mm r G
Output: p rsing t ble M
Method
1.For e ch production A→α of the gr mmer,do step 2&3
2. For e ch termin l in FIRST(α), dd A→α to M[A, ]
3. If € is in FIRST(α), dd A→α to M[A,b] for e ch termin l b in FOLLOW(A)
If € is in FIRST(α), nd $ is in FOLLOW(A), dd A→α to M[A,$]
4.M ke e ch undefined entry of M is the error
→This lgorithm c n be pplied to the bove gr mm r
→first E→TE’
Since FIRST(TEE’)=FIRST(T)={‘(‘,id}
The production E→TE’ is dded in M[E,(] & M[E,id]
Simil rly
E’→+TE’ c uses M[E’,+]
E’→€ c uses M[E’,)] nd M[E,(] SINCE FOLLOW(E’)={),$}
T→FT’
Since FIRST(FT’)=FIRST(F)={(,id}
The production T→FT’ is dded M[T,(] nd M[T’,$]
T’→*FT’ c uses M[T’,*]
T’→€ c uses M[T’,)], M[T’,+] AND M[T’,$]
Since FOLLOW(T’)={+,),$}
F→(E) c uses M[F,(] nd
F→id c uses M[F,id]
The t ble is s follows
INPUT SYMBOL
Non termin l id + * ( ) $
E E→TE′ E→TE′
E’ E′→+TE′ E′→€ E′→€
T T→FT′ T→FT′
T’ T′→€ T′→*FT′ T′→€ T′→€
F F→id F→(E)
S→iEtss’/ FIRST(s)={I, }
S’→es/E FIRST(s’)={e,E}
E→b FIRST(E)={b}
FIRST(i)={i}
FIRST( )={ }
FIRST(b)={b}
FIRST(t)={
FOLLOW(S)={e}
S I
s→iEtSS’’ t s→ b $
S’ S’→es
S’→>e S’→e
E ---------
FOLLOW(S’)={E.$}
FOLLOW(E)=t
*→ if G is left recursive or mbigious ,then M will h ve t le st one multiply

definedentry
*in the bove M[S’,e] cont ins both s’→es nd s’→e,since FOLLOW(s’)={e.$}
LL(!) GRAMMARS
A gr mm r whose p rsing t ble h s no multiply defined entries is s id to be LL(1
)
→The first ’L’ st nds for sc nning the input from left to right.
→The second ‘L’ st nd for producing left most deriv tion.
→And’1’ for using one input symbol of look he d out e ch step to m ke p rsing
ction decision.
→LL(1) gr mm r h ve sever l distinctive properties,no mbigious or left-recrussi
ve gr mm r c n be LL(1).
→A gr mm r G is LL(1) iff whenever A→ α/β are two distinct productions of G,the
following conditions hold

1.for no terminal a do oth α nd β derive strings eginning with a.
2.atmost one of α nd β can derive empty strings.
3.if β =>€ ,then αdoes not derive ny substring .beginning with termin l in FO
LLOW(A) bove.
NOTE: Gr mm r for rthim tic expression is LL(1).
If then-else st tements CFG is not LL(1).
→For this re son,tr nsform the gr mm r by elimin ting ll left recursion nd the
n left f ctoring whenever possible.
BOTTOM-UP PARSING:
→It is lso known s shift-resduce p rsing.
→E sy-to-implement form of shift-reduce p rsing is oper tor-procedure p rsing.
→Much-more-gener l method of shift-reduce p rsing is c lled LR p rsing.
→It ttempts to construct p rse tree for n import string beginning t the le
ves(bottom) nd working up tow rds the root(top).
→At e ch reduction step, p rticul r substring m tching the right side of prod
uction is repl ced by the symbol on the left of th t production.
→And if the substring os choosen correctly t e ch step, right most deriv tion
is tr ced out in reverse.
FOREQ:
consider the gr mm r
S→ ABe the sentence bbcde c n be reduced
A→ABc/b to S by the following steps
B→d
bbcde i.e.,S=> ABe right most deriv
tion is equiv lent
Abcde => Ade to shift-r
esducing
Ade => Abcde
ABe => bbcde
H ndles:
A H ndle of string is sub string of string th t m tches the right side of
production, ( nd whose reduction to the non termin l on the left side of the
production)
-> Form lly, h ndle of right senti l from r is production
A->β and a posi
tion of r where the string β may e found and replaced y A to produce the previ
ous
Right-sentential from in a right most derivation of β.
i.e, if s*=>αAW=>αβthen A->β in the position following αβW. The string W to the
right of the handle
contains only terminals.
->Note these may e more than one handle if the grammar am iguous.
-> If a grammar is unam iguous, then every right-sentential from of the grammar
has exactly one handle.
For eq: S-> aABe the sentence a cde can e reduced y the following step
s.
A->A c/
B->d.
A cde From s=>aA e
aABCde =>aAde
aAde =>aA
cde
aABC =>a cde

So, a cde isa right sentential from whose handle is A-> position
2.
Likewise, aA cde is right-sentential
from whose handle is A->A c at position 2.
Note that we say that, the su string β is a handle of αβW.
Handle programming

It is nothing ut reducing the handle y the non terminal, which is towa
rds the left of the production.
A right most derivation in reverse can e o tained y handle programming
.
Two points are to e concerned when we passing y handle running. They a
re
-To locate the su string to e reduced in a right sentential form, and
-Too determine
what production to choose if there is more than one production wi
th the su string on the right side
Shift Reduce Parsing:

->Shift-reduce parser as a ottom-up parse, which attempts to construct a parse
tree for an input string eginning at the leaves and working up towards the root
.
->At each set up of reduction,
a particular
su string matching the right side of
a production is replaced y the sym ol on the left of that production.
->The operations or actions
of shift-reduce parser are :
Shift: The next imp sym ol is shifted on to the top
of the stack.

Reduce : The handle on the top of the stack will e reduced y the non terminal
which is towards the left side of the production, which is selected for reducing
the handler.
Accept: The parser announces successful completion of parsing
Error: The parser discover that a syntax error has occurred and calls an error r
ecovery routine.
IMPLEMENTATION:

• it uses the data structure,stack
which can hold grammer sym ols,and
input uffer to hold the string to e parsed.

• The $ sm ol is used to mark the ottom of the stack and also the
right end of the input.
• Intially,the stack is empty and the string W is on the input as shown e
low:
STACK INPUT
$ W$
• The parser operates y shifting zero or more input sym ols onto the
stack until a handle S3 ia on top of the stack
• Once the handle appear on the top of the stack,the parser reduces S3
to the left side of the production.

• The parser repeats the a ove steps
until it has detected an error or
until the stack contains the start sym ol and the input is empty.
• After the parser enters this state it halts and announces successful
completion of parsing.
• The stack implementation for the input string id1+id2*id3 is as shown e
low:
STACK INPUT ACTION
$ id1+id2*id3$ SHIFT
$ id1 id +id2*id3$ REDUCE BY E
$ E +id2*id3$ SHIFT
$E+ id2*id3$ SHIFT
$E+id2 *id3$id REDUCE BY E
$E+E *id3$
SHIFT
$E+E* id3$
SHIFT
$E+E*id3 $
id REDUCE BY E
$E+E*E $
E*E REDUCE E
$E+E $
E+E REDUCE BY
$E
$ ACCEPT
*At one step,the stack contains

E+E even though E+E can
e reduced to E y the E+E,the input sym ol reduce parse production E
is the reverse of the rightmost derivation.IF
E+E is reduced toE,it
ecomes the reverse of the leftmost o servation which is not the
characteristic of shift reduce parser.
CONFLICTS DURING SHIFT-REDUCE PARSING :
When the shift-reduce parser is applied to some CFG it leads to some
conflicts ecause shift-reduce parser cannot e used for CFGS.the
conflicts are
SHIFT/REDUCE CONFLICT :
Theparser even after knowing the entire stack contains and the next
sym ol,cannot decide whether to shift or to reduce it is shift/reduce
conflict.
REDUCE/REDUCECONFLICT
Theparser knowing the entire stack contents and the next input
sym ol,cannot decide which productions to use or which of several
reductions to make.
FOR EX: dangling-else grammar
<stmt>it <expr>then<stml>/if <expr> then <stml> else <stml>
STACK:
If <expr> then <stmt> else…..$
We cannot tell whether if <expr> then <stmt> is the handle or not
This leads the parser in cotension wether to shift else or reduce the
stack top element
so it is shift/reduce conflict
Simple LR Parsing (SLR):
Let (s0x1s1x2s2….xm sm, ai ai+1…..an$) the current configuration represent x1x2…
…xm, aiai+1….an$
1. If action[sm, ai] = shift s, the parser executes a shift move, entering the c
onfiguration (s0x1s1x2s2…xm sm ai s, ai+1…..an$)
2. If action[sm, ai] = reduce A-->B then the parser executes a reduce move, ente
ring the configuration
(s0x1s1x2s2….xm-r, sm-rAs, aiai+1...an$)
Where s=go to[sm-r, A] and r is length of B the right side of
the production
The
parser first popped 2r sym ols off the stack(r state
sym ols and r grammar s
ym ols), exposing state sm-r. The parser then pushed oth A, the left side of th
e production, and s, the entry for go to [sm-r, A] on to the stack
The current input sym ol is not changed in a reduce move.
3. If action[sm,ai] = accept, parsing is completed
4. If action[sm, ai] = errors, the parser has discovered an error recovery routi
ne
ALGORITHM:

Input: An input string W and an LR parsing ta le with functions action and go t
o for a grammar G.

Output: If W is in L (G), a ottom-up parsing for W, otherwise an error indicati
on.
Method: Initially theparser has s0 on its stack, where s0 is the initial state,
and W$ in the input uffer.
Set ip to point to the first sym ol of W$.
Repeat
{
Let s e the state on top of the stack and a the Sym ol pointed y ip;
If action [s, a ] =shift s’ then
{
Push a then s’ on the top of the stack. Advance ip to the next input
sym ol.
}
Else
If action [s, a] = reduce a->ß then
{
Pop 2*│ ß│ sym ol off stack
Let s’ e the state now on the top of the stack
Push A then go to [s’, A] on the top of the stack;
Output the production a-> ß
}
Else
If action [s, a] =accept then
Return
Else
Error ()
End.
For example:-
1. E->E+T
2. E->T
3. T->T*F
4. T->F
5. F->(E)
6. F->id
State ACTION
GO TO
Id + * ( ) $ E T F
0 S5 S4 1 2
3
1 S6 Acc
2 R2 S7 R2 R2
3 R4 R4 R4 R4
4 S5 S4 8 2 3
5 R6 R6 R6 R6
6 S5 S4 9 3
7 S5 S4 10
8 S6 S11
9 R1 S7 R1 R1
10 R3 R3 R3 R3
11 R5 R5 R5 R5
Stack Input Action

O Id*id+id$ Shift
Oid5 *id+id$ Reduced y F->id
OF3 *id+id$ Reduced y T->F
OT2 *id+id$ Shift
OT2*7 Id+id$ Shift
OT2*7id5 +id$ Reduced y F->id
OT2*7F10 +id$ Reduced y T->T*F
OT2 +id$ Reduced y E->T
OE1 +id$ Shift
OE1+6 Id$ shift
OE1+6id5 $ Reduced y F->id
OE1+6F3 $ Reduced y T->F
OE1+6T9 $ Reduced y E->E+T
OE1 $ accept

Pec 31 Acd Material

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Pec 31 Acd Material

Uploaded by

Copyright:

Available Formats

PARSING:-

Array [<sample>] of <type>

Array [<sample>] of <type>

Num dotdot num

Array [<sample>] of <type>

Num dotdot num <sample>

Array [<sample>] of <type>

Num dotdot num <sample>

The configur tions of p rser with imp ct string id+id+id$ nd st ck $E re s f

Constr ction of p rse t ble:-

*→ if G is left recursive or mbigious ,then M will h ve t le st one multiply

*At one step,the stack contains

Stack Input Action

You might also like