You are on page 1of 145

Oxford University Press 2013. All rights reserved.

K. Muneeswaran
Professor and Head
Department of Computer Science and
Engineering
Mepco Schlenk Engineering College,
Sivakasi
Email: kmuni@mepcoeng.ac.in


1
COMPILER DESIGN
Oxford University Press 2013. All rights reserved.
CHAPTER 4

Syntax Analysis
In these slides, we will cover the following topics:
Introduction
Context-free grammar and structure of a language
Parser and its types
Top-down parser
Bottom-up parser
Implementation
Parser generator tool
Error handling
2
Oxford University Press 2013. All rights reserved.
INTRODUCTION
Syntax Vs Semantics analysis

Choice of grammar for the power, flexibility and ease
of implementation should be traded off
Oxford University Press 2013. All rights reserved.
Relation among Grammar, Language
and Recognizer
Oxford University Press 2013. All rights reserved.
Context free grammar (CFG) for syntax
analysis
1. CFG is powerful enough to represent the structure
of the programming language

2. It does not look into the context or semantics
associated with the language representation

3. It is relatively easy to implement compared to
context sensitive grammar or unrestricted grammar
Oxford University Press 2013. All rights reserved.
CFG G is defined as a 4-tuple:
G = (V
T
, V
N
, R, S)
1. V
N
is a finite set of non-terminal symbols or variables
2. V
T
is a finite set of terminal symbols, disjoint with V
N
,
which makes up the actual content of the sentence
3. R is a relation from V
N
to (V
N
U V
T
). The members of R
are called production rules or rewriting rules
4. S is the start variable, used to represent the whole
sentence (or program)
Denoting CFG
A o, where A is a non-terminal and o is string of
grammar symbols (V
N
or V
T
)
Oxford University Press 2013. All rights reserved.
CFG An Example
S aSbS
S bSaS
S c

Or
S aSbS | bSaS | c

Where S is the non-terminal and start symbol
a,b are terminals and
c is the empty symbol

In many cases, the grammar is recursively defined for representing
the language
Oxford University Press 2013. All rights reserved.
Representations of Grammar and
Examples
CFG can be recognized by the abstract machine, called
pushdown automaton.

Derivation (Production):
is the process of deriving the given sentence of the
language from the start symbol
Oxford University Press 2013. All rights reserved.
Example Grammar:
exp exp + exp | exp * exp | id

Sentence: id * id + id

Derivations:
exp exp * exp
exp * exp + exp
exp * exp + id
id * exp + id
id * id + id

Derivations can be left most or rightmost
Sentential form: consists of terminals and non-terminals in the derivation
process
Sentence: has only of terminals
In general:
StartSymbol One or more Sentential Forms Sentence
Oxford University Press 2013. All rights reserved.
Parse Trees:
are the diagrammatic representation of the
derivation steps
Interior Node: Non-terminal
Leaf Node: Terminal
Oxford University Press 2013. All rights reserved.
Limitations of Context Free Grammar
The semantics of the language could not be handled by CFG

Example:
int a,b,c;
a = 10;
b = 20;
c = a + b;

The semantics such as whether the variables are declared could not
be incorporated as part of the syntax specifications.

This type of language is denoted by:
L = {wcw | w is string of alphabets}
Oxford University Press 2013. All rights reserved.
Ambiguities in Grammar and resolving
them
Ambiguous grammar is a grammar, which gives more
than one parse tree for at least one sentence w.

For example, the grammar:
exp exp + exp | exp * exp | id
is ambiguous grammar
Oxford University Press 2013. All rights reserved.
Eliminating ambiguity from grammar
1. Eliminating ambiguity due to lack of precedence and
associativity

2. Eliminating dangling else ambiguity
Oxford University Press 2013. All rights reserved.
Steps for eliminating ambiguity due to
lack of precedence and associativity
1. Add n new non-terminal where n is the number of
precedence levels.

2. Define the least precedence operators for the start
symbol.

S S operator11 S | S operator12 S | ..

where S is the start symbol and operator11,
operator12, etc are operators having least precedence.

Oxford University Press 2013. All rights reserved.
3. For each new non-terminal, rules are defined using the
operators in next level of precedence.
A A operator21 A | A operator22 A |
where A is a new non-terminal and operator21,
operator22 etc are operators at same level of
precedence

4. At the end, the definitions in the given grammar without
precedence are added to the remaining new non-
terminal
Steps for eliminating ambiguity due to lack
of precedence and associativity ..contd
Oxford University Press 2013. All rights reserved.
5. To include the associativity, in each definition

a. If the associativity is left to right, then write the definition in left
recursive form :
A A operator1 B | A operator2 B |

b. If the associativity is right to left, then write the definition in right
recursive form

A B operator1 A | B operator2 A |.

where A is the non-terminal which defines operators at a particular
level of precedence and B is then on-terminal which defines operators
at next higher level of precedence.
Steps for eliminating ambiguity due to lack
of precedence and associativity ..contd
Oxford University Press 2013. All rights reserved.
Eliminating ambiguity due to lack of precedence
and associativity An Example
Grammar: exp exp + exp | exp * exp | (exp) | id |
const
Two levels operator precedence (+,*) leads to addition of
two new non-terminals term and fact.
expr expr + expr
term term * term
fact (expr) | id | const

expr expr + term
term term * fact
fact (expr) | id | const
Oxford University Press 2013. All rights reserved.
Eliminating dangling else ambiguity
stmt if con then stmt | if con then stmt else stmt |
other
Ambiguities resolved grammar
stmt matched | unmatched
matched if con then matched else matched | other
unmatched if con then stmt | if con then matched
else unmatched
Oxford University Press 2013. All rights reserved.
Role of the Parser
process of grouping the sequence of tokens in the source
program for checking the syntax (structure of the language)
by constructing a parse tree

Also performs error handling

Two methods:
Top down parsing (left most derivation)
Bottom up parsing (reverse of right most derivation)
Oxford University Press 2013. All rights reserved.
Issues in the design of Parser
Choice for developing parser such as:
High level languages like C
Low level languages
Using parser generator tools.

What intermediate code to produce

Handling error
Oxford University Press 2013. All rights reserved.
General Structure of a Parser
Oxford University Press 2013. All rights reserved.
1. Recursive descendant parsing with back
tracking
2. Recursive descendant parsing without back
tracking

Recursive descendant parsing
Recursion:
Left Recursive grammar (A A | |)

Right Recursive grammar (A A | |)

Left recursion has to be eliminated since leads to
infinite loop
Oxford University Press 2013. All rights reserved.
Elimination of left recursion
A A1 | A2 | .. | An | |1 | |2 | . | |m
A |1 A1 | |2 A1 | . | |m A1
A1 1 A1 | 2 A1 | .. | n A1 | c
Rewritten as
Oxford University Press 2013. All rights reserved.
Elimination of left recursion - An Example
exp exp + exp | exp * exp | id

exp id exp
1

exp
1
+ exp exp
1
| * exp exp
1
| c
Oxford University Press 2013. All rights reserved.
Recursive-Descendent Parser with
backtracking An Example
Grammar:
S AB
A c | cB
B d
Given String: cdd

Oxford University Press 2013. All rights reserved.
Recursive-Descendent Parser with
backtracking Pseudo Procedure
returnStatus procedure S();
{
parseStatus = SUCCESS
parseStatus = A();
if(parserStatus == ERR)
{
report(Error);
return parseStatus;
}
parseStatus = B();
if(parserStatus == ERR)
{
report(Error);
return parseStatus;
}
parseStatus = SUCCESS;
return parseStatus;
}
Oxford University Press 2013. All rights reserved.
returnStatus Procedure A()
{
if (getInput() == c)
return SUCCESS;
else if( getInput() == c)
//undo any actions
associated //with the parsing
of previous //rule alternative
return( B() );
}
returnStatus Procedure B()
{
if(getInput() == d)
Return
SUCCESS;
else
Return FAIL;
}

Recursive-Descendent Parser with
backtracking Pseudo ProcedureContd
Oxford University Press 2013. All rights reserved.
Common Prefix and its elimination
(left factoring)
A o|
1
| o|
2
| | o|
n
|
1
|
2
| .
m



A oB |
1
|
2
| .
m

B |
1
| |
2
| | |
n


Oxford University Press 2013. All rights reserved.
Common Prefix and its elimination
(left factoring) contd
exp exp + exp | exp * exp | id



exp exp B | id
B + exp | * exp
Oxford University Press 2013. All rights reserved.
Predictive Parser
Predictive Parser has 5 major components:

1. A input buffer to hold the sentence to be parsed
2. An output buffer to display the actions taken by the
parser
3. Stack data structure used to hold the grammar
symbols during the parsing process
4. Predictive parsing table
5. Parser program
Oxford University Press 2013. All rights reserved.
Algorithm - Predictive_parser(w#)
//w# Input sentence appended with #
// ip input pointer
// a next input symbol pointed by input pointer i.e., buffer[ip]
// X grammar symbol on top of the stack
// M(A,a) Predictive parsing table entry holding a production
rule
// where the row is indexed by the non-terminal A
// The column indexed by the terminal a
// returns 1, if the sentence is parsed successfully -1, else
Oxford University Press 2013. All rights reserved.
ip= 1
a = buffer[ip]
X =top (stack)
do
{
if (X = = b )
if (X = = a)
{
ip=ip+1
pop (stack)
}
else
return -1

Algorithm - Predictive_parser(w#) contd
Oxford University Press 2013. All rights reserved.
else if (X = = A)
if( M(X,a) = X Y
1
Y
2
..Y
n
)
{
Pop (stack)
for each grammar symbol Y
j
in Y1 Y
n-1
,Y
n
do
Push (stack, Y
j
)
}
else
return -1
}
while(X = # and buffer[ip] = #)
return 1
Algorithm - Predictive_parser(w#) contd
Oxford University Press 2013. All rights reserved.
Error conditions in predictive parsing
1. The top of the stack terminal is not matching the
next input symbol.

2. The referred predictive parsing tables entry is
blank
Oxford University Press 2013. All rights reserved.
Predictive Parser An Example
expr term expr
1

expr
1
+ term expr
1
| c
term fact term
1

term
1
* fact term
1
| c
fact (expr) | id | const
Oxford University Press 2013. All rights reserved.
Predictive Parsing table for Arithmetic
expression
Oxford University Press 2013. All rights reserved.
Parsing action using the predictive
parsing table for the string: id * id + id
Oxford University Press 2013. All rights reserved.
Parsing action using the predictive
parsing table for the string: id * id + id
contd
Oxford University Press 2013. All rights reserved.
Parsing action using the predictive
parsing table for the string: id * + id
Oxford University Press 2013. All rights reserved.
// G the context free grammar given as input
// first(X) returns a set of terminals for a grammar symbol X
// first(o) returns a set of terminals for a string of grammar
symbol (o)
// follow(A) returns a set of terminals for a non-terminal A
// returns M(A,a) the predictive parsing table with m x n
elements
// where m is the number of Non-terminals in G
// n is the number of terminals plus one
Algorithm constructPredictiveTable(G)
Oxford University Press 2013. All rights reserved.
for each production A o in G do
{
for each terminal a in first(o) do
M(A,a) = A o
if(c is in first(o))
for each terminal b in follow(A) do
M(A,b) = A c
if(# is in follow(A))
M(A,#) = A c
}

Algorithm constructPredictiveTable(G)
contd
Oxford University Press 2013. All rights reserved.
// returns set of terminals that begin strings derived from X
{
if(X is a terminal)
first(X) = {X}
if(X is non-terminal and X Y
1
Y
2
..Y
n
)
{
if(c is in first(Y
1
), first(Y
2
) . first(Y
k
) )
first(X) = first(X) first(Y
1
) . first(Y
k+1
) {c}
if (k ==n)
first(X) = first(X) {c}
}
return first(X)
}
Algorithm first(X)
Oxford University Press 2013. All rights reserved.
// S start symbol
// returns set of terminals that follow A in the body of the production
{
If (A == S)
follow (A) = follow(A) {#}
if( B oA| is a production in G)
{
follow(A) = follow(A) first(|) {c}
if(c is in first(|))
follow(A) = follow(A) follow(B)
}
if( B oA is a production in G)
follow(A) = follow(A) follow(B)
return follow(A)
}
Algorithm follow (A)
Oxford University Press 2013. All rights reserved.
1. expr term expr
1

2. expr
1
+ term expr
1
| c

3. term fact term
1

4. term
1
* fact term
1
| c

5. fact (expr) | id | cons
Algorithm constructPredictiveTable(G)
Example1
Oxford University Press 2013. All rights reserved.
Computations of First and Follow
Oxford University Press 2013. All rights reserved.
Algorithm constructPredictiveTable(G)
Example2
S Aa | bAc | Bc | bBa
A d
B d

Oxford University Press 2013. All rights reserved.
Limitations of top-down parser
Requires preprocessing steps such as:
Elimination of left recursion
Left factoring

The effect of backtracking has to be considered
Oxford University Press 2013. All rights reserved.
Bottom up Parser
It is the process of finding the exact right hand side of the
sentential form (handle) to reduce to the previous right
sentential form in the right most derivation in reverse

The main actions are:
Shift
Reduce
Accept
Error
Oxford University Press 2013. All rights reserved.
Simple stack based parser
Shift: In this action the next input symbol is pushed into the stack

Reduce: If there is a handle in the top of stack then this handle
which is the body of a production is popped from the stack and the
equivalent non-terminal in the head of the production is pushed
into the stack

Accept: Parsing is terminated successfully when the top of the
stack has # and start symbol and input pointer is pointing to #.

Reject: Parsing is terminated because there are some errors in
the given sentence.
Oxford University Press 2013. All rights reserved.
Simple stack based parser - Example
expr expr + term | term
term term * fact | fact
fact (expr) | id | cons
Oxford University Press 2013. All rights reserved.
Simple stack based parser Example
..Contd
Oxford University Press 2013. All rights reserved.
Conflicts in shift reduce parsing
Shift-reduce conflict:
This conflict occurs when the parser has choice to
select both shift and reduce action
reduce-reduce conflict :
This conflict occurs when the top of the stack has
handle for which there is more than one reduction
possible
These conflicts are resolved in operator precedence
parser and LR parser
Oxford University Press 2013. All rights reserved.
Operator Grammar and Parser
Operator grammar is a type of context free grammar,
where there is no single production having adjacent non-
terminals in the body of the production

Example :
exp exp + exp | exp * exp | id

However, a grammar equivalent to this is:
exp exp op exp | id
op + | *

It is not operator grammar

Oxford University Press 2013. All rights reserved.
Operator Precedence Parsing
The precedence relations between two adjacent terminals a
and b can be:

1. a has higher precedence than b i.e., a takes
precedence over b denoted as a
.
> b

2. a has lower precedence than b i.e., a gives precedence
to b denoted as a <
.
B

3. a has equal precedence to b i.e., a and b have same
precedence denoted as a =
.
b
Oxford University Press 2013. All rights reserved.
The parsing steps are as follows:

1. Parser starts scanning from left to right till it
encounters first .> relation and set a pointer
2. Scan backward till it encounters <. relation and set
another pointer
3. The grammar symbols in-between these pointers are
considered to be the handle and is replaced by its
equivalent non-terminal in the grammar
4. Steps 1 to 3 are repeated till the sentence is reduced
to the start symbol of the grammar
Operator Precedence ParsingContd
Oxford University Press 2013. All rights reserved.
Algorithm OperatorParsing(w#)
//w the sentence to be parsed
// ip input pointer
// buffer[ip] next input symbol in buffer pointed by input
//pointer
// M(a,b) Operator Precedence entry for the terminal as
//row and bs column
// A non-terminal in the head of production for the
handle //in the top of
// stack
// handle is popped from stack
// returns 1, if parsing is completed successfully
// -1, else

Oxford University Press 2013. All rights reserved.
ip=1
a = top(stack)
b = buffer[ip]
handle = empty
Do
{
if(M(a,b) == <
.
or M(a,b) == =
.
)
{
ip=ip+1
Push(stack,b) // shift action is performed
}
else if(M(a,b) ==
.
> ) //handle is found on top of stack
{
do
{
Algorithm OperatorParsing(w#) Contd
Oxford University Press 2013. All rights reserved.
do
{
t = Pop(stack)
handle =handle + t
//+ stands for concatenation
}
while(t = a)
c = top(stack)
t = recently popped terminal
}
while( M(c,t) = <
.
)
// reduce action is performed
if (top(stack) is a non-terminal)
handle = handle + pop(stack)
//Handle is available in reverse order
Handle = reverse(handle)
Algorithm OperatorParsing(w#) Contd
Oxford University Press 2013. All rights reserved.
push(stack, A)
//A is the head of the rule
//corresponding
//to the definition associated with the
//handle
}
else
return -1
}while(a = # and b = #)
return 1

Algorithm OperatorParsing(w#) Contd
Oxford University Press 2013. All rights reserved.
Operator Precedence Parsing-Example
Grammar:
exp exp + exp | exp * exp | (exp) | id | cons
Precedence relation for an expression grammar
Oxford University Press 2013. All rights reserved.
Parsing action using the operator
precedence table
Oxford University Press 2013. All rights reserved.
Operator Precedence Table
Construction
The precedence and associativity rules are made use to
construct the table

For example
* .> +
Oxford University Press 2013. All rights reserved.
Algorithm PrecedenceTableConstruct(G,P,A)
// G the grammar used for precedence table
construction
// P[a] precedence value of operator a given as a
integer, if //a
.
> b then
// P[a] > P[b]
// A[a] Associativity value of operator and it is l for left
//associative
// and r for right associative
// opd operand such as id, cons etc
// returns M[a,b] which is the precedence table having
//size n x n where n // is the number of terminals + 1
Oxford University Press 2013. All rights reserved.
for each operator a in G do
{
for each operator b in G do
{
if(P[a] > P[b] )
M[a,b] =
.
>
M[b,a] = <
.

}
else if(P[a] == P[b])

Algorithm PrecedenceTableConstruct(G,P,A)
contd
Oxford University Press 2013. All rights reserved.
{
if(A[a] == l)
{
M[a,b] =
.
>
M[b,a] =
.
>
}
else if(A[a] = = r)
{
M[a,b] = <
.

M[b,a] = <
.

}
}
}

Algorithm PrecedenceTableConstruct(G,P,A)
contd
Oxford University Press 2013. All rights reserved.
M[a,opd] = <
.

M[opd,a] =
.
>
M[a,#] =
.
>
M[#,a] = <
.

if( ( is in G and ) is in G)
{
M[a,( ] = <
.

M[(,a] = <
.

M[a,) ] =
.
>
M[ ),a] =
.
>
}
}

Algorithm PrecedenceTableConstruct(G,P,A)
contd
Oxford University Press 2013. All rights reserved.
M[opd,#] =
.
>
M[#, opd] = <
.

if ( ( is in G and ) is in G )
{
M[opd, ) ] =
.
>
M[(, opd ] = <
.

M[ #, ( ] = <
.

M[ ), # ] =
.
>
M[ (, ) ] = =
.

M[ (, ( ] = <
.

M[ ), ) ] =
.
>
}

Algorithm PrecedenceTableConstruct(G,P,A)
contd
Oxford University Press 2013. All rights reserved.
Construction of Precedence Table -
Example
Grammar:
exp exp + exp | exp * exp | (exp) | id | cons
Precedence and associativity of the operators are
shown as:
Operator Precedence Associativity
+ 1 L
* 2 L
Oxford University Press 2013. All rights reserved.
The constructed operator precedence table
Oxford University Press 2013. All rights reserved.
Limitations of operator precedence
parser
1. This parser can be constructed only for operator grammar

2. If in a grammar, an operator has multiple precedence
values, then operator precedence parser cannot be
constructed.

For example the operator -can be either unary minus or
binary minus which differs in the precedence values.
Oxford University Press 2013. All rights reserved.
Components of LR Parser
1. Input buffer which holds the sentence appended with #

2. Output buffer which tells the type of action selected

3. Stack used to hold the grammar symbols and their
states during parsing

4. Parsing table which helps to select the parsing action

5. Parser routine which is the code for the parser
Oxford University Press 2013. All rights reserved.
Types of LR Parser
1. Simple LR parser known as SLR parser

2. Canonical LR parser known as CLR parser

3. Look Ahead LR parser known as LALR parser
Oxford University Press 2013. All rights reserved.
Algorithm LR_Parsing(w#)
//w sentence to be parsed
// ip input pointer
// buffer[ip] next input symbol
// s start state which is alone initially present in the stack
// p,q,t states used in parsing process
// A(p,a)action part of the parsing table for the state p and
the //terminal a
// G(p,A) goto part of the parsing table for the state p and
the non
//terminal A
// returns 1, on accept
// -1, on reject

Oxford University Press 2013. All rights reserved.
ip=1 // Input pointer is initialized to point to left most token
push(stack,s) // store the start state alone in the stack
p = top(stack)
a = buffer[ip]
while(1)
{
if(A(p,a) = (s,q) ) // if the action to be selected is shift
{
push(stack, a)
push(stack, q)
ip=ip+1
}
else if (A(p,a) = (r, A o) ) // If the action is to reduce

Algorithm LR_Parsing(w#) Contd
Oxford University Press 2013. All rights reserved.
{
len = length(o)
for (j=1;j <= len*2; j =j+1)
pop(stack) and form a handle
push(stack, A)
t= G(top(stack),A)
push(stack, t)
}
else if(A(p,a) = acc )
// If the action is accept
return 1
else
// The action to be selected is reject
return -1
Algorithm LR_Parsing(w#) Contd
Oxford University Press 2013. All rights reserved.
LR Parsing - Example
Grammar:
expr expr + term
expr term
term term *fact
term fact
fact (expr)
fact id

String for Parsing:
id * id + id

Oxford University Press 2013. All rights reserved.
LR Parsing Table
Oxford University Press 2013. All rights reserved.
Parsing action for the arithmetic expression
id * id + id using SLR parsing table
Oxford University Press 2013. All rights reserved.
Parsing action for the arithmetic expression
id (id + id) using SLR parsing table
Oxford University Press 2013. All rights reserved.
Error Entries
1. If in state 0 and any operator is seen, the possible
error message is: Expected Operand but operator
found or missing operand
2. If in state 0, the close parenthesis is seen, actually it is
a un-matching parenthesis. Hence the possible error
message is Unbalanced parenthesis
3. If in state 1, id is seen the valid entries are only + and
hence the error message could be operator expected
but operand found or missing operator
4. In a similar manner, if in state 0,4,6,7,8 the end of
marker # is seen, the error message like unexpected
end of expression
Oxford University Press 2013. All rights reserved.
SLR Parser
Viable Prefixes
Viable prefixes are the prefixes of right sentential
forms. These prefixes are used for identifying the action
to be performed during the parsing process. The
information of viable prefixes is obtained using LR(0)
items.

LR(0) item
LR(0) item is a CFG of the form A o having . in
the right side of the production rule.
Example:
A .o and A o
Oxford University Press 2013. All rights reserved.
Algorithm for constructing LR Parser
1. ClosureLR0(I)

2. GotoLR0(I, X)

3. ConstructLR0ItemsSet(G)

4. ConsturctSLRTable(G)
Oxford University Press 2013. All rights reserved.
Algorithm ClosureLR0(I)
// I set of LR(0) items
// a a single LR(0) item
// returns J set of LR(0) items that can be reached
//without changing the
// parsers state

Oxford University Press 2013. All rights reserved.
for each LR(0) item a in I do
{
J = J {a}
push(stack, a)
}
While(stack is not empty)
{
a = pop(stack)
if( a == A o.B|)
for each production B do
{
J = J { B . }
push (stack, B .)
}
}
return J
Algorithm ClosureLR0(I)Contd
Oxford University Press 2013. All rights reserved.
ClosureLR0(I) - Example
expr expr + term
expr term
term term * fact
term fact
fact (expr)
fact id
closure ({expr . expr + term }) =
{
expr . expr + term,
expr . term,
term . term * fact,
term . fact,
fact . (expr),
fact . id
}
Given Grammar
Oxford University Press 2013. All rights reserved.
Algorithm GotoLR0(I, X)
// I set of LR(0) items
// X grammar symbol in the given grammar
// a a single LR(0) item
// returns K set of LR(0) items that can be reached
//after processing X
{
for each LR(0) item A o.X| do
J = J { A oX.|}
K = Closure0(J)
return K
}
Oxford University Press 2013. All rights reserved.
GotoLR0(I, X) - Example
goto({fact .(expr), ( })

goto({fact . (expr), ( }) = closure ({fact ( . expr) })
=
{
fact ( . expr),
expr . expr + term
expr . term
term . term * fact
term . Fact,
fact . (expr), fact . id
}
Oxford University Press 2013. All rights reserved.
Algorithm ConstructLR0ItemsSet(G)
//G the grammar G including the augmented production S // S
//I, J, K set of LR(0) items
//a,b single LR(0) item
//X grammar symbol of grammar G
// returns C canonical collection of set of LR(0) items
{
C = ClosureLR0({S .S})
for each new LR(0) item I added to C do
for each grammar symbol X in G do
if(GotoLR0(I,X) is not in C and
GotoLR0(I,x) = |)
C = C GotoLR0(I,X)
return C
}
Oxford University Press 2013. All rights reserved.

// G
1
the grammar G including the augmented production S // S
// C canonical collection of set of LR(0) items
// A(p,a) action part of parsing table for state p and terminal //a
// G(p,A) goto part of parsing table for state p and non-//terminal A
// returns both A and G
Algorithm ConsturctSLRTable(G)
Oxford University Press 2013. All rights reserved.
C = ConstructLR0ItemsSet(G)
for each item I in C do
{ p = state number representing I
q = state number representing J
if( A o .a| is in I and Goto0(I,a) = J)
A(p,a) = (s, q)
else if (S S. is in I)
A(p,#) = acc
else if(A o. is in I)
for each b in follow(A) do
A(p,b) = (r, A o)
if( A o .B| is in I and Goto0(I,B) = J)
G(p,B) = q
}
return A and G

Algorithm ConsturctSLRTable(G)Contd
Oxford University Press 2013. All rights reserved.
ConstructLR0ItemsSet(G) - Example
Grammar:
expr expr + term
expr term
term term * fact
term fact
fact (expr)
fact id

I
0
= closure ({expr
.expr}) =
{
expr .expr
expr .expr + term
expr .term
term .term * fact
term .fact
fact .(expr)
fact .id
}

I
1
= goto(I
0
, expr) =
{
expr expr.
expr expr. +
term
}

Oxford University Press 2013. All rights reserved.
I
3
= goto(I
0
, fact) =
{
term fact.
}
I
4
= goto(I
0
, ( ) =
{
fact ( .expr)
expr .expr + term
expr .term
term .term * fact
term .fact
fact . (expr)
fact .id
}
I
5
= goto(I
0
, id) =
{
fact id.
}

I
6
= goto(I
1
, +) =
{
expr expr + .term
term .term * fact
term .fact
fact . (expr)
fact .id
}
I
7
= goto(I
2
, *) =
{
term term * .fact
fact .(expr)
fact .id
}
I
2
= goto(I
0
, term) =
{
expr term.
term term. * fact
}

ConstructLR0ItemsSet(G)
ExampleContd
Oxford University Press 2013. All rights reserved.
I
6
= goto(I
1
, +) =
{
expr expr + .term
term .term * fact
term .fact
fact . (expr)
fact .id
}
I
7
= goto(I
2
, *) =
{
term term * .fact
fact .(expr)
fact .id
}
I
8
= goto(I
4
, expr) =
{
fact ( expr .)
expr expr. +
term
}

I
9
= goto(I
6
, term) =
{
expr expr + term.
term term. * fact
}
I
10
= goto( I
7
, fact) =
{
term term * fact.
}
I
11
= goto(I
8
, ) ) =
{
fact ( expr) .
}
ConstructLR0ItemsSet(G)
ExampleContd
Oxford University Press 2013. All rights reserved.
Action and Goto fields in the LR
parsing table
Oxford University Press 2013. All rights reserved.
Construct SLR Parsing Table Example
(if-else statement)
Grammar:
stat if cond then stat
stat if cond then stat else stat
stat other
cond expr
I
0
= closure ({stat .stat}) =
{
stat .stat
stat .if cond then stat
stat .if cond then stat else stat
stat .other
Cond .expr
}
I
1
= goto(I
0
, stat) =
{
stat stat.
}
Oxford University Press 2013. All rights reserved.
I
2
= goto(I
0
, if) =
{
stat if .cond then stat
stat if .cond then stat
else stat
cond .expr
}
I
3
= goto(I
0
, other) =
{
stat other.
}
I
4
= goto(I
0
, expr) =
{
cond expr.
}

I
5
= goto(I
2
, cond) =
{
stat if cond .then stat
stat if cond .then stat else stat
}

Construct SLR Parsing Table
Example (if-else statement) Contd
Oxford University Press 2013. All rights reserved.
I
6
= goto(I
5
, then) =
{
stat if cond then .stat
stat if cond then .stat else stat
stat .if cond then stat
stat .if cond then stat else stat
stat .other
cond .expr
}

I
7
= goto(I
6
, stat) =
{
Stat if cond then stat .
stat if cond then stat .else stat
}

I
8
= goto(I
7
, else) =
{
stat if cond then stat else
.stat
stat .if cond then stat
stat .if cond then stat else
stat
stat .other
cond .expr
}
I
9
= goto(I
8
, stat) =
{
stat if cond then stat else
stat.
}
Construct SLR Parsing Table Example
(if-else statement) Contd
Oxford University Press 2013. All rights reserved.
Construction of Action and Goto fields for
SLR Parsing Table Example (if-else
statement)
Oxford University Press 2013. All rights reserved.
SLR parsing table with shift reduce
conflicts
Oxford University Press 2013. All rights reserved.
Grammar
Sid ( P )
SE
Pid
Eid ( E )
Eid
SLR parsing table construction for
a function call grammar
I
0
= {
S
1
.S
S.id (P)
S .E
E.id (E)
E.id
}

Goto (I
0
, S) = I
1

= {
S1S.
}
Goto (I
0
, E) = I
2

= {
SE.
}

Goto (I
0
, id)=I
3
=
{
Sid.( P )
Eid.(E)
Eid.
}
Goto (I
3
, ( ) =
I
4
=
{
Sid(.P)
P.id
Eid(.E)
E.id(E)
E.id
}
Goto (I
4
,P)=I
5
={
Sid(P.)
}
Oxford University Press 2013. All rights reserved.
Goto (I
4
,E)=I
6
=
{
Eid(E.)
}
Goto (I
4
, id)=I
7
={
Pid.
Eid.(E)
Eid.
}
Goto (I
5
,) ) = I
8
={
Sid(P).
}
Goto (I
6
, ) )=I
9
=
{
Eid(E).
}
Goto(I
7
, ( ) = I
10
=
{
Eid(.E)
E.id ( E )
E.id
}
Goto(I
10
, E)=I
11
=
{
Eid(E.)
}
Goto(I
10
,id)=I
12
=
{
Eid.(E)
Eid.
}
Goto(I
11
, ) )= I
13
=
{
Eid(E).
}

Goto(I
12
, ( )=I
10

SLR parsing table construction for
a function call grammarcontd
Oxford University Press 2013. All rights reserved.
SLR parsing table construction for a
function call grammarAction and Gotos
Oxford University Press 2013. All rights reserved.
SLR parsing table for a function call
Oxford University Press 2013. All rights reserved.
CLR Parser LR(1) Parser
Algorithms:
ClosureLR1(I)
GotoLR1(I, X)
ConstructLR1ItemsSet(G)
ConsturctCLRTable(G)
Oxford University Press 2013. All rights reserved.
Algorithm ClosureLR1(I)
// I set of LR(1) items
// a a single LR(1) item
//b, c a single terminal
// returns J set of LR(1) items that can be reached
without //changing the
// parsers state
Oxford University Press 2013. All rights reserved.
for each LR(1) item a in I do
{
J = J {a}
push(stack, a)
}
While(stack is not empty)
{
if( a == [A o.B|,b])
for each production B do
for each terminal c in first(|b)do
{
J = J { [B .,c] }
push (stack, [B .,c])
}
}
return J
Algorithm ClosureLR1(I) Contd
Oxford University Press 2013. All rights reserved.
Algorithm GotoLR1(I, X)
// I set of LR(1) items
// X grammar symbol in the given grammar
// returns K set of LR(1) items that can be reached
//after processing X
{
for each LR(1) item [A o.X|, b] do
J = J { [A oX.|, b]}
K = ClosureLR1(J)
return K
}
Oxford University Press 2013. All rights reserved.
Algorithm
ConstructLR1ItemsSet(G)
//G the grammar G including the augmented production //S S
// I, J, K set of LR(1) items
//a,b single LR(1) items
// c,d single terminal
// returns C canonical collection of set of LR(1) items
{
C = Closure1({[S .S, #]})
for each new LR(1) item I added to C do
for each grammar symbol X in G do
if( Goto1(I,X) is not in C and Goto1(I,x) = |)
C = C Goto1(I,X)
return C
}
Oxford University Press 2013. All rights reserved.
Algorithm ConstructCLRTable(G)
// G the grammar G including the augmented
production //S S
// C canonical collection of set of LR(0) items
// A(p,a) action part of parsing table for state p and
//terminal a
// G(p,A) goto part of parsing table for state p and non-
//terminal A
// returns both A and G
Oxford University Press 2013. All rights reserved.
C = ConstructLR1ItemsSet(G)
for each item I in C do
{
p = state number representing I
q = state number representing J
if( [A o .a|,b] is in I and Goto1(I,a) = J)
A(p,a) = (s, q)
else if ([S S.,#] is in I)
A(p,#) = acc
else if([A o., b] is in I)
A(p,b) = (r, A o)
if( [A o .B|,b] is in I and Goto1(I,B) = J)
G(p,B) = q
}
return A and G
Algorithm
ConstructCLRTable(G)Contd
Oxford University Press 2013. All rights reserved.
CLR Parser Construction - Example
Grammar:
Stat LHS = RHS
Stat RHS
LHS *RHS
LHS id
RHS LHS
I
0
= closure ({[Stat . Stat, #] }) =
{
[Stat . Stat, #]
[Stat . LHS = RHS, #]
[Stat . RHS, #]
[LHS . *RHS, =]
[LHS . id, =]
[RHS . LHS, #]
[LHS . *RHS, #]
[LHS . id, #]
}
I
1
= goto(I
0
, Stat) =
{
[Stat Stat . , #]
}

Oxford University Press 2013. All rights reserved.
I
2
= goto(I
0
,LHS) =
{
[Stat LHS . = RHS, #]
[RHS LHS . , #]
}
I
3
= goto(I
0
,RHS) =
{
[Stat RHS . , #]
}

I
4
= goto(I
0
,*) =
{
[LHS * . RHS, =]
[LHS * . RHS, #]
[RHS . LHS, =]
[RHS . LHS, #]
[LHS . *RHS, =]
[LHS . id, =]
[LHS . *RHS, #]
[LHS . id, #]
}

I
5
= goto(I
0
,id) =
{
[LHS id., =]
[LHS id . , #]
}

I
6
=goto(I
2
,=) =
{
[Stat LHS = .
RHS, #]
[RHS . LHS, #]
[LHS . *RHS, #]
[LHS . id, #]
}
CLR Parser Construction
ExampleContd
Oxford University Press 2013. All rights reserved.
I
7
= goto(I
4
, LHS) =
{
[RHS LHS . , =]
[RHS LHS . , #]
}
I
8
= goto(I
4
, RHS) =
{
[LHS * RHS . , =]
[LHS * RHS . , #]
}

goto(I
4
,*) = I
4

goto(I
4
,id) = I
5

I
9
= goto(I
6
, LHS) =
{
[RHS LHS ., #]
}
I
10
= goto(I
6
, RHS)
=
{
[ Stat LHS =
RHS . , #]
}
I
11
= goto(I
6
,*) =
{
[LHS * . RHS, #]
[RHS . LHS, #]
[LHS . *RHS, #]
[LHS . id, #]
}
I
12
= goto(I
6
,id) =
{
[LHS id ., #]
}
goto(I
11
,LHS) = I
9


I
13
=goto(I
11
,RHS) =
{
[LHS * RHS . , #]
}
goto(I
11
,*) = I
11

goto(I
11
,id) = I
12

CLR Parser Construction
ExampleContd
Oxford University Press 2013. All rights reserved.
I
0
= { [Stat . Stat, #]},
I
1
= {[Stat Stat . , #] }
I
2
= {[Stat LHS . = RHS, #], [RHS LHS . , #] }
I
3
= {[Stat RHS . , #] }
I
4
= {[LHS * . RHS, =], [LHS * . RHS, #]}
I
5
= {[LHS id., =], [LHS id . , #] }
I
6
={[Stat LHS = . RHS, #]}
I
7
= {[RHS LHS . , =], [RHS LHS . , #] }
I
8
= {[LHS * RHS . , =], [LHS * RHS . , #]}
I
9
= { [RHS LHS ., #] }
I
10
= {[ Stat LHS = RHS . , #]}
I
11
= {[LHS * . RHS, #]}
I
12
= {[LHS id ., #]}
I
13
= {[LHS * RHS . , #] }
Kernel Items
Oxford University Press 2013. All rights reserved.
CLR Parsing Table
Oxford University Press 2013. All rights reserved.
LALR Parser
I
4
=
{
[LHS * . RHS, =]
[LHS * . RHS, #]
[RHS . LHS, =]
[RHS . LHS, #]
[LHS . *RHS, =]
[LHS . id, =]
[LHS . *RHS, #]
[LHS . id, #]
}
I
11
= {
[LHS * . RHS, #]
[RHS . LHS, #]
[LHS . *RHS, #]
[LHS . id, #]
}

I
4
and I
11
have the similar core
components
(I
4
, I
11
), (I
5
, I
12
), (I
7
, I
9
),(I
8
, I
13
)
are equivalent items
Oxford University Press 2013. All rights reserved.
LALR parsing table
Oxford University Press 2013. All rights reserved.
Design of Data Structure -
Predictive Parser
typedef char* FIRST;
typedef char* FOLLOW;

typedef struct FIRST
{
char* symName;
FIRST* values;
}first;


Oxford University Press 2013. All rights reserved.
typedef struct FOLLOW
{
char* ntName; //Non-terminal
FOLLOW* values;
}follow;

typedef struct Production
{
char* symName;
struct Production * next;
}Production;

Production * PredictTable[NTSIZE][TSIZE+1];

Oxford University Press 2013. All rights reserved.
Representation of the rule S L = R
Oxford University Press 2013. All rights reserved.
Design of Data Structure - SLR Parser
typedef struct ProductionWithDotBody
{
char* symName;
struct ProductionWithDotBody * next;
}ProductionWithDotBody;

typedef struct ProductionWithDotHead
{
char* symName;
struct ProductionWithDotHead * nextHead;
struct ProductionWithDotBody * next;
}ProductionWithDotHead;
T
typedef ProductionWithDotHead LR0Items;
typedef LR0Items * LR0ItemsCollection;
Oxford University Press 2013. All rights reserved.
Data Structure for LR(0) items:
{A .B c, B .eD }
Oxford University Press 2013. All rights reserved.
typedef struct ActionNode
{
char action;
// s shift, r reduce ,
// a accept, e error
short num;
// represents state number or rule number
struct ActionNode * next;
// provision for multiple entries
}ActionNode;
Data Structure for Action Field
Oxford University Press 2013. All rights reserved.
typedef struct GotoNode
{
short state;
struct GotoNode * next;
}GotoNode;


ActionNode SLRTableAction[STATES][TSIZE+1];

GotoNode SLRTableGoto[STATES][NTSIZE];
Data Structure for Goto Field
Oxford University Press 2013. All rights reserved.

YACC

Syntactic
specifications
Parser as C
routine
Usage of YACC for Parser
Generator
Oxford University Press 2013. All rights reserved.
Structure of YACC specification
Definition Section
%%
Rules Section
%%
To be included only if user sub section is present
User Subroutine Section
$ yacc a.y

Oxford University Press 2013. All rights reserved.
YACC and Lex put together
cc y.tab.c lex.yy.c -ll ly
Oxford University Press 2013. All rights reserved.
Lexical Specifications for Calculator
Program
%{
#include<stdio.h>
#include<stdlib.h>
#include"y.tab.h"
%}
%%

[0-9]+ {
ylval.i= atoi(yytext);
flag=1;
return INTEGER;
}
Grammar:
S E
E E + E | E *E | (E) | const
Oxford University Press 2013. All rights reserved.
[0-9]+[.][0-9]+ { yylval.dval=atof(yytext);
flag=1;
return REAL;
}

+" { flag=0;
return PLUS;
}
*" { flag=0;
return MUL;
}
"/" { flag=0;
return DIV;
}
Lexical Specifications for Calculator
ProgramContd
Oxford University Press 2013. All rights reserved.
"(" { flag=0;
return OPENB;
}

")" { flag=1;
return CLOSEB;
}
[\n] {
return NEWLINE;
}
%%
Lexical Specifications for Calculator
ProgramContd
Oxford University Press 2013. All rights reserved.
Syntactic Specifications for Calculator
Program
%{
#include<stdio.h>
#include<stdlib.h>
%}

%union
{
int i;
float dval;
}


Oxford University Press 2013. All rights reserved.
%token OPENB CLOSEB NEWLINE
%token <i> INTEGER
%token <dval> REAL
%left PLUS
%left MUL DIV
%type <dval> exp
Syntactic Specifications for Calculator
ProgramContd
Oxford University Press 2013. All rights reserved.
%%
beg : exp NEWLINE
{
printf("The value of the expression is
%f\n",$1);
exit(0);
} ;
exp : exp PLUS exp
{
$$=$1+$3;
}
| exp MUL exp
{
$$=$1*$3;
}
Syntactic Specifications for Calculator
ProgramContd
Oxford University Press 2013. All rights reserved.
| exp DIV exp { if ($3!=0)
$$=$1/$3;
}
else
{ yyerror("Divide by zero");
exit(0);
}
}
| OPENB exp CLOSEB { $$=$2; }
| INTEGER { $$=$1;}
| REAL { $$=$1; } ;
%%
int main()
{ yyparse(); return 0; }
Oxford University Press 2013. All rights reserved.
Lexical Specifications for Calculator
Program with grammar rewritten
Grammar:
S - > E
E -> E+T | T
T -> T *F | T /F | F
F-> (E) | const

%{
extern int yylval;
%}
%%
"+" { return plus;}
"*" { return mul;}
"\n" { return newline;}
"(" { return openp;}
")" { return closep;}
Oxford University Press 2013. All rights reserved.

[0-9]+ {
yylval=atoi(yytext);
return number;
}
%%
yywrap()
{
printf("eof reached\n");
return 1;
}

Lexical Specifications for Calculator
Program with grammar
rewrittenContd
Oxford University Press 2013. All rights reserved.
%{
#include<stdio.h>
%}
%token plus mul newline
%token number
%%
lines : lines line | line
;
line : E newline {printf("%d\n",$1);}
;
E : E plus T {$$ = $1 + $3;}
| T {$$ = $1;}
;

Syntax Specifications for Calculator
Program with grammar rewritten
Oxford University Press 2013. All rights reserved.
T : T mul F {$$ = $1 * $3;}
|F {$$=$1;}
;
F : openp E closep {$$ = $2;}
|number {$$ = $1;}
;
%%
yyerror()
{
printf("error occured\n");
exit(-1);
}
#include"y.tab.c"
#include"lex.yy.c"
int main()
{
yyparse();
return 1;
}
Syntax Specifications for Calculator
Program with grammar rewrittencontd
Oxford University Press 2013. All rights reserved.
%{
#include<stdio.h>
#include<stdlib.h>
%}
%union {
char varname[10];
}
%type <varname> assig
%type <varname> exp
%token PLUS MUL EQUAL OPENB
%token CLOSEB NEWLINE
%token <varname> ID
%left PLUS BMINUS
%left MUL
%nonassoc UMINUS
%%
Infix to Postfix Expression Lexical
Specification
Oxford University Press 2013. All rights reserved.
assig : ID EQUAL exp NEWLINE
{
strcpy($$,"");
strcat($$,$1);
strcat($$,$3);
strcat($$,"=");
printf("The postfix expression is
%s\n,$$);
exit(0);
}

Infix to Postfix Expression
Syntax Specification
Oxford University Press 2013. All rights reserved.
exp : exp PLUS exp
{ strcpy($$,""); strcat($$,$1);
strcat($$,$3); strcat($$,"+");
}
| exp MUL exp
{
strcpy($$,""); strcat($$,$1);
strcat($$,$3); strcat($$,"*");
}
|exp BMINUS exp
{
strcpy($$,""); strcat($$,$1);
strcat($$,$3); strcat($$,"-");
}
Infix to Postfix Expression
Syntax Specification Contd
Oxford University Press 2013. All rights reserved.
| UMINUS exp
{ strcpy($$,""); strcat($$,$2);
strcat($$,"-");
}
| OPENB exp CLOSEB
{
strcpy($$,""); strcat($$,$2);
}
| ID
{
strcpy($$,""); strcat($$,$1);
}
;

Infix to Postfix Expression
Syntax Specification Contd
Oxford University Press 2013. All rights reserved.
int main()
{
yyparse();
return 0;
}

yyerror(char *msg)
{
fprintf(stderr,"%s\n",msg);
}
Infix to Postfix Expression
Calling and error handling
Oxford University Press 2013. All rights reserved.
Error Handling
Categories of Error

Error Location

Error Recovery

Error Reporting

Oxford University Press 2013. All rights reserved.
KEY TERMS
Role of Parser
Grammar and languages
Ambiguous Grammar and its elimination
Parser and its Types
Top-down parser recursive descent and predictive
parser
Bottom-up Parser Stack based parser, operator
precedence parser, LR Parser (SLR, CLR, LALR)
Parsing tools (YACC)
Implementation techniques
Error Handling

You might also like