You are on page 1of 44

Topic to be covered

Bootstrapping and Porting of Compiler


Converting a RE Directly to a DFA
Removing Ambiguity of Grammar
Computing First and Follow
Syntax Directed Translation

P K Singh

MMMUT, Gorakhpur

Bootstrapping and
Porting of Compiler

P K Singh

MMMUT, Gorakhpur

Third Language for Compiler Construction


Machine language
Compiler to execute immediately
Another language with existed compiler on the
same target machine : (First Scenario)
Compile the new compiler with existing compiler
Another language with existed compiler on
different machine : (Second Scenario)
Compilation produce a cross compiler

T-Diagram Describing Complex Situation


A compiler written in language H that
translates language S into language T.
S

T
H

T-Diagram can be combined in two basic


ways.

The First T-diagram Combination


A

BB
H

C
H

C
H

Two compilers run on the same


machine H
First from A to B
Second from B to C
Result from A to C on H

The Second T-diagram Combination


A

B
H

B
K

Translate implementation language of a


compiler from H to K
Use another compiler from H to K

The First Scenario


A

H
B

H
H

Translate a compiler from A to H written in B


Use an existing compiler for language B
on machine H

The Second Scenario


A

H
B

H
K

Use an existing compiler for language B


on different machine K
Result in a cross compiler

Process of Bootstrapping
Write a compiler in the same language
S

T
S

No compiler for source language yet


Porting to a new host machine

The First step in bootstrap


B

H
A

H
H

quick and dirty compiler written in


machine language H
Compiler written in its own language A
Result in running but inefficient compiler

The Second step in bootstrap


L

M
B

M
H

Running but inefficient compiler


Compiler written in its own language A
Result in final version of the compiler

The step 1 in porting


A

K
A

K
H

Original compiler
Compiler source code retargeted to K
Result in Cross Compiler

The step 2 in porting


A

K
A

K
K

Cross compiler
Compiler source code retargeted to K
Result in Retargeted Compiler

Converting a RE Directly
to a DFA

P K Singh

MMMUT, Gorakhpur

14

Converting a RE Directly to a DFA


Construct a syntax tree for (r)#
Traverse the tree to construct functions nullable, firstpos,
lastpos, and followpos
Construct DFA D by algorithm 3.62

P K Singh

MMMUT, Gorakhpur

15

Function Computed From the Syntax


Tree
nullable(n)
The subtree at node n generates languages including the empty
string

firstpos(n)
The set of positions that can match the first symbol of a string
generated by the subtree at node n

lastpos(n)
The set of positions that can match the last symbol of a string
generated be the subtree at node n

followpos(i)
The set of positions that can follow position i in the tree

P K Singh

MMMUT, Gorakhpur

16

Rules for Computing the Function


Node n

nullable(n)

firstpos(n)

lastpos(n)

A leaf labeled
by

true

A leaf with
position i

false

{i}

{i}

n = c 1 | c2

nullable(c1)
or
nullable(c2)

firstpos(c1) firstpos(c2)

lastpos(c1) lastpos(c2)

if ( nullable(c1) )

n = c 1 c2

nullable(c1)
and
nullable(c2)

firstpos(c1) firstpos(c2)
else firstpos(c1)

if ( nullable(c2) )
lastpos(c1) lastpos(c2)
else lastpos(c2)

n = c1*

true

firstpos(c1)

lastpos(c1)

P K Singh

MMMUT, Gorakhpur

17

Computing followpos
for (each node n in the tree)
{
//n is a cat-node with left child c1 and right child c2
if ( n == c1 c2)
for (each i in lastpos(c1) )
followpos(i) = followpos(i) firstpos(c2);
else if (n is a star-node)
for ( each i in lastpos(n) )
followpos(i) = followpos(i) firstpos(n);
}

P K Singh

MMMUT, Gorakhpur

18

Converting a RE Directly to a DFA


Initialize Dstates to contain only the unmarked state firstpos(n0),
where n0 is the root of syntax tree T for (r)#;
while ( there is an unmarked state S in Dstates ) {
mark S;
for ( each input symbol a ) {
let U be the union of followpos(p)
for all p in S that correspond to a;
if (U is not in Dstates )
add U as an unmarked state to Dstates
Dtran[S,a] = U;
}
}
P K Singh

MMMUT, Gorakhpur

19

Example

( a | b )* a b b #

a
3

*
|

a
1
P K Singh

b
4

b
2
MMMUT, Gorakhpur

b
5

n = ( a | b )* a
nullable(n) = false
firstpos(n) = { 1, 2, 3 }
lastpos(n) = { 3 }
followpos(1) = {1, 2, 3 }
20

Example
{1,2,3}

( a | b )* a b b #

{1,2,3}

{1,2,3}

{4}

{1,2}

{1,2}

{1}

P K Singh

a1 {1}

*
|

{3}

{5}

{4}

{3}

{1,2}

{6}

{5}

nullable
{1,2,3}

{6}

a3 {3}

#6 {6}

b5 {5}

b4 {4}
firstpos

lastpos

{1,2}

{2}

b2 {2}
MMMUT, Gorakhpur

21

Example
Node

followpos

{1, 2, 3}

{1, 2, 3}

{4}

{5}

{6}

( a | b )* a b b #
3

b
a

1,2,3

1,2,
3,4

1,2,
3,5

1,2,3,6

a
a

P K Singh

MMMUT, Gorakhpur

22

Elimination of Left Recursion


Productions of the form
A A
|
are left recursive
Non-left-recursions
A A
A A |

When one of the productions in a grammar is left recursive


then a predictive parser loops forever on certain inputs

Immediate Left-Recursion Elimination


Group the Productions as
A A1 | A2 | | Am | 1 | 2 | | n

Where no

begins with an A

Replace the A-Productions by


A 1 A | 2 A | | n A
A 1 A | 2 A | | m A |

Example
Left-recursive grammar
AA
|
|
| A

Into a right-recursive production


A
|
AR
|
|

AR
AR
AR
AR

Non-Immediate Left-Recursion
The Grammar
S Aa | b
A Ac | Sd |

The nonterminal S is left recursive, because


S A a Sda
But S is not immediately left recursive.

Elimination of Left Recursion


Eliminating left recursion algorithm
Arrange the nonterminals in some order A1, A2, , An
for (each i from 1 to n) {
for (each j from 1 to i-1){
replace each production
Ai A j
with
Ai 1 | 2 | | k
where
Aj 1 | 2 | | k
}
eliminate the immediate left recursion in Ai
}

Example

ABC|a
BCA|Ab
CAB|CC|a

i=1
nothing to do
i = 2, j = B C A | A b
1
BCA|BCb|ab
(imm) B C A BR | a b BR
BR C b BR |
i = 3, j = C A B | C C | a
1
CBCB|aB|CC|a
i = 3, j = C B C B | a B | C C | a
2
C C A BR C B | a b BR C B | a B | C C |
a
(imm)C a b BR C B CR | a B CR | a CR
CR A BR C B C R | C C R |

Exercise
The grammar
S Aa | b
A Ac | Sd |

Answer
AAc|Aad|bd|

Left Factoring
Left Factoring is a grammar transformation.
Predictive Parsing
Top-down Parsing

Replace productions
A 1 | 2 | | n |
with
A AR |
AR 1 | 2 | | n

Example
The Grammar
stmt if expr then stmt
| if expr then stmt else stmt
Replace with

stmt if expr then


stmts else stmt |

stmt

stmts

Exercise
The following grammar
S iEtS | iEtSeS|a
E b

Answer
S i E t S S | a
S e S |
E b

First and Follow


First() is set of terminals that begins strings derived from
If => then is also in First()
In predictive
parsing when we have A-> |, if First()
*

and First() are disjoint sets then we can select


appropriate A-production by looking at the next input

Follow(A), for any nonterminal A, is set of terminals a that


can appear immediately after A in some sentential form
If we have S => Aa for some and then a is in Follow(A)

If A can be the* rightmost symbol in some sentential form,


then $ is in Follow(A)

Computing First
To compute First(X) for all grammar symbols X,
apply following rules until no more terminals or
can be added to any First set:
*

1. If X is a terminal then First(X) = {X}.


2. If X is a nonterminal and X->Y1Y2Yk is a
production for some k>=1, then place a in
First(X) if for some i a is in First(Yi) and is in
all of First(Y1),,First(Yi-1) that is Y1Yi-1 =>
. if is in
* First(Yj) for j=1,,k then add to
First(X).
3. If X-> is a production then add to First(X)

Computing follow
To compute First(A) for all nonterminals A, apply
following rules until nothing can be added to any
follow set:
1. Place $ in Follow(S) where S is the start
symbol
2. If there is a production A-> B then
everything in First() except is in Follow(B).
3. If there is a production A->B or a production
A->B where First() contains , then
everything in Follow(A) is in Follow(B)

Top-down Parsing
Give a Grammar G

EE+T|T
TT*F|F
F(E)|id

E T E
E + T E |
T F T
T * F T |
F ( E ) | id

Example
Give a Grammar G
E T E
E + T E |
T F T
T * F T |
F ( E ) | id

FIRST
E (
id
E +

T (
id
T *

F (
id
FOLLOW
E
E
T
T
F

)
+
*

$
$
+
+

)
)
$
$

)
)

Non-Recursive Predictive Parsing


Table-Driven Parsing
Given an LL(1) grammar G = <N, T, P, S> construct a table
M[A,a] for A N, a T and use a driver program with a stack

input

stack
X
Y
Z
$

Predictiveparsing
program(driver)
Parsingtable
M

output

Construction of predictive parsing


table
For each production A-> in grammar do
the following:

For each terminal a in First() add A-> in M[A,a]


If is in First(), then for each terminal b in
Follow(A) add A-> to M[A,b]. If is in First()
and $ is in Follow(A), add A-> to M[A,$] as
well

If after performing the above, there is no


production in M[A,a] then set M[A,a] to
error

Predictive parsing algorithm


Set ip point to the first symbol of w;
Set X to the top stack symbol;
While (X<>$) { /* stack is not empty */
if (X is a) pop the stack and advance ip;
else if (X is a terminal) error();
else if (M[X,a] is an error entry) error();
else if (M[X,a] = X->Y1Y2..Yk) {
output the production X->Y1Y2..Yk;
pop the stack;
push Yk,,Y2,Y1 on to the stack with Y1 on top;
}
set X to the top stack symbol;
}

LL(1)
A grammar G is LL(1) if it is not left recursive and
for each collection of productions
A 1 | 2 | | n
for nonterminal A the following holds:
FIRST(i) FIRST(j) = for all i j
if i * then
j * for all i j
FIRST(j) FOLLOW(A) = for all i j

Example
Grammar

Not LL(1) because:

SSa|a

Left recursive

SaS|a

FIRST(a S) FIRST(a) ={a}

SaR|
RS|

For R: S * and *

SaRa
RS|

For R:
FIRST(S) FOLLOW(R)

Syntax Directed Definitions


We can associate information with a language construct
by attaching attributes to the grammar symbols.
A syntax directed definition specifies the values of
attributes by associating semantic rules with the
grammar productions.

Production
E->E1+T

Semantic Rule
E.code=E1.code||T.code||+

We may alternatively insert the semantic


actions inside the grammar
E -> E1+T {print +}

Syntax Directed Definitions


A SDD is a context free grammar with attributes and
rules
Attributes are associated with grammar symbols
and rules with productions
Attributes may be of many kinds: numbers, types,
table references, strings, etc.
Synthesized attributes
A synthesized attribute at node N is defined only in terms of
attribute values of children of N and at N it

Inherited attributes
An inherited attribute at node N is defined only in terms of
attribute values at Ns parent, N itself and Ns siblings

You might also like