Compiler Design Lab

UE168110
INDEX
S.No Name of the Experiment Date Remarks

1. INTRODUCTION TO COMPILERS. 14-01-2019
2. IMPLEMENTATION OF A LEXICAL 21-01-2019

ANALYSER.
3. REGULAR EXPRESSION TO FINITE 28-01-2019

AUTOMATA
4. CONVERSION OF NFA TO DFA. 11-02-2019
5. MINIMIZATION OF DFA 18-02-2019
6. ELIMINATION OF LEFT RECURSION 25-02-2019
7. FIRST AND FOLLOW OF A 11-03-2019

GRAMMAR
8. FIRST AND FOLLOW OF A 18-03-2019

GRAMMAR
9. RECURSIVE DESCENT PARSER 25-03-19
Practical 1
1
UE168110
Aim - Introduction to compiler and its phases.
Programming languages are implemented in two ways: interpretation and compilation.
What is a Compiler?
A compiler is a program that translates a source program written in some high-level
programming language (such as Java) into machine code for some computer architecture (such
as the Intel Pentium architecture).The generated machine code can be later executed many times
against different data each time.
Compilers are a type of translator that support digital devices, primarily computers. The name
compiler is primarily used for programs that translate source code from a high-level
programming language to a lower level language to create an executable program.
What is an Interpreter?
An interpreter is a computer program that is used to directly execute program instructions written
using one of the many high-level programming languages. The interpreter transforms the high-
level program into an intermediate language that it then executes, or it could parse the high-level
source code and then performs the command directly, which is done line by line or statement by
statement.
2
UE168110
Interpreter Vs Compiler : Difference Between Interpreter and Compiler
INTERPRETER COMPILER
Translates program one statement at a time. Scans the entire program and translates it as a
whole into machine code.
It takes less amount of time to analyze the source It takes large amount of time to analyze the source
code but the overall execution time is slower. code but the overall execution time is
comparatively faster.
No intermediate object code is generated, hence Generates intermediate object code which further
are memory efficient. requires linking, hence requires more memory.
Continues translating the program until the first It generates the error message only after scanning
error is met, in which case it stops. Hence the whole program. Hence debugging is
debugging is easy. comparatively hard.
It generates the error message only after scanning Programming language like C, C++ use compilers.
the whole program. Hence debugging is
comparatively hard.
3
UE168110
Examples:
Compilers:
❏ Gcc
❏ Clang
❏ Javac
❏ Go
Interpreter:
❏ Ruby
❏ Python
❏ Php
❏ LISP
Phases of Compiler
The compilation process is a sequence of various phases. Each phase takes input from its
previous stage, has its own representation of source program, and feeds its output to the next
phase of the compiler. Let us understand the phases of a compiler.
❏ Lexical Analysis
The first phase of scanner works as a text scanner. This phase scans the source code as a
stream of characters and converts it into meaningful lexemes. Lexical analyzer represents
these lexemes in the form of tokens as:
<token-name, attribute-value>
4
UE168110
Input: stream of characters
Output: Token
Example
Input
c=a+b*5;
Output:
LEXEMES TOKENS
c identifier
= Assignment symbol
a identifier
+ +(addition symbol)
b identifier
* *(multiplication symbol)
5 5 (number)
❏ Syntax Analysis
The next phase is called the syntax analysis or parsing. It takes the token produced by
lexical analysis as input and generates a parse tree (or syntax tree). In this phase, token
arrangements are checked against the source code grammar, i.e. the parser checks if the
expression made by the tokens is syntactically correct.
Input: Tokens
Output: Syntax tree
5
UE168110
❏ Semantic Analysis
Semantic analysis checks whether the parse tree constructed follows the rules of
language. For example, assignment of values is between compatible data types, and
adding string to an integer. Also, the semantic analyzer keeps track of identifiers, their
types and expressions; whether identifiers are declared before use or not etc. The
semantic analyzer produces an annotated syntax tree as an output.
❏ Intermediate Code Generation

After semantic analysis the compiler generates an intermediate code of the source code
for the target machine. It represents a program for some abstract machine. It is in between
the high-level language and the machine language. This intermediate code should be
6
UE168110
generated in such a way that it makes it easier to be translated into the target machine
code.
Example:
t1 = inttofloat (5)
t2 = id3* tl
t3 = id2 + t2
id1 = t3
❏ Code Optimization
The next phase does code optimization of the intermediate code. Optimization can be
assumed as something that removes unnecessary code lines, and arranges the sequence of
statements in order to speed up the program execution without wasting resources (CPU,
memory).
Examples:
t1 = id3* 5.0
id1 = id2 + t1
❏ Code Generation
In this phase, the code generator takes the optimized representation of the intermediate
code and maps it to the target machine language. The code generator translates the
intermediate code into a sequence of (generally) re relocatable machine code. Sequence
of instructions of machine code performs the task as the intermediate code would do.
Example:
LDF R2, id3

MULF R2, # 5.0
LDF R1, id2
ADDF R1, R2
STF id1, R1
Practical 2
Aim - Write a program to implement Lexical Analyzer/Tokenization.
#include<iostream>
7
UE168110
#include<stdio.h>
#include<string.h>
using namespace std;
int count_keys = 0;
int count_ident = 0;
int count_const = 0;
int count_opt = 0;
int count_del=0;
int isOperator(char* str1)

{
char ch;
cout<<str1;
for(int i=0;i<strlen(str1);i++)
{
ch = str1[i];
if(ch == '+' || ch == '-' || ch == '*' || ch == '/' || ch == '>' || ch == '<' || ch == '=')
{
cout<<ch<<" is an operator"<<endl;
count_opt++;
}
else if(ch == '(' || ch == ')' || ch == ',' || ch == ';' || ch == ':' || ch == '{' || ch == '}' || ch == '[' || ch==
']' || ch == '"')
{
cout<<ch<<" is a delimiter"<<endl;
count_del++;
}
}
return 0;
}
void find_count(char* t)
{
if(!strcmp(t,"int") || !strcmp(t,"main") || !strcmp(t,"for") || !strcmp(t,"endl") || !strcmp(t,"cout") ||
!strcmp(t,"cin") || !strcmp(t,"while") || !strcmp(t,"include") || !strcmp(t,"using") ||
!strcmp(t,"namespace") || !strcmp(t,"std") || !strcmp(t,"case") || !strcmp(t,"switch") ||
!strcmp(t,"if") || !strcmp(t,"else") || !strcmp(t,"break")|| !strcmp(t,"continue") || !strcmp(t,"return")
|| !strcmp(t,"double") || !strcmp(t,"sizeof") || !strcmp(t,"void"))
{
8
UE168110
cout<<t<<" is keyword"<<endl;
count_keys++;
}
else if(!strcmp(t,"0") || !strcmp(t,"1") || !strcmp(t,"2") || !strcmp(t,"3") || !strcmp(t,"4") ||
!strcmp(t,"5") || !strcmp(t,"6") || !strcmp(t,"7") || !strcmp(t,"8") || !strcmp(t,"9") || !strcmp(t,"10"))
{
cout<<t<<" is constant"<<endl;
count_const++;
}
else
{
cout<<t<<" is identifier"<<endl;
count_ident++;
}
}
void parse(char* str,char* str1)

{
char* token = strtok(str, ".,;:()/-+={}><' ");
while(token!= NULL)
{
find_count(token);
token = strtok(NULL, ".,;:()/-+={}><' ");
}
isOperator(str1);
cout<<"count of delimiters is : "<<count_del<<endl;
cout<<"count of operators is : "<<count_opt<<endl;
cout<<"count of keywords is : "<<count_keys<<endl;
cout<<"count of identifiers is : "<<count_ident<<endl;
cout<<"count of constants is : "<<count_const<<endl;
}
int main()
{
char str[]= "int main() {int a = b+c; for(int j=0;j<10;j++){ printf(\"My name is Tanvi\"<<endl;}";
char str1[]="int main() {int a = b+c; for(int j=0;j<10;j++){ printf(\"My name is
Tanvi\"<<endl;}";
parse(str,str1);
return 0;
9
UE168110
Fig. 2.1
Practical 3
Aim - Write a program to convert input Regular expression to Deterministic Finite
automata.
#include <bits/stdc++.h>
#include <stdio.h>
int ret[100];
static int pos=0;
static int sc=0;
10
UE168110
void nfa(int st,int p,char *s)

{ int i,sp,fs[15],fsc=0;
sp=st;pos=p;sc=st;
while(*s)
{if(isalpha(*s))
{ret[pos++]=sp;
ret[pos++]=*s;
ret[pos++]=++sc;}
if(*s=='.')
{sp=sc;
ret[pos++]=sc;
ret[pos++]=238;
ret[pos++]=++sc;
sp=sc;}
if(*s=='|')
{sp=st;
fs[fsc++]=sc;}
if(*s=='*')
{ret[pos++]=sc;
ret[pos++]=238;
ret[pos++]=sp;
ret[pos++]=sp;
ret[pos++]=238;
ret[pos++]=sc;
}
if (*s=='(')
{char ps[50];
int i=0,flag=1;
s++;
while(flag!=0)
{ps[i++]=*s;
if (*s=='(')
flag++;
if (*s==')')
flag--;
s++;}
ps[--i]='\0';
nfa(sc,pos,ps);
s--;
}
s++;
}
sc++;
for(i=0;i<fsc;i++)
{ret[pos++]=fs[i];
ret[pos++]=238;
11
UE168110
ret[pos++]=sc;
}
ret[pos++]=sc-1;
ret[pos++]=238;
ret[pos++]=sc;
}
Int main()
{ int i;
char inp[100];
printf("Tanvi");
printf("\nUE168110\n");
printf("enter the regular expression :");
gets(inp);
puts(inp);
nfa(1,0,inp);
printf("\nstate input state\n");
for(i=0;i<pos;i=i+3)
printf("%d --%c--> %d\n",ret[i],ret[i+1],ret[i+2]);
printf("\n");
return 0;
}
Practical 4
Aim: Write a Program which converts Non-Deterministic Finite automata to Deterministic finite
Automata.
#include <stdio.h>
#include <string.h>
#define STATES 50
struct Dstate
{
char name;
char StateString[STATES+1];
char trans[10];
int is_final;
12
UE168110
}Dstates[50];
struct tran
{
char sym;
int tostates[50];
int notran;
};
struct state
{
int no;
struct tran tranlist[50];
};
int stackA[100],stackB[100],C[100],Cptr=-1,Aptr=-1,Bptr=-1;
struct state States[STATES];
char temp[STATES+1],inp[10];
int nos,noi,nof,j,k,nods=-1;
void pushA(int z)
{
stackA[++Aptr]=z;
}
void pushB(int z)
{
stackB[++Bptr]=z;
}
int popA()
{
return stackA[Aptr--];
}
void copy(int i)
{
char temp[STATES+1]=" ";
int k=0;
Bptr=-1;
strcpy(temp,Dstates[i].StateString);
while(temp[k]!='\0')
{
pushB(temp[k]-'0');
k++;
}
}
int popB()
{
return stackB[Bptr--];
}
int peekB()
{
13
UE168110
return stackA[Bptr];
}
int peekA()
{
return stackA[Aptr];
}
int seek(int arr[],int ptr,int s)
{
int i;
for(i=0;i<=ptr;i++)
{
if(s==arr[i])
return 1;
}
return 0;
}
void sort()
{
int i,j,temp;
for(i=0;i<Bptr;i++)
{
for(j=0;j<(Bptr-i);j++)
{
if(stackB[j]>stackB[j+1])
{
temp=stackB[j];
stackB[j]=stackB[j+1];
stackB[j+1]=temp;
}
}
}
}
void tostring()
{
int i=0;
sort();
for(i=0;i<=Bptr;i++)
{
temp[i]=stackB[i]+'0';
}
temp[i]='\0';
}
void display_DTran()
{
int i,j;
printf("\n\t\t DFA Transition Table ");
14
UE168110
printf("\n\t\t -------------------- ");

printf("\nStates\tString\tInputs\n ");
for(i=0;i<noi;i++)
{
printf("\t%c",inp[i]);
}
printf("\n \t----------");
for(i=0;i<nods;i++)
{
if(Dstates[i].is_final==0)
printf("\n%c",Dstates[i].name);
else
printf("\n*%c",Dstates[i].name);
printf("\t%s",Dstates[i].StateString);
for(j=0;j<noi;j++)
{
printf("\t%c",Dstates[i].trans[j]);
}
}
printf("\n");
}
void move(int st,int j)
{
int ctr=0;
while(ctr<States[st].tranlist[j].notran)
{
pushA(States[st].tranlist[j].tostates[ctr++]);
}
}
void lambda_closure(int st)
{
int ctr=0,in_state=st,curst=st,chk;
while(Aptr!=-1)
{
curst=popA();
ctr=0;
in_state=curst;
while(ctr<=States[curst].tranlist[noi].notran)
{
chk=seek(stackB,Bptr,in_state);
if(chk==0)
pushB(in_state);
in_state=States[curst].tranlist[noi].tostates[ctr++];
chk=seek(stackA,Aptr,in_state);
15
UE168110
if(chk==0 && ctr<=States[curst].tranlist[noi].notran)

pushA(in_state);
}
}
}
main()
{
cout<<endl<<"Tanvi"<<endl<<"UE168110"<<endl;
int final[20],start,fin=0,i;
char c,ans,st[20];
printf("\nEnter no. of states in NFA : ");
scanf("%d",&nos);
for(i=0;i<nos;i++)
{
States[i].no=i;
}
printf("\nEnter the start state : ");
scanf("%d",&start);
printf("Enter the no. of final states : ");
scanf("%d",&nof);
printf("\nEnter the final states : \n");
for(i=0;i<nof;i++)
scanf("%d",&final[i]);
printf("\nEnter the no. of input symbols : ");
scanf("%d",&noi);
c=getchar();
printf("\nEnter the input symbols : \n ");
for(i=0;i<noi;i++)
{
scanf("%c",&inp[i]);
c=getchar();
}
inp[i]='e';
printf("\nEnter the transitions : (-1 to stop)\n");
for(i=0;i<nos;i++)
{
for(j=0;j<=noi;j++)
{
States[i].tranlist[j].sym=inp[j];
k=0;
ans='y';
while(ans=='y')
{
printf("move(%d,%c) : ",i,inp[j]);
scanf("%d",&States[i].tranlist[j].tostates[k++]);
if(States[i].tranlist[j].tostates[k-1]==-1)
16
UE168110
{
k--;ans='n';
break;
}
}
States[i].tranlist[j].notran=k;
}
}
//Conversion
i=0;nods=0;fin=0;
pushA(start);
lambda_closure(peekA());
tostring();
Dstates[nods].name='A';
nods++;
strcpy(Dstates[0].StateString,temp);
while(i<nods)
{
for(j=0;j<noi;j++)
{
fin=0;
copy(i);
while(Bptr!=-1)
{
move(popB(),j);
}
while(Aptr!=-1)
lambda_closure(peekA());
tostring();
for(k=0;k<nods;k++)
{
if((strcmp(temp,Dstates[k].StateString)==0))
{
Dstates[i].trans[j]=Dstates[k].name;
break;
}
}
if(k==nods)
{
nods++;
for(k=0;k<nof;k++)
{
fin=seek(stackB,Bptr,final[k]);
if(fin==1)
{
Dstates[nods-1].is_final=1;
17
UE168110
break;
}
}
strcpy(Dstates[nods-1].StateString,temp);
Dstates[nods-1].name='A'+nods-1;
Dstates[i].trans[j]=Dstates[nods-1].name;
}
}
i++;
}
display_DTran();
}
18
UE168110
Fig 4.1
Practical 5
19
UE168110
Aim: Write a program Minimization of DFA
#include <stdio.h>
#include <string.h>
#define STATES 99
#define SYMBOLS 20
int N_symbols; /* number of input symbols */

int N_DFA_states; /* number of DFA states */
char *DFA_finals; /* final-state string */
int DFAtab[STATES][SYMBOLS];
char StateName[STATES][STATES+1]; /* state-name table */
int N_optDFA_states; /* number of optimized DFA states */

int OptDFA[STATES][SYMBOLS];
char NEW_finals[STATES+1];
/*
Print state-transition table.
State names: 'A', 'B', 'C', ...
*/
void print_dfa_table(
int tab[][SYMBOLS], /* DFA table */
int nstates, /* number of states */
int nsymbols, /* number of input symbols */
char *finals)
{
int i, j;
puts("\nDFA: STATE TRANSITION TABLE");
/* input symbols: '0', '1', ... */

printf(" | ");
for (i = 0; i < nsymbols; i++) printf(" %c ", '0'+i);
printf("\n-----+--");
for (i = 0; i < nsymbols; i++) printf("-----");
printf("\n");
for (i = 0; i < nstates; i++) {

printf(" %c | ", 'A'+i); /* state */
for (j = 0; j < nsymbols; j++)
printf(" %c ", tab[i][j]); /* next state */
20
UE168110
printf("\n");
}
printf("Final states = %s\n", finals);
}
/*
Initialize NFA table.
*/
void load_DFA_table()
{
DFAtab[0][0] = 'B'; DFAtab[0][1] = 'C';

DFAtab[1][0] = 'E'; DFAtab[1][1] = 'F';
DFAtab[2][0] = 'A'; DFAtab[2][1] = 'A';
DFAtab[3][0] = 'F'; DFAtab[3][1] = 'E';
DFAtab[4][0] = 'D'; DFAtab[4][1] = 'F';
DFAtab[5][0] = 'D'; DFAtab[5][1] = 'E';
DFA_finals = "EF";
N_DFA_states = 6;
N_symbols = 2;
}
/*
Get next-state string for current-state string.
*/
void get_next_state(char *nextstates, char *cur_states,
int dfa[STATES][SYMBOLS], int symbol)
{
int i, ch;
for (i = 0; i < strlen(cur_states); i++)

*nextstates++ = dfa[cur_states[i]-'A'][symbol];
*nextstates = '\0';
}
/*
Get index of the equivalence states for state 'ch'.
Equiv. class id's are '0', '1', '2', ...
*/
char equiv_class_ndx(char ch, char stnt[][STATES+1], int n)
{
int i;
for (i = 0; i < n; i++)

if (strchr(stnt[i], ch)) return i+'0';
21
UE168110
return -1; /* next state is NOT defined */

}
/*
Check if all the next states belongs to same equivalence class.
Return value:
If next state is NOT unique, return 0.
If next state is unique, return next state --> 'A/B/C/...'
's' is a '0/1' string: state-id's
*/
char is_one_nextstate(char *s)
{
char equiv_class; /* first equiv. class */
while (*s == '@') s++;

equiv_class = *s++; /* index of equiv. class */
while (*s) {
if (*s != '@' && *s != equiv_class) return 0;
s++;
}
return equiv_class; /* next state: char type */

}
int state_index(char *state, char stnt[][STATES+1], int n, int *pn,

int cur) /* 'cur' is added only for 'printf()' */
{
int i;
char state_flags[STATES+1]; /* next state info. */
if (!*state) return -1; /* no next state */
for (i = 0; i < strlen(state); i++)

state_flags[i] = equiv_class_ndx(state[i], stnt, n);
state_flags[i] = '\0';
printf(" %d:[%s]\t--> [%s] (%s)\n",

cur, stnt[cur], state, state_flags);
if (i=is_one_nextstate(state_flags))
return i-'0'; /* deterministic next states */
else {
strcpy(stnt[*pn], state_flags); /* state-division info */
return (*pn)++;
}
22
UE168110
/*
Divide DFA states into finals and non-finals.
*/
int init_equiv_class(char statename[][STATES+1], int n, char *finals)
{
int i, j;
if (strlen(finals) == n) { /* all states are final states */

strcpy(statename[0], finals);
return 1;
}
strcpy(statename[1], finals); /* final state group */
for (i=j=0; i < n; i++) {

if (i == *finals-'A') {
finals++;
} else statename[0][j++] = i+'A';
}
statename[0][j] = '\0';
return 2;
}
/*
Get optimized DFA 'newdfa' for equiv. class 'stnt'.
*/
int get_optimized_DFA(char stnt[][STATES+1], int n,
int dfa[][SYMBOLS], int n_sym, int newdfa[][SYMBOLS])
{
int n2=n; /* 'n' + <num. of state-division info>
*/
int i, j;
char nextstate[STATES+1];
for (i = 0; i < n; i++) { /* for each pseudo-DFA state */

for (j = 0; j < n_sym; j++) { /* for each input symbol */
get_next_state(nextstate, stnt[i], dfa, j);
newdfa[i][j] = state_index(nextstate, stnt, n, &n2, i)+'A';
}
}
return n2;
}
23
UE168110
/*
char 'ch' is appended at the end of 's'.
*/
void chr_append(char *s, char ch)
{
int n=strlen(s);
*(s+n) = ch;
*(s+n+1) = '\0';
}
void sort(char stnt[][STATES+1], int n)

{
int i, j;
char temp[STATES+1];
for (i = 0; i < n-1; i++)

for (j = i+1; j < n; j++)
if (stnt[i][0] > stnt[j][0]) {
strcpy(temp, stnt[i]);
strcpy(stnt[i], stnt[j]);
strcpy(stnt[j], temp);
}
}
/*
Divide first equivalent class into subclasses.
stnt[i1] : equiv. class to be segmented
stnt[i2] : equiv. vector for next state of stnt[i1]
Algorithm:
- stnt[i1] is splitted into 2 or more classes 's1/s2/...'
- old equiv. classes are NOT changed, except stnt[i1]
- stnt[i1]=s1, stnt[n]=s2, stnt[n+1]=s3, ...
Return value: number of NEW equiv. classses in 'stnt'.
*/
int split_equiv_class(char stnt[][STATES+1],
int i1, /* index of 'i1'-th equiv. class */
int i2, /* index of equiv. vector for 'i1'-th class */
int n, /* number of entries in 'stnt' */
int n_dfa) /* number of source DFA entries */
{
char *old=stnt[i1], *vec=stnt[i2];
int i, n2, flag=0;
char newstates[STATES][STATES+1]; /* max. 'n' subclasses */
for (i=0; i < STATES; i++) newstates[i][0] = '\0';
24
UE168110
for (i=0; vec[i]; i++)

chr_append(newstates[vec[i]-'0'], old[i]);
for (i=0, n2=n; i < n_dfa; i++) {

if (newstates[i][0]) {
if (!flag) { /* stnt[i1] = s1 */
strcpy(stnt[i1], newstates[i]);
flag = 1; /* overwrite parent class */
} else /* newstate is appended in 'stnt' */
strcpy(stnt[n2++], newstates[i]);
}
}
sort(stnt, n2); /* sort equiv. classes */
return n2; /* number of NEW states(equiv. classes) */

}
/*
Equiv. classes are segmented and get NEW equiv. classes.
*/
int set_new_equiv_class(char stnt[][STATES+1], int n,
int newdfa[][SYMBOLS], int n_sym, int n_dfa)
{
int i, j, k;
for (i = 0; i < n; i++) {

for (j = 0; j < n_sym; j++) {
k = newdfa[i][j]-'A'; /* index of equiv. vector */
if (k >= n) /* equiv. class 'i' should be segmented */
return split_equiv_class(stnt, i, k, n, n_dfa);
}
}
return n;
}
void print_equiv_classes(char stnt[][STATES+1], int n)

{
int i;
printf("\nEQUIV. CLASS CANDIDATE ==>");

for (i = 0; i < n; i++)
printf(" %d:[%s]", i, stnt[i]);
printf("\n");
25
UE168110
/*
State-minimization of DFA: 'dfa' --> 'newdfa'
Return value: number of DFA states.
*/
int optimize_DFA(
int dfa[][SYMBOLS], /* DFA state-transition table */
int n_dfa, /* number of DFA states */
int n_sym, /* number of input symbols */
char *finals, /* final states of DFA */
char stnt[][STATES+1], /* state name table */
int newdfa[][SYMBOLS]) /* reduced DFA table */
{
char nextstate[STATES+1];
int n; /* number of new DFA states */
int n2; /* 'n' + <num. of state-dividing info> */
n = init_equiv_class(stnt, n_dfa, finals);
while (1) {
print_equiv_classes(stnt, n);
n2 = get_optimized_DFA(stnt, n, dfa, n_sym, newdfa);
if (n != n2)
n = set_new_equiv_class(stnt, n, newdfa, n_sym, n_dfa);
else break; /* equiv. class segmentation ended!!! */
}
return n; /* number of DFA states */

}
/*
Check if 't' is a subset of 's'.
*/
int is_subset(char *s, char *t)
{
int i;
for (i = 0; *t; i++)

if (!strchr(s, *t++)) return 0;
return 1;
}
/*
New finals states of reduced DFA.
*/
26
UE168110
void get_NEW_finals(
char *newfinals, /* new DFA finals */
char *oldfinals, /* source DFA finals */
char stnt[][STATES+1], /* state name table */
int n) /* number of states in 'stnt' */
{
int i;
for (i = 0; i < n; i++)

if (is_subset(oldfinals, stnt[i])) *newfinals++ = i+'A';
*newfinals++ = '\0';
}
void main()
{
load_DFA_table();
print_dfa_table(DFAtab, N_DFA_states, N_symbols, DFA_finals);
N_optDFA_states = optimize_DFA(DFAtab, N_DFA_states,

N_symbols, DFA_finals, StateName, OptDFA);
get_NEW_finals(NEW_finals, DFA_finals, StateName, N_optDFA_states);
print_dfa_table(OptDFA, N_optDFA_states, N_symbols, NEW_finals);

}
27
UE168110
Fig. 5.1
28
UE168110
Practical 6
Aim : To find First and Follow of Context Free Grammar.
// C program to calculate the First and

// Follow sets of a given grammar
#include<stdio.h>
#include<ctype.h>
#include<string.h>
#include<iostream>
// Functions to calculate Follow

void followfirst(char, int, int);
void follow(char c);
// Function to calculate First

void findfirst(char, int, int);
int count, n = 0;
// Stores the final result

// of the First Sets
char calc_first[10][100];
// Stores the final result

// of the Follow Sets
char calc_follow[10][100];
int m = 0;
// Stores the production rules

char production[10][10];
char f[10], first[10];
int k;
char ck;
int e;
int main(int argc, char **argv)

{
int jm = 0;
int km = 0;
int i, choice;
char c, ch;
29
UE168110
// The Input grammar

cout<<"How many productions do you want to enter?"<<endl;
cin>>count;
cout<<"Enter Production rules:"<<endl;
for(int i=0;i<count;i++)
{
cin>>production[i];
}
int kay;
char done[count];
int ptr = -1;
// Initializing the calc_first array

for(k = 0; k < count; k++) {
for(kay = 0; kay < 100; kay++) {
calc_first[k][kay] = '!';
}
}
int point1 = 0, point2, xxx;
for(k = 0; k < count; k++)

{
c = production[k][0];
point2 = 0;
xxx = 0;
// Checking if First of c has

// already been calculated
for(kay = 0; kay <= ptr; kay++)
if(c == done[kay])
xxx = 1;
if (xxx == 1)
continue;
// Function call
findfirst(c, 0, 0);
ptr += 1;
// Adding c to the calculated list

done[ptr] = c;
printf("\n First(%c) = { ", c);
30
UE168110
calc_first[point1][point2++] = c;
// Printing the First Sets of the grammar

for(i = 0 + jm; i < n; i++) {
int lark = 0, chk = 0;
for(lark = 0; lark < point2; lark++) {
if (first[i] == calc_first[point1][lark])
{
chk = 1;
break;
}
}
if(chk == 0)
{
printf("%c, ", first[i]);
calc_first[point1][point2++] = first[i];
}
}
printf("}\n");
jm = n;
point1++;
}
printf("\n");
printf("-----------------------------------------------\n\n");
char donee[count];
ptr = -1;
// Initializing the calc_follow array

for(k = 0; k < count; k++) {
for(kay = 0; kay < 100; kay++) {
calc_follow[k][kay] = '!';
}
}
point1 = 0;
int land = 0;
for(e = 0; e < count; e++)
{
ck = production[e][0];
point2 = 0;
xxx = 0;
// Checking if Follow of ck
// has alredy been calculated
for(kay = 0; kay <= ptr; kay++)
31
UE168110
if(ck == donee[kay])
xxx = 1;
if (xxx == 1)
continue;
land += 1;
// Function call
follow(ck);
ptr += 1;
// Adding ck to the calculated list

donee[ptr] = ck;
printf(" Follow(%c) = { ", ck);
calc_follow[point1][point2++] = ck;
// Printing the Follow Sets of the grammar

for(i = 0 + km; i < m; i++) {
int lark = 0, chk = 0;
for(lark = 0; lark < point2; lark++)
{
if (f[i] == calc_follow[point1][lark])
{
chk = 1;
break;
}
}
if(chk == 0)
{
printf("%c, ", f[i]);
calc_follow[point1][point2++] = f[i];
}
}
printf(" }\n\n");
km = m;
point1++;
}
}
void follow(char c)
{
int i, j;
// Adding "$" to the follow

// set of the start symbol
if(production[0][0] == c) {
32
UE168110
f[m++] = '$';
}
for(i = 0; i < 10; i++)
{
for(j = 2;j < 10; j++)
{
if(production[i][j] == c)
{
if(production[i][j+1] != '\0')
{
// Calculate the first of the next
// Non-Terminal in the production
followfirst(production[i][j+1], i, (j+2));
}
if(production[i][j+1]=='\0' && c!=production[i][0])

{
// Calculate the follow of the Non-Terminal
// in the L.H.S. of the production
follow(production[i][0]);
}
}
}
}
}
void findfirst(char c, int q1, int q2)

{
int j;
// The case where we

// encounter a Terminal
if(!(isupper(c))) {
first[n++] = c;
}
for(j = 0; j < count; j++)
{
if(production[j][0] == c)
{
if(production[j][2] == '#')
{
if(production[q1][q2] == '\0')
first[n++] = '#';
else if(production[q1][q2] != '\0'
&& (q1 != 0 || q2 != 0))
{
33
UE168110
// Recursion to calculate First of New

// Non-Terminal we encounter after epsilon
findfirst(production[q1][q2], q1, (q2+1));
}
else
first[n++] = '#';
}
else if(!isupper(production[j][2]))
{
first[n++] = production[j][2];
}
else
{
// Recursion to calculate First of
// New Non-Terminal we encounter
// at the beginning
findfirst(production[j][2], j, 3);
}
}
}
}
void followfirst(char c, int c1, int c2)

{
int k;
// The case where we encounter

// a Terminal
if(!(isupper(c)))
f[m++] = c;
else
{
int i = 0, j = 1;
for(i = 0; i < count; i++)
{
if(calc_first[i][0] == c)
break;
}
//Including the First set of the

// Non-Terminal in the Follow of
// the original query
while(calc_first[i][j] != '!')
{
if(calc_first[i][j] != '#')
{
34
UE168110
f[m++] = calc_first[i][j];
}
else
{
if(production[c1][c2] == '\0')
{
// Case where we reach the
// end of a production
follow(production[c1][0]);
}
else
{
// Recursion to the next symbol
// in case we encounter a "#"
followfirst(production[c1][c2], c1, c2+1);
}
}
j++;
}
}
}
35
UE168110
Fig. 6.1
Practical 7
Parsers
1. Top Down Parsers (TDP)
1.1 TDP with Backtracking - Brute Force
1.2 TDP without backtracking - Non Recursive descent LL(1)
2. Bottom-up Parser (BUP) or shift reduce
2.1 Operator Precedence
2.2 LR parsers
LR(0),SLR(1),LALR(1),CLR(1)
36
UE168110
#include<iostream>
#include<stdio.h>
#include<string.h>
char input[100];
char prod[100][100];
int pos=-1,len,st=-1;
char id,num;
void E();
void E_dash();
void T();
void T_dash();
void F();
void advance();
void advance()
{
pos++;
cout<<"Input is updated to :- "<<input[pos]<<endl;
if((input[pos]>='a' || input[pos]>='A' ) && (input[pos]<='z' || input[pos]<='Z'))
{
id = input[pos];
}
}
void E()
{
strcpy(prod[++st],"E->TE'");
T();
E_dash();
}
void E_dash()
{
cout<<"In E()--------------"<<endl;
int p=1;
if(input[pos]=='+')
{
p=0;
strcpy(prod[++st],"E'->TE'");
advance();
cout<<"Calling T()....."<<endl;
T();
E_dash();
37
UE168110
}
if(p==1)
{
strcpy(prod[++st],"E'->null");
}
}
void T()
{
cout<<"In T()--------------"<<endl;
strcpy(prod[++st],"T->FT'");
cout<<"Calling F()....."<<endl;
F();
T_dash();
}
void T_dash()
{
int p=1;
if(input[pos]=='*')
{
p=0;
strcpy(prod[++st],"T'->*FT'");
advance();
F();
T_dash();
}
if(p==1)
{
strcpy(prod[++st],"T->null");
}
}
void F()
{
cout<<"In F()--------------"<<endl;
if(input[pos]==id){
cout<<"INPUT MATCHED : - id"<<endl;
strcpy(prod[++st],"F->id");
advance();
}
if(input[pos]=='(')
{
cout<<"INPUT MATCHED : - '(' "<<endl;
strcpy(prod[++st],"F->(E)");
advance();
cout<<"Calling E()....."<<endl;
38
UE168110
E();
if(input[pos]==')')
{
advance();
}
}
}
int main()
{
cout<<"Enter Input"<<endl;
cin>>input;
len=strlen(input);
input[len]='$';
advance();
cout<<"Calling E()....."<<endl;
E();
if(pos==len)
{
cout<<"String accepted"<<endl;
for(int i=0;i<st;i++)
{
cout<<prod[i]<<endl;
}
}
else
{
cout<<"String not accepted"<<endl;
}
return 0;
}
39
UE168110
Fig 7.1
Fig 7.2
40

Compiler Design Lab

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Compiler Design Lab

Uploaded by

Copyright:

Available Formats

UE168110

S.No Name of the Experiment Date Remarks

2. IMPLEMENTATION OF A LEXICAL 21-01-2019

3. REGULAR EXPRESSION TO FINITE 28-01-2019

4. CONVERSION OF NFA TO DFA. 11-02-2019

5. MINIMIZATION OF DFA 18-02-2019

6. ELIMINATION OF LEFT RECURSION 25-02-2019

7. FIRST AND FOLLOW OF A 11-03-2019

8. FIRST AND FOLLOW OF A 18-03-2019

9. RECURSIVE DESCENT PARSER 25-03-19

Aim - Introduction to compiler and its phases.

Programming languages are implemented in two ways: interpretation and compilation.

Interpreter Vs Compiler : Difference Between Interpreter and Compiler

❏ Intermediate Code Generation

LDF R2, id3

int isOperator(char* str1)

void parse(char* str,char* str1)

void nfa(int st,int p,char *s)

printf("\n\t\t -------------------- ");

if(chk==0 && ctr<=States[curst].tranlist[noi].notran)

Aim: Write a program Minimization of DFA

int N_symbols; /* number of input symbols */

char StateName[STATES][STATES+1]; /* state-name table */

int N_optDFA_states; /* number of optimized DFA states */

puts("\nDFA: STATE TRANSITION TABLE");

/* input symbols: '0', '1', ... */

for (i = 0; i < nstates; i++) {

DFAtab[0][0] = 'B'; DFAtab[0][1] = 'C';

for (i = 0; i < strlen(cur_states); i++)

for (i = 0; i < n; i++)

return -1; /* next state is NOT defined */

while (*s == '@') s++;

return equiv_class; /* next state: char type */

int state_index(char *state, char stnt[][STATES+1], int n, int *pn,

if (!*state) return -1; /* no next state */

for (i = 0; i < strlen(state); i++)

printf(" %d:[%s]\t--> [%s] (%s)\n",

if (strlen(finals) == n) { /* all states are final states */

strcpy(statename[1], finals); /* final state group */

for (i=j=0; i < n; i++) {

for (i = 0; i < n; i++) { /* for each pseudo-DFA state */

void sort(char stnt[][STATES+1], int n)

for (i = 0; i < n-1; i++)

for (i=0; i < STATES; i++) newstates[i][0] = '\0';

for (i=0; vec[i]; i++)

for (i=0, n2=n; i < n_dfa; i++) {

sort(stnt, n2); /* sort equiv. classes */

return n2; /* number of NEW states(equiv. classes) */

for (i = 0; i < n; i++) {

void print_equiv_classes(char stnt[][STATES+1], int n)

printf("\nEQUIV. CLASS CANDIDATE ==>");

n = init_equiv_class(stnt, n_dfa, finals);

return n; /* number of DFA states */

for (i = 0; *t; i++)

for (i = 0; i < n; i++)

N_optDFA_states = optimize_DFA(DFAtab, N_DFA_states,

print_dfa_table(OptDFA, N_optDFA_states, N_symbols, NEW_finals);

Aim : To find First and Follow of Context Free Grammar.

// C program to calculate the First and

using namespace std;

// Functions to calculate Follow

// Function to calculate First

// Stores the final result

int state_index(char state, char stnt[][STATES+1], int n, int pn,

if (!state) return -1; / no next state */