Professional Documents
Culture Documents
INDEX
Practical 1
1
UE168110
What is a Compiler?
A compiler is a program that translates a source program written in some high-level
programming language (such as Java) into machine code for some computer architecture (such
as the Intel Pentium architecture).The generated machine code can be later executed many times
against different data each time.
Compilers are a type of translator that support digital devices, primarily computers. The name
compiler is primarily used for programs that translate source code from a high-level
programming language to a lower level language to create an executable program.
What is an Interpreter?
An interpreter is a computer program that is used to directly execute program instructions written
using one of the many high-level programming languages. The interpreter transforms the high-
level program into an intermediate language that it then executes, or it could parse the high-level
source code and then performs the command directly, which is done line by line or statement by
statement.
2
UE168110
INTERPRETER COMPILER
Translates program one statement at a time. Scans the entire program and translates it as a
whole into machine code.
It takes less amount of time to analyze the source It takes large amount of time to analyze the source
code but the overall execution time is slower. code but the overall execution time is
comparatively faster.
No intermediate object code is generated, hence Generates intermediate object code which further
are memory efficient. requires linking, hence requires more memory.
Continues translating the program until the first It generates the error message only after scanning
error is met, in which case it stops. Hence the whole program. Hence debugging is
debugging is easy. comparatively hard.
It generates the error message only after scanning Programming language like C, C++ use compilers.
the whole program. Hence debugging is
comparatively hard.
3
UE168110
Examples:
Compilers:
❏ Gcc
❏ Clang
❏ Javac
❏ Go
Interpreter:
❏ Ruby
❏ Python
❏ Php
❏ LISP
Phases of Compiler
The compilation process is a sequence of various phases. Each phase takes input from its
previous stage, has its own representation of source program, and feeds its output to the next
phase of the compiler. Let us understand the phases of a compiler.
❏ Lexical Analysis
The first phase of scanner works as a text scanner. This phase scans the source code as a
stream of characters and converts it into meaningful lexemes. Lexical analyzer represents
these lexemes in the form of tokens as:
<token-name, attribute-value>
4
UE168110
Input: stream of characters
Output: Token
Example
Input
c=a+b*5;
Output:
LEXEMES TOKENS
c identifier
= Assignment symbol
a identifier
+ +(addition symbol)
b identifier
* *(multiplication symbol)
5 5 (number)
❏ Syntax Analysis
The next phase is called the syntax analysis or parsing. It takes the token produced by
lexical analysis as input and generates a parse tree (or syntax tree). In this phase, token
arrangements are checked against the source code grammar, i.e. the parser checks if the
expression made by the tokens is syntactically correct.
Input: Tokens
Output: Syntax tree
5
UE168110
❏ Semantic Analysis
Semantic analysis checks whether the parse tree constructed follows the rules of
language. For example, assignment of values is between compatible data types, and
adding string to an integer. Also, the semantic analyzer keeps track of identifiers, their
types and expressions; whether identifiers are declared before use or not etc. The
semantic analyzer produces an annotated syntax tree as an output.
6
UE168110
generated in such a way that it makes it easier to be translated into the target machine
code.
Example:
t1 = inttofloat (5)
t2 = id3* tl
t3 = id2 + t2
id1 = t3
❏ Code Optimization
The next phase does code optimization of the intermediate code. Optimization can be
assumed as something that removes unnecessary code lines, and arranges the sequence of
statements in order to speed up the program execution without wasting resources (CPU,
memory).
Examples:
t1 = id3* 5.0
id1 = id2 + t1
❏ Code Generation
In this phase, the code generator takes the optimized representation of the intermediate
code and maps it to the target machine language. The code generator translates the
intermediate code into a sequence of (generally) re relocatable machine code. Sequence
of instructions of machine code performs the task as the intermediate code would do.
Example:
Practical 2
Aim - Write a program to implement Lexical Analyzer/Tokenization.
#include<iostream>
7
UE168110
#include<stdio.h>
#include<string.h>
using namespace std;
int count_keys = 0;
int count_ident = 0;
int count_const = 0;
int count_opt = 0;
int count_del=0;
void find_count(char* t)
{
if(!strcmp(t,"int") || !strcmp(t,"main") || !strcmp(t,"for") || !strcmp(t,"endl") || !strcmp(t,"cout") ||
!strcmp(t,"cin") || !strcmp(t,"while") || !strcmp(t,"include") || !strcmp(t,"using") ||
!strcmp(t,"namespace") || !strcmp(t,"std") || !strcmp(t,"case") || !strcmp(t,"switch") ||
!strcmp(t,"if") || !strcmp(t,"else") || !strcmp(t,"break")|| !strcmp(t,"continue") || !strcmp(t,"return")
|| !strcmp(t,"double") || !strcmp(t,"sizeof") || !strcmp(t,"void"))
{
8
UE168110
cout<<t<<" is keyword"<<endl;
count_keys++;
}
else if(!strcmp(t,"0") || !strcmp(t,"1") || !strcmp(t,"2") || !strcmp(t,"3") || !strcmp(t,"4") ||
!strcmp(t,"5") || !strcmp(t,"6") || !strcmp(t,"7") || !strcmp(t,"8") || !strcmp(t,"9") || !strcmp(t,"10"))
{
cout<<t<<" is constant"<<endl;
count_const++;
}
else
{
cout<<t<<" is identifier"<<endl;
count_ident++;
}
}
isOperator(str1);
cout<<"count of delimiters is : "<<count_del<<endl;
cout<<"count of operators is : "<<count_opt<<endl;
cout<<"count of keywords is : "<<count_keys<<endl;
cout<<"count of identifiers is : "<<count_ident<<endl;
cout<<"count of constants is : "<<count_const<<endl;
}
int main()
{
char str[]= "int main() {int a = b+c; for(int j=0;j<10;j++){ printf(\"My name is Tanvi\"<<endl;}";
char str1[]="int main() {int a = b+c; for(int j=0;j<10;j++){ printf(\"My name is
Tanvi\"<<endl;}";
parse(str,str1);
return 0;
9
UE168110
Fig. 2.1
Practical 3
Aim - Write a program to convert input Regular expression to Deterministic Finite
automata.
#include <bits/stdc++.h>
#include <stdio.h>
using namespace std;
int ret[100];
static int pos=0;
static int sc=0;
10
UE168110
11
UE168110
ret[pos++]=sc;
}
ret[pos++]=sc-1;
ret[pos++]=238;
ret[pos++]=sc;
}
Int main()
{ int i;
char inp[100];
printf("Tanvi");
printf("\nUE168110\n");
printf("enter the regular expression :");
gets(inp);
puts(inp);
nfa(1,0,inp);
printf("\nstate input state\n");
for(i=0;i<pos;i=i+3)
printf("%d --%c--> %d\n",ret[i],ret[i+1],ret[i+2]);
printf("\n");
return 0;
}
Practical 4
Aim: Write a Program which converts Non-Deterministic Finite automata to Deterministic finite
Automata.
#include <stdio.h>
#include <string.h>
#define STATES 50
struct Dstate
{
char name;
char StateString[STATES+1];
char trans[10];
int is_final;
12
UE168110
}Dstates[50];
struct tran
{
char sym;
int tostates[50];
int notran;
};
struct state
{
int no;
struct tran tranlist[50];
};
int stackA[100],stackB[100],C[100],Cptr=-1,Aptr=-1,Bptr=-1;
struct state States[STATES];
char temp[STATES+1],inp[10];
int nos,noi,nof,j,k,nods=-1;
void pushA(int z)
{
stackA[++Aptr]=z;
}
void pushB(int z)
{
stackB[++Bptr]=z;
}
int popA()
{
return stackA[Aptr--];
}
void copy(int i)
{
char temp[STATES+1]=" ";
int k=0;
Bptr=-1;
strcpy(temp,Dstates[i].StateString);
while(temp[k]!='\0')
{
pushB(temp[k]-'0');
k++;
}
}
int popB()
{
return stackB[Bptr--];
}
int peekB()
{
13
UE168110
return stackA[Bptr];
}
int peekA()
{
return stackA[Aptr];
}
int seek(int arr[],int ptr,int s)
{
int i;
for(i=0;i<=ptr;i++)
{
if(s==arr[i])
return 1;
}
return 0;
}
void sort()
{
int i,j,temp;
for(i=0;i<Bptr;i++)
{
for(j=0;j<(Bptr-i);j++)
{
if(stackB[j]>stackB[j+1])
{
temp=stackB[j];
stackB[j]=stackB[j+1];
stackB[j+1]=temp;
}
}
}
}
void tostring()
{
int i=0;
sort();
for(i=0;i<=Bptr;i++)
{
temp[i]=stackB[i]+'0';
}
temp[i]='\0';
}
void display_DTran()
{
int i,j;
printf("\n\t\t DFA Transition Table ");
14
UE168110
if(Dstates[i].is_final==0)
printf("\n%c",Dstates[i].name);
else
printf("\n*%c",Dstates[i].name);
printf("\t%s",Dstates[i].StateString);
for(j=0;j<noi;j++)
{
printf("\t%c",Dstates[i].trans[j]);
}
}
printf("\n");
}
void move(int st,int j)
{
int ctr=0;
while(ctr<States[st].tranlist[j].notran)
{
pushA(States[st].tranlist[j].tostates[ctr++]);
}
}
void lambda_closure(int st)
{
int ctr=0,in_state=st,curst=st,chk;
while(Aptr!=-1)
{
curst=popA();
ctr=0;
in_state=curst;
while(ctr<=States[curst].tranlist[noi].notran)
{
chk=seek(stackB,Bptr,in_state);
if(chk==0)
pushB(in_state);
in_state=States[curst].tranlist[noi].tostates[ctr++];
chk=seek(stackA,Aptr,in_state);
15
UE168110
16
UE168110
{
k--;ans='n';
break;
}
}
States[i].tranlist[j].notran=k;
}
}
//Conversion
i=0;nods=0;fin=0;
pushA(start);
lambda_closure(peekA());
tostring();
Dstates[nods].name='A';
nods++;
strcpy(Dstates[0].StateString,temp);
while(i<nods)
{
for(j=0;j<noi;j++)
{
fin=0;
copy(i);
while(Bptr!=-1)
{
move(popB(),j);
}
while(Aptr!=-1)
lambda_closure(peekA());
tostring();
for(k=0;k<nods;k++)
{
if((strcmp(temp,Dstates[k].StateString)==0))
{
Dstates[i].trans[j]=Dstates[k].name;
break;
}
}
if(k==nods)
{
nods++;
for(k=0;k<nof;k++)
{
fin=seek(stackB,Bptr,final[k]);
if(fin==1)
{
Dstates[nods-1].is_final=1;
17
UE168110
break;
}
}
strcpy(Dstates[nods-1].StateString,temp);
Dstates[nods-1].name='A'+nods-1;
Dstates[i].trans[j]=Dstates[nods-1].name;
}
}
i++;
}
display_DTran();
}
18
UE168110
Fig 4.1
Practical 5
19
UE168110
#include <stdio.h>
#include <string.h>
#define STATES 99
#define SYMBOLS 20
/*
Print state-transition table.
State names: 'A', 'B', 'C', ...
*/
void print_dfa_table(
int tab[][SYMBOLS], /* DFA table */
int nstates, /* number of states */
int nsymbols, /* number of input symbols */
char *finals)
{
int i, j;
printf("\n-----+--");
for (i = 0; i < nsymbols; i++) printf("-----");
printf("\n");
20
UE168110
printf("\n");
}
printf("Final states = %s\n", finals);
}
/*
Initialize NFA table.
*/
void load_DFA_table()
{
DFA_finals = "EF";
N_DFA_states = 6;
N_symbols = 2;
}
/*
Get next-state string for current-state string.
*/
void get_next_state(char *nextstates, char *cur_states,
int dfa[STATES][SYMBOLS], int symbol)
{
int i, ch;
/*
Get index of the equivalence states for state 'ch'.
Equiv. class id's are '0', '1', '2', ...
*/
char equiv_class_ndx(char ch, char stnt[][STATES+1], int n)
{
int i;
21
UE168110
/*
Check if all the next states belongs to same equivalence class.
Return value:
If next state is NOT unique, return 0.
If next state is unique, return next state --> 'A/B/C/...'
's' is a '0/1' string: state-id's
*/
char is_one_nextstate(char *s)
{
char equiv_class; /* first equiv. class */
while (*s) {
if (*s != '@' && *s != equiv_class) return 0;
s++;
}
if (i=is_one_nextstate(state_flags))
return i-'0'; /* deterministic next states */
else {
strcpy(stnt[*pn], state_flags); /* state-division info */
return (*pn)++;
}
22
UE168110
/*
Divide DFA states into finals and non-finals.
*/
int init_equiv_class(char statename[][STATES+1], int n, char *finals)
{
int i, j;
return 2;
}
/*
Get optimized DFA 'newdfa' for equiv. class 'stnt'.
*/
int get_optimized_DFA(char stnt[][STATES+1], int n,
int dfa[][SYMBOLS], int n_sym, int newdfa[][SYMBOLS])
{
int n2=n; /* 'n' + <num. of state-division info>
*/
int i, j;
char nextstate[STATES+1];
return n2;
}
23
UE168110
/*
char 'ch' is appended at the end of 's'.
*/
void chr_append(char *s, char ch)
{
int n=strlen(s);
*(s+n) = ch;
*(s+n+1) = '\0';
}
/*
Divide first equivalent class into subclasses.
stnt[i1] : equiv. class to be segmented
stnt[i2] : equiv. vector for next state of stnt[i1]
Algorithm:
- stnt[i1] is splitted into 2 or more classes 's1/s2/...'
- old equiv. classes are NOT changed, except stnt[i1]
- stnt[i1]=s1, stnt[n]=s2, stnt[n+1]=s3, ...
Return value: number of NEW equiv. classses in 'stnt'.
*/
int split_equiv_class(char stnt[][STATES+1],
int i1, /* index of 'i1'-th equiv. class */
int i2, /* index of equiv. vector for 'i1'-th class */
int n, /* number of entries in 'stnt' */
int n_dfa) /* number of source DFA entries */
{
char *old=stnt[i1], *vec=stnt[i2];
int i, n2, flag=0;
char newstates[STATES][STATES+1]; /* max. 'n' subclasses */
24
UE168110
/*
Equiv. classes are segmented and get NEW equiv. classes.
*/
int set_new_equiv_class(char stnt[][STATES+1], int n,
int newdfa[][SYMBOLS], int n_sym, int n_dfa)
{
int i, j, k;
return n;
}
25
UE168110
/*
State-minimization of DFA: 'dfa' --> 'newdfa'
Return value: number of DFA states.
*/
int optimize_DFA(
int dfa[][SYMBOLS], /* DFA state-transition table */
int n_dfa, /* number of DFA states */
int n_sym, /* number of input symbols */
char *finals, /* final states of DFA */
char stnt[][STATES+1], /* state name table */
int newdfa[][SYMBOLS]) /* reduced DFA table */
{
char nextstate[STATES+1];
int n; /* number of new DFA states */
int n2; /* 'n' + <num. of state-dividing info> */
while (1) {
print_equiv_classes(stnt, n);
n2 = get_optimized_DFA(stnt, n, dfa, n_sym, newdfa);
if (n != n2)
n = set_new_equiv_class(stnt, n, newdfa, n_sym, n_dfa);
else break; /* equiv. class segmentation ended!!! */
}
/*
Check if 't' is a subset of 's'.
*/
int is_subset(char *s, char *t)
{
int i;
/*
New finals states of reduced DFA.
*/
26
UE168110
void get_NEW_finals(
char *newfinals, /* new DFA finals */
char *oldfinals, /* source DFA finals */
char stnt[][STATES+1], /* state name table */
int n) /* number of states in 'stnt' */
{
int i;
void main()
{
load_DFA_table();
print_dfa_table(DFAtab, N_DFA_states, N_symbols, DFA_finals);
27
UE168110
Fig. 5.1
28
UE168110
Practical 6
int count, n = 0;
29
UE168110
int kay;
char done[count];
int ptr = -1;
if (xxx == 1)
continue;
// Function call
findfirst(c, 0, 0);
ptr += 1;
30
UE168110
calc_first[point1][point2++] = c;
if (first[i] == calc_first[point1][lark])
{
chk = 1;
break;
}
}
if(chk == 0)
{
printf("%c, ", first[i]);
calc_first[point1][point2++] = first[i];
}
}
printf("}\n");
jm = n;
point1++;
}
printf("\n");
printf("-----------------------------------------------\n\n");
char donee[count];
ptr = -1;
// Checking if Follow of ck
// has alredy been calculated
for(kay = 0; kay <= ptr; kay++)
31
UE168110
if(ck == donee[kay])
xxx = 1;
if (xxx == 1)
continue;
land += 1;
// Function call
follow(ck);
ptr += 1;
void follow(char c)
{
int i, j;
32
UE168110
f[m++] = '$';
}
for(i = 0; i < 10; i++)
{
for(j = 2;j < 10; j++)
{
if(production[i][j] == c)
{
if(production[i][j+1] != '\0')
{
// Calculate the first of the next
// Non-Terminal in the production
followfirst(production[i][j+1], i, (j+2));
}
33
UE168110
34
UE168110
f[m++] = calc_first[i][j];
}
else
{
if(production[c1][c2] == '\0')
{
// Case where we reach the
// end of a production
follow(production[c1][0]);
}
else
{
// Recursion to the next symbol
// in case we encounter a "#"
followfirst(production[c1][c2], c1, c2+1);
}
}
j++;
}
}
}
35
UE168110
Fig. 6.1
Practical 7
Parsers
1. Top Down Parsers (TDP)
1.1 TDP with Backtracking - Brute Force
1.2 TDP without backtracking - Non Recursive descent LL(1)
2. Bottom-up Parser (BUP) or shift reduce
2.1 Operator Precedence
2.2 LR parsers
LR(0),SLR(1),LALR(1),CLR(1)
36
UE168110
#include<iostream>
#include<stdio.h>
#include<string.h>
char input[100];
char prod[100][100];
int pos=-1,len,st=-1;
char id,num;
void E();
void E_dash();
void T();
void T_dash();
void F();
void advance();
void advance()
{
pos++;
cout<<"Input is updated to :- "<<input[pos]<<endl;
if((input[pos]>='a' || input[pos]>='A' ) && (input[pos]<='z' || input[pos]<='Z'))
{
id = input[pos];
}
}
void E()
{
strcpy(prod[++st],"E->TE'");
T();
E_dash();
}
void E_dash()
{
cout<<"In E()--------------"<<endl;
int p=1;
if(input[pos]=='+')
{
p=0;
strcpy(prod[++st],"E'->TE'");
advance();
cout<<"Calling T()....."<<endl;
T();
E_dash();
37
UE168110
}
if(p==1)
{
strcpy(prod[++st],"E'->null");
}
}
void T()
{
cout<<"In T()--------------"<<endl;
strcpy(prod[++st],"T->FT'");
cout<<"Calling F()....."<<endl;
F();
T_dash();
}
void T_dash()
{
int p=1;
if(input[pos]=='*')
{
p=0;
strcpy(prod[++st],"T'->*FT'");
advance();
F();
T_dash();
}
if(p==1)
{
strcpy(prod[++st],"T->null");
}
}
void F()
{
cout<<"In F()--------------"<<endl;
if(input[pos]==id){
cout<<"INPUT MATCHED : - id"<<endl;
strcpy(prod[++st],"F->id");
advance();
}
if(input[pos]=='(')
{
cout<<"INPUT MATCHED : - '(' "<<endl;
strcpy(prod[++st],"F->(E)");
advance();
cout<<"Calling E()....."<<endl;
38
UE168110
E();
if(input[pos]==')')
{
advance();
}
}
}
int main()
{
cout<<"Enter Input"<<endl;
cin>>input;
len=strlen(input);
input[len]='$';
advance();
cout<<"Calling E()....."<<endl;
E();
if(pos==len)
{
cout<<"String accepted"<<endl;
for(int i=0;i<st;i++)
{
cout<<prod[i]<<endl;
}
}
else
{
cout<<"String not accepted"<<endl;
}
return 0;
}
39
UE168110
Fig 7.1
Fig 7.2
40