You are on page 1of 23

Introduction to Compilers

Compilers and
Interpreters
Compilation
A compiler is a program that reads a
program written in one language
(the source language) and
translates it into an equivalent
program in another language (the
Input
target language).
Oversimplified view:
Source Target
Compiler
Program Program

Errormessages Output
2
Interpreter
Instead of producing a target
program as a translation, an
interpreter performs the operations
implied by the source program.
An interpreter might build a tree
and carry out the operations at the
nodes as it walks the tree.
At the root it would discover it had
an assignment to perform.

3
Compilers and Interpreters
(contd)
Interpretation
Performing the operations implied by
the source program
Oversimplified view:

Source
Program
Interpreter Output
Input

Errormessages
4
Compilers and Interpreters
(contd)
Compiler: a program that
translates an executable program
in one language into an executable
program in another language

Interpreter: a program that reads


an executable program and
produces the results of running
that program

5
The Analysis-Synthesis
Model of Compilation
There are two parts to
compilation:
Analysis
Breaks up source program into pieces
and imposes a grammatical structure
Creates intermediate representation
of source program
Determines the operations and records
them in a tree structure, syntax tree
Known as front end of compiler

6
The Analysis-Synthesis
Model of Compilation
(contd)
Synthesis
Constructs target program from
intermediate representation
Takes the tree structure and translates the
operations into the target program
Known as back end of compiler

7
A language-processing
system
SkeletalSourceProgram

Preprocessor
SourceProgram
Compiler
TargetAssemblyProgram
Assembler
RelocatableObjectCode
Linker Librariesand
RelocatableObjectFiles
AbsoluteMachineCode 8
The context of a Compiler
A source program is divided into
modules stored in separate files. The
task of collecting the source program
is entrusted to a distinct program,
called a preprocessor. It also expand
shorthands called macros.
The compiler creates assembly code
that is translated by an assembler
into machine code and than linked
together with some library routines
into the code that actually runs on
the machine. 9
Analysis
Incompiling, analysis has three
phases:
Linear analysis: stream of characters
read from left-to-right and grouped into
tokens; known as lexical analysis or
scanning
Hierarchical analysis: tokens grouped
hierarchically with collective meaning;
known as parsing or syntax analysis
Semantic analysis: check if the program
components fit together meaningfully

10
Phases of a compiler

11
Lexical Analysis
Firstphase of a compiler is called lexical
analysis or scanning, it reads stream of
characters & groups the characters into
meaningful sequence called lexemes.
For each lexeme the lexical analyzer
produces as output a token of the form
<token name, attribute value>
So, this phase perform linear analysis
on the source program.

12
Lexical analysis(contd)
Characters grouped into tokens.

13
Syntax analysis (Parsing)
It performs hierarchical analysis
on the source program.
It is represented by syntax tree,
where each interior nodes are
expressions and leave nodes are
operands.

14
Syntax analysis (contd)
Grouping tokens into grammatical phrases
Character groups recorded in symbol table
Represented by a parse tree

15
Syntax analysis (contd)
Hierarchical structure usually
expressed by recursive rules
Rules for definition of expression:

16
Semantic analysis
Checks source program for
semantic errors
Gathers type information for
subsequent code generation
(type checking)
Identifies operator and operands
of expressions and statements

17
Intermediate code
generation
Program representation for an
abstract machine
Should have two properties
Easy to produce
Easy to translate into target program
Three-address code is a
commonly used form similar to
assembly language

18
Code optimization and
generation
Code Optimization
Improve intermediate code by
producing code that runs faster
Code Generation
Generate target code, which is
machine code or assembly code

19
Symbol-Table
Management
Symbol table data structure
with a record for each identifier
and its attributes
Attributes include storage
allocation, type, scope, etc
All the compiler phases insert
and modify the symbol table

20
The Structure of a Compiler
(8)
Non-optimized
Intermediate Code

Tokens

Optimized Intermediate Cod


Parse
tree

Target machine code

Abstract Syntax Tree w/


Attributes

21
The Grouping of Phases
Compiler front and back ends:
Front end:
Analysis steps + Intermediate code generation
Depends primarily on the source language
Machine independent
Back end:
Code optimization and generation
Independent of source language
Machine dependent

22
The Grouping of Phases
(contd)
Compiler passes:
A collection of phases is done only once
(single pass) or multiple times (multi pass)
Single pass: reading input, processing, and
producing output by one large compiler program;
usually runs faster
Multi pass: compiler split into smaller programs,
each making a pass over the source; performs
better code optimization

23

You might also like