You are on page 1of 7

Language Processors

Assignment 1 Amrith Krishna,


Submitted by: Roll No: 5, S5 CSE –A

Q. What is the purpose of a Language Processor.


Language processors are programs that can read a program in one language and
convert it to an equivalent program in another language

Source Program Target code


Language
Processor

Language Processors has mainly Three purposes:

 Bridge gap between Application Domain and Execution Domain


 Translation from one language to another
 To detect error in source during translation.

Bridging
Semantic gap

Translation
of language

Error detection
and reporting

1
1. Bridge Gap between Application domain and execution domain

As computers only understand machine language and it is not plausible by humans


to understand machine codes. So a software designer describes the ideas
concerning software in terms related to application domain, usually in high-level
languages.

This description needs to be interpreted to machine level languages, i.e. in terms


related to execution domain. The rules of each domain (semantics) vary and
therefore this difference is called semantic gap.

A Language processor is used for:

1. Specification, design and coding steps.


2. PL Implementation steps.

The gaps b/w both the domains are bridged by introducing a new domain PL
Domain. By introducing this new domain the software designer need to take care of
only bridging specification gap i.e. gap b/w Application domain and PL Domain, while
the translator takes care of execution gap.[1]

2. Translation Structure of a language processor


A Language Processor is divided into two phases:

Analysis phase:

The analysis part breaks up the source code and imposes into a grammatical
structure. This structure is then used to create an intermediate representation(IR).

It also collects information like labels from source code and stores in a symbol table.

Synthesis phase:

This pahse constructs object code from IR and symbol table.Its objectives include:

o obtain machine code from mnemonics table.

o Check address of a operand from symbol table

o synthesize a machine instruction.

2
Phases of an LP [2]

Lexical Analyser:

They read the stream of characters from source and groups them into meaningful
sequences called lexemes of the form

<token-name, attrib-value>

3
Syntax Analysis

They create a tree-like intermediate representation which depicsts the grammatical


structure of token stream from lexical analyzer.

Semantic Analysis

They perform type-checking and check for semantic consistency of source with the
language definition.

Intermediate code:

many compilers generate this an machine-code like intermediate code in order to


exchange them between different passes during the conversion process.

Code Optimization and Code Generation:

The intermediate code is taken as input and converted to target code. Prior to this
conversion these codes are optimized by machine-independent routines and thereby
increase its efficiency.

If target is an executable then efficient allocation of registers and memory location for
variable are selected.

Ultimately this section aims at faster execution of code.

3. Error detection and reporting


Detection of Errors has always been a important task for LPs.It is impossible to
correct errors in source by simply looking at target code even if it is reported that
there is an error.

One probable solution is to report errors after pass 1 rather than pass 2, but errors
like references to undefined variables only can be found at 2nd pass only.

These factors make error reporting a complicated procedure.

It is inevitable to report errors only after pass 2 and to overcome the former issue
compilers print both target code and error description against the source itself. This
makes debugging easier.[3]

4
Q. Differentiate between one pass and two pass assemblers.

Introduction
Assemblers are language processors which are used to convert assembly level
languages into low-level machine level languages.

Functions of assemblers:

 Translates Assembly language to object code


 Assigning machine address to symbolic labels.

Structure of a language processor

A Language Processor is divided into two phases:

Analysis phase:

The analysis part breaks up the source code and imposes into a grammatical
structure. This structure is then used to create an intermediate representation(IR).

It allso collects information like labels from source code and stores in a symbol table.

Synthesis phase:

This pahse constructs object code from IR and symbol table.Its objectives include:

o obtain machine code from mnemonics table.


o Check address of a operand from symbol table
o synthesize a machine instruction.

Design of Assembler:
The design of an assembler can be achieved in either

o Multi-Pass translation
 2 pass translation.
o Single pass translation

5
Two-pass translation:

Pass 1:

 Separate the symbol, mnemonic op-code, and operand fields


 Build the symbol table
 Perform LC processing
 Construct IR

Pass2:

 Synthesize the target program

Single Pass translation

Performs all the operations in one go. And so called as load-and-go


assemblers and used for faster execution or if intermediate file is
inconvenient.

Difference between Single pass and two pass assemblers


Single Pass Two Pass
Performs translation in one pass itself Performs translation in 2 passes by
And hence known as load-and-go using different organizing principles
assembler
Speed of External/secondary storage The process can be slow if the
never affects the working. external storage used for IR is
working slow
Forward Referencing can be Forward referencing is easily as
troublesome though a process called synthesizing is on pass 2 and every
back-patching can be used entry in symbol table will be complete
before passing it to pass2
Back-patching No need for back-patching.

The address of undefined operand is


omitted and is entered to symtab and
is noted on TII (Table of Incomplete
Instructions) and when the definition is
encountered TII is checked and
values are added
The default address of an undefined No such scenario will arise and
symbol will be zero and is later hence no such conventions
updated with the correct address.

6
No need for an intermediate code to An intermediate code is generated
be generated. after pass 1 to be passes to pass 2
All the tables should be available in Optab (Table containing mnemonics
memory during the translation process and op-codes) is needed only dring
and hence more memory intensive 1st pass and hence can be eliminated
in second pass
Generally all areas are defined before Though forward referencing is much
they are used and backpatching is left widely used forward references in
only for forward jumping. symbol definition are not allowed.

It is possible for data for data items, Eg: jack EQU jill
though inconvenient
is not allowed.
The object code is generated in The object code is written out and a
memory but is not written out implicitly loader is needed.
and hence no loader is needed
Emphasize more on processing power Used in areas where there is
and requires more memory. So used limitation of available memory is
where memory consumption is of no there
serious concern
.

Bibliography
Systems programming and operating systems – D M Dhamdhere. [1],[3]

Principles of Compiler Design - Aho A.V.- [2]

Submitted by:

Amrith Krishna,
Roll No:5,
CSE-A,
FISAT

You might also like