R.V. College of Engineering

R.V.
COLLEGE OF ENGINEERING
Chapter 1
Introduction
Department of Information Science and Engineering
Page
R.V. COLLEGE OF ENGINEERING

Chapter 1
Introduction
An assembler program is a language processor. It simply translates statements of a

source program into their machine language counterparts. It reads records from input
device (code F1) and copies them to output device (code 05). At the end of the file, it
writes EOF on the output
device, then RSUB to the operating system.
Types of Assemblers:
1. Load and go one pass assembler : It produces object code directly in memory for
immediate execution.
2. Store and go one pass assembler: It produces the usual kind of object program for later
execution.
1.1 About the project

The various types of data structure used by the single pass assembler are
Operation Code Table (OPTAB)
Used to look up mnemonic operation codes and translate them to their machine
language. It contains instruction format& length. OPTAB is used to look up and validate
operation codes. In SIC/XE, assembler search OPTAB in Pass 1 i.e. single pass to find
the instruction length for incrementing LOCCTR. It organizes as a hash table (static).
Symbol Table (SYMTAB)
Used to store values assigned to labels. It also includes the name and value(address) for
each label. It also includes flags to indicate error conditions. It contains type, length.
Page

Labels are entered into SYMTAB, along with assigned addresses (from LOCCTR). It
also organizes as a hash table. The entries are rarely deleted from the table.
Location Counter (LOCCTR):
Used to help in the assignment of address. LOCCTR is initialized to address specified in
START. When reach a label, the current value of LOCCTR gives the address to be
associated with that label.
1.2Aim of the project:

Aim of the project is to develop a single pass assembler that can generate object
program through single pass.
1.3Purpose of the project:

The job of the single pass assembler is to translate mnemonic operation codes to
their machine language equivalents. It also assigns machine addresses to symbolic labels
used by the programmer. It also handles eliminating forward and backward references of
memory. It also helps in writing the object code and also assembly listing.
1.4 Motivation of the Work

It is often useful when developing knowledge in a new field to start by
considering restricted cases and then gradually expand to the general case. Finite-memory
programs are important. They are applicable in the design and analysis of some common
types of computer programs.
Much of the assemblers found commercially are either simple two pass or multi-pass assemblers. But
many cases can still be found where single pass assembler is found in application.
Generally for
such multiple passes and for handling aggressive optimization of target code lot of time is required.
Page
In case of dynamic web applications where we download applets and expect them to update in real time
multiple passes is not an option. Similarly in handheld web enabled devices both space and time
limitations have to be considered.
In such cases single pass over the source is only option, no
intermediate file and no multiple passes, load directly into memory or create an object module. This is
1.5 Overview
An assembler program is a language processor. It simply translates statements of a
source program into their machine language counterparts. Source programs are created
as c files (.c) and then submitted assembly process.
This document describes the characteristics and usage conventions of a one-pass
assembler that executes on the assembly language program (ALP) written for SIC
Machine Architecture and assembles object code into a object code, as in case of a
STORE and GO assembler, different form a LOAD and GO assembler which
assembles code directly into memory directly.
1.6 Objective
The intention of our project is to implement a simple single-pass Assembler for SIC
Machine Architecture. It is intended to perform basic functions mentioned below. They
are as follows:
1) Translating mnemonic operation codes to their machine language equivalents.
2) Assigning machine addresses to symbolic labels used by the programmer.
3) Including handling of forward and backward references in one pass over the
source program.
1.7 Methodology:
The system after careful analysis has been identified to be presented with the following
data structure modules:
Page

1) Operation Code Table (OPTAB): Used to look up mnemonic operation codes
and
translate them to their machine language.
2) Symbol Table (SYMTAB): Used to store values assigned to labels.
3) Location Counter (LOCCTR): Used to help in the assignment of address.
1.8 Organization of the report :

Chapter 1 gives the brief introduction about single pass assembler and also the
motivation, objectives, methodology and organization of the report have been discussed.
In Chapter 2, the specification of the design, the software and hardware
components required for the simulation and implementation are discussed. It also
contains User Interface for doing this project.
Chapter 3 , brainstorms on the System Architecture And Data Flow Diagram;
also, it picturizes on the components and different levels of DFD.
Chapter 4 highlights on Implementation of Single Pass Assembler ; adding to it
code convertor and methodology is also included to it.
Chapter 5 includes Testing process and Different test cases.
Chapter 6 contains results of different Test cases and Inference.
Chapter 7 concludes the project and throws light on the future work that can be
carried out.
Page
Chapter 2
Basic Requirements Specifications
Chapter 2
Basic Requirements Specifications
2.1 Software requirements

A Software Requirements Specifications (SRS) is a complete description of the
behavior of the software of the system developed. It includes a set of use cases that
describe all the interactions the users will have with the software. Use cases are also
Page

known as functional requirements. In addition to use cases, the SRS also contains
nonfunctional requirements (supplementary) requirements.
Non-functional requirements are requirements which impose constraints on the design or
implementation (such as performance engineering requirements, quality standards, or
design constraints).
Purpose
The purpose of this software requirements specification (SRS) is to establish the
ten major requirements necessary to develop the Software Systems Software.
Platform (technology/tools)
In computing, C is a general-purpose computer programming language originally
developed in 1972 by Dennis Ritchie at the Bell Telephone Laboratories to implement the
Unix operating system.
Although C was designed for writing architecturally independent system Software, it is
also widely used for developing application software.
Worldwide, C one among the most popular language in terms of number of developer
positions or publically available code. It is widely used on many different software
platforms, and there are few computer architectures for which a C compiler does not
exist. C has greatly influenced many other popular programming languages, most notably
C++, which originally began as an extension to C, and JAVA and C# .
2.2 User interface

The implementation of the source code for single pass assembler by the user is in
the form of a C code as the input.
Here object code is stored in a different file and also the input as well as the
symbol table are stored in different files. If errors are present then the error as well as its
location is stored in different file.
Page

Also here only those specified operands and their op code are valid . Hence in
order that others should work then those operands and their opcodes must be added in the
c source code file.
Thus the single pass assembler is implemented which translates source code into
object program through one single pass.
2.3 System requirements

Software requirements:
OPERATING SYSTEM: Red Hat Linux 9.0 version 2.4.2 0-8 or Fedora Core 2.4.221.2215.nptl
COMPILER USED : GCC version 3.2.2
EDITOR
: VI Editor version 6.1
PROGRAMMING LANGUAGE : GNU C, Lex version 2.5.4
Hardware requirements:
MAIN PROCESSOR : Pentium IV (500MHz)
RAM SIZE : 128 MB
CACHE MEMORY:256KB
DISKETTE DRIVE:1.FFMB,3.5inches
OPTICAL DRIVE:4X CD-ROM DRIVE
Page
Chapter 3
System Architecture and Data Flow Diagram
Page
Chapter 3
System Architecture and Data Flow Diagram
Data flow diagram
The DFD (bubble chart) is a hierarchical graphical model of a system that shows
the different processing activities or functions that the system performs and the data
interchange among these functions. Each function is considered as a processing station
(process) that consumes some input data and produces some output. The system is
represented in terms of the input to the system, various processing carried out on it and
the output generated by the system. A DFD model uses a very limited number of
primitive symbols to represent the functions performed by the system and the data flow.
3.1 Components of DFD
External Entity
Process
Output
Page
Data flow
Data Store
Fig. 3.1
The main reason why the DFD technique is so popular is probably because of the
fact that DFD is a very simple formalism. It is simple to understand and use. Starting with
a set of high-level functions that a system performs, a DFD model hierarchically
represents various sub functions. In fact, any hierarchical model is simple to understand
because in a hierarchical model of a system, different details are slowly introduced
through different hierarchies. This technique also follows a very simple set of intuitive
concepts and rules. DFD is an elegant modeling that turn out to be useful not only to
represent the results of the structured analysis of a software problem, but also for several
other applications such as showing the flow of items or control in an organization.
3.2 Different levels of DFD

The different levels of DFD for Single Pass Assembler are constructed as follows:
LEVEL-0 DFD
Source Code
Single
Pass
Assemble
Object code
Fig: 3.2
LEVEL-1 DFD
Source code
CodeSource
Creating
SYMTAB
Symbol
Table
OP Table
Page
Error
messages
Fig: 3.3
LEVEL-2 DFD
Source Code
Main program
Parser Routine
Lexical Analyser
Fig: 3.4
Assembler
listing
Creating SYMTAB
Error message
Fig 3.4
Page
Chapter 4
Implementation
Page
Chapter 4
Implementation
4.1 Implementation of Single Pass Assembler:
Implementation Details
If One-Pass needs to generate object code and following situation occurs:
1) If the operand contains an undefined symbol, use 0 as the address and write the Text
record to the object program.
2) Forward references are entered into lists as in the load-and-go assembler.
3) When the definition of a symbol is encountered, the assembler generates another Text
record with the correct operand address of each entry in the reference list.
4) When loaded, the incorrect address 0 will be updated by the latter Text record
containing the symbol definition.
Thus in this way it is handled.
4.2 Algorithm: Single Pass Assembler

\\Input: source file containing the assembly level program for 8086
\\Output: object file containing the assembler listing.
Page

Main routine ()
{
Declare required data and Data Structures;
Open source file;
Display error messages if file cannot be accessed;
Open a new file for writing assembler listing;
AA: Scan source file to find tokens;
Write the token to a temporary file;
Call parser routine;
Check for error flags if so display error messages;
Take care to increment location pointer and line count;
Write into object file the assembler listing;
Repeat AA for all instructions in source file;
}
Forward Reference in One-pass Assembler

For any symbol that has not yet been defined
1. Omit the address translation.
2. Insert the symbol into SYMTAB, and mark this symbol undefined.
3. The address that refers to the undefined symbol is added to a list of forward references
associated with the symbol table entry.
4. When the definition for a symbol is encountered, the proper address for the symbol is
then inserted into any instructions previous generated according to the forward reference
list.
Flow chart of single pass assembler
Page

4.3 Methodology:
The system after careful analysis has been identified to be presented with the following
data structure modules:
1) Operation Code Table (OPTAB): Used to look up mnemonic operation codes and
translate them to their machine language. It contains instruction format& length.
OPTAB is used to look up and validate operation codes. In SIC/XE, assembler search
OPTAB in Pass 1 i.e. single pass to find the instruction length for incrementing
LOCCTR. It organizes as a hash table (static).
2) Symbol Table (SYMTAB): Used to store values assigned to labels. It also includes the
name and value(address) for each label. It also includes flags to indicate error conditions.
It contains type, length. Labels are entered into SYMTAB, along with assigned addresses
(from LOCCTR). It also organizes as a hash table. The entries are rarely deleted from the
table.
3) Location Counter (LOCCTR): Used to help in the assignment of address. LOCCTR is
initialized to address specified in START. When reach a label, the current value of
LOCCTR gives the address to be associated with that label.
Page
Chapter 5
Testing
Page
Chapter 5
Testing
5.1 Testing Process
Single pass assembler performs basic functions like translating mnemonic operation
codes to their machine language equivalents and assigning machine addresses to
symbolic labels used by the programmer.
Thus it also includes handling of forward and backward references in one pass over the
source program. It also generates where the error is occurring through error text file.
5.2 Different test cases

Test case No
Name of the test case
Description
Input
Expected output
Actual output
1
Test_case_1
Test case to check source code with no error
Input file with no error
Object program with no error
Object program with no error
Test case No
Description
Input
Expected output
2
Test_case_2
Test case to check source code with error
Input file with error
Error message to be generated in error file
Page

Actual output
Error message to be generated in error file
Test case No
Description
3
Test_case_3
Test case to check source code with invalid
Input
Expected output
opcode
Input file with error
Error message to be generated in error file with
Actual output
no object program
Error message to be generated in error file with
no object program
Page
Chapter 6
Results and Inference
Page
Chapter 6
Results and Inference
6.1 Results:
Page
Page
Page
6.2 Inference
Thus the single pass assembler generates the symbol table as well as the object code. It
also generates the error if any present.
It all happens through single pass and also it eliminates forward references.
Page
Chapter 7
Conclusion and Future Scope
Page

Chapter 7
Conclusion and Future Scope
7.1 Conclusion
It has been an important area of research of computer science on working this project.
This project provides us great learning experience and great knowledge about the
working of the single pas assembler.
With the help of this project, we can better understand the Integrate software components
written separately into a single working unit. We can also better understand about the
different aspects of assembler.
An assembler is a language processor. It generates object program equivalent to source
code. It is obvious that an assembler which does more than one pass over the source
program would create an intermediate file which it would modify during the further
passes. Resolving references and other complex constructs in every pass until the target
code is ready and can be loaded into the physical memory for execution with or without
the help of a linker/loader.
7.2 Future scope:

1. Some amount of output code optimization have to be implemented.
2. To have more op table contents.
3. To generate more enhanced and proper object code.
Page

References
[1] Leland L Beck - System Software, Pearson Education, Third Edition [2007]
[2] John R. Levine , Tony Mason & Doug Brown. Lex And Yacc , Oreily
Publication Fourth Edition .[2007]
[3] Yashwant Kanithkar, Let Us C, B.P.B Publications,[2007]
Page

APPENDIX
Source Code
#include<stdio.h>
#include<string.h>
#define q 11 //no. of mnemonics in the array A
void main()
{
int lc,ad,address,err=0;
int s,num,l,i=0,j,n=0,line=1,f=0,f1=0,t=0,ni=0,m=0,t1;
FILE *fp1,*fp2,*fp3,*fp4;
char lab[10],op[10],val[10],code[10];
char a[20]
[15]={"STA","STL","LDA","LDB","J","JEQ","J","SUB","COMP","STCH","ADD","SU
B"};
char b[20][15]={"14","32","03","69","34","30","48","28","24","16","0C"};
char sym[15][10];
int symadd[15];
fp1=fopen("INPUT.DAT","r");
fp2=fopen("OBJFILE.DAT","w");
fp3=fopen("ERROR.DAT","w");
fp4=fopen("SYMTAB.DAT","w");
while((!feof(fp1)))
{
fscanf(fp1,"%s\t%s\t%s",lab,op,val);
t++;
m++;
if(strcmp(op,".")==0)
Page

m=0;
else if(strcmp(op,"END")==0)
break;
}
t=t-1;
m--;
fclose(fp1);
fp1=fopen("INPUT.DAT","r");
fscanf(fp1,"%s\t%s\t%x",lab,op,&lc);
fprintf(fp3,"-------------------------------------\n");
fprintf(fp3,"LINE NO.\t|ERROR FOUND\n");
fprintf(fp3,"-------------------------------------");
fprintf(fp4,"SYMBOL\tADDRESS");
s=lc;
fprintf(fp2,"H^%s^00%x^%x\n",lab,lc,t*3);
fprintf(fp2,"T^00%x^",lc);
if(m>10)
fprintf(fp2,"1E");
else
fprintf(fp2,"%x",m*3);
while((op,".")!=0&&(!feof(fp1)))
{
fscanf(fp1,"%s\t%s\t%s",lab,op,val);
line++;
if(strcmp(lab,"$")!=0)
{
for(i=0;i<n;i++)
{
if(strcmp(lab,sym[i])==0)
{
f=1;
Page

break;
}
f=0;
}
if(f==0)
{
strcpy(sym[n],lab);
symadd[n]=lc;
fprintf(fp4,"\n%s\t%x",lab,lc);
n++;
}
if(f==1)
{
fprintf(fp3,"\n%d\t\t|SYMBOL ALREADY DEFINED",line);
err++;
}
}
num = atoi(val);
if(strcmp(op,"RESW")==0)
lc = lc + (num*3);
else if(strcmp(op,"RESB") == 0)
lc = lc + num;
else if(strcmp(op,"BYTE") == 0)
{
num = strlen(val) - 3;
lc = lc + num;
for(i = 2, j = 0;i <strlen(val)-1;i++)
{
code[j] = val[i];
j++;
}
Page

code[j] = '\0';
fprintf(fp2,"^%s",code);
ni++;
}
else
lc = lc + 3;
if(strcmp(op,".") == 0)
break;
}
while(strcmp(op,"END")!= 0 && (!feof(fp1)))
{
fscanf(fp1,"%s\t%s\t%s\t",lab,op,val);
line++;
if(strcmp(op,"END")==0)
break;
if((strcmp(lab,"$") != 0)&&((strcmp(op,"RESW")!=0||strcmp(op,"RESB")!=0||
strcmp(op,"WORD")!=0||strcmp(op,"BYTE")!=0)))
{
for(i = 0; i < n; i++)
{
if(strcmp(lab,sym[i])==0)
{
f = 1;
break;
}
f = 0;
}
if(f==0)
{
strcpy(sym[n],lab);
symadd[n]=lc;
Page

fprintf(fp4,"\n%s\t%x",lab,lc);
n++;
}
else
{
fprintf(fp3,"\n%d\t\tSYMBOL ALREADY DEFINED",line);
err++;
}
}
else if(strcmp(op,"RESW")==0||strcmp(op,"RESB")==0||strcmp(op,"WORD")==0||
strcmp(op,"BYTE")==0)
fprintf(fp3,"\n%d\t\t|Declaration not allowed here",line);
if(strcmp(op,"RESW")!=0&&strcmp(op,"RESB")!=0&&strcmp(op,"WORD")!
=0&&strcmp(op,"BYTE")!=0)
{
for(i=0;i<q;i++)
{
if(strcmp(op,a[i])==0)
{
strcpy(code,b[i]);
f1=0;
break;
}
f1=1;
}
if(f1==1)
{
fprintf(fp3,"\n%d\t\t|WRONG OPCODE",line);
err++;
}
for(i=0;i<n;i++)
Page

{
if(strcmp(val,sym[i])==0)
{
address=symadd[i];
f=0;
break;
}
f=1;
}
if(f)
{
fprintf(fp3,"\n%d\t\t|UNDEFINED SYMBOL",line);
err++;
}
}
if(ni<10)
{
fprintf(fp2,"^%s%x",code,address);
ni++;
}
else
{
fprintf(fp2,"T^00%x^",lc);
if(m>10)
{
fprintf(fp2,"1E");
m=m-10;
}
else
{
fprintf(fp2,"%x",m*3);
Page

fprintf(fp2,"^%s%x",code,address);
ni=0;
}
}
lc=lc+3;
}
fprintf(fp2,"\nE^00%x",s);
fprintf(fp3,"No of errors=%d\n--------------------------------------",err);
printf("Output file:OBJCODE.DAT\nErrors are described in ERROR.DAT\nSymbol
table is in the file:SYMTAB.DAT\n");
fcloseall();
}
Page

R.V. College of Engineering

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

R.V. College of Engineering

Uploaded by

Copyright:

Available Formats

R.V.

Department of Information Science and Engineering