You are on page 1of 20

Amoeba Distributed Operating System

What is Amoeba

Amoeba is a distributed operating system


Developed by Andrew Tannenbaum
Uses timesharing
User logs into the system as a whole, not just his local machine.
When the user runs a program, the system decides which machine (or machines) in the
system should execute it.
This decision is invisible to the user.

Amoeba is a distributed operating system. It collects a huge varity of single machines


connected over a (fast) network to one, huge computer. It was originally developed at the Vrije
Universiteit in Amsterdam by Andrew Tanenbaum and many more. Amoeba was always
designed to be used, so it was deemed essential to achieve extremely high
performance. Currently, it's the fastest distributed operating system.  
The original Amoeba sources are handled under a public license - similar the BSD license.

Amoeba builds upon a traditional micro kernel. It supports true multithreading (kernel
controlled), segment based memory management. All Amoeba components communicate with
eachother over a standardized RPC (Remote Procedure Call) interface - simple but very
powerfull. No matter if a client or server thread is running in kernel or user mode - it uses the
same RPC interface. Always. Everywhere. This leads to a very clean and simple OS design,
very well suited for beginners.  
Because Amoeba was designed from scratch with  new concepts, never seen before, it 
suffered from a lack of application programs. Therefore a POSIX-compliant UNIX emulation was
added. It makes porting UNIX programs much easier!  Now, with additional changes, a huge
varity of application programs  from the UNIX-world work under Amoeba:

X11 with applications, various compilers (gcc, ocaml, tcl/tk), bash shell, editors and many, many
more. Amoeba is ready to use!  Of course not without (little) problems.

It's still in an sligthly experimental state, but ready to use. FSD-Amoeba is intended to
use for dedicated programmers only, and not for end users!

Highlights:

 A huge number of kernel extensions and changes


 New hardware driver stuff:
o several new network driver
o full pci support
o new parallel and serial port support
o virtual console support
o enhanced page protection
 100 Mbps Fast Ethernet support
 New interrupt handling, national keyboard support, enhanced IOP server for X11
 X11 Release 6.4
 Bash-2.02, tcl-7.6, tk-4.0, ocaml-3.01, xv-3.10a, ghostscript 4.03 & 6.01, all GNU-
textutils, gmake, gzip...
 New Setup and installation scripts, various new menu driven system administration tools
 Crosscompiling environment under Linux
 Fireball Documentation Project (FDP)
 Andrew Toolkit for the Fireball Documentation Project
History

 The history of modern computing can be divided into the following eras:
 1970s: Timesharing (1 computer with many users)
 1980s: Personal computing (1 computer per user)
 1990s: Parallel computing (many computers per user)

80’s computers could be networked together and files could be shared between users
RPCs.
Parallel computing in the 90’s and today are used to share CPU resources among a
network of computer systems.
This concept is referred to as distributed computer systems or parallel computing
How can we exploit with the one-to-many computer system configuration?
The answer is Amoeba OS can solve this all problem.
Developed at the Vrije Universiteit Amsterdam, Netherland. Chief designer: Andrew S.
Tanenbaum; other developpers were Frans Kaashock, Sape J. Mullender, Robbert van
Renesse, Leendert van Doorn,  Kees Verstoep and many, many more.
First proto release in 1983 (V1.0), last official release 1996 (V5.3)
Supports multiple architectures: 68k, i80386, SPARC
Virtual Amoeba Machine and Features in Amoeba

The next step: a virtual machine supplying the Amoeba concepts like RPC. Either running
natively under Amoeba, or under UNIX with the AMUNIX library together with the FLIP protocol
module. The VM is derived and build from OCaML. The great advantage: Amoeba programs
written in OCaML and compiled to bytecode can run independently from the underlying OS!

1. Design goals 

The basic design goals of Amoeba are:


Distribution—Connecting together many machines
Parallelism—Allowing individual jobs to use multiple CPUs easily
Transparency—Having the collection of computers act like a single system
Performance—Achieving all of the above in an efficient manner

 One of the main goals of the Amoeba development was to design a transparent
distributed system that allows users to log into system as a whole:Transparency. That
means: Hiding the comlexities of a distributed system from the users. Amoeba users
should not be concerned about the number of processors in the system, nor must they
know the location of the other machines or servers (like the Filesystem server...).
 Several machines connected over a network operate as a single system: Distribution.
Amoeba gives its users the illusion of interacting with a single, powerfull system.
 Parallelism: On an Amoeba system, a single program or command can use multiple
processors to increase performance. The user simply requests an operation, and the
Amoeba OS decides the best way to execute the request. Amoeba will decide which
processor (or processors) are appropriate for the request, based on the current state of
the system. Additionaly, special development tools have been made for an Amoeba
environment that take advantage of the inherent parallelism. For example, Amoeba
supports a parallel 'make'  program.
 Peformance:  Much effort was given to meet this goal!!!This is accomplished with a
newly developed High Performance network protocol called FLIP(Fast Local Internet
Protocol). When FLIP was developed, none of the current protocols provided adequate
support for distributed systems. FLIP performs clean, simple and efficient
communication between distributed nodes.
 Development from the scratch; Amoeba doesen't based on any existing operating
system
 Amoeba  interact with  the user as a UNIX-like Timesharing System

2. System architecture

Amoeba implements a universell distributed Client-Server-Modell. In fact, basically the whole


system needs only three Functions to do all the work: The transaction call from the Client, and
the GetRequest and PutReply functions on the Server side.  
An Amoeba System consists of four principle components: 

1. Workstations
2. Pool Processors
3. Specialized Servers (File server...)
4. WAN Gateways
Objects
 Abstract data types with data and behaviors.
 Amoeba primarily supports software objects, but hardware objects also exist.
 Each object is managed by a server process to which RPCs can be sent. Each
RPC specifies the object to be used, the operation to be performed, and any
parameters to be passed

Capabilities
 128-bit value object description created and returned to the caller when the
object is created.
 Subsequent operations on the object require the user to send its capability to the
server to both specify the object and prove the user has permission to
manipulate the object.

Capabilities are encrypted to prevent tampering.

In more detail:  
 

 Amoeba is designed as a collection of micro kernels. Thus the Amoeba system


consists of many CPU's connected over a network. Each CPU owns his own
local Memory in the range from 2MB to several 100MB. A huge number of 
Processor's build the so called Processorpool. This group of CPUs can be
dynamically allocated as needed by the system and the users. Specialized
servers, called Run server, distribute processes in a fair manner to these
machines. 
 Many different Processor architectures are supported: i80386(Pentium), 68k,
SPARC. Today, only the i80386 architecute is significant for building an Amoeba
system (cheap!!!).
 Workstations allow the users to gain access to the Amoeba system. There is
typically one workstation per user, and  the workstation are mostly diskless; only
a workstation kernel must be booted (from floppy, via tftp, burned in Flash-
EEPROM). Amoeba supports X-Windows and UNIX-emulation.
 At heart of the Amoeba system are several specialized servers that carry out and
synchronize the fundamental operations of the kernel. Amoeba has a directory
server (called SOAP) that is the naming service for all objects used in the
system. SOAP provides a way to assign ASCII names to an object  so it's easier
to manipulate(by humans). The directory server can replicate files without fearing
their change. Amoeba has of course a file server (called the Bullet Server) that
implements a stable high speed file service. High speed is achieved by using a
large buffer cache. Since the files are first created in cache, and are only written
to disk when they are closed, all the files can be stored contigously. The
underlying idea behind immutable files is to prevent the replication mechanism
from undergoing race conditions. And file server crashes normally don't  result in
an inconsistent file system! The Bullet server uses the virtual disk server to
perform I/O to disk, so it's possible that the file server run as a normal user
program!  The Boot server controll all global system servers (outside the kernel):
start, check and poll, restart if crashed.
 All Amoeba objects (files, programs, memory segments, servers) are protected 
and discribed with so called Capabilities (see below).

An example for a processor pool: About 60 -80  Sun motherboards were build into a  rack
system at the Vrije Univerity. Of course cheap and normal IBM-PC 's can be used as CPU-
Servers, too!

3. The Amoeba Micro-kernel 

Microkernel and Server Architecture


 Amoeba is built upon a microkernel architecture.
 The microkernel supports the basic process, communications, and object
primitives. It also handles device I/O and memory management.
 Each machine in the Amoeba system runs a small identical software program -
called the microkernel.
 The function of the kernel is to allow efficient communication between client
processes, which run application programs, and server processes, such as the
Bullet File server or the directory server.  

A small piece of code, called the microkernel, is present on all Amoeba machines and they run
nearly the same microkernel which handles 
 Low level I/O management
 Communication between processes or threads
 Low level Memory management
 Process and  thread (kernel/user space) management

Server processes (see above) supply other  operating system services and generally run in user
mode. This job specialization allows the microkernel to be small and efficient, increases
reliability, allows as much as possible of the operating system to run as user processes,
providing flexibility and no extra burdens are added to individual CPUs with faciliites that it
doesn't need. 

 Threads

Each process has its own address space and contains multiple threads.

These threads have their own stack and program counter, but share the global data and code of
the process.

 Remote Procedure Calls

RPC is the basic communication mechanism in Amoeba. Communication consists of a client


thread sending a message to a server thread, then blocking until the server thread sends back a
return message, at which time the client is unblocked.

Amoeba uses stubs to access remote services which hide the details of the remote services
from the user. A special language in Amoeba called the Amoeba Interface Language (AIL)
generates these stubs automatically. The stubs will then marshal parameters and hide the
communication details from the user.

Process concept:

 Amoeba supports traditional process concept


 Processes consists of several threads (at least one)
 Each thread has his own registers, Instruction Pointer, stack; but all threads of a
process share the same  memory region
 Example: File server. Each request is handled by one thread, but all threads use
the same cache; synchronization through Mutex  and Semaphores

Memory management:

 Threads can allocate and deallocate blocks of memory, called Segments . 


 These segments can be read and written, and can be mapped into and out of the
address space of the process.
 A process owns at leat one segment, but may have many more of them.
 Segments can be used for text, data, stack, or any other purpose the process
desires. The operating system doesen't enforce any particular pattern on
segment usage.

I/O-Managment:

 For each I/O-Device attached to a machine, there is a device driver in the kernel.
The driver manages all I/O for the device.
 All drivers are static linked to the kernel; no dynamic Module support
 Mostly the communication with Device-Drivers are performed through the
standard message protocoll (like the rest of the system in user space)

Communication:
Two forms of communication are provided:  
 

o Point-to-Point communication
o Group communication

 Group Communication
o Amoeba provides a mechanism that allows all receivers in a one-to-many
configuration to receive a transmitted message in the same order. This
simplifies parallel processing and distributed programming problems.

4. Amoeba's  Object  concept 

Amoeba
 Programs can execute wherever OS decides.
 No concept of host machine.
 Objects and Capabilities are used to manage file systems.

Network OS
 Programs run locally unless specified.
 User aware he is using a local host machine.
 Files are maintained and accessed from local machine unless using a remote file
system.

The central point of the software concept for a server implementation is the Objectconcept.
Each object consists of 

 Data and
 Operations on this data

Amoeba is organized as a collection of objects (essentially abstarct data types), each with some
number of operations that processes can perform on it.  Operations on an object are performed
by sending a message to the object's server.Objects are created by processes and managed by
the corresponding server. There are many different object classes: 

 Files
 Directories
 Memory segments
 Processes
 I/O-Devices (Hard drive...)
 Terminals
 ...

Operations on objects are performed with Stub-procedures.When an object is created, the


server returns a Capability.  The capability is used to address and protect the object. A typically
capability is shown below. 

The Port field identifies the server. The Object field tells which object is beeing referred to,
since a server normally will manage several objects. The Rights field specifies which operations
are allowed (e.g. capability for a file may be read only). Since capabilities are managed in user
space, the Check field is needed to protect them cryptographically, to prevent users from
tampering with them.

5. Process management 
A process is an object in Amoeba. Information about the processes in Amoeba are contained in
capabilities and in a data structure called a process descriptor, which is used for process
creation and stunned processes (and process migration). The process descriptor consists of
four components: 

 
 

 The host descriptor provides the requierements for the system where the process
must run, by describing what machine it can be run
 The capabilities include the capability of the process which every client needs,
and the capability of a handler, which deals signals and process exit
 The segment component describes the layout of the addess space (see below)
 The thread component describes the state of each of the threads (see below) in
the process and their state informations(IP, Stack,...)

Amoeba supports a simple thread model. When a process starts up, it has at least one thread.
The number of threads is dynamic. During execution, the process can create additional threads.
And existing threads can terminate. All threads are managed by the kernel. The advantage of
this design is that when a thread does a RPC, the kernel can block that thread and schedule
another one in the same process if one is ready! 

Three methods are provided for thread synchronization:


 Mutexes
 Semaphores
 Signals

A Mutex is like a binary semaphore. It can be in one of two states, locked or unlocked. Trying to
lock an unlocked mutex causes it to become locked. The calling thread continues. Trying to lock
a mutex that is already locked causes the calling thread to block until another thread unlocks the
mutex.The second way threads can synchronize is by counting Semaphores. These are slower
than mutexes, but there are times when they are needed. A semaphore can't be negative. Try
down a zero semaphore  causes the calling thread to block until another thread do a up
operation on the semaphore.

Signals are asynchronous interrupts sent from one thread to another in the same process.
Signals can be raised, caught, or ignored. Asynchronous interrupts between processes use
the stun mechanism.

6. Memory management 
 

Amoeba supplies a simple memory management based on segments. Each process owns at
least three segments:

1. Text/Code segment
2. Stack segement for the main thread/process
3. Data segment

Each further thread gets his own stack segment, and the process can allocated arbitrary
additional data segments.

All segments are page protected by the underlying MMU, the kernel segments, too.

7. Communication 
The definitions of the Amoeba communication calls are given in the ANSI C language.
All three calls use a Msg data structure, which is a 32-byte header with several fields to
hold capabilities and other items. Note that each request or reply message can consist
of just a header or a header and an additional component.
All processes, the kernel too, communicate with a standardized RPC (Remote procedure
call) interface. There are only three functions to reach this goal:

trans(Msg *requestHeader, char *requestBuffer, int requestSize, Msg *replyHeader,


char*replyBuffer, int replySize) Client sends a request message and receives a reply; the
header contains a capability for the object upon which an operation is being requested.
trans(req_header, req_buf, req_size, rep_header,rep_buf, rep_size)

-> do a transaction to another server

get_request(Msg *requestHeader, char *requestBuffer, int requestSize)Server gets a


request from the port specified in the message header.
getreq( req_header, req_buf, req_size)

-> get a client request

put_reply(Msg *replyHeader, char *replyBuffer, int replySize) Server replies.zza


putrep( rep_header, rep_buf, rep_size)
-> send a reply to the client

The first function is used by client to send a message to a server, and get a reply from the
server on this request. The reply and request buffers are generic memory buffers (char). The
reply and request headers are simple data structures to describe the request and the capability
of the server. On the other side, teh server calls within an infinite loop the getreq function. Each
time a client sends this server (determined by a server port - see capabilities for details) a
message, the getreqfunction returns with the client data filled in the request buffer, if any. The
request header contains informations about the client request.

Because the client expects a reply, the server must send a reply (either with or without reply
data) using the reply function.

How to use Amoeba


 Amoeba is freeware
 It can be loaded on a LAN or University computer system
 Editors such as elvis, jove, ed come with the installation package
 Compilers such as C, Pascal, Fortran 77, Basic, and Modula 2
 Orca - used for Parallel Programming

Applications

 UNIX emulation
 Parallel make
 Traveling salesman
 Alpha-beta search
 Parallel make

o Amoeba runs contains a processor pool with several processors.


o One application for these processors in a UNIX environment is a parallel
version of make.
o When make discovers that multiple compilations are needed, they are run in
parallel on different processors.
o pmake was developed based on the UNIX make but with additional code to
handle parallelism.
o many medium-sized files = considerable speedup
o one large source file and many small ones = total time can never be smaller
than the compilation time of the large one.
o A speedup of about a factor of 4 over sequential make has been
observed in practice on typical makefiles.

 The Traveling Salesman


o The computer is given a starting location and a list of cities to be visited. The
idea is to find the shortest path that visits each city exactly once, and then
returns to the starting place.
o Amoeba was programmed to run this application in parallel by having one
pool processor act as coordinator, and the rest as slaves.
o Example: the starting place is London, and the cities to be visited include
New York, Sydney, Nairobi, and Tokyo
o The coordinator might tell the first slave to investigate all paths starting with
London-New York, the second slave to investigate all paths starting with
London-Sydney, the third slave to investigate all paths starting with London-
Nairobi...
o All searches go on in parallel.
o When a slave is finished, it reports back to the coordinator and gets a new
assignment.
o Also, the algorithm can be applied recursively.
o The first slave could allocate a processor to investigate paths starting with
London-New York-Sydney, another processor to investigate London-New
York-Nairobi, and so forth.
o Results show that about 75 percent of the theoretical maximum speedup can
be achieved using this algorithm, the remaining 1/4 being lost to
communication and other overhead

Significance of points
Amoeba is a distributed operating system which successfully allows users to execute
jobs transparently over multiple CPUs.

It was primarily developed by Andrew Tannenbaum and others at the Vrije Universiteit
Amsterdam, Netherland.

Its basic design goals are –


 Distribution—Connecting together many machines
 Parallelism—Allowing individual jobs to use multiple CPUs easily
 Transparency—Having the collection of computers act like a single system

Performance—Achieving all of the above in an efficient manner

It is based on a microkernel architecture.

It uses objects to encapsulate data and processes and capabilities to describe the
objects.

The kernel provides just three major system calls

trans(Msg *requestHeader, char *requestBuffer, int requestSize, Msg *replyHeader,


char*replyBuffer, int replySize)

get_request(Msg *requestHeader, char *requestBuffer, int requestSize)

put_reply(Msg *replyHeader, char *replyBuffer, int replySize)

It has proven to be successful at implementing speedup on many common computer


science algorithms including UNIX emulation, parallel make, traveling salesman, and
alpha-beta search.

Summary
The Amoeba distributed operating system succeeds in overcoming many of the
hurdles faces in distributed computing.

It abstracts away the use of RPCs using stubs and is scalable based on available
CPUs.

Although system updates seem to have stopped, the current version appears to
have reached a stable point in its architectural development.

The programming languages included with the distribution are common to most
programmers and should make code creation easy for Amoeba applications.

Results of application speedup and the fact that the system is freely available
make it worth evaluating at the university level.

References
 [1]-Based on the article: Amoeba: An Overview of a Distributed Operating
System by Eric W. Lund - March 29, 1998, Rochester Institute of Technolog and
informations about Amoeba taken from a  web site from the University of Halle.
Additional parts are taken from the Amoeba tribute site from Stephen Wagner.
Furthermore the classics: The Amoeba kernel, Andrew S. Tanenbaum, M.F.
Kaashoek.

 [2] Tanenbaum, A.S, Sharp, G.J. “The Amoeba Distributed Operating System”
Online: 2006 http://www.cs.vu.nl/pub/amoeba/Intro.pdf
 [3] Ramsay, M., Keigel, T., Memmer, H. “Ameoba Distributed Operating
System” Online http://csserver.evansville.edu/~mr56/CS470/Final_Draft.pdf
 [4] Coulouris, G. Dollimore, J., Kindberg, T. Distributed Systems – Concepts
and Design, 1994, Online: http://www.cdk3.net/oss/Ed2/Amoeba.pdf
 [5] Sharp, G.J.: ‘‘The Design of a Window System for Amoeba,’’ Report IR-
142, Dept. of Math. & Computer Science, Vrije Universiteit, Dec. 1987.
 [6] The Amoeba Reference Manual Users Guide Vrije University of
Amsterdam, 1996 Online 2006:
http://www.cs.vu.nl/pub/amoeba/manuals/usr.pdf
 [7] Bal, H.E., Renesse R. van, and Tanenbaum, A.S.: ‘‘Implementing
Distributed Algorithms Using Remote Procedure Calls,’’ Proc. 1987 National
Computer Conference, pp. 499-506, June 1987.
 [8] Baalbergen, E.H.: ‘‘Parallel and Distributed Compilations in Loosely
Coupled systems,’’ Proc. Workshop on Large Grain Parallelism , Providence,
RI, Oct 1986.
 [9] Straven H. van, Renesse R. van, and Tanenbaum, A.S, “The Performance
Of The Amoeba Distributed Operating System” Online: 2006
https://dare.ubvu.vu.nl/bitstream/1871/2589/1/11008.pdf
 [10] Tanenbaum, A.S, et al, “Experiences with the Amoeba Distributed
Operating System” Online 2006:
http://citeseer.ist.psu.edu/cache/papers/cs/6593/ftp:zSzzSzftp.sys.toronto.eduzS
zpubzSzamoebazSz03.pdf/tanenbaum90experiences.pdf
 [11] Wikipedia – www.wikipedia.com
 [12] Slide finder – www.slidefinder.com
 [13] Slide share – www.slideshare.com
 [14] document share – www.docshare.com

You might also like