Professional Documents
Culture Documents
DISTRIBUTED OPERATING
SYSTEM
What is Distributed System? Give some examples of Distributed System?
Distributed System:
A distributed system is a system in which hardware or software components located at
different computers communicate with each other so that they can pass messages which
will help them share resources with each other.
………
In normal life users mostly share hardware resources such as printers, data resources
such as files etc.
But the importance of resource sharing increases when the users are able to share the
higher level resources that play a part in their everyday work and social activities.
For example, users are concerned with sharing data in the form of a shared database or
a set of web pages not the disks and processors that those are implemented on.
………
A Distributed System has the following advantages associated with it
Concurrency:
By implementing Distributed system, programs can be executed concurrently between
two or more computers so that the users can share resources such as web pages or files
when necessary.
No global clock:
When users want to share resources with one another, then there is no particular global
clock that is associated with them i.e. there is no fixed time for the user when he wants
to share a resource… … ….
Independent failures:
All computer systems can fail. Similarly Distributed system can fail in many ways.
Faults in the network result in the disconnection of the computers that are connected to
that particular network, but still the computers don’t stop running. Even though the
Page 2 of 49
network fails the user can use his computer without any problem.
………
What are the different Hardware Concepts involved with Distributed Operating
Systems? (or)
Write about Multi-processors and Multi-Computers?
HARDWARE CONCEPTS:
Even though all distributed systems consist of multiple CPUs, there are several
different ways the hardware can be interconnected so that they can communicate with
each other.
Single Input Stream Data:
A computer with a single instruction stream and a single data stream is called SISD.
All traditional uni-processor computers i.e., computers which have only one CPU are
considered to be SISD such as Personal Computers.
Bus-Based Multicomputer:
On the other hand, building a multicomputer (i.e., no shared memory) is easy. Each
CPU has direct connection to its own local memory. It looks topologically similar to
the bus-based multiprocessor, but since there will be much less traffic over it, it need
not be a high-speed bus.
Draw Diagram – HC - 4
Switched Multicomputer:
Our last category consists of switched multi-computer. In this type of category each
and every CPU will have its own memory and can directly access that memory.
The most two popular topologies, a grid and a hypercube.
Draw Diagram HC – 5
A hypercube is an n-dimensional cube where each vertex represents a CPU and each
edge represents the connection between two CPU’s.
SORTWARE CONCEPTS:
Although the hardware is important, the software is even more important. The
operating system software is responsible for the interaction between the user and the
hardware.
Two kinds of Operating systems for multiple CPU systems one is loosely coupled and
the other is tightly coupled system.
Loosely Coupled System: This software allows machines and users of a distributed
system to be independent of each other, but still they can interact with each other
whenever necessary. For example a group of personal computers, where each computer
has its own CPU, its own memory, its own hard disk and its own operating system, but
………
Tightly Coupled System: This software allows machines to be run by a central
machine, which controls all the operations of the other machines.
Network Operating Systems:
Page 5 of 49
What are the different design issues involved with Distributed Operating
Systems?
Transparency:
A distributed system is made up of independent components, but still the user doesn’t
know about the individual components and he feels that it is a single system. Such type
of hiding is referred as Transparency.
Access Transparency: The users can access resources in his system or on any other
system by using the same operation. Such transparency is referred as Access
Transparency.
Location transparency: The user will never know the location of the resources that he
is accessing.
Concurrency transparency: The users can share the same resource without any
interference.
Failure transparency: It hides the failures that have occurred from the user.
Flexibility:
The different systems involved in the Distributed Operating system must be flexible.
So that we can add any number of system without having to change much of the
configurations of the hardware as well as the software.
The structure of the Distributed Operating System can be arranged in two ways
Monolithic Kernel: Each machine should run a kernel that provides all the services to
the other systems and involved in the network. For example a centralized operating
system with networking facilities can be considered as a Monolithic kernel.
Micro Kernel: In Micro kernel the kernel provides very little service to the other
systems, but the bulk of the operating system services are provided by the user-level
servers.
………
Reliability:
One of the original goals of building distributed systems was to make them more
reliable than single-processor systems. The idea is that if a machine goes down, some
Page 7 of 49
Scalability:
Distributed systems must be able to share resources to a group of users. But the main
challenge is if the numbers of users increase still the distributed system must be able to
provide the same service. Such service is referred as a scalable service by the
distributed system.
Controlling the cost of physical resources:
As the demand for a resource grows, it should be possible to include the resource at a
reasonable cost.
Controlling the performance loss:
Some times if the number of resources will be same as that of the number of users. For
Page 8 of 49
example the number of domain names and the IP addresses are always same. In such
cases we should see that performance is not lost.
Preventing software resources running out:
Software resources must not run out for example if we consider IPv4 addresses can
provide only IP addresses to 4 billion people where as they are much more people who
want IP addresses. … … ….
Transmission mode: The physical layer also defines the direction of transmission
between two devices i.e. whether they have Simplex, half-duplex or full- duplex
transmission modes.
2. Data Link Layer:
The data link layer is responsible to provide the necessary services to the physical
layer. It makes the physical layer appear error-free to the upper layer.
Framing: The data link layer divides the stream of bits received from the network layer
into smaller parts called as frames.
Physical addressing: If frames are to be distributed to different systems on the
network, the data link layer adds a header address to the frame to define the sender
and/or receiver of the frame.
Flow Control: If the speed at which the receiver takes the data is less than the speed at
which the sender sends data, then the receiver receive too much of data which in cannot
receive at a time. … … …
Error Control: The data link layer sees that data is not lost by adding mechanisms to
detect and retransmit damaged or lost frames.
Access control: When two or more devices are connected to the same link, data link
layer protocols are necessary to determine which device has control over the link at any
given time.
3. Network layer:
The network layer is responsible for the source-to-destination delivery of a packet,
possibly across multiple network (links). Whereas the data link layer oversees the
delivery of the packet between two systems on the same network (links), the network
layer ensures that each packet gets from its point of origin to its final destination.
Logical Addressing: If a packet passes the network layer, a header address of the next
link is added to it and sent to the next link. This address is the referred as Logical
addressing.
Routing: In order to send the packet from one link to another, we will use connecting
devices called routers or switches which route the packets to their final destination.
Page 10 of 49
4. Transport Layer:
The transport layer is responsible for process-to-process delivery of the entire message.
A process is an application program running on a computer… … …packet is related to
what.
Segmentation and reassembly: A message is divided into smaller parts called
segments, where each segment will have a sequence number. These segments are also
referred packets.
Flow Control: Like the data link layer, the transport layer is responsible for flow
control.
Error Control: Like the data link layer, the transport layer is responsible for error
control. However, error control at this layer is performed process-to-process rather than
across a single link.
5. Session Layer:
Dialog Control: The session layer allows two systems to enter into a dialog. It allows
the communication between two processes to take place in either half duplex (one way
at a time) or full-duplex (two ways at a time) mode.
Synchronization: The session layer allows a process to add checkpoints, or
synchrobization points, to a stream of data. If they are 2000 pages then it will add
checkpoints at every 100th page, so that if a crash occurs at 530th page, then we have
to retransmit only pages from 500 to 530 instead of retransmitting all the 530 pages.
6. Presentation layer:
The presentation layer is concerned with the syntax and semantics of the information
exchanged between two systems.
Translation: Since the formats will be different for different computers, therefore if
one computer sends data in its format, then it will be translated into a common format,
which is understood by the destination system, and will be sent to the receiving system.
………
Encryption: To carry important information, a system must be able to ensure privacy,
so that the other users cannot understand the data. Encryption means that the sender
Page 11 of 49
transforms the original information to another form and sends the resulting message out
over the network. … … …
7. Application Layer:
File transfer, access and management: This application allows a user to access files,
as well as retrieve files and also manage, and control files present in a remote
computer.
Mail Service: This application provides the basis for e-mail forwarding and storage.
Directory service: This application provides distributed database sources and access for
global information about various objects and services.
Draw Diagram LP – 1
by traversing one or more switches. Each of these switches has four ports, each used
for both input and output.
………
A different switch design copies the cell into a queue associated with the output buffer
and lets it wait there, instead of keeping it in the input buffer. This approach eliminates
head-of-line blocking and gives better performance. It is also possible for a switch to
have a pool of buffers that can be used for both input and output buffering.
Draw Diagram ATM – 3 (b)
by the system and then invoke the method which will perform operation on the
arguments.
sendReply: Once the operation is performed then the sendReply method will sent the
result in the form of a response.
Message identifiers:
Each and every message is attached with a message identifier so that the message can
be uniquely identified as well as the order of the message can be determined. A
message identifier consists of two parts
Timeouts:
Whenever a doOperation fails then they are number of options for it to perform. One
such operation is once the timeout has occurred it returns immediately from the
doOperation and indicate to the client that the doOperation has failed. Another option
for it is to again perform the doOperation and wait for the reply.
Discarding duplicate request messages:
In cases when the request message is retransmitted, the server may receive it more than
once. For example the server may receive the first request message but take longer than
the client’s timeout to execute the command and return the reply. Once the timeout
occurs the sender will retransmit the message again. This causes duplicate messages at
Page 15 of 49
In the request (R) protocol the client will send a request, without any reply from the
server.
In the request-reply (RR) protocol the client will send a request, and the reply will be
sent by the server
In request-reply-acknowledge reply (RRA) protocol the client will send a request and
the server will send a reply along with an acknowledgement.
HTTP: an example of request and reply protocols:
HTTP:
The Hyper Text Transfer Protocol defines the ways in which browsers and other types
of clients interact with web servers.
Request Reply interactions:
Page 16 of 49
HTTP is a request reply protocol The client sends a request messages to the server
containing the URL of the required resource.
Content types:
Here content type refers to the type of content that is sent to the web server or received
from the web server by the Browsers. These contents may include text, images, audio,
video etc.
………
One resource per request:
Clients specify one resource per HTTP request. If a web page contains nine images, say
then the browser will issue a total of ten separate requests to obtain the entire contents
of the page.
Simple access control:
By default any user with network connectivity to a web server can access any of its
published resources. If users wish to restrict access to a resource, then they can provide
access control on that resource. For example the user can set read access on a file if he
wants to give access on a file such that they can only read from a file but cannot write
into it. .
Authentication: Sun RPC request and reply messages provide additional fields for
authentication. Only procedures which has authentication have the rights to access a
procedure at the server. The request message will have the user id of the user which
will be passed to the server. The server will authenticate the user id and if it is valid the
procedure will allow to call the server procedure.
Write about Clock Synchronization?
Clocks:
Computers each contain their own physical clock. These clocks are electronic devices
that count occurring oscillations store the result in a counter register.
Clock skew and clock drift:
Computer clocks, present in different computers don’t have the same reading i.e. the
time in their clocks always seem to be different.
The difference between the readings of any two clocks is called their skew. Also, the
crystal-based clocks used in computers are, like any other clocks, which have chances
of clock drift, which means that they count time at different rates, and so the difference
occurs between them… … ….
Page 19 of 49
Clock Synchronization:
In a distributed system since such drifts in clocks between two systems should not
occur therefore we will use Clock Synchronization. In this mechanism the main aim is
to synchronize the clocks of all the systems that are present in the Distributed System.
They are two ways which can be used to implement clock synchronization
1. Logical Clocks
2. Physical Clocks
Logical Clocks::
From the point of view of any single process, events are ordered uniquely by times
shown on the local clock. However, as Lamport pointed out, since we cannot
synchronize clocks perfectly across a distributed system, we cannot in general use
physical time to find out the order of any pair of events occurring within it.
………
• If two events occurred at the same process Pi (I = 1, …, N), then they occurred
in the order in which Pi observes them.
• Whenever a message is sent between processes, the event of sending the
message occurred before the event of receiving the message.
………
It then sends the message to all other processes, conceptually including itself. The
following three cases can occur.
• If the receiver is not in the critical region and does not want to enter it, it sends
back an OK message to the sender.
• If the receiver is already in the critical region, it does not reply. Instead, it queues
the request.
• If the receiver wants to enter the critical region but has not yet done so, it
compares the timestamp in the incoming message with the one contained in the
message that it has sent everyone. The lowest one wins. If the incoming message
is lower, the receiver sends back an OK message. If its own message has a lower
timestamp, the receiver queues the incoming request and sends nothing.
After sending out requests asking permission to enter a critical region, a process sits
back and waits until everyone else has given permission. As soon as all the permissions
are in, it may enter the critical region. When it exits the critical region, it sends OK
message to all processes on its queue and deletes them all from the queue.
Draw Diagram ME - 2
………
A Token Ring Algorithm:
In software, a logical ring is constructed in which each process is assigned a position in
the ring. The ring positions may be allocated in numerical order of network addresses
or some other means.
Draw Diagram ME - 3
When the ring is initialized, process 0 is given a token. The token circulates around the
ring. It is passed from process k to process k + 1 (modulo the ring sixe) in point-to-
point messages.
………
Page 23 of 49
At any moment, a process can get an election message from one of its lower numbered
colleagues. When such a message arrives, the receiver sends and OK message back to
the sender to indicate that he is alive and will take over.
………
If a process that we previously down comes back up, it holds an election. If it happens
to be the highest-numbered process currently running, it will win the election and take
over the coordinator’s job. Thus the biggest guy in town always wins, hence the name
“bully algorithm.”
Draw Diagram ME - 4
A Ring Algorithm:
Another election algorithm is based on the use of a ring, but without a token. When any
process notices that the coordinator is not functioning, it builds an ELECTION message
containing its own process number and sends the message to its successor.
If the successor is down, the sender skips over the successor and goes to the next
member along the ring, or the one after that, until a running process is located. At each
step, the sender adds its own process number to the list in the message.
Page 24 of 49
………
When this message has circulated one\ce, it is removed and everyone goes back to
work.
Draw Diagram ME - 5
• They are free from interference by operations being performed on behalf of other
concurrent clients; and
• Either all of the operations must be completed successfully or they must have no
effect at all in the presence of server cashes.
A client that performs a sequence of operations on a particular bank account will fist
lookup to account by name and then apply the deposit, withdraw and getBalance
operation directly to the relevant account.
………
Transaction Primitives:
Whenever transaction takes place the following transaction primitives are applied on
that transaction
BEGIN_TRANSACTION: This state indicates the beginning of the transaction.
READ OR WRITE: These specify read or write operations on the database items that
are executed as a part of a transaction.
END_TRANSACTION: This specifies that READ and WRITE transaction operations
have ended and marks the end of transaction execution. However at this point it may be
necessary to check whether the changes introduced by the transaction can permanently
Page 25 of 49
applied to the database or whether the transaction has to be aborted because it violates
serializability.
COMMIT_TRANSACTION: This signals a successful end of the transaction so that
any changes executed by the transaction can be safely committed to the database and
will be undone.
ROLLBACK (Abort): This signals that the transaction has ended unsuccessfully, so
that any changes that are applied on the transaction must be undone.
Properties of a Transaction(ACID): Transaction are associated with following
properties which are often referred as ACID properties.
Atomicity: A transaction is an atomic unit or processing i.e. if we have to start a
transaction then first we have to see whether that transaction will be completed or not.
If we are sure that it will be completed then only we will start the transaction other
wise the transaction will not be started.
Consistency Preservation: A transaction is consistent, if its complete execution of the
database takes it from one consistent state to another consistent state i.e. if we make
changes and even after making those changes the data is correct then we say that
Consistency is preserved.
Isolation: A transaction should appear as though it is being executed in isolation from
other transactions i.e. the execution of a transaction must not effect the execution and
output of the other transactions.
Durability or permanency: If any changes made in the database by a transaction must
be stored permanently in them. These changes must not be lost because of any failure.
Nested transactions:
Nested transaction extend the above transaction model by allowing transactions to be
composed of other transactions. Thus several transactions may be started from within a
transaction, allowing transactions to be regarded as modules that can be composed as
required.
………
A subtransaction appears atomic to its parent with respect to transaction failures and to
Page 26 of 49
concurrent access. Subtransactions at the same level, such as T1 and T2 can run
concurrently but their access to common objects is serialized, for example by locking
scheme.
• Subtransactions one level (and their descendants) may run concurrently with
other subtransaction at the same level in a hierarchy. This can allow additional
concurrency in a transaction.
• Subtransactions can commit or abort independently. In comparison with a single
transaction, a set of tested subtransactions is potentially more robust.
Implementation of Transactions:
The two phase commit protocol:
A client request to commit (or abort) a transaction is directed to the coordinator. If the
client requests abortTransation, or if the transaction is aborted by one of the
participants, the coordinator informs the participants immediately.
………
If a participant can commit its part of transaction, it will agree as soon as it have
recorded the changes and its status in permanent storage – and is prepared to commit,.
The coordinator in a distributed transaction communicates with the participants to carry
out the two-phase commit protocol.
The two-phase commit protocol consists of a voting phase and completion phase By
the end of step(2) the coordinator and all the participants that voted Yes a prepared to
commit. By the end of step (3) the transaction is effectively completed. At step (3a) the
coordinator and the participants are committed, so the coordinator can report a decision
to commit to the client. At (3b) reports a decision to abort to the client.
………
The two Phase commit protocol:
Phase1 (voting Phase):
transaction.
• When a participant receives a canCommit? Request it replies with its vote (yes
or No) to the coordinator. Before voting Yes, it prepares to commit by saving
objects in permanent storage. If the vote is No the participant aborts
immediately.
Concurrency control:
When multiple transactions are executing simultaneously in different processes, some
mechanism is needed to keep them out of each other’s way. That mechanism is called a
concurrency control.
Locking:
In a distributed transaction, the locks on an object are held locally (in the same server).
The local lock manager can decide whether to grant a lock or make the requesting
transaction wait.
………
When locking is used for concurrency control, the objects remain locked and are
unavailable for other transitions during the atomic commit protocol, although an
Page 28 of 49
graph is directed graph in which nodes represent transaction waiting for an object.
There is a deadlock if and only if there is cycle in the wait-for graph.
Definition of deadlock:
Deadlock is a state in which each member of a group of transactions is waiting for
some other member to release a lock. A wait for graph can be used to represent the
waiting relationships between current transactions. In a wait for graph the nodes
represent transactions and the edges represent wait-for relationships between
transactions – there is an edge for node T to node when transaction T is waiting for
transaction U to release a lock.
Deadlock prevention:
One solution is to prevent deadlock. An apparently simple but not very good way to
overcome deadlock is to lock all of the entire objects used by a transaction when it
starts. This would need to be done as single atomic steep so as to avoid deadlock at this
stage. Such a transaction cannot run into deadlock with other transactions, but it
unnecessarily restricts access to shared resources. In addition, I is sometimes
impossible to predict at the start of a transaction which objects will be used. Deadlock
can also be prevented requesting locks on objects in a predefined order, but this can
result I premature locking and a reduction in concurrency.
Deadlock detection:
Deadlocks may b detected by finding cycle in the wait-for graph. Having detected a
deadlock, a transaction must be selected for abortion to break the cycled.
The software responsible for deadlock detection can be part of the lock manager. It
must hold a representation of the wait-for graph so that it can check it for cycles from
time to time. Edges are added to the graph and removed from the graph by the lock
manager’s setlock and unlock operations. At the point illustrated by Figure 13.22 on the
left it will have the following information: An edge T → U is added whenever the lock
………
Timeouts:
Lock timeouts are a method for resolution o deadlocks that is commonly used. Each
Page 30 of 49
lock is given a limited period in which it is invulnerable. After this time, a lock
becomes vulnerable. Provided that no other transaction is competing for the object that
is locked, an object with a venerable lock remains locked. However, if any other
transaction is waiting to access the object protected by a vulnerable lock, the lock is
broken (that is, the object is unlocked) and the waiting transaction resumes. … … …
Threads were invented allow parallelism to be combined with sequential execution and
blocking systems calls. Here one thread the dispatcher, reads incoming requests for
work from the system mailbox.
After examining the request, it chose an idle (i.e., blocked) worker thread and hands it
the request, possibly by writing a pointer to the message into a special word associated
with each thread. The dispatcher then wakes up the sleeping worker (e.g., by doing UP
on the semaphore on which it is sleeping).
………
The scheduler will now be invoked and another thread will be started, possibly the
dispatcher, in order to acquire more work, or possibly another worker that is now ready
to run.
Design Issues for Threads Package:
A set of primitives (e.g., library calls) available to the user relating to threads is called a
threads package.
The first issue we will consider how threads management. Two alternatives are
possible here, static threads and dynamic threads.
With a static design, the choice of how many threads there will be is made when the
program is written or when it is complied. Each thread is allocated a fixed stack. This
approach is simple, but inflexible.
Implementing a Threads Package:
Implementing Threads in User Space:
The first method is to put the threads package entirely in user space. The kernel knows
nothing about them. As far as the kernel is concerned, it is managing ordinary, single-
threaded processes.
This first, and most obvious, advantage is that a user-level threads package can be
implemented on an operating system that does not support threads. The threads run on
top of runtime system, which is a collection of procedures that manage threads. When a
thread executes a system call, goes to sleep, performs an operation on semaphore or
mutex, or otherwise does something that may cause it to be suspended, it calls a
Page 32 of 49
other may be in public areas and have several different users during the course of a
day.
… … …Draw Diagram WM - 1
Diskless Workstations:
Workstations that do not have a local hard disk are referred as diskless workstations.
If the workstations are diskless, the files are stored in a remote file server. Requests to
read and write files are sent to a file server, which performs the work and sends back
the replies.
Diskless workstations are popular at universities and companies. Having large number
of workstations equipped with small, slow disks is typically much more expensive than
having one or two file servers equipped with huge, fast disks and accessed over the
LAN.
Diskful Workstations:
Workstations that have a local hard disk that they are referred as diskful or disky
workstations. They are a number of different models of Diskful workstations.
In the first model all the user files are kept on a central file server, and the local disks
are used for paging, and for storing temporary files.
………
A third approach to using local disks is to use them as explicit caches (addition to using
them for paging, temporaries, and binaries. In this mode operation, users can download
files from the file servers to their own disks, read and write them locally, and then
upload the modified ones at the end of the login session.
Fourth each machine can have its own self-contained file system, with the possibility of
mounting or otherwise accessing other machines file systems. The idea here is that
each machine is basically self-contained and that contact with the outside world is
limited.
Using Idle Workstations:
If the user is not using the workstation or he has not activated it for some time then that
workstation is said to be an idle workstation. Most of the time workstations are usually
Page 34 of 49
idle in universities which is wastage of resources therefore there must be some means
to use the idle workstations.
………
Server Driven: When a workstation goes idle, it announces its availability by entering
its name, network address, and properties in registry file (or data base),
An alternative way for the newly idle workstation to announce the fact that it has
become unemployed is to put a broadcast message onto the network. All other
workstations then record this fact.
Client Driven: The other way to locate idle workstations is to use a client-driven
approach. When remote is invoked, it broadcasts a request saying what program it
wants to run, how much memory it needs, whether or not floating point is needed, and
so on. These details are not needed if all the workstations are identical, but if the
system is heterogeneous and not every program can run on every workstation, they are
essential.
When the replies come back, the idle workstation picks one and sets it up.
2. The Processor Pool Model:
Although using idle workstations adds a little computing power to the system, it does
not address a more fundamental issue: What happens when it is feasible to provide 10
or 100 times as many CPU’s as there are active user?
An alternative approach is to construct a process pool, a rack full of CPUs in the
machine room, which can be dynamically allocated to users on demand.
The motivation for the processor pool idea comes from taking the diskless workstation
idea a step further.
………
In effect, we are converting all the computing power into ‘idle workstations” that can
be accessed dynamically. User can be assigned as many CPUs as they need for short
periods, after which they are returned to the pool so that other users can have them.
There is no concept of ownership here: all the processors belong equally to everyone.
Draw Diagram WM - 2
Page 35 of 49
A Hybrid Model:
A possible compromise is to provide each user with a personal workstation and to have
a processor pool in addition. Although this solution is more expensive than either a
pure workstation model or a pure processor pool model, it combines the advantages of
both of the others.
to share its load, then it is Sender initiated. But if a workstations is idle then it
sends a request to other work stations that it is idle and can send any additional
work to it.
Example Processor Allocation Algorithms:
1. A Graph-Theoretic Deterministic Algorithm:
A widely-studied class of algorithm is for systems consisting of processes with
known CPU and memory requirements, and a known matrix given the average
amount of traffic between each pair of processes. … … ….
Draw Diagram WM – 2 (a)
In the above diagram the graph is partitioned with processes A, I, and G on one
processor, B,F, and H on a second, and processes C, D, and I on the third. The
total network traffic is the sum of the arcs intersected by the dotted cut lines, or
30 units.
Draw Diagram WM – 2 (b)
In the above diagram we have a different partitioning that has only 28 units of
network traffic. Assuming that it meets all the memory and CPU constrains, this
is a better choice because it requires less communication.
2. A Centralized Algorithm:
The unusual thing about this algorithm, and the reason that it is centralized, is
that instead of trying to maximize CPU Utilization, it is concerned with giving
each workstation owner a fair share of the computing power.
Whereas other algorithms will happily let one user take over all the machines if
he promises to keep them all busy (i.e., achieve a high CPU utilization), this
algorithm is designed to prevent precisely that.
………
Usage table entries can be positive, zero, or negative. A positive score indicates
that the workstation is a net user of system resources, whereas a negative score
means that it needs resources. A zero score is neutral.
3. A Hierarchical Algorithm:
Page 38 of 49
One approach that has been proposed for keeping tabs on a collection of
processors is to organize them in a logical hierarchy independent of the physical
structure of the network.
Draw Diagram WM - 3
For each group of k workers, one manager machines (the “department head”) is
assigned the task of keeping track of who is busy and who is idle. If the system
is large, there will be an unwieldy number of department heads, so some
machines will function as “deans,” each riding herd on some number of
department’s heads. If there are many deans, they too can be organized
hierarchically, with a “big cheese” keeping tabs on a collection of deans.
This hierarchy can be extended ad infinitum, with the number of levels needed
growing logarithmically with the number of workers.
4. A Receiver-Initiated Distributed Heuristic Algorithm:
With this algorithm, whenever a process finishes the system checks to see if it
has enough work. If not, it picks some machine at random and asks it for work.
If that machine has nothing to offer, a second, and then a third machine is asked.
If no work is found with N probes, the receiver temporarily stops asking, does
any work is has queued up, and tries again when the next process finishes. If no
work is available, the machine goes idle. After some fixed time interval, it
begins probing again.
5. A Bidding Algorithm:
Each processor advertises its approximate price by putting it in a publicly
readable file. This price is not guaranteed, but gives an indication of what the
service is worth (actually, it is the price that the last customer paid).
Different processors may have different prices, depending on their speed,
memory size, presence of floating-point hardware, and other features. An
indication of the service provided, such as expected response time, can also be
published.
When a process wants to start up a child process, it goes around and checks out
Page 39 of 49
is currently offering the service that it needs. It then determines the set of
processors who services it can afford.
………
Write about Real Time Distributed Systems?
Real Time Distributed Systems:
Real-time programs (and systems) interact with the external world in a way
that involves time. When a stimulus appears, the system must respond to it in a
certain way and before a certain deadline, the system is regarded as having
failed. When the answer is produced is as important as which answer is
produced.
Real-time systems are generally split into two types depending on how serious
their deadlines are and the consequences of missing one. These are:
Soft Real Time System: Soft real-time means that missing an occasional
deadline is all right. For example, a telephone switch might be permitted to lose
or misroute one call in 105 under overload conditions and still be within
specification.
Hard Real Time System: In contrast, even a single missed deadline in a hard
real-time system is unacceptable, as this might lead to loss of life or an
environmental catastrophe. In practice, there are also intermediate systems
where missing a deadline means you have to kill off the current activity, but the
consequence is not fatal. For example, if a soda bottle on a conveyor belt has
passes by the nozzle, there is no point in continuing to squirt soda at it, but the
results are not fatal. Also, in some real-time systems, some subsystems are heard
real time whereas others are soft real time.
Design Issues:
Page 40 of 49
During a sudden electric power failure, everyone grabs the telephone, possibly
causing the phone system to overload, even though it has its own independent
power generation system.
Language support:
The language should be designed so that the maximum execution time of every
task can be computed at compile time. This requirements means that the
language cannot support general while loops. Interaction must be done using for
loops with constant parameters.
Real-time languages need a way to deal with time itself. To start with, a special
variable, clock, should be available, containing the current time in ticks.
However, one has to be careful about the unit that time is expressed in. The finer
the resolution, the faster clock will overflow.
Real-Time Communication:
Communication in real-time distributed systems is different from
communication in other distributed systems.
Consider a token ring LAN. Whenever a processor has a packet to send, it waits
for the circulating token to pass by, then it captures the token, send, sends its
packet, and puts the token back on the ring so that the next machine downstream
gets the opportunity to seize it.
… … …Draw Diagram WM - 4
The other part of the file service is the directory service, which provides
operations for creating and deleting directories, naming and renaming files, and
moving them from one directory to another. The nature of the directory service
does not depend on whether individual files are transferered in their entirety or
accessed remotely.
The directory service defines some alphabet and syntax for composing file (and
directory) names. File names can typically be from I to some maximum number
of letters, numbers, and curtail special characters. Some systems divide file
names into two parts, usually separated by a period, such as prog.c for a C
program or man.txt for a text file. The second part of the name, called the file
extension, identifies the file type.
All distributed systems allow directories to contain subdirectories, to make it
possible for users to group related files together.
Accordingly, operations are provided for creating and deleting directories as well
as entering, removing and looking up files in them. Normally, each subdirectory
contains all the files for one project, such as a large program or document (e.g., a
book).
When the (sub) directory is listed, only the relevant file are shown; unrelated
files are in other (sub) directories and do not clutter the listing. Subdirectories
can contain their own subdirectories and so on, leading to a tree of directories,
often called a hierarchical file system.
Naming Transparency:
The principal problem with this form of naming is that it is not fully transparent.
Two forms of transparency are relevant in this context and are worth
distinguishing. The first one location transparency means that the path name
gives no hint as to where the file (or other object) is located.
On a desktop computer, the bus is typically etched onto the main board
(the parent-board), which holds the CPU and some of the memory, and into
which I/O cards are plugged. On minicomputers the bus is sometimes a flat cable
that wends its way amount the processors, memories, and I/O controllers.
………
The disadvantage of having a single bus is that with as few as three or four CPUs
the bus is likely to become overloaded. The usual approach taken to reduce the
bus load is to equip each CPU with a snooping cache (sometimes called a
snoopy cache), so called because it “snoops” on the bus.
One particularly simple and common protocol is called write through. When a
CPU first reads a word from memory, that word is fetched over the bus and is
stored in the cached of the CPU making the request. If that word is needed again
later, the CPU can take it from the cached without making a memory request,
thus reducing bus traffic.
Each CPU does its caching independent of the others. Consequently, it is
possible for a particular word to be cached at two or more CPUs at the same
time.
Our protocol manages cache blocks, each of which can be in one of the
following there states:
Ring-Based Multiprocessors:
The next step along the path toward distributed shared memory systems are ting-
based multiprocessors, In Memnet, a single address space is divided into a
private part and a shared part.
… … …Draw Diagram SM - 3
Page 48 of 49
The ring consists of 20 parallel wires, which together allow 16 data bits and 4
control bits to be sent every 100 nsec, for a data rate of 160 Mbps.
Unlike the bus-based multiprocessor in Memnet there is no centralized global
memory. Instead, each 32-byte block in the shared address space has a home
machine on which physical memory is always reserved for it, in the Home
memory field.
A block may be cached on a machine other than its home machine; a read-only
block may be present on multiple machines; a read-wire block may be present on
only one machine. In both case, a block need not be present on its home
machine.
All the home machine does is provide a guaranteed place to store the block if no
other machine wants to cache it. This feature is needed because there is no global
memory. In effect, the global memory has been spread out over all the machines.
The Memnet device on each machine contains a table, which contains an entry
for each block in the shared address space, indexed by block number.
Each entry contains a Valid bit telling whether the block is present in the cache
and up to date, an Exclusive bit, specifying whether the local copy, if any, is the
only one, a Home bit, which is set only if this is the block’s home machine, and
interrupt bit, used for forcing interrupts, and a Location field that tells where the
block is located in the cache if it is present and valid.
As the packet passes around the ring, each Memnet device along the way checks
to see if it has the block needed. If so, it puts the block in the dummy field and
modifies the packet header to inhibit subsequent machines from doing so.
A problem arises if the requesting machine has no free space in its cache to hold
the incoming block. To make space, it picks a cached block at random and sends
it home, thus freeing up a cache slot. Blocks who Home bit are set are never
chosen since they are already home.
Switched Multiprocessors:
Although bus-based multiprocessors and ring-based multiprocessors work fine
Page 49 of 49
for small systems, they do not scale well to systems with hundreds or thousands
of CPUs.
A different method is to build the system as hierarchy. Continue to put some
number of CPUs on a single bus, now regard this entire unit (CPUs plus bus) as
a cluster. Build the system as multiple clusters and connect the clusters using an
inter-cluster bus. As long as most CPUs communicate primarily within their own
cluster, there will be relatively little inter cluster traffic.
If one inter-cluster bus proves to be inadequate, add a second inter-cluster bus, or
arrange the cluster in a tr4ee or grid. If still more bandwidth is needed, collect a
bus, tree, or grid of cluster together into a super cluster, and break the system
into multiple super clusters.
Draw Diagram SM - 4