Professional Documents
Culture Documents
Chapter 1
Process is a program currently running. File is a collection of data that can be referred to by
name. A process should be associated with a file because the file stores the instructions that
are executed for that process to run.
The three major jobs of a computer are Input, Output, and Processing. In computing,
input/output is the communication between an information processing system, such as a
computer, and the outside world, possibly a human or another information processing system.
Many kinds of input and output exist, including keyboard, monitors, printers and so on. It is up
to the OS to manage those devices.
Computers contain large amounts of information that users often want to protect and keep
confidential. This information may include email, business plans, tax returns and much more.
Files in UNIX are protected by assigning each one a 9bit binary protection code. The protection
code consists of three 3-bit fields, one for the owner, one for group and one for everyone else.
3bits are known as RWX-bit which means that the owner can read, write and execute.
28. Define the Shell
The OS is the code that carries out the system calls. In computing, a shell is a user interface for
accessing the operating system's services. In general, operating system shells use either a
command-line interface (CLI) or graphical user interface (GUI), depending on a computer's role
and particular operation. Editors, compilers, assemblers, linkers and command interpreters
definitely are not part of the OS but they are important and useful.
OS have to main functions: providing abstraction to user program and managing the
computer’s resources. A system call is a way for programs to interact with the operating
system. A computer program makes a system call when it makes a request to the operating
system’s kernel. System call provides the services of the operating system to the user programs
via Application Program Interface(API).
And interface between user program and OS is primarily dealing with abstractions. Modern OS
have system calls that perform the same functions even if the details differ. System calls are
usually made when a process in user mode requires access to a resource. Then it requests the
kernel to provide the resource via a system call.
If a file system requires the creation or deletion of files. Reading and writing from files
also require a system call.
Creation and management of new processes.
Network connections also require system calls. This includes sending and receiving
packets.
Access to hardware devices such as a printer, scanner etc. requires a system call.
Fork is the only way to create a process in POSIX. It creates an exact duplicate of the original
process, including all the file description, registers and everything. After the fork, the original
process and the copy go their separate ways. All variables have identical values at the time of
the fork. The fork call returns a value, which is zero in the child and equal to the child’s process
identifier or PID in the parent.
To read or write a file, the file must first be opened using open. This call specifies the file name
to be opened. Code O^RDONLY O^WRONLY or O-RDWR meaning open for reading, writing or
both.
To create the O-CRBAT parameter is used.
The most heavily used calls are read and write.
30. Explain Monolithic Systems
The monolithic operating system is a very basic operating system in which file management,
memory management, device management, and process management are directly controlled
within the kernel. All these components like file management, memory management etc. are
located within the kernel. The OS is written as a collection of procedures, linked together into a
single large executable binary program. When this technique is used, each procedure in the
system is free to call any other one.
The services (system calls) provided by the operating system are requested by putting the
parameters in well-defined place and then executing a trap instruction.
The system fetch the parameter and determines which system call is to carried out.
In this model for each system call there is one service procedure that takes care of it and
executes.
The operating system is split into various layers in the layered operating system and each of the
layers have different functionalities.
Layering provides a distinct advantage in an operating system. All the layers can be defined
separately and interact with each other as required. Also, it is easier to create, maintain and
update the system if it is done in the form of layers. Change in one-layer specification does not
affect the rest of the layers.
Each of the layers in the operating system can only interact with the layers that are above and
below it. The lowest layer handles the hardware and the uppermost layer deals with the user
applications.
First ever layered os was the THE, it had 6 layers and todays operating systems take the logic of
that layers with a little of changes.
Details about the six layers are:
1.Hardware:
This layer interacts with the system hardware and coordinates with all the peripheral devices
used such as printer, mouse, keyboard, scanner etc. The hardware layer is the lowest layer in
the layered operating system architecture.
2.CPU Scheduling and multiprogramming:
This layer deals with scheduling the processes for the CPU. There are many scheduling queues
that are used to handle processes. When the processes enter the system, they are put into the
job queue. The processes that are ready to execute in the main memory are kept in the ready
queue. In other words, this layer provided the basic multiprogramming of CPU.
3. Memory Management:
Memory management deals with memory and the moving of processes from disk to primary
memory for execution and back again. This is handled by the third layer of the operating
system.
4. Process Management:
This layer is responsible for managing the processes i.e. assigning the processor to a process at
a time. This is known as process scheduling. The different algorithms used for process
scheduling are FCFS (first come first served), SJF (shortest job first), priority scheduling, round-
robin scheduling etc.
5. I/O Buffer:
I/O devices are very important in the computer systems. They provide users with the means of
interacting with the system. This layer handles the buffers for the I/O devices and makes sure
that they work correctly.
6. User Programs:
This is the highest layer in the layered operating system. This layer deals with the many user
programs and applications that run in an operating system such as word processors, games,
browsers etc.
32.2 Explain Microkernels!
Traditionally, all the layers went in the kernel, but that is not necessary. Putting as little as
possible in kernel mode is a good approach because bugs in the kernel can bring down the
entire system. In contrast, bug that occurs in user processes may not be fatal.
The basic idea behind the microkernel design is to achieve high reliability by
splitting the operating system up into small modules, only one of
which—the microkernel—runs in kernel mode and the rest run as relatively powerless
ordinary user processes. By running each device driver and file
system as a separate user process, a bug in one of these can crash that component,
but cannot crash the entire system. Common desktop os do not use microkernels. They are
dominant in real-time, industrial, avionic, military applications that are mission critical.
A model of the microkernel idea is to distinguish the servers, each of which provides some
service, and the clients, which use these services. This model is known as the client-server
model. Communication between clients and servers is often by message passing. To obtain a
service, a client constructs a message saying what it wants and sends it to the service. The
service then does the work and sends back the answer.
unlike all operating systems, virtual machines are not extended machines, with
files and other nice features. Instead, they are exact copies of the hardware, including
kernel/user mode, I/O, interrupts, and everything else the real machine has.
Because each virtual machine is identical to the true hardware, each one can run any operating
system that will run directly on the hardware. Different virtual machines can, and frequently
do, run different operating systems
CHAPTER 2
36.1 Explain process creation!
Four principal events cause processes to be created:
1. System initialization: When an operating system is booted, many processes are created.
Some of these are processes that interact with users and perform work for them. Others run in
the background and are not associated with users, but instead have some specific function.
2.Execution of a process-creation system call by running a process: Often a running process will
issue system calls to create one or more new processes to help it do its job. Creating new
processes is useful when the work to be done can easily be formulated in terms of several
related processes.
3.A user request to create a new process: In interactive systems, users can start a program by
typing a command or clicking on an icon. Taking either of these actions starts a new process
and runs the selected program in it.
4.Iniiation of batch job: Situation in which processes are created applies only in the batch
systems found on large mainframes.
In some systems, when a process creates another process, the parent process and child process
continue to be associated in certain ways. The child process can itself create more processes,
forming a process hierarchy.
In UNIX, a process and all its children and further descendants together form a process group.
When a user sends a signal from the keyboard, the signal is delivered to all members of the
process group. Each process can catch the signal, ignore the signal, or take the default
action, which is to be killed by the signal.
Windows has no concept of hierarchy
Transition 1 occurs when the operating system discovers that a process cannot continue right
now. In some systems the process can execute a system call, such as pause, to get
into blocked state.
Transitions 2 and 3 are caused by the process scheduler, a part of the operating
system, without the process even knowing about them. Transition 2 occurs when
the scheduler decides that the running process has run long enough, and it is time
to let another process, have some CPU time. Transition 3 occurs when all the other
processes have had their fair share and it is time for the first process to get the CPU
to run again.
Transition 4 occurs when the external event for which a process was waiting
(such as the arrival of some input) happens. If no other process is running at that
instant, transition 3 will be triggered and the process will start running. Otherwise
it may have to wait in ready state for a little while until the CPU is available and its
turn comes.
40.5 Explain the rationale behind Thread Usage
Threads are concept simply explained as process inside process. Instead of thinking about
interrupts, timers, and context switches, we can think about parallel processes. Only now with
threads we add a new element: the ability for the parallel entities to share an address space
and all of its data among themselves. This ability is essential for certain applications, which is
why
having multiple processes (with their separate address spaces) will not work.
The main reason for having threads is that in many applications, multiple activities are going on
at once. By dividing such an application into multiple sequential threads that run in quasi-
parallel, the programming model becomes simpler.
A second argument for having threads is that since they are lighter weight than
processes, they are easier to create and destroy than processes. In
many systems, creating a thread goes 10–100 times faster than creating a process.
A third reason for having threads is also a performance argument. Threads
produce no performance gain when all of them are CPU bound, but when there is considerable
computing and considerable I/O, having threads allows these activities to overlap, thus
speeding up the application.
Finally, threads are useful on systems with multiple CPUs, where real parallelism
is possible. For example, using a word processor, using three threads, the programming model
is much simpler. The first thread just interacts with the user. The second thread reformats the
document when told to. The third thread writes the contents of RAM to disk periodically.
It should be clear that having three separate processes would not work here because
all three threads need to operate on the document. By having three threads
instead of three processes, they share a common memory and therefore all have access
to the document being edited. With three processes this would be impossible.
41. Why are Threads implemented in User Space (explain advantages and disadvantages)(still
not totally clear, see tutorials on YouTube)
There are two main places to implement threads: user space and the kernel.
The first method is to put the threads package entirely in user space. The kernel
knows nothing about them. The first, and most obvious, advantage is that
a user-level threads package can be implemented on an operating system that does
not support threads. All operating systems used to fall into this category, and even
now some still do. With this approach, threads are implemented by a library.
When threads are managed in user space, each process needs its own private
thread table to keep track of the threads in that process. This table is analogous to
the kernel’s process table, except that it keeps track only of the per-thread properties,
such as each thread’s program counter, stack pointer, registers, state, and so
forth. The thread table is managed by the run-time system. When a thread is
moved to ready state or blocked state, the information needed to restart it is stored
in the thread table, exactly the same way as the kernel stores information about
processes in the process table.
When a thread does something that may cause it to become blocked locally, for
example, waiting for another thread in its process to complete some work, it calls a
run-time system procedure. This procedure checks to see if the thread must be put
into blocked state. If so, it stores the thread’s registers (i.e., its own) in the thread
table, looks in the table for a ready thread to run, and reloads the machine registers
with the new thread’s sav ed values. User-level threads also have other advantages. They allow
each process to have its own customized scheduling algorithm.
Despite their better performance, user-level threads packages have some major
problems. First among these is the problem of how blocking system calls are implemented.
Suppose that a thread reads from the keyboard before any keys have
been hit. Letting the thread actually make the system call is unacceptable, since
this will stop all the threads. One of the main goals of having threads in the first
place was to allow each one to use blocking calls, but to prevent one blocked
thread from affecting the others. With blocking system calls, it is hard to see how
this goal can be achieved readily. Another problem with user-level thread packages is that if a
thread starts running, no other thread in that process will ever run unless the first thread
voluntarily gives up the CPU.
42.7 Why are Threads implemented in the Kernel (explain advantages and disadvantages)
There is no thread table in each process. Instead, the kernel has a thread table that keeps track
of all the threads in the system. The kernel’s thread table holds each thread’s registers, state,
and other information. The information is the same as with user-level threads, but now kept in
the kernel instead of in user space (inside the run-time system).
All calls that might block a thread are implemented as system calls, at considerably greater cost
than a call to a run-time system procedure. When a thread blocks, the kernel, at its option, can
run either another thread from the same process (if one is ready) or a thread from a different
process. With user-level threads, the run-time system keeps running threads from its own
process until the kernel takes the CPU away from it (or there are no ready threads left to run).
Kernel threads do not require any new, nonblocking system calls. In addition, if one thread in a
process causes a page fault, the kernel can easily check to see if the process has any other
runnable threads, and if so, run one of them while waiting for the required page to be brought
in from the disk. Their main disadvantage is that the cost of a system call is large.
Another problem is signals. Signals are sent to processes, not threads. Which thread would
know which signal should handle. What if two or more threads take same signals.
An important example is how incoming messages, for example requests for service, are
handled. The traditional approach is to have a process or thread that is blocked on a receive
system call waiting for an incoming message. When a message arrives, it accepts the message,
unpacks it, examines the contents, and processes it. However, a completely different approach
is also possible, in which the arrival of a message causes the system to create a new thread to
handle the message. Such a thread is called a pop-up. A key advantage of pop-up threads is that
since they are brand new, they do not have any history— registers, stack, whatever—that must
be restored. Each one starts out fresh and each one is identical to all the others. This makes it
possible to create such a thread quickly. The new thread is given the incoming message to
process.
First-Come First-Served – probably the simplest of all scheduling algorithms. With this
algorithm, processes are assigned the CPU in the order they request it. Basically, there is a
single queue of ready processes. When the first job enter the system from the outside in the
morning, it is started immediately and allowed to run as long as it wants to. It is not
interrupted because it has run too long. As other jobs come in, they are put onto the end of
the queue. When the running process blocks, the first process on the queue is run next. When
a blocked process becomes ready, like a newly arrived job, it is put on the end of the queue.
Shortest Job First – non-preemptive batch algorithm that assumes the run times are known
in advance. It selects for execution the waiting process with the smallest execution time.
Shortest Remaining Time Next – a preemptive version of shortest job first is shortest
remaining time next. With this algorithm, the scheduler always chooses the process whose
remaining run time is the shortest. Again here the run time must be known in advance.
Round-Robin Scheduling – Each process is assigned a time interval, called its quantum, during
which it is allowed to run.
Priority Scheduling – Each process is assigned a priority, and the runnable process with the
highest priority is allowed to run.
Multiple Queues – scheduling algorithm is used in scenarios where the processes can be
classified into groups based on property like process type, CPU time, IO access, memory size,
etc.
Shortest Process Next – figuring out which of the currently runnable processes is the shortest
one. One approach is to make estimates based on past behavior and run the process with the
shortest estimated running time.
Guaranteed Scheduling – make real promises to the users about performance. Guarantees
fairness by monitoring the amount of CPU time spent by each user and allocating resources
accordingly.
Lottery Scheduling - the basic idea is to give processes lottery tickets for various system
resources, such as CPU time. Whenever a scheduling decision has to be made, a lottery ticket
is chosen at random, and the process holding that ticket gets the resource.
Fair-Share Scheduling – each user is allocated some fraction of the CPU and the scheduler
picks the processes in such a way as to enforce it. Thus if two users have each been promised
50% of the CPU, they will each get that, no matter how many processes they have in
existence.
54.19. Explain Scheduling in Real-Time Systems
Mechanism versus policy
Thread Scheduling – when several processes each have multiple threads, we have
two levels of parallelism present: processes and threads. Scheduling in such
systems differs substantially depending on whether user-level threads or kernel-
level threads are supported.
55.20. Explain in detail The Dining Philosophers Problem and offered solutions
The problem can be stated quite simply as follows. Five philosophers are seated around a circular
table. Each philosopher has a plate of spaghetti. The spaghetti is so slippery that a philosopher
needs two forks to eat it. Between each pair of plates is one fork. The layout of the table is
illustrated below:
The life of a philosopher consists of alternate periods of eating and thinking. When a
philosopher gets hungry, she tries to acquire her left and right forks, one at a time in either
order. If successful in acquiring two forks, she eats for a while, then puts down the forks, and
continues to think. The key question is: Can you write a program for each philosopher that does
what it is supposed to do and never gets struck? Suppose that all five philosophers take their
left forks simultaneously. None of them will be able to take their right forks, and there will be a
deadlock.
A solution that has no deadlock and no starvation is to protect the five statements following the
call to think by binary semaphore. Before starting to acquire forks, a philosopher would do a
down on mutex. After replacing the forks, she would do an up on mutex. From a theoretical
viewpoint, this solution is adequate. From a practical one, it has a performance bug: only one
philosopher can be eating at any instant. With five forks available, we should be able to allow
two philosophers to eat at the same time.
56.21. Explain in detail The Readers and Writers Problem and offered solutions
The Readers and Writers Problem models access to a database. Imagine, for example, an airline
reservation system, with many competing processes wishing to read and write it. It is acceptable
too have multiple processes reading the database at the same time, but if one process is updating
(writing) the database, no other processes may have access to the database, not even readers.
The question is how do you program the readers and the writers? Solution:
CHAPTER 3
57.1. When dealing with memory, one way to manage it is to have no abstraction at all. Explain
this approach in detail and key disadvantages.
Back on time, it was not possible to have two running programs in memory at the same time. IF
the first program wrote a new value, say, location 2000, this would erase whatever value the
second program was storing there. Both programs would crash almost immediately. Generally,
only one process at a time can be running. As soon as the user types a command, the operating
system copies the requested program from disk to memory and executes it. When the process
finishes, the operating system displays a prompt character and waits for a new command. When
it receives, it loads a new program into memory, overwriting the first one. One way to get some
parallelism in a system with no memory abstraction is to program with multiple threads. Since all
threads in a process are supposed to see the same memory image, the fact that they are forced
to is not a problem. A key disadvantage of no memory abstraction is that a bug in the user
program can wipe out the operating system, possibly with disastrous results (such as garbling the
disk).
58.2. Is it possible to run multiple programs without memory abstraction? Explains which ways
it can be done (hint: there are at least 2 approaches).
However, even with no memory abstraction, it is possible to run multiple programs at the same
time. What the operating system has to do is save the entire contents of memory to a disk file,
then bring in and run the next program. As long as there is only one program at a time in memory,
there are no conflicts. It can be done by use of swapping and virtual memory. Swapping strategy
consists of bringing in each process in its entirety, running it for a while, then putting it back on
the disk. The other strategy, called virtual memory, allows programs to run even when they are
only partially in main memory.
59.3 Explain the implementation of base and limit registers in memory management. How does
this improve the performance and what is its main drawback?
When base and limit registers are used, programs are loaded into consecutive memory locations
wherever there is room and without relocation during loading. When a process is run, the base
register is loaded with the physical address where its program begins in memory and the limit
register is loaded with the length of the program. Using base and limit registers is an easy way to
give each process its own private address space because every memory address generated
automatically has the base register contents added to it before being sent to memory. A
drawback is the need of performing an addition and a comparison on every memory reference
during relocation.
60.4. Explain swapping as a technique and how it is implemented.
The swapping technique deals with memory overload. It is also known as a technique for memory
compaction. Swapping strategy consists of bringing in each process in its entirety, running it for
a while, then putting it back on the disk. Idle processes are mostly stored on disk, so they do not
take up any memory when they are not running. The operation of swapping system is illustrated
below. Initially, only process A is in memory. Then processes B and C are created or swapped in
from disk. In Fig. 3-4(d) A is swapped out to disk. Then D comes in and B goes out. Finally A comes
in again. Since A is now at a different location, addresses contained in it must be relocated, either
by software when it is swapped in or (more likely) by hardware during program execution. For
example, base and limit registers would work fine here.
61.5 How can free memory be managed (kept track of)? Explain how Bitmaps and Linked Lists
is implemented.
Memory is an important resource that must be carefully managed. The part if the operating
system that manages the memory hierarchy is called the memory manager. Its job is to keep
track of which parts of memory are in use and which parts are not in use, to allocate memory to
processes when they need it and deallocate it when they are done, and to manage swapping
between main memory and disk when main memory is too small to hold processes.
When memory is assigned dynamically, the operating system must manage it. In general
terms, there are two ways to keep track of memory usage: bitmaps and free lists.
Bitmaps - With a bitmap, memory is divided into allocation units as small as a few words and as
large as several kilobytes. Corresponding to each allocation unit is a bit in the bitmap, which is 0
if the unit is free and 1 if it is occupied. Figure 4 shows part of memory and the corresponding
bitmap. A bitmap provides a simple way to keep track of memory words in a fixed amount of
memory because the size of the bitmap depends only on the size of memory and the size of the
allocation unit.
Linked List - Another way of keeping track of memory is to maintain a linked list of allocated
and free memory segments, where a segment either contains a process or is an empty hole
between two processes. Each entry in the list specifies a hole (H) or process (P), the address at
which it starts, the length, and a pointer to the next entry.
Fig 4
62.6 What are the main advantages and disadvantages of Bitmaps and Linked Lists?
Not done
A bitmap provides a simple way to keep track of memory words in a fixed amount of memory
because the size of the bitmap depends only on the size of memory and the size of the allocation
unit. The main problem with it is that when it has been decided to bring a k unit process into
memory, the memory manager must search the bitmap to find a run of k consecutive 0 bits in
the map. Searching a bitmap for a run of a given length is a slow operation.
Linked lists have the advantage that when a process terminates or is swapped out, updating the
list is straightforward. A terminating process normally has two neighbors ( except when it is at
the very top or bottom of memory). These may be either processes or holes, leading to several
combinations. Updating the list requires replacing a process by a hole. When the processes and
holes are kept on a list sorted by address, several algorithms can be used to allocate memory for
a newly created process (or an existing process being swapped in from disk). We assume that
the memory manager knows how much memory to allocate. It consumes more space because
every node requires an additional pointer to store address of the next node. Searching a
particular element in list is difficult and also time consuming. Reverse Traversing is difficult in
linked list
63.7 Explain the FITS approaches in managing free memory (how are
implemented, which is best, worst, etc.)
The simplest algorithm is first fit. The memory manager scans along the list of segments until it
finds a hole that is big enough. The hole is then broken up into two pieces, one for the process
and one for the unused memory, except in the statistically unlikely case of an exact fit. First fit
is a fast algorithm because it searches as little as possible.
A minor variation of first fit is next fit. It works the same way as first fit, except that it keeps
track of where it is, whenever it finds a suitable hole. The next time it is called to find a hole, it
starts searching the list from the place where it left off last time, instead of always at the
beginning, as first fit does.
Another well-known and widely used algorithm is best fit. Best fit searches the entire list, from
beginning to end, and takes the smallest hole that is adequate. Rather than breaking up a big
hole that might be needed later, best fit tries to find a hole that is close to the actual size
needed, to best match the request and the available holes.
As an example of first fit and best fit, consider Fig. 4. If a block of size 2 is needed, first fit will
allocate the hole at 5, but best fit will allocate the hole at 18.
Best fit is slower than first fit because it must search the entire list every time it is called.
Somewhat surprisingly, it also results in more wasted memory than first fit or next fit because it
tends to fill up memory with tiny, useless holes. First fit generates larger holes on the average.
To get around the problem of breaking up nearly exact matches into a process and a tiny hole,
one could think about worst fit, that is, always take the largest available hole, so that the new
hole will be big enough to be useful. Simulation has shown that worst fit is not a very good idea
either.
Yet another allocation algorithm is quick fit, which maintains separate lists for some of the
more common sizes requested. For example, it might have a table with n entries, in which the
first entry is a pointer to the head of a list of 4-KB holes, the second entry is a pointer to a list of
8-KB holes, the third entry a pointer to 12-KB holes, and so on. Holes of, say, 21 KB, could be
put on either the 20-KB list or on a special list of odd-sized holes. With quick fit, finding a hole
of the required size is extremely fast, but it has the same disadvantage as all schemes that sort
by hole size, namely, when a process terminates or is swapped out, finding its neighbors to see
if a merge is possible is expensive. If merging is not done, memory will quickly fragment into a
large number of small holes into which no processes fit.
65.9 Explain the Paging approach in managing the memory (virtual memory). Give one
example of its implementation (with actual sizes of memory, size of pages, # of pages, etc.).
Give the numbers at your will, but make sure that your calculations are correct.
66.10 Why are how are used the Modified and Referenced bits in managing memory pages?
Give an example.
The Modified and Referenced bits keep track of page usage. When a page is written to, the
hardware automatically sets the Modified bit. This bit is of value when the operating system
decides to reclaim a page frame. If the page in it has been modified (i.e., is "dirty"), it must be
written back to the disk. If it has not been modified (i.e., is "clean"), it can just be abandoned,
since the disk copy is still valid. The bit is sometimes called the dirty bit, since it reflects the
page's state.
The Referenced bit is set whenever a page is referenced, either for reading or writing. Its value
is to help the operating system choose a page to evict when a page fault occurs. Pages that are
not being used are better candidates than pages that are, and this bit plays an important role in
several of the page replacement algorithms that we will study later in this chapter.
67.11 Why and how are used Translation Lookaside Buffers? Give an example.
Since the page tables are stored in the main memory, each memory access of a program
requires at least one memory accesses to translate virtual into physical address and to try to
satisfy it from the cache. On the cache miss, there will be two memory accesses. The key to
improving access performance is to rely on locality of references to page table. When a
translation for a virtual page is used, it will probably be needed again in the near future because
the references to the words on that page have both temporal and spatial locality.
The solution that has been devised is to equip computers with a small hardware device for
mapping virtual addresses to physical addresses without going through the page table. The
device, called a TLB (Translation Lookaside Buffer) or sometimes an associative memory. It is
usually inside the MMU and consists of a small number of entries, eight in this example, but
rarely more than 64. Each entry contains information about one page, including the virtual page
number, a bit that is set when the page is modified, the protection code (read/write/execute
permissions), and the physical page frame in which the page is located. These fields have a one-
to-one correspondence with the fields in the page table, except for the virtual page number,
which is not needed in the page table. Another bit indicates whether the entry is valid (i.e., in
use) or not.
Example: https://www.youtube.com/watch?v=95QpHJX55bM
Steps in TLB hit:
1. CPU generates virtual address.
2. It is checked in TLB (present).
3. Corresponding frame number is retrieved, which now tells where in the main memory
page lies.
68.12 Give an example when Multilevel Page Tables are used. Why is this approach practical?
The secret to the multilevel page table method is to avoid keeping all the page tables in
memory all the time. In particular, those that are not needed should not be kept around.
1. In single level page table, you need the whole table to access even a small amount of
data(less memory references). i.e 2^20 pages each PTE occupying 4bytes as you assumed.
Space required to access any data is 2^20 * 4bytes = 4MB
2. Paging pages is multi-level paging. In multilevel paging it is more specific, you can with the
help of multi-level organization decide which specific page among the 2^20 pages your
data exists and select it . So here you need only that specific page to be in the memory
while you run the process.
In the 2 level case that you discussed you need 1st level page table and then 1 of the 2^10 page
tables in second level. So, 1st level size = 2^10 * 4bytes = 4KB 2nd level we need only 1 among
the 2^10 page tables = so size is 2^10 * 4bytes = 4KB
Practical: The interesting thing to note is that although the address space may contain over a
million pages, only four page tables are actually needed: the top-level table, and the second-
level tables for 0 to 4M (for the program text), 4M to 8M (for the data), and the top 4M (for the
stack).
69.13 Why do Page Replacement Algorithms exist? What kind of problem do they solve?
When a page fault occurs, the operating system has to choose a page to evict (remove from
memory) to make room for the incoming page. If the page to be removed has been modified
while in memory, it must be rewritten to the disk to bring the disk copy up to date. If, however,
the page has not been changed (e.g., it contains program text), the disk copy is already up to
date, so no rewrite is needed. The page to be read in just overwrites the page being evicted.
While it would be possible to pick a random page to evict at each page fault, system
performance is much better if a page that is not heavily used is chosen. If a heavily used page is
removed, it will probably have to be brought back in quickly, resulting in extra overhead. Much
work has been done on the subject of page replacement algorithms, both theoretical and
experimental. Below we will describe some of the most important algorithms.
It is worth noting that the problem of "page replacement" occurs in other areas of computer
design as well. For example, most computers have one or more memory caches consisting of
recently used 32-byte or 64-byte memory blocks. When the cache is full, some block has to be
chosen for removal. This problem is precisely the same as page replacement except on a
shorter time scale (it has to be done in a few nanoseconds, not milliseconds as with page
replacement). The reason for the shorter time scale is that cache block misses are satisfied from
main memory, which has no seek time and no rotational latency.
A second example is in a Web server. The server can keep a certain number of heavily used
Web pages in its memory cache. However, when the memory cache is full, and a new page is
referenced, a decision has to be made which Web page to evict. The considerations are similar
to pages of virtual memory, except for the fact that the Web pages are never modified in the
cache, so there is always a fresh copy "on disk." In a virtual memory system, pages in main
memory may be either clean or dirty. In all the page replacement algorithms to be studied
below, a certain issue arises: when a page is to be evicted from memory, does it have to be one
of the faulting process' own pages, or can it be a page belonging to another process? In the
former case, we are effectively limiting each process to a fixed number of pages; in the latter
case we are not. Both are possibilities.
70.14 List and explain all Page Replacement Algorithms you know.
The Optimal Page Replacement - The optimal page replacement algorithm says that the page
with the highest label should be removed. If one page will not be used for 8 million instructions
and another page will not be used for 6 million instructions, removing the former pushes the
page fault that will fetch it back as far into the future as possible. Computers, like people, try to
put off unpleasant events for as long as they can.
The only problem with this algorithm is that it is unrealizable. At the time of the page fault, the
operating system has no way of knowing when each of the pages will be referenced next. (We
saw a similar situation earlier with the shortest job first scheduling algorithm, how can the
system tell which job is shortest?) Still, by running a program on a simulator and keeping track
of all page references, it is possible to implement optimal page replacement on the second run
by using the page reference information collected during the first run.
The Not Recently Used Page Replacement Algorithm - The NRU (Not Recently Used) algorithm
removes a page at random from the lowest-numbered nonempty class. Implicit in this
algorithm is the idea that it is better to remove a modified page that has not been referenced in
at least one clock tick (typically about 20 msec) than a clean page that is in heavy use. The main
attraction of NRU is that it is easy to understand, moderately efficient to implement, and gives
a performance that, while certainly not optimal, may be adequate.
The First-In, First-Out (FIFO) Page Replacement Algorithm - Another low-overhead paging
algorithm is the FIFO (First-In, First-Out) algorithm. To illustrate how this works, consider a
supermarket that has enough shelves to display exactly k different products. One day, some
company introduces a new convenience food, freeze-dried, organic yogurt that can be
reconstituted in a microwave oven. It is an immediate success, so our finite supermarket has to
get rid of one old product in order to stock it.
One possibility is to find the product that the supermarket has been stocking the longest (i.e.,
something it began selling 120 years ago) and get rid of it on the grounds that no one is
interested any more. In effect, the supermarket maintains a linked list of all the products it
currently sells in the order they were introduced. The new one goes on the back of the list; the
one at the front of the list is dropped. As a page replacement algorithm, the same idea is
applicable. The operating system maintains a list of all pages currently in memory, with the
most recent arrival at the tail and the least recent arrival at the head. On a page fault, the page
at the head is removed and the new page added to the tail of the list. When applied to stores,
FIFO might remove mustache wax, but it might also remove flour, salt, or butter. When applied
to computers the same problem arises. For this reason, FIFO in its pure form is rarely used.
The Second Chance Page Replacement Algorithm - A simple modification to FIFO that avoids
the problem of throwing out a heavily used page is to inspect the R bit of the oldest page. If it is
0, the page is both old and unused, so it is replaced immediately. If the R bit is 1 (it’s
referenced), the bit is cleared, the page is put onto the end of the list of pages, and its load time
is updated as though it had just arrived in memory. Then the search continues. What second
chance is looking for is an old page that has not been referenced in the most recent clock
interval. If all the pages have been referenced, second chance degenerates into pure FIFO.
The Clock Page Replacement Algorithm - Although second chance is a reasonable algorithm, it
is unnecessarily inefficient because it is constantly moving pages around on its list. A better
approach is to keep all the page frames on a circular list in the form of a clock. The hand points
to the oldest page. When a page fault occurs, the page being pointed to by the hand is
inspected. If its R bit is 0, the page is evicted, the new page is inserted into the clock in its place,
and the hand is advanced one position. If R is 1, it is cleared, and the hand is advanced to the
next page. This process is repeated until a page is found with R = 0. Not surprisingly, this
algorithm is called clock.
The Least Recently Used Page Replacement Algorithm - A good approximation to the optimal
algorithm is based on the observation that pages that have been heavily used in the last few
instructions will probably be heavily used again in the next few. Conversely, pages that have not
been used for ages will probably remain unused for a long time. This idea suggests a realizable
algorithm: when a page fault occurs, throw out the page that has been unused for the longest
time. This strategy is called LRU (Least Recently Used) paging.
Although LRU is theoretically realizable, it is not cheap. To fully implement LRU, it is necessary
to maintain a linked list of all pages in memory, with the most recently used page at the front
and the least recently used page at the rear. The difficulty is that the list must be updated on
every memory reference. Finding a page in the list, deleting it, and then moving it to the front is
a very time-consuming operation, even in hardware (assuming that such hardware could be
built).
The Working Set Page Replacement Algorithm - The set of pages that a process is currently
using is known as its working set. If the entire working set is in memory, the process will run
without causing many faults until it moves into another execution phase (e.g., the next pass of
the compiler). If the available memory is too small to hold the entire working set, the process
will cause many page faults and run slowly, since executing an instruction takes a few
nanoseconds and reading in a page from the disk typically takes 10 milliseconds. At a rate of
one or two instructions per 10 milliseconds, it will take ages to finish. A program causing page
faults every few instructions is said to be thrashing.
In a multiprogramming system, processes are frequently moved to disk (i.e., all their pages are
removed from memory) to let other processes have a turn at the CPU. The question arises of
what to do when a process is brought back in again. Technically, nothing need be done. The
process will just cause page faults until its working set has been loaded. The problem is that
having 20, 100, or even 1000 page faults every time a process is loaded is slow, and it also
wastes considerable CPU time, since it takes the operating system a few milliseconds of CPU
time to process a page fault. Therefore, many paging systems try to keep track of each process'
working set and make sure that it is in memory before letting the process run. This approach is
called the working set model (Denning, 1970). It is designed to greatly reduce the page fault
rate. Loading the pages before letting processes run is also called prepaging.
WSClock Page Replacement Algorithm – The basic working set algorithm is awkward, since the
entire page table has to be scanned at each page fault until a suitable candidate is located. An
improved algorithm, that is based on the clock algorithm but also uses the working set
information, is called WSCIock (Carr and Hennessey, 1981). Due to its simplicity of
implementation and good performance, it is widely used in practice. The data structure needed
is a circular list of page frames, as in the clock algorithm. Initially, this list is empty. When the
first page is loaded, it is added to the list. As more pages are added, they go into the list to form
a ring. Each entry contains the Time of last use field from the basic working set algorithm, as
well as the R bit and the M bit. As with the clock algorithm, at each page fault the page pointed
to by the hand is examined first If the R bit is set to 1, the page has been used during the
current tick, so it is not an ideal candidate to remove. The R bit is then set to 0, the hand
advanced to the next page, and the algorithm repeated for that page.
A new virtual memory management algorithm WSCLOCK has been synthesized from the local
working set (WS) algorithm, the global CLOCK algorithm, and a new load control mechanism for
auxiliary memory access. The new algorithm combines the most useful feature of WS-a natural
and effective load control that prevents thrashing-with the simplicity and efficiency of CLOCK.
Studies are presented to show that the performance of WS and WSCLOCK are equivalent, even
if the savings in overhead are ignored.
71.15 What are the differences between Clock Page Replacement algorithm and WSClock
Page Replacement Algorithm?
72.16 What is the difference between a physical address and a virtual address?
Physical addresses refer to hardware addresses of physical memory.
Physical addressing means that your program actually knows the real layout of RAM. When you
access a variable at address 0x8746b3, that's where it's really stored in the physical RAM chips.
Virtual addresses refer to the virtual store viewed by the process.
With virtual addressing, all application memory accesses go to a page table, which then maps
from the virtual to the physical address. So, every application has its own "private" address
space, and no program can read or write to another program's memory. This is called
segmentation.
Virtual addresses might be the same as physical addresses or might be different, in which case
virtual addresses must be mapped into physical addresses. Mapping is done by Memory
Management Unit (MMU).
Virtual space is limited by size of virtual addresses (not physical addresses)
Virtual space and physical memory space are independent
The only downsides of virtual memory are added complexity in the hardware implementation
and slower performance.5