You are on page 1of 35

Computer Architecture Memory Management

Memory Paging Segmentation Virtual Memory Caches


Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Segmentation
User views of logical memory: Linear array of bytes
Reflected by the Paging memory scheme

A collection of variable-sized entities


User thinks in terms of subroutines, stack, symbol table, main program which are somehow located somewhere in memory.

Segmentation supports this user view. The logical address space is a collection of segments.
Computer Architecture WS 06/07

Figure from [Sil00 p.285] Dr.-Ing. Stefan Freinatis

Segmentation
1 1 2 3 2 4 3
User space Physical memory

Segmentation model: The user space (logical address space) consists of a collection of segments which are mapped through the segmentation architecture onto the physical memory.
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Segmentation
Physical address space of a process can

be non-contiguous as with paging


Logical address consists of a tuple
<segment number, offset>

Segment table maps logical address onto physical

address

base: physical address of segment limit: length of segment

Segment table can hold additional segment attributes


Like with frame attributes (see paging).

Shared Segments
Shared segments are mapped to the same segment in physical memory.

Computer Architecture

WS 06/07

Dr.-Ing. Stefan Freinatis

Segmentation
s selects the entry from the table. Offset d is checked against the maximum size of the segment (limit). Final physical address = base + d.

Figure from [Sil00 p.286] Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Segmentation
Segments are variable-sized
Dynamic memory allocation required (first fit, best fit, worst fit).

External fragmentation
In the worst case the largest hole may not be large enough to fit in a new segment. Note that paging has no external fragmentation problem.

Each process has its own segment table


like with paging where each process has its own page table. The size of the segment table is determined by the number of segments, whereas the size of the page table depends on the total amount of memory occupied.

Segment table located in main memory


as is the page table with paging

Segment table base register (STBR)


points to current segment table in memory

Segment table length register (STLR)


indicates number of segments
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Segmentation
Example:

A program is being assembled. The compiler determines the sizes of the individual components (segments) as follows:
Segment main program symbol table function sqrt() subroutine stack Size 400 byte 1000 byte 400 byte 1000 byte 1100 byte
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Segmentation
Example (continued):
Figure from [Sil00 p.287]

The process is assigned 5 segments in memory as well as a segment table.


Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Shared Segments
Process P1 and P2 share the editor code. Segment 0 of each process is mapped onto the same physical segment at address 43062.

The data segments are private to each process, so segment 1 of each process is mapped to its own segment in physical memory.

Figure from [Sil00 p.288] Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Paging versus Segmentation


With paging physical memory is divided into fixed-size frames. When

memory space is needed, as many free frames are occupied as necessary. These frames can be located anywhere in memory, the user process always sees a logical contiguous address space.
With segmentation the memory is not systematically divided. When a

program needs k segments (usually these have different sizes), the OS tries to place these segments in the available memory holes. The segments can be scattered around memory. The user process does not see a contiguous address space, but sees a collection of segments (of course each individual segment is contiguous as is each page or frame).

Computer Architecture

WS 06/07

Dr.-Ing. Stefan Freinatis

Paging versus Segmentation


13 14 15 16 17 18 19 20
seg3

unused memory
internal fragmentation

seg1

free memory
can be allocated

seg4

seg2

Paging is based on fixed-size units of memory (frames)

Segmentation is based on variable-size units of memory (segments)


WS 06/07 Dr.-Ing. Stefan Freinatis

Computer Architecture

Paging versus Segmentation


Each process is assigned its page table.

Paging Segmentation

Page table size proportional to allocated memory Often large page tables and/or multi-level paging Internal fragmentation Free memory is quickly allocated to a process Motorola 68000 line is based on a flat address space

Each process is assigned a segment table Segment table size proportional to number of segments Usually small segment tables External fragmentation. Lengthy search times when allocating memory to a process. Intel 80X86 family is based on segmentation
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Paged Segments
Combining segmentation with paging yields paged segments
13
seg1

14 15

With segmentation, each segment is a contiguous space in physical memory.

seg4

16 17 18 19

seg2

With paged segments, each segment is sliced into pages. The pages can be scattered in memory.

seg3

20

segmentation

paged segments
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Paged Segments
Each segment has its own page table
frame numbers

15
seg1

13 14 15 16 17 18 19 20

16 17
page table

seg4

14
page table

seg2

13
page table

unused memory
internal fragmentation

seg3

18 20
page table

logical process space


Computer Architecture

physical memory
WS 06/07 Dr.-Ing. Stefan Freinatis

Paged Segments
The MULTICS (predecessor of UNIX) operating system solved the problems of external fragmentation and lengthy search times by paging the segments. This solution differs from pure segmentation in that each segment table entry does not contain the base address of the segment, but rather contains the base address of a page table for this segment. In contrast to pure paging where each process is assigned a page table, here each segment is assigned a page table. The processes still see just segments not knowing that the segments themselves are paged. With paged segments there is no more time spent on optimal segment placing, however, there is introduced some internal fragmentation.
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Paged Segments
Explanation of next slide (principle of paged segments)

The logical address is a tuple <segment number s, offset d>. The segment number is added to the STBR (segment table base register) and by this points to a segment table entry. The segment table is located in main memory. From the entry the page table base is derived which points to the beginning of the corresponding page table in memory. The first part p of the offset d determines the entry in the page table. The output of the page table is the frame address f (or alternatively a frame number). Finally f + d is the physical memory address. Steps in resolving the final physical address: PageTable = SegmentTable[s].base; f = PageTable[p]; final address = f + d d
logical address

Computer Architecture

WS 06/07

Dr.-Ing. Stefan Freinatis

Paged Segments

Principle of paged segments

Computer Architecture

WS 06/07

Dr.-Ing. Stefan Freinatis

Paged Segments
Combination of segmentation and paging
User view is segmentation, memory allocation scheme is paging

Used by modern processors / architectures

Example: Intel 80386


CPU has 6 segment registers
which act as a quick 6-entry segment table

Up to 16384 segments per process possible


in which case the segment table resides in main memory.

Maximum segment size is 4 GB


Within each segment we have a flat address scheme of 232 byte addresses

Page size is 4 kB
A two-level paging scheme is used
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Computer Architecture Memory Management


Memory Paging Segmentation Virtual Memory Caches
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Virtual Memory
What if the physical memory is smaller than required by a process?

Dynamic Loading Overlays

Require special precautions and extra work by the programmer.

It would be much easier if we would not have to worry about the memory size and could leave the problem of fitting a larger program into smaller memory to the operating system.

Virtual Memory
Memory is abstracted into an extremely large uniform array of storage, apart from the amount of physical memory available.
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Virtual Memory
Based on locality assumption
No process can access all its code and data at the same time, therefore the entire process space does not need to be in memory at all time instants.

Only parts of the process space are in memory


The remaining ones are on disk and are loaded when demanded

Logical address space can be much larger than physical address space
A program larger than physical memory can be executed More programs can (partially) reside in memory which increases the degree of multiprogramming!

Computer Architecture

WS 06/07

Dr.-Ing. Stefan Freinatis

Virtual Memory
Virtual memory concept (one program)

OS size

OS backing store
virtual memory concept

(usually a disk)

free memory

program

physical memory
Computer Architecture

physical memory
WS 06/07 Dr.-Ing. Stefan Freinatis

Virtual Memory
Virtual memory concept (three programs)

OS size A

C
program A program B program C

physical memory
Computer Architecture WS 06/07

backing store

Dr.-Ing. Stefan Freinatis

Virtual Memory
Virtual memory can be implemented by means of

Demand Segmentation
Used in early Burroughs computer systems and in IBM OS/2. Complex segment-replacement algorithms.

Demand Paging
Commonly used today. Physical memory is divided into frames (paging principle). Demand paging applies to both paging systems and paged segment systems.

Figure next slide: Virtual memory usually is much larger than physical memory (e.g. modern 64-bit processors). The pages currently needed by a process are in memory, the other pages reside on disk. From the page table is known whether a page is in memory or on disk.

Computer Architecture

WS 06/07

Dr.-Ing. Stefan Freinatis

Virtual Memory

page table disk


Figure from [Sil00 p.299]

Virtual memory consists of more pages than there are frames in physical memory
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Demand Paging
A page is brought from disk into memory when it is needed (when it is demanded by the process)

Less I/O
than loading the entire program (at least for the moment)

Less memory needed


since a (hopefully) great part of the program remains on disk

Faster response
The process can start earlier since loading is quicker

More processes in memory


The memory saved can be given to other processes

Loading a page on demand is done by the pager (a part of the operating system usually a daemon process).
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Demand Paging
Q: How does the OS know that a page is demanded by a process? A: When the process tries to access a page that is not in memory!
A process does not know whether or not a page is in memory, only the OS knows.

Each page table entry has a validity bit (v)


If v = 1 page is in memory If v = 0 page is in on disk validity bit is also termed valid-invalid bit
During address translation, when the validity bit is found 0, the hardware causes a page fault trap to the operating system.

Computer Architecture

WS 06/07

Dr.-Ing. Stefan Freinatis

Page Fault
A page fault is the fact that a non-memory-resident page was tried to be accessed by some process. Steps in demand paging: 1. A reference to some page is made 2. The page is not listed in the table (or is marked invalid) which causes a page fault trap (a hardware interrupt) to the operating system. 3. An internal table is checked (usually kept with the process control block) to determine whether the reference was a valid or an invalid memory access. If the reference was valid, a free frame is to be found.
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Page Fault
4. A disk operation is scheduled to read in the desired page into the free frame. 5. When disk read is complete, the internal tables are updated to reflect that the page now is in memory. 6. The process is restarted at the instruction that caused the page fault trap. The process can now access the page.

These steps are symbolized in the next figure

Computer Architecture

WS 06/07

Dr.-Ing. Stefan Freinatis

Virtual Memory

Page table indicating that pages 0, 2 and 5 are currently in memory, while pages 1, 3, 4, 6, 7 are not.
Figure from [Sil00 p.301] Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Page Fault

Steps in handling a page fault


Computer Architecture WS 06/07

Figure from [Sil00 p.302] Dr.-Ing. Stefan Freinatis

Performance of Demand Paging


Page fault rate 0 p 1
Average probability that a memory reference will cause a page fault

if p = 0 no page faults at all if p = 1 every reference causes a page fault

Memory access time tma


Time to access physical memory (usually in the range of 10 ns ...150 ns)

Effective access time teff


Average effective memory access time. This time finally counts for system performance

teff = (1 p) tma + p page fault time


Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Performance of Demand Paging


Page fault time
The time from the failed memory reference until the machine instruction continues

Trap to the OS Context switch Check validity Find a free frame Schedule disk read

Context switch to another process (optional) Place page in frame Adjust tables Context switch and restart process

Assuming a disk system with an average latency of 8 ms, average seek time of 15 ms and a transfer time of 1 ms (and neglecting that the disk queue may hold other processes waiting for disk I/O), and assuming the execution time of the page fault handling instructions to be 1 ms, the page fault time is 25 ms.
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Performance of Demand Paging


Effective access time
tma

teff = (1 p) 100 ns + p 25 ms = 100 ns + p 249999 ns 25 ms


When each memory reference causes a page fault (p = 1), the system is slowed down by a factor of 250000. When one out of 1000 references causes a page fault (p = 0.001), the system is slowed down by a factor of 250. For less than a 10 % degradation, the page fault rate p must be less than 0.000004 (1 page fault in 2.5 million references).
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Performance of Demand Paging


Some possibilities for lowering the page fault rate

Increase page size


With larger pages the likelihood of crossing page boundaries is lesser.

Use good page replacement scheme


Preferably one that minimizes page faults.

Assign sufficient frames


The system constantly monitors memory accesses, creates page-usage statistics and on-the-fly adjusts the number of allocated frames. Costly, but used in some systems (so-called working set model).

Enforce program locality


Programs can contribute to locality by minimizing cross-page accesses. This applies to the implemented algorithms as well as to the addressing modes of the individual machine instructions.

Computer Architecture

WS 06/07

Dr.-Ing. Stefan Freinatis

Page Size
What should be the page (= frame) size?

Small Pages
little internal fragmentation large page tables slower disk I/O more page faults

Large Pages
internal fragmentation smaller page tables faster disk I/O less page faults

Trend goes toward larger pages. Page faults are more costly today because the gap between CPU-speed and disk speed increased.

Intel 80386: 4 kB Intel Pentium II: 4 kB or 4 MB Sun UltraSparc: 8 kB, 64 kB, 512 kB, 4MB
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Page Attributes
Next to the validity bit v, each page may in addition be equipped with the following attribute bits in the page table entry:

Reference bit r
Upon any reference to the page (read / write) the bit is set. Once the bit is set it remains set until cleared by the OS.

Modify bit m
Each time the page is modified (write access), the bit is set. The bit remains set until cleared by the OS. A page that is modified is also called dirty. The modify bit is also termed dirty bit. When the page is not modified it is clean.

Computer Architecture

WS 06/07

Dr.-Ing. Stefan Freinatis

Finding Free Frames


What options does the OS have when needing free frames?

Terminate another process


Not acceptable. The process may already have done some work (e.g. changed a data base) which may mistakenly be repeated when the process is started again.

Swap out a process


An option only in case of rare circumstances (e.g. thrashing).

Hold some frames in spare


Sooner or later the spare frames are used up. Memory utilization is lower since the spare memory is not used productively.

Borrow frames
Yes! Take an allocated frame, use it, and give it (or another one) back to the owner later. Page Replacement
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Page Replacement
Page replacement scheme:
If there is a free frame use it, otherwise use a page-replacement algorithm to select a victim frame. Save the victim page to disk and adjust the tables of the owner process. Read in the desired page and adjust the tables. Improvement Preferably use a victim page that is clean (not modified, m = 0). Clean pages do not need to be saved to disk.
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Two page transfers

Page Replacement
0 1 2 3

0 1 2 3

Figure from [Sil00 p.309]

Need for page-replacement User process 1 wants to access module M (page 3). All memory however is occupied. Now a victim frame needs to be determined.
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Page Replacement
Figure from [Sil00 p.310]

Page-replacement The victim is saved to disk (1) and the page table is adjusted (2). The desired page is read in (3) and the table is adjusted again. In this figure the victim used to be a page from the same process (or same segment in case of paged segments).
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Page Replacement
Global Page Replacement
The victim frame can be from the set of all frames, that is, one process can take a frame from another. Processes can affect each others page fault rate, though.

Local Page Replacement


The victim frame may only be from the own set of frames, that is, the number of allocated frames per process does not change. No impact onto other processes.
The figure on the previous slide shows a local page replacement strategy.

Computer Architecture

WS 06/07

Dr.-Ing. Stefan Freinatis

Page Replacement
Page replacement algorithms

First-in first-out (FIFO)


and its variations second-chance and clock.

Optimal page replacement (OPT) Least Recently Used (LRU) LRU Approximations
Desired: Lowest page-fault rate! Evaluation of the algorithms through applying them onto memory reference strings.
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Memory Reference Strings


Assume the following address sequence:
(e.g. recorded by tracing the memory accesses of a process)

0100, 0432, 0101, 0612, 0102, 0103, 0104, 0101, 0611, 0102, 0103 0104, 0101, 0610, 0102, 0103, 0104, 0101, 0609, 0102, 0105 Assuming a page size of 100 bytes, the sequence can be reduced to

1, 4, 1, 6, 1, 6, 1, 6, 1, 6, 1
This memory reference string lists the pages accessed over time (at the time steps at which page access changes).
If there is only 1 frame available, the sequence would cause 11 page faults. If there are 3 frames available, the sequence would cause 3 page faults.
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Memory Reference Strings


In general, the more frames available the lesser is the expected number of page faults.

Page faults versus number of frames

Figure from [Sil00 p.312] Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

FIFO Page Replacement


Principle: Replace the oldest page (old = swap-in time).

Example VM.1
Memory reference string: 7, 0, 1, 2, 0, 3, 0, 4, 2, 3, 0, 3, 2, 1, 2, 0, 1, 7, 0, 1 Number of frames: 3

0 1 2

frame contents over time

Figure from [Sil00 p.313]

Total: 15 page faults.


Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

FIFO Page Replacement


Example VM.2
Memory reference string: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 Number of frames: 3 1
1

2
1 2

3
1 2 3

4
4 2 3

1
4 1 3

2
4 1 2

5
5 1 2

1
5 1 2

2
5 1 2

3
5 3 2

4
5 3 4

5
5 3 4

9 page faults

Computer Architecture

WS 06/07

Dr.-Ing. Stefan Freinatis

FIFO Page Replacement


Example VM.3
Memory reference string: 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5 (as in VM.2) Number of frames: 4 1
1

2
1 2

3
1 2 3

4
1 2 3 4

1
1 2 3 4

2
1 2 3 4

5
5 2 3 4

1
5 1 3 4

2
5 1 2 4

3
5 1 2 3

4
4 1 2 3

5
1 5 2 3

10 page faults
Computer Architecture

Although we have more frames available than previously, the page fault rate did not decrease!
WS 06/07 Dr.-Ing. Stefan Freinatis

FIFO Page Replacement


From the examples VM.2, VM.3 it can be noticed that the number of page faults for 4 frames is greater than for 3 frames. This unexpected result is known as Beladys Anomaly1:

For some page-replacement algorithms the page-fault rate may increase as the number of allocated frames increases.

Lazlo Belady, R. Nelson, G. Shedler: An anomaly in space-time characteristics of certain programs running in a paging machine, Communications of the ACM, Volume 12, Issue 6, June 1969, Pages: 349 - 353, ISSN:0001-0782, also available online as pdf from the ACM.

Computer Architecture

WS 06/07

Dr.-Ing. Stefan Freinatis

Beladys Anomaly
Page faults versus number of frames for the string 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5.

Figure from [Sil00 p.314] Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Second-Chance Algorithm
This algorithm is a derivative of the FIFO algorithm. Start with the oldest page Inspect the page If r = 0: replace the page. Done. If r = 1: give the page a second chance by clearing r and moving the page to the top of the FIFO Proceed to next oldest page

When a page is used often enough to keep the r bit set, it will never be replaced. Avoids the problem of throwing out a heavily used page (as may happen with strict FIFO). If all pages have r =1, the algorithm however is FIFO.
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Second-Chance Algorithm
Example: page A is the oldest in the FIFO (see a). With pure FIFO it would have been replaced. However, as r = 1 it is given a second chance and is moved to the top of the FIFO (see b). The algorithm continues with page B.

FIFO

r=1

Figure from [Ta01 p.218]

Computer Architecture

WS 06/07

Dr.-Ing. Stefan Freinatis

Clock Algorithm
Second chance constantly moves pages within the FIFO (overhead)! When the FIFO is arranged as a circular list the overhead is less.

Initially the hand (a pointer) points to the oldest page. The algorithm then applied is second chance.

Figure from [Ta01 p.219]

Computer Architecture

WS 06/07

Dr.-Ing. Stefan Freinatis

Optimal Page Replacement


Principle: Replace the page that will not be used for the longest time.

Example VM.4
Memory reference string: 7, 0, 1, 2, 0, 3, 0, 4, 2, 3, 0, 3, 2, 1, 2, 0, 1, 7, 0, 1 Number of frames: 3

frame contents over time

Figure from [Sil00 p.315]

Total: 9 page faults.


Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

LRU Page Replacement


Principle: Replace the page that has not been used for the longest time.

Example VM.5
Memory reference string: 7, 0, 1, 2, 0, 3, 0, 4, 2, 3, 0, 3, 2, 1, 2, 0, 1, 7, 0, 1 Number of frames: 3

frame contents over time


Figure from [Sil00 p.315] Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

LRU Page Replacement


Possible LRU implementations:

Counter Implementation
Every page table entry has a counter field. The system hardware must have a logical counter. With each page access the counter value is copied to the entry.

Update on each page access required Searching the table for finding the LRU page Account for clock overflow

Stack Implementation
Keep a stack containing all page numbers. Each time a page is referenced, its number is searched and moved to the top. The top holds the MRU pages, the bottom holds the LRU pages.

Update on page access required Searching the stack for the current page number
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

LRU Page Replacement


Example for the stack implementation principle

bottom of stack

Figure from [Sil00 p.317] Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

LRU Approximation
Not many systems provide sufficient hardware support for true LRU page replacement. Approximate LRU!

Use reference bit


When looking for LRU page, take a page with r = 0 No ordering among the pages (only used and unused)

History Field
Each page table entry has a history field h (e.g. a byte) When page is accessed, set most significant bit (e.g. bit 7) Periodically (e.g. every 100 ms) shift right the bits When looking for LRU page, take page with smallest unsigned int(h) Better ordering among the pages (256 history values)
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

LRU Approximation
History field examples

00000000 11111111 01001000


history field

= Not used for the last 8 time periods = Used in each of the past 8 periods = Used in last period and in the fifth last period

value (unsigned int)

0101 0111 0110 1011

5 7 6 11

This page will be chosen as victim

Computer Architecture

WS 06/07

Dr.-Ing. Stefan Freinatis

Page Replacement
Exemplary page fault rates
Page Faults per 1000 References
40 35 30 25 20 15 10 5 0 6 8 10 Number of Frames Allocated 12 14 FIFO Clock LRU Opt

Figure from lecture slides WS 05/06

Differences noticeable only for smaller number of frames


Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Page Replacement
Algorithms Summary

First-in first-out (FIFO)


Simplest algorithm, easy to implement, but has worst performance. The clock version is somewhat better as it does not replace busy pages.

Optimal page replacement (OPT)


Not of practical use as one must know future! Used for comparisons only. Lowest page fault rate of all algorithms.

Least Recently Used (LRU)


The best algorithm usable, but requires much execution time or highly sophisticated hardware.

LRU Approximations
Slightly worse than LRU, but faster. Applicable in practice.

Computer Architecture

WS 06/07

Dr.-Ing. Stefan Freinatis

Thrashing
When the number of allocated frames falls below a certain number of pages actively used by a process, the process will cause page fault after page fault. This high paging activity is called thrashing.
Figure from [Sil00 p.326]

A too high degree of multiprogramming results in thrashing because each process does not have enough frames.

Computer Architecture

WS 06/07

Dr.-Ing. Stefan Freinatis

Thrashing
Countermeasures

Switching to local page replacement


A thrashing process cannot steal frames from others. The page device queue (used by all) however is still full of requests lowering overall system performance.

Swap out
The thrashing process or some other process can be swapped out for a while. Choice depends on process priorities.

Assign sufficient frames


How many frames are sufficient?

Working-set model: All page references are monitored (online memory reference string creation). The pages recently accessed form the working-set. Its size is used as the number of sufficient frames.

Computer Architecture

WS 06/07

Dr.-Ing. Stefan Freinatis

Working-Set

Figure from [Sil00 p.328]

The working-set model uses a parameter to define the working-set window. The set of pages in defines the working-set WS. The OS allocates to the process enough frames to maintain the size of the working-set. Keeping track of the working set requires the observation of memory accesses (constantly or in time intervals).

Computer Architecture

WS 06/07

Dr.-Ing. Stefan Freinatis

Program Locality
Demand paging is transparent to the user program. A program however can enforce locality (at least for data).
Assume a page size of 128 words and consider the following program which clears the elements of a 128 x 128 matrix.
row column

int A[][] = new int[128][128]; for (int j = 0; j < 128; j++) for (int i = 0; i < 128; i++) A[i][j] = 0;
Program A
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

Program clearing the matrix elements column-wise.

Program Locality
The array is stored in memory row major. In row major storage, a multidimensional array in linear memory is accessed such that rows are stored one after the other. It is the approach used by C, Java, and many other languages, with the notable exception of Fortran. For example, the matrix in C as

123 456

6
row 2

high

is defined

5 4 3

row 1

int A[2][3]= { {1,2,3}, {4,5,6} };

2 1
word memory
low

and is stored in memory row-wise.


Computer Architecture WS 06/07

Dr.-Ing. Stefan Freinatis

Program Locality
Thus, each row of the 128 x 128 matrix occupies one page. If the operating system allocates only one frame (for the data) to process A, the process will cause 128 x 128 = 16384 page faults! This is because the process clears one word in each page (word j), then the next word, ..., thus jumping from page to page in the inner loop. for (int j = 0; j < 128; j++) for (int i = 0; i < 128; i++) A[i][j] = 0;
Computer Architecture WS 06/07 Dr.-Ing. Stefan Freinatis

i+3

i+2

i+1

Program Locality
By changing the loop order, the process first finishes one page before going to the next. int A[][] = new int[128][128]; for (int i = 0; i < 128; i++) for (int j = 0; j < 128; j++) A[i][j] = 0;
Program B
i+3

i+2

i+1

Now, if the operating system allocates only one frame (for the data) to process B, the process will cause only 128 page faults!
Computer Architecture WS 06/07

Dr.-Ing. Stefan Freinatis

Program Locality
Locality is also influenced by the addressing modes at machine instruction level.
Consider a three-address instruction, such as ADD A,B,C which performs C:=A+B. In the worst case the operands A, B, C are located in 3 different pages. Another example is the PDP-11 instruction MOV @(R1)+,@(R2)+ which in the worst case straddles across 6 pages.

R1
PDP 11 addressing mode 3 for the source operand

Computer Architecture

WS 06/07

Dr.-Ing. Stefan Freinatis

Virtual Memory
Separation logical physical memory
The user / programmer can think of an extremely large virtual address space.

Pure Paging / Paged Segments


Virtual memory can be implemented upon both memory allocation schemes.

Execution of large programs


which do not fit into physical memory in their entirety.

Better multiprogramming
as there can be more programs in memory.

Not suitable for hard real-time systems!


Virtual memory is the antithesis of hard real-time computing. This is because the response times cannot be guaranteed owing to the fact that processes may influence each other (page device queue, thrashing, ...).

Computer Architecture

WS 06/07

Dr.-Ing. Stefan Freinatis

You might also like