You are on page 1of 58

Parkinson's law

- "Work expands so as to fill the time available for its completion."

Corollaries:
- "If you wait until the last minute, it only takes a minute to do."
- "In ten hours a day you have time to fall twice as far behind your
commitments as in five hours a day."
- "Programs expand to fill the memory available to them.
CPSC 457
CPSC 457
Memory

Fall 2017

Contains slides from Mea Wang, Andrew Tanenbaum and Herbert Bos, Silberschatz, Galvin and Gagne

2
CPSC 457
Overview
● address spaces (logical vs physical) CPU
Registers
● address binding
Level 1 Cache
● memory management unit

cost, speed
Level 2 Cache
● swapping
● memory management
RAM
○ fixed/dynamic partitioning
○ bitmaps, linked lists
Hard Drive, SSD
○ placement algorithms: (Virtual Memory)
first fit, best fit, worst fit, next fit, quick fit
● virtual memory & paging capacity

○ pages, frames, demand paging, page table,


page fault, effective access time
3
CPSC 457
Addresses

● most programs need memory to run


● we expect computers to run multiple programs simultaneously
⇒ OS needs to manage memory for processes

● some issues related to memory management:


○ OS must give each process some portion of available memory (address space)
□ Which part of memory? How much?
○ OS needs to protect memory given to one process from other processes
□ How?
○ If programmers do not know where the program will be loaded, how do they write code?
□ eg. how to write JMP instruction?
● working with physical (direct) addresses is not a good solution …
4
CPSC 457
Working with physical addresses
● Problem: we need to load two programs into physical memory…

0: MOV 0: ADD
4: MOV 4: MOV
8: SUB 8: CMP ??? RAM
12: JMP 20 12: ADD
16: ADD 16: ADD
20: MUL 20: JMP 8
24: MOV

5
CPSC 457
Working with physical addresses
● Problem: we need to load two programs into physical memory…
0: ADD
4: MOV
8: CMP
12: ADD
16: ADD
0: MOV 0: ADD 20: JMP 8
4: MOV 4: MOV
8: SUB 8: CMP

...
12: JMP 20 12: ADD
16: ADD 16: ADD 1000: MOV
20: MUL 20: JMP 8 1004: MOV
24: MOV 1008: SUB
1012: JMP 20
1016: ADD
1020: MUL
1024: MOV
6
CPSC 457
Working with physical addresses
● Problem: we need to load two programs into physical memory…
0: ADD
4: MOV
8: CMP
12: ADD
16: ADD
0: MOV 0: ADD 20: JMP 8
4: MOV 4: MOV
8: SUB 8: CMP

...
12: JMP 20 12: ADD 2nd program would
16: ADD 16: ADD not work, it would 1000: MOV
20: MUL 20: JMP 8 jump to the wrong 1004: MOV
address
24: MOV 1008: SUB
1012: JMP 20
1016: ADD
another problem is
lack of memory 1020: MUL
protection 1024: MOV
7
CPSC 457
Base and Limit Registers ― address protection in hardware

● a pair of base and limit registers define the allowed


range of addresses available to CPU
○ base = starting memory
○ limit = size of memory
● the base and limit registers can only be modified in
kernel mode
● CPU checks every memory access generated by a
process
● when process tries to access invalid address → trap to
OS
● base & limit registers stored in PCB

8
CPSC 457
Base and Limit Registers ― address protection in hardware

base base + limit

address yes yes


CPU ≥ < memory

no no

trap to OS ― addressing error

9
CPSC 457
Address binding

● problem: when programs are compiled, the physical address space of the process is not known
● possible solution: programs could be expressed in a way that allows them to be relocated
○ when needed, we can bind the addresses to the actual physical memory location
● addresses in a program are represented in different ways at different stages of a program’s life
○ source code addresses are usually symbolic
□ eg. int main()
○ addresses in compiled code can bind to relocatable addresses
□ eg. main = “14 bytes from beginning of this module”
○ before execution, the loader will bind relocatable addresses to physical addresses
□ eg. main = “14 bytes from beginning of this module” ⇒ "1000 + 14 = 1014"
● each binding maps one address space to another

10
CPSC 457
Binding of instructions and data to memory
address binding of instructions and data to memory addresses can
happen at three different stages

● compile time - slowest


○ if memory location known a priori, absolute code can be
generated and stored
○ we must recompile code if starting location changes ☹

● load time - much faster


○ the compiled code must be stored as relocatable code
○ binding is done before program starts executing

● execution time - fastest with HW support


○ if process can be moved during its execution, binding is
done at run-time, dynamically
○ most flexible, but need hardware support for address maps
(e.g., memory management unit) ☺ 11
CPSC 457
Logical & physical addresses

● we can achieve execution-time address-binding and memory-protection by 'virtualizing memory'


● OS gives each process a logical address space (aka virtual address space)
○ a contiguous space, ranging from 0 to MAX
○ as process executes, addresses generated by the CPU are logical addresses
○ if logical address does not fall into the logical address space range → violation (trap)
● physical address - a real memory address
○ logical addresses are mapped to physical addresses before reaching memory
via memory management unit (MMU)
○ physical address space of a process is the subset of RAM allocated to a process
○ physical address space = set of all mappings from logical addresses

12
CPSC 457
Memory-Management Unit (MMU)

● MMU is a hardware device that maps virtual addresses to physical addresses


● needs to be super fast, and is often part of a CPU
● many possible implementations
● a CPU executing a process uses logical addresses, it never sees the real physical addresses
● execution-time binding occurs automatically whenever memory reference is made

logical physical
address address
CPU MMU Memory

13
CPSC 457
MMU ― Relocation Registers

● a simple MMU implementation:


○ logical address space starts at 0
○ CPU has a relocation register
○ value in the relocation register is added to every
address generated by a process at the time
it is sent to memory

14
CPSC 457
MMU ― Relocation and Limit Registers
● combining relocation register and limit register
● relocation (base) register = smallest allowed physical memory address
● limit register = the size of the chunk of physical memory a process is allowed to use
● achieves execution-time binding + memory protection

15
CPSC 457
Hardware Support for Relocation and Limit Registers
0: ADD
0: MOV 0: ADD 4: MOV
4: MOV 4: MOV 8: CMP
8: SUB 8: CMP base register: 12: ADD
12: JMP 20 12: ADD 16: ADD
0
16: ADD 16: ADD 20: JMP 8
20: MUL 20: JMP 8 limit register:
24: MOV

...
512

1000: MOV
1004: MOV
base register:
1008: SUB
1000 1012: JMP 20
each process gets its 1016: ADD
own pair of base/limit 1020: MUL
registers 1024: MOV
limit register:

...
could be part of PCB 64 16
CPSC 457
Swapping

● a process can be swapped temporarily out


of memory to a backing store, and then
brought back into memory for continued
execution
● backing store – fast disk large enough to
accommodate copies of all memory images
for all processes
● swapping allows the OS to run more
processes than the available physical
memory

17
CPSC 457
Swapping

● Does the swapped out process need to swap back into the same physical addresses?
○ depends on address binding method
○ much easier if MMU is used
○ must be careful with pending I/O, especially when using memory-mapped device registers
○ I/O results could be sent to kernel, then to the process (double-buffering)
● warning: context switch time can be extremely high
● standard swapping not used in modern operating systems
● although modified versions of swapping are used on many systems (eg. Linux, Windows)
○ swapping is disabled initially
○ swapping enabled if more than threshold amount of memory allocated
○ swapping is disabled again once memory demand reduced below threshold

18
CPSC 457
Swapping and memory
● memory allocation changes as processes are swapped out and swapped in
● the shaded regions are unused memory

A is swapped out B is swapped out A is swapped in


19
CPSC 457
Memory allocation
● How much memory should OS allocate to each process?
● Fact: most programs increase their memory usage during execution
● Possible solution: swap process out, find a bigger free memory chunk, swap it back in
● Better solution: OS allocates extra memory for each process

20
CPSC 457
Memory allocation

● at some point OS needs to


○ find a free chunk of memory
○ then mark it as used
○ and later free it up
● simple approaches can lead to fragmentation
○ lots of tiny free chunks of memory
○ none of them big enough to satisfy any requests
● OS needs to manage the memory in an efficient way
○ fast (searching, allocating, freeing, …)
○ minimize fragmentation
● two general approaches: fixed partitioning and dynamic partitioning

21
CPSC 457
Fixed partitioning
● memory is divided into equal sized partitions
● example:
○ total memory = 64MB, partition size = 8MB → 8 partitions
○ OS usually reserves some memory for itself (eg. 1 partition, or 8MB)
○ let's load 3 processes: P1 (4MB), P2 (8MB), P3 (10MB)

8MB 8MB 8MB 8MB 8MB 8MB 8MB 8MB

22
CPSC 457
Fixed partitioning
● memory is divided into equal sized partitions
● example:
○ total memory = 64MB, partition size = 8MB → 8 partitions
○ OS usually reserves some memory for itself (eg. 1 partition, or 8MB)
○ let's load 3 processes: P1 (4MB), P2 (8MB), P3 (10MB)

8MB 8MB 8MB 8MB 8MB 8MB 8MB 8MB

P1 P2 P3
OS
4M 8MB 10MB

● Problems:

23
CPSC 457
Fixed partitioning
● memory is divided into equal sized partitions
● example:
○ total memory = 64MB, partition size = 8MB → 8 partitions
○ OS usually reserves some memory for itself (eg. 1 partition, or 8MB)
○ let's load 3 processes: P1 (4MB), P2 (8MB), P3 (10MB)

8MB 8MB 8MB 8MB 8MB 8MB 8MB 8MB

P1 P2 P3
OS
4M 8MB 10MB

● Problems:
○ internal fragmentation: memory internal to a Actual free memory: 34 MB
partition becomes fragmented Usable free memory: 24 MB
○ leads to low memory utilization if partitions are big 24
CPSC 457
Dynamic partitioning
● create partitions that can fit a request perfectly
● example:
○ total memory = 64MB, minus 8MB taken by OS
○ load 3 processes: P1 (4MB), P2 (8MB), P3 (10MB)

64MB

25
CPSC 457
Dynamic partitioning
● create partitions that can fit a request perfectly
● example:
○ total memory = 64MB, minus 8MB taken by OS
○ load 3 processes: P1 (4MB), P2 (8MB), P3 (10MB)

34MB

P1 P2 P3
OS
4M 8MB 10MB

● no more internal fragmentation, but what if P2 finishes, and P4 (18MB) gets added?

Usable free memory: 34 MB


Actual free memory: 34 MB

26
CPSC 457
Dynamic partitioning
● create partitions that can fit a request perfectly
● example:
○ total memory = 64MB, minus 8MB taken by OS
○ load 3 processes: P1 (4MB), P2 (8MB), P3 (10MB)

8MB 16MB

P1 P3 P4
OS
4M 10MB 18MB

● no more internal fragmentation, but what if P2 finishes, and P4 (18MB) gets added?
● external fragmentation: the memory that is external
Usable free memory: 24 MB
to all partitions becomes increasingly fragmented,
Actual free memory: 24 MB
leading to low memory utilization
Largest free chunk: 16 MB
● eg. P5 (17MB) could not start, despite having enough free RAM 27
CPSC 457
Memory compaction
● memory compaction is a mechanism to deal with fragmentation
● from time to time, the OS re-arranges the used blocks of memory so that they are contiguous
● free blocks are merged into a single large block
● CPU intensive operation

8MB 16MB

OS P1 P3 P4

24MB

OS P1 P3 P4

28
CPSC 457
Implementation

● how do we keep track of free memory and allocated memory?


● we need a data structure
○ that is efficient at searching free space
○ that is efficient at reclaiming free space
○ "deals" with fragmentation

29
CPSC 457
Bitmaps & fixed partitions
● memory is divided into equal partitions as small as few words and as large as several KB
● OS maintains a bitmap, 1 bit per partition, where 0=free, 1=occupied

■ Problems:
● searching is O(N), N = size of bitmap
● smaller partitions ⇒ less fragmentation, but larger bitmap
● larger partitions ⇒ smaller bitmap, but more fragmentation
● compromise between efficiency and fragmentation
● note: larger bitmap also implies more wasted memory
30
CPSC 457
Linked lists & dynamic partitioning
● memory is divided into segments of dynamic size
● OS maintains a list of allocated and free memory segments (holes), sorted by address

● searching is O(N), where N = number of segments in the linked list


● reclaiming free space can be O(1)
○ if doubly-linked list is used and linked-list data is stored within segments
31
CPSC 457
Memory management with linked lists - reclaiming space

two way merge

two way merge

three way merge

32
CPSC 457
Memory allocation
● algorithms for finding a free space (hole) in a linked list:
○ first fit - find the first hole that is big enough, leftover space becomes new hole
○ best fit - find the smallest hole that is big enough, leftover (tiny) space becomes new hole
○ next fit - same as first fit, but start searching at the location of last placement
○ worst fit - find the largest hole, leftover space is likely to be usable
○ quick fit - maintain separate lists for common request sizes
□ leads to faster search, but more complicated management
● Example: request is to find memory for 2 units

hole
last placement next fit

3 5 2

first fit worst fit best fit 33


CPSC 457
Virtual memory
● virtual memory is a memory management technique
● allows the OS to present a process with logical
physical
address space that appears contiguous memory
(RAM)
● physical address space can be discontiguous
process's
● some parts of logical address space can be virtual
memory
mapped to a backing store
● improves memory management
another
● allows parts of programs to be 'swapped' in/out process

DISK 34
CPSC 457
Paging
physical
memory
● most common implementation of virtual memory (RAM)

● virtual address space is divided into pages process's frame 0


virtual frame 1
○ fixed size blocks memory
frame 2
page 0
○ usually power of 2, range 512B ― 16MB page 1
● physical memory is divided into (page) frames page 2

○ same size as pages


page
● pages map to frames via a lookup table called page table

table (logical → physical address mapping)


frame m-1
● avoids external fragmentation, since no holes
● each process has its own page table (ptr in PCB) page n-1

35
CPSC 457
Paging

● if a program tries to address a page that does not map to physical memory
○ CPU issues a trap (called page fault)
○ OS suspends the process
○ OS loads the page from disk
○ OS updates the page table
○ OS resumes the process
● if OS only loads pages as a result of page fault, we call that demand paging

36
CPSC 457
Paging

37
CPSC 457
Page table

● OS keeps track of all free frames


● we still need memory management techniques
● to run a program of size N pages, OS needs to find N free frames and load program in them
● page table translates logical addresses → physical addresses
● we need a similar mechanism for the backing store
○ backing store is also split into pages
● still have internal fragmentation

38
CPSC 457
Paging example
● virtual address space = 64KB
● physical address space = 32KB
● page size = 4KB
● calculate:
○ frame size = ?
○ # of pages = ?
○ # of frames = ?

39
CPSC 457
Paging example
● virtual address space = 64KB
● physical address space = 32KB
● page size = 4KB
● calculate:
○ frame size = 4KB (same as page size)
○ # of pages = 16 (64KB / 4KB)
○ # of frames = 8 (32KB / 4KB)

40
CPSC 457
Paging example
● Assume page size = 2KB, and a process needs 71 KB to load. How many pages do we need?

41
CPSC 457
Paging example
● Assume page size = 2KB, and a process needs 71 KB to load. How many pages do we need?

○ we need 35 pages + 1 KB → OS needs to find 36 free frames

○ do they need to be contiguous? NO!

○ OS allocates 36 frames anywhere

● Observations:

○ one frame will have 1KB of unused space (internal fragmentation)

○ no external fragmentation since there are no holes between frames

○ but what if there are no free frames?

42
CPSC 457
Demand paging performance
● paging performance is commonly evaluated by the effective access time for memory access
● Let:
○ p = probability of page fault or page fault rate ( 0 <= p <= 1)
□ p = 0 → all pages are in memory, no page fault
□ p = 1 → all pages are on disk, all memory accesses are page faults
○ ma = memory access time
○ pfst = page fault service time, ie. how long does it take to service a page fault
● Then:
○ effective access time (EAT) = (1-p) * ma + p * pfst
● Non-realistic example: calculate EAT if page fault probability is 50%, ma = 1ms and pfst = 10ms
○ EAT = (1-0.5) * 1ms + 0.5 * 10ms = 0.5ms + 5ms = 5.5ms
● Realistic example: calculate EAT if page fault probability is 1/1000, ma = 100ns and pfst = 10ms
○ EAT = (1-0.001) * 100ns + 0.001 * 10,000,000ns = 99.9ns + 10,000ns ~= 10099.9ns
43
CPSC 457
Demand paging performance
● paging performance is commonly evaluated by the effective access time for memory access
● Let:
○ p = probability of page fault or page fault rate ( 0 <= p <= 1)
□ p = 0 → all pages are in memory, no page fault
□ p = 1 → all pages are on disk, all memory accesses are page faults
○ ma = memory access time
○ pfst = page fault service time, ie. how long does it take to service a page fault
● Then:
○ effective access time (EAT) = (1-p) * ma + p * pfst
● Non-realistic example: calculate EAT if page fault probability is 50%, ma = 1ms and pfst = 10ms
○ EAT = (1-0.5) * 1ms + 0.5 * 10ms = 0.5ms + 5ms = 5.5ms 101 x slower

● Realistic example: calculate EAT if page fault probability is 1/1000, ma = 100ns and pfst = 10ms
○ EAT = (1-0.001) * 100ns + 0.001 * 10,000,000ns = 99.9ns + 10,000ns ~= 10099.9ns
44
CPSC 457
Address translation scheme
● Address generated by CPU is divided into:
○ Page number (p) – used as an index into a page table which contains base address of
corresponding frame in physical memory
○ Page offset (d) – combined with base address to define the physical memory address that is
sent to the memory unit

○ For given logical address space 2m and page size 2n

Example: 16-bit logical address 1010101001100101b, and page size 210

page number page offset


101010b 1001100101b
45
CPSC 457
Paging hardware

46
CPSC 457
Paging Model of Logical and Physical Memory

47
CPSC 457
Paging example
32 byte memory
4 byte pages

48
CPSC 457
Free frames

before allocation after allocation


49
CPSC 457
Page Table Implementation

● page table is kept in main memory


● page-table base register (PTBR) points to the page table
● page-table length register (PTLR) indicates size of the page table
● in this scheme every data/instruction access requires at least two memory accesses
○ instruction fetch + page table lookup
● the two memory access problem can be solved by using a special fast-lookup hardware cache
called associative memory or translation lookaside buffers (TLBs)
○ TLBs are extremely fast and extremely small (64 to 1K entries)
○ on TLB miss, value is loaded into TLB for faster access next time

50
CPSC 457
Associative memory
● associative memory ― parallel search on content

● TLB stores a subset of the page table


● TLB searches based on page #, returns corresponding frame #
● search is done in parallel
● if TLB does not contain page #, it must be obtained from page table in memory

51
CPSC 457
Paging hardware with TLB

52
CPSC 457
Memory protection

● memory protection implemented by associating protection bit with each frame to indicate if
read-only or read-write access is allowed
● can also add other protection bits (execute only, etc)
● Valid-invalid bit attached to each entry in the page table:
○ “valid” indicates that the associated page is in the process’ logical address space, and is
thus a legal page
○ “invalid” indicates that the page is not in the process’ logical address space
○ Or use page-table length register (PTLR)
● Any violations result in a trap to the kernel

53
CPSC 457
Typical structure of page table entry
aka dirty bit
set by hardware
automatically on
aka valid/invalid
write access
invalid → page fault

set by hardware
various bits, eg.
automatically on
read/write/execute
any access
54
CPSC 457
Structure of the Page Table

● page tables can get huge using straight-forward methods


○ consider a 32-bit logical address space (still common)
○ page size of 4 KB (212)
○ page table would have 1 million entries (232 / 212 = 220)
○ if each entry is 4 bytes → page table would take 4 MB of memory
○ for 64-bit systems, page table can get "very" big
● Hierarchical Paging
● Hashed Page Tables
● Inverted Page Tables
● … stay tuned

55
CPSC 457
Summary
● address space, logical/virtual, physical
● MMU, base & limit registers
● address binding - compile- / load- / execution-time
● swapping
● free memory management
○ fixed/dynamic partitioning
○ bitmaps, linked lists
○ placement algorithms: first fit, best fit, worst fit, next fit, quick fit
● virtual memory
○ pages, frames, demand paging, page table, page fault, effective access time

Reference: 3.1 - 3.2.2, 3.3 (Modern Operating Systems)


8.1 - 8.5, 9.1 - 9.2 (Operating System Concepts)
56
CPSC 457
Review

● What is an address space?


● Name two registers used in a simple MMU implementation.
● Explain logical address / physical address.
● What is the purpose of an MMU?
● Best fit memory allocation is faster than first fit. True or False
● Virtual address space is the same as physical address space. True or False
● Page size is the same as frame size. True or False
● Define: page, frame, demand paging, page table, page fault

57
CPSC 457
Questions?

58

You might also like