Memory Management

Pritee Parwekar
MEMORY MANAGEMENT
PHYSICAL TYPES
 Semiconductor
 RAM
 Magnetic
 Disk & Tape
 Optical
 CD & DVD
 Others
 Bubble
 Hologram Pritee
Parwek
MEMORY HIERARCHY
 Registers
 In CPU
 Internal or Main memory
 May include one or more levels of cache
 “RAM”
 External memory
 Backing store
Pritee
Parwek
THE BOTTOM LINE
 How much?
Capacity
 How fast?
Time is money
 How expensive?
Pritee
Parwek
HIERARCHY LIST
 Registers
 L1 Cache
 L2 Cache
 Main memory
 Disk cache
 Disk
 Optical
 Tape
Pritee
Parwek
MEMORY HIERARCHY
 Main Memory
RAM
Volatile, unless backed up with battery
Stores active programs and data
Communicates directly with the CPU
Pritee
Parwek
MEMORY HIERARCHY
 Auxiliary Memory
Magnetic (disks & tapes)
Non volatile
Saves files (programs & data)
Communicates with the CPU

through a controller
Pritee
Parwek
MEMORY HIERARCHY
Main Memory I/O Processor
CPU
Cache
Magnetic
Disks Magnetic Tapes
Pritee
Parwek
CPU
Increasing distance
Level 1
from the CPU in
access time
Levels in the Level 2

memory hierarchy
Level n
Size of the memory at each level

Pritee
Parwek
MEMORY HIERARCHY OF A MODERN COMPUTER SY
 By taking advantage of the principle of locality:
 Present the user with as much memory as is available in the cheapest
technology.
 Provide access at the speed offered by the fastest technology.
Processor
Control Tertiary
Secondary Storage
Storage (Tape)
Second Main
(Disk)
On-Chip
Registers
Level Memory
Cache
Datapath Cache (DRAM)

(SRAM)
Speed (ns): 1s 10s 100s 10,000,000s 10,000,000,000s

(10s ms) (10s sec)
Size (bytes): 100s Ks Ms Gs Ts
Pritee
Parwek
Pritee Parwekar
TABLE OF TECH AND ACCESS TIME
Memory technology Typical access time
SRAM 0.5–5 ns
DRAM 50–70 ns
Magnetic disk 5,000,000–20,000,000 ns

MEMORY
 SRAM:
 Value is stored on a pair of inverting gates
 Very fast but takes up more space than

DRAM (4 to 6 transistors)
 DRAM:
 Value is stored as a charge on capacitor
(must be refreshed)
 Very small but slower than SRAM (factor of 5
to 10)
Pritee
Parwek
MEMORY CELL OPERATION
Pritee
Parwek
 Hit: data appears in some block in the upper level (example:
Block X)
 Hit Rate: the fraction of memory access found in the upper
level
 Miss: data needs to be retrieve from a block in the lower level

(Block Y)
 Miss Rate = 1 - (Hit Rate)
 Miss Penalty: Time to replace a block in the upper level +
Time to deliver the block the processor
Lower Level
To Processor Upper Level Memory
Memory
Blk X
From Processor Blk Y
Pritee
Parwek
MEMORY HIERARCHY: HOW DOES IT WORK?
 Temporal Locality (Locality in Time):

=> Keep most recently accessed data items closer to
the processor
 Spatial Locality (Locality in Space):
=> Move blocks consists of contiguous words to the
upper levels
Lower Level
To Processor Upper Level Memory
Memory
Blk X
From Processor Blk Y
Pritee
Parwek
PROBLEM 1
 Consider two-level memory hierarchy (M1,M2)
t1,t2 are access time of M1 and M2 ,h1 is hit ratio of
M1.Calculate effective access time
EAT=h1*t1+(1-h1)(1+t2)
If cost per unit f M1 is C1 and the cost per unit of M2
is C2 then the average cost of memory =
C1*M1 + C2*M2 / (M1+M2)
Pritee
Parwek
PROBLEM 2
 Suppose the access time of cache memory is 80ns
and that of main memory is 800ns.It is estimated
that the memory requests are for read. The hit ratio
for read access is 0.8.
Determine the average access time of the system
only memory read cycle
Pritee
Parwek
CACHE MEMORY
 A special hierarchical level memory

 Why is cache memory needed ?
Pritee
Parwek
WHY IS CACHE MEMORY NEEDED?
 When a program references a

memory location, it is likely to
reference that same memory location
again soon.
Pritee
Parwek
WHY IS CACHE MEMORY NEEDED ?
 A small but fast cache memory, in which the

contents of the most commonly accessed
locations are maintained, can be placed
between the CPU and the main memory.
 When a program executes, the cache

memory is searched first.
Pritee
Parwek
CACHE MEMORY
 Why is cache fast ?
Pritee
Parwek
WHY IS CACHE MEMORY FAST?
 A cache memory has fewer locations than a

main memory, which reduces the access
time
 The cache is placed both physically closer
and logically closer the the CPU than the
main memory
Pritee
Parwek
WHY IS CACHE MEMORY FAST
 This cacheless computer usually needs a

few bus cycles to synchronize the CPU with
the bus.
 A cache memory can be positioned closer to
the CPU.
Pritee
Parwek
CACHE MEMORY
 High speed (towards CPU speed)
 Small size (power & cost)
Miss
Main
CPU Memory
Cache (Slow)
(Fast) Mem
Hit Cache
95% hit ratio

Access = 0.95 Cache + 0.05 Mem
Pritee
Parwek
CACHE MEMORY
CPU 30-bit Address

Main
Memory
Cache
1 Gword
1 Mword
Only 20 bits !!!
Pritee
Parwek
CACHE MEMORY
00000000 Main
00000001 Memory
00000 Cache •
00001 •
• •
• •
• •
• •
FFFFF •
•
•
Address Mapping !!! •

3FFFFFFF
Pritee
Parwek
CACHE OPERATION - OVERVIEW
 CPU requests contents of memory location

 Check cache for this data
 If present, get from cache (fast)
 If not present, read required block from main

memory to cache
 Then deliver from cache to CPU
 Cache includes tags to identify which block

of main memory is in each cache slot
Pritee
Parwek
CACHE READ OPERATION - FLOWCHART
Pritee
Parwek
CACHE AND MAIN MEMORY
Pritee
Parwek
CACHE DESIGN
 Size
 Mapping Function
 Replacement Algorithm
 Write Policy
 Block Size
 Number of Caches
Pritee
Parwek
CACHE
 Small amount of fast memory

 Sits between normal main memory and CPU
 May be located on CPU chip or module
Pritee
Parwek
CACHE DESIGN
 Size
 Mapping Function
 Replacement Algorithm
 Write Policy
 Block Size
 Number of Caches
Pritee
Parwek
CACHE-MAPPING FUNCTIONS
 DIRECT MAPPING
 ASSOCIATIVE MAPPING
 SET ASSOCIATIVE MAPPING
Pritee
Parwek
CACHE MEMORY
00000000 Main
00000001 Memory
00000 Cache •
00001 •
• •
• •
• •
• •
FFFFF •
•
•
Address Mapping !!! •

3FFFFFFF
Pritee
Parwek
DIRECT MAPPED CACHE
 Mapping: memory mapped to one location in cache:

(Block address) mod (Number of lines in cache)
 Number of blocks is typically a power of two, i.e.,
cache location obtained from low-order bits of address.
Cache
000
001
010
011
111
100
101
110
00001 00101 01001 01101 10001 10101 11001 11101
Memory Pritee
Parwek
DIRECT MAPPING Pritee Parwekar
CACHE LINE TABLE
Cache line Main Memory blocks held
0 0, m, 2m, 3m…2s-m
1 1,m+1, 2m+1…2s-m+1
m-1 m-1, 2m-1,3m-1…2s-1

CACHE/MAIN MEMORY STRUCTURE
Pritee
Parwek
DIRECT MAPPING FROM CACHE TO MAIN
MEMORY
Pritee
Parwek
DIRECT MAPPING CACHE ORGANIZATION
Pritee
Parwek
DIRECT MAPPING EXAMPLE
 If you have a 24 bit address in direct

mapping with a block size of 4 words and 1K
lines in a cache how would be the
partitioning of the address for the cache ?
Pritee
Parwek
ANSWER
Direct Mapping Address Partitions

 Tag (12 bits)
 line identifier (10 bits)
 word id (2 bits)
Pritee
Parwek
PROBLEM
 How many total bits are required for

a direct-mapping cache with 16KB
and 4-word blocks,Main memory is
8MB
Pritee
Parwek
Pritee Parwekar
DIRECT MAPPING
ADDRESS STRUCTURE
Tag s-r Line or Slot r Word w
7 14 2
 23 bit address
 2 bit word identifier (4 byte block)
 21 bit block identifier
 7 bit tag (=22-14)
 14 bit slot or line
 No two blocks in the same line have the same Tag field
 Check contents of cache by finding line and checking Tag
PROBLEM
 How many total bits are required for

a direct-mapping cache with 8KB of
data and 4-word blocks, assuming a
32-bit address.
Pritee
Parwek
ASSOCIATIVE MAPPING
 A main memory block can load into any line

of cache
 Memory address is interpreted as tag and
word
 Tag uniquely identifies block of memory
 Every line’s tag is examined for a match
 Cache searching gets expensive
Pritee
Parwek
ASSOCIATIVE MAPPING FROM
CACHE TO MAIN MEMORY
Pritee
Parwek
FULLY ASSOCIATIVE CACHE
ORGANIZATION
Pritee
Parwek
ASSOCIATIVE MAPPING
ADDRESS STRUCTURE
Word
Tag 22 bit 2 bit
 22 bit tag
 Compare tag field with tag entry in cache to
check for hit
 Least significant 2 bits of address identify
which word is required from data block
Pritee
Parwek
SET ASSOCIATIVE MAPPING
 Cache is divided into a number of sets

 Each set contains a number of lines
 A given block maps to any line in a given set
 e.g. Block B can be in any line of set i
 e.g. 2 lines per set
 2 way associative mapping
 A given block can be in one of 2 lines in

only one set
Pritee
Parwek
 The cache memory is divided into 'v' sets,

each consisting of 'n' cache lines. A block
from Main memory is first mapped onto a
specific cache set, and then it can be placed
anywhere within that set. This type of
mapping has very efficient ratio between
implementation and efficiency.
Pritee
Parwek
 Cache set number = (Main memory

block number) MOD (Number of sets
in the cache memory)
Pritee
Parwek
 If there are 'n' cache lines in a set,

the cache placement is called n-way
set associative i.e. if there are two
blocks or cache lines per set, then it
is a 2-way set associative cache
mapping and four blocks or cache
lines per set, then it is a 4-way set
associative cache mapping.
Pritee
Parwek
 NOTE: The Main memory is not physically

partitioned in the given way, but this is the
view of Main memory that the cache sees.
NOTE: We are dividing both Main Memory

and cache memory into blocks of same size
Pritee
Parwek
PROBLEM
 We have a Main Memory of size 4GB,with

each byte directly addressable. Main
memory into blocks of each 32 bytes. Cache
memory of 512KB.A two way set associative
mapping is used for cache find the
distribution of bits in the address.
Pritee
Parwek
SOLUTION
 Let us assume we have a Main Memory of size
4GB (232), with each byte directly addressable by a
32-bit address. We will divide Main memory into
blocks of each 32 bytes (25). Thus there are 128M
(i.e. 232/25 = 227) blocks in Main memory.
We have a Cache memory of 512KB (i.e. 219),

divided into blocks of each 32 bytes (25). Thus
there are 16K (i.e. 219/25 = 214) blocks also known
as Cache slots or Cache lines in cache memory. It
is clear from above numbers that there are more
Main memory blocks than Cache slots.
Pritee
Parwek
SOLUTION
 Let us try 2-way set associative cache mapping i.e. 2 cache
lines per set. We will divide 16K cache lines into sets of 2
and hence there are 8K (214/2 = 213) sets in the Cache
memory.
Cache Size = (Number of Sets) * (Size of each set) * (Cache

line size)
So even using the above formula we can find out number of

sets in the Cache memory i.e.
219 = (Number of Sets) * 2 * 25
Number of Sets = 219 / (2 * 25) = 213. Pritee

Parwek
SOLUTION
 When an address is mapped to a set, the
direct mapping scheme is used, and then
associative mapping is used within a set.
The format for an address has 13 bits in the

set field, which identifies the set in which the
addressed word will be found if it is in the
cache. There are five bits for the word field
as before and there is 14-bit tag field that
together make up the remaining 32 bits of the
address as shown below
Pritee
Parwek
PROBLEM
 A computer system has 8GB of word-

addressable main memory and a 256 KB
unified cache memory with 64 word block.
How many bits for set and word offset
required if 4-way set associative mapping is
used.
Pritee
Parwek
PROBLEM
An address for a byte addressable memory

presented to the cache unit is divided as
follows: 13 bit tab,14 bit line index and 5 bit
offset
What is the cache size in bytes ?
What is the cache mapping scheme?
Pritee
Parwek
PROBLEM
 A block set-associative cache consists of 64

blocks divided in 4 blocks sets.The main
memory contains 4096 blocks,each
consisting of 128 words of 16 bits length.
i) How many bits are there in main memory?
ii) How many bits are there in each of the
TAG,SET and WORD fields?
Pritee
Parwek
PROBLEM
 Main memory of 16MB with the block size of

16 words.Cache memory is of 64KB.Find the
number of bits used in TAG field
Pritee
Parwek
EXAMPLE
 In the below example we have
chosen the block 14 from Main
memory and compared it with the
different block replacement
algorithms. In Direct Mapped cache
it can be placed in Frame 6 since 14
mod 8 = 6. In Set associative cache
it can be placed in set 2.
Pritee
Parwek
MAPPING FROM MAIN MEMORY TO
CACHE:
V ASSOCIATIVE
Pritee
Parwek
MAPPING FROM MAIN MEMORY TO
CACHE:
K-WAY ASSOCIATIVE
Pritee
Parwek
K-WAY SET ASSOCIATIVE CACHE
ORGANIZATION
Pritee
Parwek
ADDRESS STRUCTURE
Word
Tag 9 bit Set 13 bit 2 bit
 Use set field to determine cache set to look in

 Compare tag field to see if we have a hit
 e.g
 Address Tag Data Set
number
 1FF 7FFC 1FF 12345678 1FFF
 001 7FFC 001 11223344 1FFF
Pritee
Parwek
TWO WAY SET ASSOCIATIVE MAPPING
EXAMPLE
Pritee
Parwek
SUMMARY
 Address length = (s + w) bits
 Number of addressable units = 2s+w words or
bytes
 Block size = line size = 2w words or bytes
 Number of blocks in main memory = 2d
 Number of lines in set = k
 Number of sets = v = 2d
 Number of lines in cache = kv = k * 2d
 Size of tag = (s – d) bits
Pritee
Parwek
Therefore, cache line's tag size depends on 3
factors:
 Size of cache memory;
 Associativity of cache memory;
 Cacheable range of operating memory.
Here,
Stag — size of cache tag, in bits;
Smemory — cacheable range of operating memory, in bytes;
Scache — size of cache memory, in bytes;
A — associativity of cache memory, in ways.
Pritee
Parwek
EXAMPLE
 Three small 4 word caches:

Direct mapped, two-way set associative,
fully associative
 How many misses in the sequence of block

addresses: 0, 8, 0, 6, 8?
 How does this change with 8 words, 16

words?
Pritee
Parwek
 Whenthe set associative mapping will
become direct and associative mapping
?
Pritee
Parwek
VALID BIT / DIRTY BIT
Valid Bit :- When a program is first loaded

into main memory, the cache is cleared,
and so while a program is executing, a
valid bit is needed to indicate whether or
not the slot holds a line that belongs to the
program being executed.
Pritee
Parwek
VALID BIT / DIRTY BIT
Dirty bit :- keeps track of whether or not
a line has been modified while it is in the
cache. A slot that is modified must be
written back to the main memory before the
slot is reused for another line.
Pritee
Parwek
REPLACEMENT
ALGORITHM
 Optimal Replacement: replace the block
which is no longer needed in the future. If
all blocks currently in Cache Memory will
be used again, replace the one which will
not be used in the future for the longest
time.
 Random selection: replace a randomly

selected block among all blocks currently
in Cache Memory. Pritee
Parwek
REPLACEMENT
ALGORITHM
 FIFO (first-in first-out): replace the block
that has been in Cache Memory for the
longest time.
 LRU (Least recently used): replace the
block in Cache Memory that has not been
used for the longest time.
 LFU (Least frequently used): replace the
block in Cache Memory that has been
used for the least number of times.
Pritee
Parwek
REPLACEMENT ALGORITHMS
 For Associative & Set-Associative Cache
Which location should be emptied when
the cache is full and a miss occurs?
 First In First Out (FIFO)
 Least Recently Used (LRU)
 Distinguish an Empty location from a Full

one
 Valid Bit
Pritee
Parwek
CPU A B C A D E A D C F
Reference
Miss Miss Miss Hit Miss Miss Miss Hit Hit Miss
Cache A A A A A E E E E E
B B B B B A A A A
FIFO  C C C C C C C F
D D D D D D
Hit Ratio = 3 / 10 = 0.3
Pritee
Parwek
CPU A B C A D E A D C F
Reference
Miss Miss Miss Hit Miss Miss Hit Hit Hit Miss
Cache A B C A D E A D C F
A B C A D E A D C
LRU  A B C A D E A D
B C C C E A
Hit Ratio = 4 / 10 = 0.4
Pritee
Parwek
LOAD-THROUGH
STORE-THROUGH
Load-Through : When the CPU needs to

read a word from the memory, the block
containing the word is brought from MM to
CM, while at the same time the word is
forwarded to the CPU.
Store-Through : If store-through is used, a

word to be stored from CPU to memory is
written to both CM (if the word is in there)
and MM..
Pritee
Parwek
Pritee
Parwek
WRITE METHODS
 Note: Words in a cache have been
viewed simply as copies of words from
main memory that are read from the
cache to provide faster access. However
this view point changes.
 There are 3 possible write actions:
 Write the result into the main memory
 Write the result into the cache
 Write the result into both main memory

and cache memory
Pritee
Parwek
WRITE METHODS
 Write Through: A cache architecture in which data is

written to main memory at the same time as it is
cached.
 Write Back / Copy Back: CPU performs write only to

the cache in case of a cache hit. If there is a cache
miss, CPU performs a write to main memory.
Pritee
Parwek
When the cache is missed :
 Write Allocate: loads the memory block

into cache and updates the cache block
 No-Write allocation: this bypasses the

cache and writes the word directly into the
memory.
Pritee
Parwek
Pritee
Parwek
CACHE CONFLICT
 A sequence of accesses to memory

repeatedly overwriting the same cache entry.
 This can happen if two blocks of data, which

are mapped to the same set of cache
locations, are needed simultaneously.
Pritee
Parwek
PROBLEM
An address for a byte addressable memory

presented to the cache unit is divided as
follows: 13 bit tab,14 bit line index and 5 bit
offset
What is the cache size in bytes ?
What is the cache mapping scheme?
Pritee
Parwek
PROBLEM
 A block set-associative cache consists of 64

blocks divided in 4 blocks sets.The main
memory contains 4096 blocks,each
consisting of 128 words of 16 bits length.
i) How many bits are there in main memory?
ii) How many bits are there in each of the
TAG,SET and WORD fields?
Pritee
Parwek
PROBLEM
 Main memory of 16MB with the block size of

16 words.Cache memory is of 64KB.Find the
number of bits used in TAG field
Pritee
Parwek
Pritee
Parwek

Memory Management

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Memory Management

Uploaded by

Copyright:

Available Formats

Pritee Parwekar

 Disk & Tape

 Internal or Main memory

 May include one or more levels of cache

Volatile, unless backed up with battery

Stores active programs and data

Communicates directly with the CPU

Magnetic (disks & tapes)

Saves files (programs & data)

Communicates with the CPU

Levels in the Level 2

Size of the memory at each level

Datapath Cache (DRAM)

Speed (ns): 1s 10s 100s 10,000,000s 10,000,000,000s

Magnetic disk 5,000,000–20,000,000 ns

 Very fast but takes up more space than

 Miss: data needs to be retrieve from a block in the lower level

 Miss Penalty: Time to replace a block in the upper level +

Time to deliver the block the processor

 Temporal Locality (Locality in Time):

 A special hierarchical level memory

 When a program references a

 A small but fast cache memory, in which the

 When a program executes, the cache

 Why is cache fast ?

 A cache memory has fewer locations than a

 This cacheless computer usually needs a

95% hit ratio

CPU 30-bit Address

Only 20 bits !!!

Address Mapping !!! •

 CPU requests contents of memory location

 If present, get from cache (fast)

 If not present, read required block from main

 Cache includes tags to identify which block

 Small amount of fast memory

 May be located on CPU chip or module

 SET ASSOCIATIVE MAPPING

Address Mapping !!! •

 Mapping: memory mapped to one location in cache:

00001 00101 01001 01101 10001 10101 11001 11101

CACHE LINE TABLE

Cache line Main Memory blocks held

m-1 m-1, 2m-1,3m-1…2s-1

 If you have a 24 bit address in direct

Direct Mapping Address Partitions

 line identifier (10 bits)

 How many total bits are required for

 How many total bits are required for

 A main memory block can load into any line

 Every line’s tag is examined for a match

 Cache searching gets expensive

 Cache is divided into a number of sets

 A given block maps to any line in a given set

 e.g. Block B can be in any line of set i

 e.g. 2 lines per set

 2 way associative mapping

 A given block can be in one of 2 lines in

 The cache memory is divided into 'v' sets,

 Cache set number = (Main memory

 If there are 'n' cache lines in a set,

 NOTE: The Main memory is not physically

NOTE: We are dividing both Main Memory

 We have a Main Memory of size 4GB,with