You are on page 1of 89

Pritee Parwekar

MEMORY MANAGEMENT
PHYSICAL TYPES

 Semiconductor
 RAM

 Magnetic

 Disk & Tape

 Optical

 CD & DVD

 Others

 Bubble

 Hologram Pritee
Parwek
MEMORY HIERARCHY

 Registers
 In CPU

 Internal or Main memory

 May include one or more levels of cache

 “RAM”

 External memory

 Backing store

Pritee
Parwek
THE BOTTOM LINE

 How much?
Capacity

 How fast?

Time is money

 How expensive?

Pritee
Parwek
HIERARCHY LIST

 Registers
 L1 Cache

 L2 Cache

 Main memory

 Disk cache

 Disk

 Optical

 Tape

Pritee
Parwek
MEMORY HIERARCHY
 Main Memory

RAM

Volatile, unless backed up with battery

Stores active programs and data

Communicates directly with the CPU

Pritee
Parwek
MEMORY HIERARCHY
 Auxiliary Memory

Magnetic (disks & tapes)

Non volatile

Saves files (programs & data)

Communicates with the CPU


through a controller

Pritee
Parwek
MEMORY HIERARCHY
Main Memory I/O Processor
CPU

Cache

Magnetic
Disks Magnetic Tapes

Pritee
Parwek
CPU

Increasing distance
Level 1
from the CPU in
access time

Levels in the Level 2


memory hierarchy

Level n

Size of the memory at each level


Pritee
Parwek
MEMORY HIERARCHY OF A MODERN COMPUTER SY
 By taking advantage of the principle of locality:
 Present the user with as much memory as is available in the cheapest
technology.
 Provide access at the speed offered by the fastest technology.

Processor

Control Tertiary
Secondary Storage
Storage (Tape)
Second Main
(Disk)
On-Chip
Registers

Level Memory
Cache

Datapath Cache (DRAM)


(SRAM)

Speed (ns): 1s 10s 100s 10,000,000s 10,000,000,000s


(10s ms) (10s sec)
Size (bytes): 100s Ks Ms Gs Ts

Pritee
Parwek
Pritee Parwekar
TABLE OF TECH AND ACCESS TIME
Memory technology Typical access time

SRAM 0.5–5 ns

DRAM 50–70 ns

Magnetic disk 5,000,000–20,000,000 ns


MEMORY
 SRAM:
 Value is stored on a pair of inverting gates

 Very fast but takes up more space than


DRAM (4 to 6 transistors)

 DRAM:
 Value is stored as a charge on capacitor
(must be refreshed)
 Very small but slower than SRAM (factor of 5
to 10)
Pritee
Parwek
MEMORY CELL OPERATION

Pritee
Parwek
 Hit: data appears in some block in the upper level (example:
Block X)
 Hit Rate: the fraction of memory access found in the upper
level

 Miss: data needs to be retrieve from a block in the lower level


(Block Y)
 Miss Rate = 1 - (Hit Rate)

 Miss Penalty: Time to replace a block in the upper level +

Time to deliver the block the processor

Lower Level
To Processor Upper Level Memory
Memory
Blk X
From Processor Blk Y

Pritee
Parwek
MEMORY HIERARCHY: HOW DOES IT WORK?

 Temporal Locality (Locality in Time):


=> Keep most recently accessed data items closer to
the processor
 Spatial Locality (Locality in Space):
=> Move blocks consists of contiguous words to the
upper levels
Lower Level
To Processor Upper Level Memory
Memory
Blk X
From Processor Blk Y

Pritee
Parwek
PROBLEM 1
 Consider two-level memory hierarchy (M1,M2)
t1,t2 are access time of M1 and M2 ,h1 is hit ratio of
M1.Calculate effective access time

EAT=h1*t1+(1-h1)(1+t2)
If cost per unit f M1 is C1 and the cost per unit of M2
is C2 then the average cost of memory =
C1*M1 + C2*M2 / (M1+M2)

Pritee
Parwek
PROBLEM 2
 Suppose the access time of cache memory is 80ns
and that of main memory is 800ns.It is estimated
that the memory requests are for read. The hit ratio
for read access is 0.8.
Determine the average access time of the system
only memory read cycle

Pritee
Parwek
CACHE MEMORY

 A special hierarchical level memory


 Why is cache memory needed ?

Pritee
Parwek
WHY IS CACHE MEMORY NEEDED?

 When a program references a


memory location, it is likely to
reference that same memory location
again soon.

Pritee
Parwek
WHY IS CACHE MEMORY NEEDED ?

 A small but fast cache memory, in which the


contents of the most commonly accessed
locations are maintained, can be placed
between the CPU and the main memory.

 When a program executes, the cache


memory is searched first.

Pritee
Parwek
CACHE MEMORY

 Why is cache fast ?

Pritee
Parwek
WHY IS CACHE MEMORY FAST?

 A cache memory has fewer locations than a


main memory, which reduces the access
time
 The cache is placed both physically closer
and logically closer the the CPU than the
main memory

Pritee
Parwek
WHY IS CACHE MEMORY FAST

 This cacheless computer usually needs a


few bus cycles to synchronize the CPU with
the bus.
 A cache memory can be positioned closer to
the CPU.

Pritee
Parwek
CACHE MEMORY
 High speed (towards CPU speed)
 Small size (power & cost)

Miss

Main
CPU Memory
Cache (Slow)
(Fast) Mem
Hit Cache

95% hit ratio


Access = 0.95 Cache + 0.05 Mem
Pritee
Parwek
CACHE MEMORY

CPU 30-bit Address


Main
Memory
Cache
1 Gword
1 Mword

Only 20 bits !!!

Pritee
Parwek
CACHE MEMORY
00000000 Main
00000001 Memory
00000 Cache •
00001 •
• •
• •
• •
• •
FFFFF •

Address Mapping !!! •


3FFFFFFF
Pritee
Parwek
CACHE OPERATION - OVERVIEW

 CPU requests contents of memory location


 Check cache for this data

 If present, get from cache (fast)

 If not present, read required block from main


memory to cache
 Then deliver from cache to CPU

 Cache includes tags to identify which block


of main memory is in each cache slot
Pritee
Parwek
CACHE READ OPERATION - FLOWCHART

Pritee
Parwek
CACHE AND MAIN MEMORY

Pritee
Parwek
CACHE DESIGN

 Size
 Mapping Function

 Replacement Algorithm

 Write Policy

 Block Size

 Number of Caches

Pritee
Parwek
CACHE

 Small amount of fast memory


 Sits between normal main memory and CPU

 May be located on CPU chip or module

Pritee
Parwek
CACHE DESIGN

 Size
 Mapping Function

 Replacement Algorithm

 Write Policy

 Block Size

 Number of Caches

Pritee
Parwek
CACHE-MAPPING FUNCTIONS

 DIRECT MAPPING
 ASSOCIATIVE MAPPING

 SET ASSOCIATIVE MAPPING

Pritee
Parwek
CACHE MEMORY
00000000 Main
00000001 Memory
00000 Cache •
00001 •
• •
• •
• •
• •
FFFFF •

Address Mapping !!! •


3FFFFFFF
Pritee
Parwek
DIRECT MAPPED CACHE

 Mapping: memory mapped to one location in cache:


(Block address) mod (Number of lines in cache)
 Number of blocks is typically a power of two, i.e.,
cache location obtained from low-order bits of address.

Cache
000
001
010
011

111
100
101
110

00001 00101 01001 01101 10001 10101 11001 11101

Memory Pritee
Parwek
DIRECT MAPPING Pritee Parwekar

CACHE LINE TABLE

Cache line Main Memory blocks held

0 0, m, 2m, 3m…2s-m

1 1,m+1, 2m+1…2s-m+1

m-1 m-1, 2m-1,3m-1…2s-1


CACHE/MAIN MEMORY STRUCTURE

Pritee
Parwek
DIRECT MAPPING FROM CACHE TO MAIN
MEMORY

Pritee
Parwek
DIRECT MAPPING CACHE ORGANIZATION

Pritee
Parwek
DIRECT MAPPING EXAMPLE

 If you have a 24 bit address in direct


mapping with a block size of 4 words and 1K
lines in a cache how would be the
partitioning of the address for the cache ?

Pritee
Parwek
ANSWER

Direct Mapping Address Partitions


 Tag (12 bits)

 line identifier (10 bits)

 word id (2 bits)

Pritee
Parwek
PROBLEM

 How many total bits are required for


a direct-mapping cache with 16KB
and 4-word blocks,Main memory is
8MB

Pritee
Parwek
Pritee Parwekar

DIRECT MAPPING
ADDRESS STRUCTURE
Tag s-r Line or Slot r Word w

7 14 2
 23 bit address
 2 bit word identifier (4 byte block)
 21 bit block identifier
 7 bit tag (=22-14)
 14 bit slot or line
 No two blocks in the same line have the same Tag field
 Check contents of cache by finding line and checking Tag
PROBLEM

 How many total bits are required for


a direct-mapping cache with 8KB of
data and 4-word blocks, assuming a
32-bit address.

Pritee
Parwek
ASSOCIATIVE MAPPING

 A main memory block can load into any line


of cache
 Memory address is interpreted as tag and
word
 Tag uniquely identifies block of memory

 Every line’s tag is examined for a match

 Cache searching gets expensive

Pritee
Parwek
ASSOCIATIVE MAPPING FROM
CACHE TO MAIN MEMORY

Pritee
Parwek
FULLY ASSOCIATIVE CACHE
ORGANIZATION

Pritee
Parwek
ASSOCIATIVE MAPPING
ADDRESS STRUCTURE

Word
Tag 22 bit 2 bit

 22 bit tag
 Compare tag field with tag entry in cache to
check for hit
 Least significant 2 bits of address identify
which word is required from data block

Pritee
Parwek
SET ASSOCIATIVE MAPPING

 Cache is divided into a number of sets


 Each set contains a number of lines

 A given block maps to any line in a given set

 e.g. Block B can be in any line of set i

 e.g. 2 lines per set

 2 way associative mapping

 A given block can be in one of 2 lines in


only one set

Pritee
Parwek
SET ASSOCIATIVE MAPPING

 The cache memory is divided into 'v' sets,


each consisting of 'n' cache lines. A block
from Main memory is first mapped onto a
specific cache set, and then it can be placed
anywhere within that set. This type of
mapping has very efficient ratio between
implementation and efficiency.

Pritee
Parwek
SET ASSOCIATIVE MAPPING

 Cache set number = (Main memory


block number) MOD (Number of sets
in the cache memory)

Pritee
Parwek
SET ASSOCIATIVE MAPPING

 If there are 'n' cache lines in a set,


the cache placement is called n-way
set associative i.e. if there are two
blocks or cache lines per set, then it
is a 2-way set associative cache
mapping and four blocks or cache
lines per set, then it is a 4-way set
associative cache mapping.

Pritee
Parwek
SET ASSOCIATIVE MAPPING

 NOTE: The Main memory is not physically


partitioned in the given way, but this is the
view of Main memory that the cache sees.

NOTE: We are dividing both Main Memory


and cache memory into blocks of same size

Pritee
Parwek
PROBLEM

 We have a Main Memory of size 4GB,with


each byte directly addressable. Main
memory into blocks of each 32 bytes. Cache
memory of 512KB.A two way set associative
mapping is used for cache find the
distribution of bits in the address.

Pritee
Parwek
SOLUTION
 Let us assume we have a Main Memory of size
4GB (232), with each byte directly addressable by a
32-bit address. We will divide Main memory into
blocks of each 32 bytes (25). Thus there are 128M
(i.e. 232/25 = 227) blocks in Main memory.

We have a Cache memory of 512KB (i.e. 219),


divided into blocks of each 32 bytes (25). Thus
there are 16K (i.e. 219/25 = 214) blocks also known
as Cache slots or Cache lines in cache memory. It
is clear from above numbers that there are more
Main memory blocks than Cache slots.
Pritee
Parwek
SOLUTION
 Let us try 2-way set associative cache mapping i.e. 2 cache
lines per set. We will divide 16K cache lines into sets of 2
and hence there are 8K (214/2 = 213) sets in the Cache
memory.

Cache Size = (Number of Sets) * (Size of each set) * (Cache


line size)

So even using the above formula we can find out number of


sets in the Cache memory i.e.

219 = (Number of Sets) * 2 * 25

Number of Sets = 219 / (2 * 25) = 213. Pritee


Parwek
SOLUTION
 When an address is mapped to a set, the
direct mapping scheme is used, and then
associative mapping is used within a set.

The format for an address has 13 bits in the


set field, which identifies the set in which the
addressed word will be found if it is in the
cache. There are five bits for the word field
as before and there is 14-bit tag field that
together make up the remaining 32 bits of the
address as shown below

Pritee
Parwek
PROBLEM

 A computer system has 8GB of word-


addressable main memory and a 256 KB
unified cache memory with 64 word block.
How many bits for set and word offset
required if 4-way set associative mapping is
used.

Pritee
Parwek
PROBLEM

An address for a byte addressable memory


presented to the cache unit is divided as
follows: 13 bit tab,14 bit line index and 5 bit
offset
What is the cache size in bytes ?
What is the cache mapping scheme?

Pritee
Parwek
PROBLEM

 A block set-associative cache consists of 64


blocks divided in 4 blocks sets.The main
memory contains 4096 blocks,each
consisting of 128 words of 16 bits length.
i) How many bits are there in main memory?
ii) How many bits are there in each of the
TAG,SET and WORD fields?

Pritee
Parwek
PROBLEM

 Main memory of 16MB with the block size of


16 words.Cache memory is of 64KB.Find the
number of bits used in TAG field

Pritee
Parwek
SET ASSOCIATIVE MAPPING
EXAMPLE
 In the below example we have
chosen the block 14 from Main
memory and compared it with the
different block replacement
algorithms. In Direct Mapped cache
it can be placed in Frame 6 since 14
mod 8 = 6. In Set associative cache
it can be placed in set 2.

Pritee
Parwek
MAPPING FROM MAIN MEMORY TO
CACHE:
V ASSOCIATIVE

Pritee
Parwek
MAPPING FROM MAIN MEMORY TO
CACHE:
K-WAY ASSOCIATIVE

Pritee
Parwek
K-WAY SET ASSOCIATIVE CACHE
ORGANIZATION

Pritee
Parwek
SET ASSOCIATIVE MAPPING
ADDRESS STRUCTURE
Word
Tag 9 bit Set 13 bit 2 bit

 Use set field to determine cache set to look in


 Compare tag field to see if we have a hit
 e.g
 Address Tag Data Set
number
 1FF 7FFC 1FF 12345678 1FFF
 001 7FFC 001 11223344 1FFF

Pritee
Parwek
TWO WAY SET ASSOCIATIVE MAPPING
EXAMPLE

Pritee
Parwek
SET ASSOCIATIVE MAPPING
SUMMARY
 Address length = (s + w) bits
 Number of addressable units = 2s+w words or
bytes
 Block size = line size = 2w words or bytes
 Number of blocks in main memory = 2d
 Number of lines in set = k
 Number of sets = v = 2d
 Number of lines in cache = kv = k * 2d
 Size of tag = (s – d) bits

Pritee
Parwek
Therefore, cache line's tag size depends on 3
factors:
 Size of cache memory;

 Associativity of cache memory;

 Cacheable range of operating memory.

Here,
Stag — size of cache tag, in bits;
Smemory — cacheable range of operating memory, in bytes;
Scache — size of cache memory, in bytes;
A — associativity of cache memory, in ways.
Pritee
Parwek
EXAMPLE

 Three small 4 word caches:


Direct mapped, two-way set associative,
fully associative

 How many misses in the sequence of block


addresses: 0, 8, 0, 6, 8?

 How does this change with 8 words, 16


words?
Pritee
Parwek
 Whenthe set associative mapping will
become direct and associative mapping
?

Pritee
Parwek
VALID BIT / DIRTY BIT

Valid Bit :- When a program is first loaded


into main memory, the cache is cleared,
and so while a program is executing, a
valid bit is needed to indicate whether or
not the slot holds a line that belongs to the
program being executed.

Pritee
Parwek
VALID BIT / DIRTY BIT
Dirty bit :- keeps track of whether or not
a line has been modified while it is in the
cache. A slot that is modified must be
written back to the main memory before the
slot is reused for another line.

Pritee
Parwek
REPLACEMENT
ALGORITHM
 Optimal Replacement: replace the block
which is no longer needed in the future. If
all blocks currently in Cache Memory will
be used again, replace the one which will
not be used in the future for the longest
time.

 Random selection: replace a randomly


selected block among all blocks currently
in Cache Memory. Pritee
Parwek
REPLACEMENT
ALGORITHM
 FIFO (first-in first-out): replace the block
that has been in Cache Memory for the
longest time.
 LRU (Least recently used): replace the
block in Cache Memory that has not been
used for the longest time.
 LFU (Least frequently used): replace the
block in Cache Memory that has been
used for the least number of times.

Pritee
Parwek
REPLACEMENT ALGORITHMS
 For Associative & Set-Associative Cache
Which location should be emptied when
the cache is full and a miss occurs?
 First In First Out (FIFO)

 Least Recently Used (LRU)

 Distinguish an Empty location from a Full


one
 Valid Bit

Pritee
Parwek
REPLACEMENT ALGORITHMS
CPU A B C A D E A D C F
Reference
Miss Miss Miss Hit Miss Miss Miss Hit Hit Miss

Cache A A A A A E E E E E
B B B B B A A A A
FIFO  C C C C C C C F
D D D D D D

Hit Ratio = 3 / 10 = 0.3

Pritee
Parwek
REPLACEMENT ALGORITHMS
CPU A B C A D E A D C F
Reference
Miss Miss Miss Hit Miss Miss Hit Hit Hit Miss

Cache A B C A D E A D C F
A B C A D E A D C
LRU  A B C A D E A D
B C C C E A

Hit Ratio = 4 / 10 = 0.4

Pritee
Parwek
LOAD-THROUGH
STORE-THROUGH

Load-Through : When the CPU needs to


read a word from the memory, the block
containing the word is brought from MM to
CM, while at the same time the word is
forwarded to the CPU.

Store-Through : If store-through is used, a


word to be stored from CPU to memory is
written to both CM (if the word is in there)
and MM..
Pritee
Parwek
Pritee
Parwek
WRITE METHODS
 Note: Words in a cache have been
viewed simply as copies of words from
main memory that are read from the
cache to provide faster access. However
this view point changes.
 There are 3 possible write actions:
 Write the result into the main memory

 Write the result into the cache

 Write the result into both main memory


and cache memory
Pritee
Parwek
WRITE METHODS

 Write Through: A cache architecture in which data is


written to main memory at the same time as it is
cached.

 Write Back / Copy Back: CPU performs write only to


the cache in case of a cache hit. If there is a cache
miss, CPU performs a write to main memory.

Pritee
Parwek
When the cache is missed :

 Write Allocate: loads the memory block


into cache and updates the cache block

 No-Write allocation: this bypasses the


cache and writes the word directly into the
memory.

Pritee
Parwek
Pritee
Parwek
CACHE CONFLICT

 A sequence of accesses to memory


repeatedly overwriting the same cache entry.

 This can happen if two blocks of data, which


are mapped to the same set of cache
locations, are needed simultaneously.

Pritee
Parwek
PROBLEM

An address for a byte addressable memory


presented to the cache unit is divided as
follows: 13 bit tab,14 bit line index and 5 bit
offset
What is the cache size in bytes ?
What is the cache mapping scheme?

Pritee
Parwek
PROBLEM

 A block set-associative cache consists of 64


blocks divided in 4 blocks sets.The main
memory contains 4096 blocks,each
consisting of 128 words of 16 bits length.
i) How many bits are there in main memory?
ii) How many bits are there in each of the
TAG,SET and WORD fields?

Pritee
Parwek
PROBLEM

 Main memory of 16MB with the block size of


16 words.Cache memory is of 64KB.Find the
number of bits used in TAG field

Pritee
Parwek
Pritee
Parwek

You might also like