Professional Documents
Culture Documents
Tag Data
Hit ?
Valid, Match
Data
Augmenting Cache Architecture
Filter Cache
insert another small
cache before L1
to lower hit ratio CPU
Filter
cache L1 L2
reduces power
Block Buffer
Aim is to reduce access to
memory core
Block buffering
Save last accessed cache line addr
in buffer.
if next access is to the same line, Tag Index Off
read directly from buffer
Tag Data
Saves access to memory core
when spatial locality exists buf buf
Extend to small fully Hit ?
associative buffer. Valid, Match
Techniques -
Direct mapping Data
Fully associative mapping
Set associative mapping
Scratch Pad Memory
Compiler Managed Memory
part of memory space directly addressed
can be on-chip SRAM
Fast, predictable, low power vs. cache
Embedded Processors, IBM Cell
What data/code should reside in Scratch Pad?
Compiler decision
On-chip
Memory
1 cycle
CPU Address
space
Data
Off-chip
Cache
Memory
1 cycle (on-chip) 10-20
cycles
Tag Comparison
Basic idea:
Need to reduce comparison to save power
Techniques :
Conventional tag and data access
all tag data, arrays accessed simultaneously
More power consumption, performance is high
Sequential tag and data access
Power saving, performance penalty
Way Prediction (Memory Hit/Miss ratio etc)
Hit on predicted way
Produce result on single cycle
Miss on predicted way
Produce result on more than one cycle
Transformations and Disks
Basic idea:
Datatransformation techniques to represent and access
Assume data is large, so disk access needed
Techniques :
Loop fusion
Loop fission
Loop Fusion vs. Loop Fission
Which one is preferable ?
Transformations and Disks
Assume a, b data are located on different disks
Loop fusion may be bad
both disks are continuously accessed
Loop fission (reverse) could be preferable
idle disk in low power mode