1
Types of Cache Misses
3
More Cache Basics
L1 caches are split as instruction and data; L2 and L3
are unified
Victim caches
Cache compression
5
Victim Caches
6
Replacement Policies
NRU: every block in a set has a bit; the bit is made zero
when the block is touched; if all are zero, make all one;
a block with bit set to 1 is evicted
7
Tolerating Miss Penalty
8
Stream Buffers
When you read the top of the queue, bring in the next line
Sequential lines
L1
Stream buffer
9
Stride-Based Prefetching
incorrect
init steady
correct
correct
incorrect
(update stride) PC tag prev_addr stride state
correct
correct
trans no-pred
incorrect
(update stride) incorrect
10
(update stride)
Prefetching
12
Intel 80-Core Prototype Polaris
13
Example Intel Studies
C C C C C C C C
Memory interface
L1 L1 L1 L1 L1 L1 L1 L1
L2 L2 L2 L2
Interconnect
L3
From Zhao et al.,
IO interface CMP-MSI Workshop 2007
L3 Cache sizes up to 32 MB
14
Shared Vs. Private Caches in Multi-Core
P1 P2 P3 P4 P1 P2 P3 P4
L1 L1 L1 L1 L1 L1 L1 L1
L2 L2 L2 L2
L2
15
Shared Vs. Private Caches in Multi-Core
16
UCA and NUCA
17
Large NUCA
Mapping
CPU Migration
Search
Replication
18
Shared NUCA Cache
Bullet
20