Chapter 8
Multiprocessors
Shared Memory Architectures
Prof. Jerry Breecher
CSCI 240
Fall 2003
Chapter Overview
Were going to do only one section from this chapter, that part related
to how caches from multiple processors interact with each other.
8.1 Introduction the big picture
8.3 Centralized Shared Memory Architectures
Chap. 8 - Multiproces
Introduction
8.1 Introduction
Chap. 8 - Multiproces
Example:
Pentium System
Organization
PCI Bus
I/O Busses
Chap. 8 - Multiproces
Processor
Processor
Processor
Registers
Registers
Registers
Registers
Caches
Caches
Caches
Caches
Chipset
Memory
Chap. 8 - Multiproces
P
Network/Bus
M
Conceptual Model
Chap. 8 - Multiproces
Message Passing
Multicomputers
Network
Chap. 8 - Multiproces
Large-Scale MP Designs
Memory: distributed with nonuniform memory access time (numa)
and scalable interconnect (distributed memory)
40 cycles
100 cycles
Low Latency
High Reliability
1 cycle
Chap. 8 - Multiproces
Shared Memory
Architectures
8.1 Introduction
8.3 Centralized Shared
Memory Architectures
Chap. 8 - Multiproces
CPU
CPU
Cache
Cache
Cache
100
550
100
200
200
200
Memory
Memory
Memory
100
100
100
200
200
440
I/O
a) Cache and memory
coherent: A = A, B = B.
I/O
I/O
Input 440 to B
Chap. 8 - Multiproces
10
Shared Memory
Architectures
Mechanism
How It Works
Performance
Coherency Issues
Write Back
Write modified
data from cache
to memory only
when
necessary.
Good,
because
doesnt tie up
memory
bandwidth.
Write modified
data from cache
to memory
Write Through
immediately.
Modified values
always written to
memory; data
always matches.
Chap. 8 - Multiproces
11
Shared Memory
Architectures
Informally:
Any read must return the most recent write
Too strict and too difficult to implement
Better:
Any write must eventually be seen by a read
All writes are seen in proper order (serialization)
Two rules to ensure this:
If P writes x and P1 reads it, Ps write will be seen by P1 if the
read and write are sufficiently far apart
Writes to a single location are serialized:
seen in one order
Latest write will be seen
Otherwise could see writes in illogical order
(could see older value after a newer value)
Chap. 8 - Multiproces
12
Shared Memory
Architectures
Test_and_set(lock)
shared_data = xyz;
Clear(lock);
TYPE
Shared?
Writable
Code
Shared
No
No Need.
Private Data
Exclusive
Yes
Write Back
Shared Data
Shared
Yes
Write Back *
Interlock Data
Shared
Yes
Write Through **
* Write Back gives good performance, but if you use write through
here, there will be performance degradation.
** Write through here means the lock state is seen immediately.
You want a write through here to flush the cache.
Chap. 8 - Multiproces
13
Shared Memory
Architectures
Potential HW Coherency
Solutions
Chap. 8 - Multiproces
14
Shared Memory
Architectures
Chap. 8 - Multiproces
15
Shared Memory
Architectures
State machine
for CPU requests
for each
cache block
Invalid
CPU Read
Place read miss
on bus
Applies to
Write Back
Data
Shared
(read/only)
CPU Write
Place Write
Miss on bus
CPU Write
Place Write Miss on Bus
Cache Block
State
Exclusive
(read/write)
Chap. 8 - Multiproces
16
Shared Memory
Architectures
State machine
for bus requests
for each
cache block
Appendix E gives
details of bus requests
Invalid
Write miss
for this block
Write Back
Block; (abort
memory access)
Write miss
for this block
Exclusive
(read/write)
Shared
(read/only)
Write Back
Block; (abort
memory access)
Read miss
for this block
Chap. 8 - Multiproces
17
Shared Memory
Architectures
Example
Processor 1
step
P1: Write 10 to A1
P1: Read A1
P2: Read A1
P1
State
Bus
Processor 2
P2
Addr Value State
Bus
Addr Value Action Proc. Addr
Memory
Memory
Value Addr Value
P2: Write 20 to A1
P2: Write 40 to A2
This is the
Cache for P1.
Remote Write
or Miss
Shared
Invalid
Read
miss on bus
Remote
Write
or Miss
Write Back
Write
miss on bus
Remote Read
Write Back
CPU Write
Place Write
Miss on Bus
Exclusive
CPU read hit
CPU write hit
Chap. 8 - Multiproces
18
Shared Memory
Architectures
step
P1: Write 10 to A1
P1: Read A1
P2: Read A1
P1
State
Excl.
P2
Addr Value State
A1
10
Example: Step 1
Bus
Addr Value Action Proc. Addr
WrMs
P1
A1
Memory
Value Addr Value
P2: Write 20 to A1
P2: Write 40 to A2
Invalid
Shared
Write
miss on bus
Exclusive
Chap. 8 - Multiproces
19
Shared Memory
Architectures
step
P1: Write 10 to A1
P1: Read A1
P2: Read A1
Example: Step 2
P1
P2
State Addr Value State
Excl.
A1
10
Excl.
A1
10
Bus
Addr Value Action Proc. Addr
WrMs
P1
A1
Memory
Value Addr Value
P2: Write 20 to A1
P2: Write 40 to A2
Invalid
Shared
Exclusive
Chap. 8 - Multiproces
20
Shared Memory
Architectures
step
P1: Write 10 to A1
P1: Read A1
P2: Read A1
Example: Step 3
P1
P2
Bus
State Addr Value State Addr Value Action Proc. Addr
Excl.
A1
10
WrMs
P1
A1
Excl.
A1
10
Shar.
A1
RdMs
P2
A1
Shar.
A1
10
WrBk
P1
A1
Shar.
A1
10
RdDa
P2
A1
Memory
Value Addr Value
10
10
A1
P2: Write 20 to A1
P2: Write 40 to A2
Shared
Invalid
Read
miss on bus
Remote Read
Write Back
Exclusive
Chap. 8 - Multiproces
21
10
10
10
10
10
Shared Memory
Architectures
step
P1: Write 10 to A1
P1: Read A1
P2: Read A1
P2: Write 20 to A1
P2: Write 40 to A2
Example: Step 4
P1
P2
Bus
State Addr Value State Addr Value Action Proc. Addr
Excl.
A1
10
WrMs
P1
A1
Excl.
A1
10
Shar.
A1
RdMs
P2
A1
Shar.
A1
10
WrBk
P1
A1
Shar.
A1
10
RdDa
P2
A1
Inv.
Excl.
A1
20 WrMs
P2
A1
Memory
Value Addr Value
10
10
A1
Remote Write
Invalid
Shared
Exclusive
Chap. 8 - Multiproces
22
10
10
10
10
10
Shared Memory
Architectures
step
P1: Write 10 to A1
P1: Read A1
P2: Read A1
P2: Write 20 to A1
P2: Write 40 to A2
Example: Step 5
P1
P2
Bus
State Addr Value State Addr Value Action Proc. Addr
Excl.
A1
10
WrMs
P1
A1
Excl.
A1
10
Shar.
A1
RdMs
P2
A1
Shar.
A1
10
WrBk
P1
A1
Shar.
A1
10
RdDa
P2
A1
Inv.
Excl.
A1
20 WrMs
P2
A1
WrMs
P2
A2
Excl. A2
40
WrBk
P2
A1
Memory
Value Addr Value
10
10
A1
20
A1
Chap. 8 - Multiproces
23
10
10
10
10
20
Summary
8.1 Introduction the big picture
8.3 Centralized Shared Memory Architectures
Weve looked at what happens to caches when we have multiple
processors or devices looking at memory.
Chap. 8 - Multiproces
24