Anda di halaman 1dari 13

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB

ECE232: Hardware Organization and Design


Part 11: Pipelining
Chapter 4/6
http://www.ecs.umass.edu/ece/ece232/
ECE232: Pipelining I 2 Adapted from Computer Organization and Design, Patterson&Hennessy,UCB, Kundu,UMass Koren
CPI Calculation
CPI stands for average number of Cycles Per Instruction
Assume an instruction mix of 24% loads, 12% stores, 44% R-
format, 18% branches, and 2% jumps
CPI = 0.24 * 5 + 0.12 * 4 + 0.44 * 4 + 0.18 * 3 + 0.02 * 3 =
4.04
Speedup?
Question: Can we achieve a CPI of 1???
ECE232: Pipelining I 3 Adapted from Computer Organization and Design, Patterson&Hennessy,UCB, Kundu,UMass Koren
Speeding up through pipelining
Ann, Brian, Cathy, Dave each have one load
of clothes to wash, dry, and fold
Washer takes 30 minutes
Dryer takes 30 minutes
Folder takes 30 minutes
Stasher takes 30 minutes
to put clothes into drawers
A B C D
ECE232: Pipelining I 4 Adapted from Computer Organization and Design, Patterson&Hennessy,UCB, Kundu,UMass Koren
Sequential Laundry
Sequential laundry takes 8 hours for 4 loads
If they learned pipelining, how long would laundry take?
30
T
a
s
k
O
r
d
e
r
B
C
D
A
Time
30 30 30 30 30 30 30 30 30 30 30 30 30 30 30
6 PM 7 8
9 10 11
12 1
2 AM
ECE232: Pipelining I 5 Adapted from Computer Organization and Design, Patterson&Hennessy,UCB, Kundu,UMass Koren
Pipelined Laundry: Start work ASAP
Pipelined laundry takes 3.5 hours for 4 loads!
T
a
s
k
O
r
d
e
r
12 2 AM 6 PM 7 8
9 10 11 1
Time
B
C
D
A
30 30 30 30 30 30 30
ECE232: Pipelining I 6 Adapted from Computer Organization and Design, Patterson&Hennessy,UCB, Kundu,UMass Koren
Pipelining Lessons
Pipelining doesnt help
latency of single task, it helps
throughput of entire workload
Multiple tasks operating
simultaneously using
different resources
Potential speedup = Number
pipe stages
Pipeline rate limited by
slowest pipeline stage
Unbalanced lengths of pipe
stages reduces speedup
Time to fill pipeline and
time to drain it reduces
speedup
6 PM 7 8 9
Time
B
C
D
A
30 30 30 30 30 30 30
T
a
s
k
O
r
d
e
r
ECE232: Pipelining I 7 Adapted from Computer Organization and Design, Patterson&Hennessy,UCB, Kundu,UMass Koren
Pipelining Instructions
Time (in cycles)
I
n
s
t
r
u
c
t
i
o
n
F D EX M W
F D EX M W
F D EX M W
F D EX M W
F D EX M
F D EX
Fetch = 10 ns
Decode = 6 ns
Execute = 8 ns
Memory = 10 ns
Write back = 6 ns
W
M W
ECE232: Pipelining I 8 Adapted from Computer Organization and Design, Patterson&Hennessy,UCB, Kundu,UMass Koren
Single Cycle, Multiple Cycle, vs. Pipeline
Clk
Cycle 1
Ifetch Reg Exec Mem Wr
Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle 10
Load Ifetch Reg Exec Mem Wr
Ifetch Reg Exec Mem
Load Store
Ifetch Reg Exec Mem Wr Store
Clk
Load Store Waste
Ifetch
R-type
Ifetch Reg Exec Mem Wr R-type
Cycle 1 Cycle 2
Multiple Cycle Implementation:
Pipeline Implementation:
Single Cycle Implementation:
ECE232: Pipelining I 9 Adapted from Computer Organization and Design, Patterson&Hennessy,UCB, Kundu,UMass Koren
Why Pipeline?
Suppose we execute 100 instructions
Single Cycle Machine
45 ns/cycle x 1 CPI x 100 inst = 4500 ns
Multicycle Machine
10 ns/cycle x 4.04 CPI (for the given inst mix) x 100 inst
= 4040 ns
Instruction mix of 24% loads, 12% stores, 44% R-format, 18%
branches, and 2% jumps
Ideal pipelined machine (with 5 stages)
10 ns/cycle x (1 CPI x 100 inst + 4 cycle drain) = 1040 ns
Speedup=4.33 vs. single-cycle
3.88 vs. multi-cycle (for the given inst mix)
ECE232: Pipelining I 10 Adapted from Computer Organization and Design, Patterson&Hennessy,UCB, Kundu,UMass Koren
Why Pipeline? Because the resources are there!
I
n
s
t
r.
O
r
d
e
r
Time (clock cycles)
Inst 1
Inst 2
Inst 3
Inst 5
Inst 4
A
L
U Im Reg Dm Reg
A
L
U Im Reg Dm Reg
A
L
U Im Reg Dm Reg
A
L
U Im Reg Dm Reg
A
L
U Im Reg Dm Reg
ECE232: Pipelining I 11 Adapted from Computer Organization and Design, Patterson&Hennessy,UCB, Kundu,UMass Koren
Pipelining Rules
Forward traveling signals at each stage are latched
Only perform logic on signals in the same stage
signal labeling useful to prevent errors,
e.g., IR
R
, IR
A
, IR
M
, IR
W
Backward travelling signals at each stage represent hazards
A
L
U
IMem Reg DMem Reg
Inst 1 Inst 2 Inst 3 Inst 4 Inst 5
ECE232: Pipelining I 12 Adapted from Computer Organization and Design, Patterson&Hennessy,UCB, Kundu,UMass Koren
MIPS Pipelined Datapath
State registers between pipeline stages to isolate them
Read
Address
Instruction
Memory
Add
P
C
4
Write Data
Read Addr 1
Read Addr 2
Write Addr
Register
File
Read
Data 1
Read
Data 2
16 32
ALU
Shift
left 2
Add
Data
Memory
Address
Write Data
Read
Data I
F
e
t
c
h
/
D
e
c
D
e
c
/
E
x
e
c
E
x
e
c
/
M
e
m
M
e
m
/
W
B
IF:IFetch ID:Dec EX:Execute MEM:
MemAccess
WB:
WriteBack
System Clock
Sign
Extend
Inst 1 Inst 2 Inst 3 Inst 4 Inst 5
ECE232: Pipelining I 13 Adapted from Computer Organization and Design, Patterson&Hennessy,UCB, Kundu,UMass Koren
Pipeline Hazards
Data hazards: an instruction uses the result of a previous
instruction (RAW)
ADD R1, R2, R3 or SW R1, 4(R2)
SUB R4, R1, R5 LW R3, 4(R2)
Control hazards: the address of the next instruction to be
executed depends on a previous instruction
BEQ R1,R2,CONT
SUB R6,R7,R8

CONT: ADD R3,R4,R5


Structural hazards: two instructions need access to the same
resource
e.g., single memory shared for instruction fetch and
load/store
ECE232: Pipelining I 14 Adapted from Computer Organization and Design, Patterson&Hennessy,UCB, Kundu,UMass Koren
I
n
s
t
r.
O
r
d
e
r
Time (clock cycles)
lw
Inst 1
Inst 2
Inst 4
Inst 3
A
L
U Mem Reg Mem Reg
A
L
U Mem Reg Mem Reg
A
L
U Mem Reg Mem Reg
A
L
U Mem Reg Mem Reg
A
L
U Mem Reg Mem Reg
Structural Hazard
Reading data from
memory
Reading instruction
from memory
Fix with separate instruction and data memories (I$ and D$)
ECE232: Pipelining I 15 Adapted from Computer Organization and Design, Patterson&Hennessy,UCB, Kundu,UMass Koren
Data Hazards (RAW)
Time (in cycles)
F
I
n
s
t
r
u
c
t
i
o
n
D EX M W
F D EX M W
Write Data to R1 Here
Get data from R1 Here
ADD R1, R2, R3
SUB R4, R1, R5
ECE232: Pipelining I 16 Adapted from Computer Organization and Design, Patterson&Hennessy,UCB, Kundu,UMass Koren
stall
stall
One Way to handle a Data Hazard
I
n
s
t
r.
O
r
d
e
r
add $1,
A
L
U IM Reg DM Reg
sub $4,$1,$5
A
L
U IM Reg DM Reg
By waiting
introducing
stalls but
impacts CPI
stall
ECE232: Pipelining I 17 Adapted from Computer Organization and Design, Patterson&Hennessy,UCB, Kundu,UMass Koren
I
n
s
t
r.
O
r
d
e
r
Time (clock cycles)
Inst 1
Inst 2
Inst 3
Inst 5
Inst 4
A
L
U Im Reg Dm Reg
A
L
U Im Reg Dm Reg
A
L
U Im Reg Dm Reg
A
L
U Im Reg Dm Reg
A
L
U Im Reg Dm Reg
Must allow Wr/Rd in REG in same cycle
Split cycle into two halves
ECE232: Pipelining I 18 Adapted from Computer Organization and Design, Patterson&Hennessy,UCB, Kundu,UMass Koren
stall
stall
Only two stall cycles
I
n
s
t
r.
O
r
d
e
r
add $1,
A
L
U IM Reg DM Reg
sub $4,$1,$5
and $6,$1,$7
A
L
U IM Reg DM Reg
A
L
U IM Reg DM Reg
Write in 1
st
half,
Read in 2
nd
half
ECE232: Pipelining I 19 Adapted from Computer Organization and Design, Patterson&Hennessy,UCB, Kundu,UMass Koren
Another Way to Fix a Data Hazard
I
n
s
t
r.
O
r
d
e
r
add $1,
A
L
U IM Reg DM Reg
sub $4,$1,$5
and $6,$1,$7
A
L
U IM Reg DM Reg
A
L
U IM Reg DM Reg
by forwarding
xor $4,$1,$5
or $8,$1,$9
A
L
U IM Reg DM Reg
A
L
U IM Reg DM Reg
Time
ECE232: Pipelining I 20 Adapted from Computer Organization and Design, Patterson&Hennessy,UCB, Kundu,UMass Koren
Register File (write and then read)
I
n
s
t
r.
O
r
d
e
r
Time (clock cycles)
Inst 1
Inst 2
A
L
U IM Reg DM Reg
A
L
U IM Reg DM Reg
A
L
U IM Reg DM Reg
A
L
U IM Reg DM Reg
Fix register file
access hazard by
doing reads in the
second half of the
cycle and writes in
the first half
add $1,
or $8,$1,$9
clock edge that controls loading
of pipeline state registers
ECE232: Pipelining I 21 Adapted from Computer Organization and Design, Patterson&Hennessy,UCB, Kundu,UMass Koren
Internal data forwarding
I
n
s
t
r.
O
r
d
e
r
add $1,
A
L
U IM Reg DM Reg
sub $4,$1,$5
and $6,$1,$7
A
L
U IM Reg DM Reg
A
L
U IM Reg DM Reg
Fix data hazards
by forwarding
results as soon as
they are available
to where they are
needed
xor $4,$1,$5
or $8,$1,$9
A
L
U IM Reg DM Reg
A
L
U IM Reg DM Reg
ALU-to-ALU forwarding vs. full forwarding
ECE232: Pipelining I 22 Adapted from Computer Organization and Design, Patterson&Hennessy,UCB, Kundu,UMass Koren
Forwarding with Load-use Data Hazards
A
L
U IM Reg DM Reg
A
L
U IM Reg DM Reg
A
L
U IM Reg DM Reg
A
L
U IM Reg DM Reg
A
L
U IM Reg DM Reg
sub needs to stall
Will still need one stall cycle even with forwarding
I
n
s
t
r.
O
r
d
e
r
lw $1,4($2)
sub $4,$1,$5
and $6,$1,$7
xor $4,$1,$5
or $8,$1,$9
ECE232: Pipelining I 23 Adapted from Computer Organization and Design, Patterson&Hennessy,UCB, Kundu,UMass Koren
Injecting Bubbles
Read
Address
Instruction
Memory
Add
P
C
4
Write Data
Read Addr 1
Read Addr 2
Write Addr
Register
File
Read
Data 1
Read
Data 2
16 32
ALU
Shift
left 2
Add
Data
Memory
Address
Write Data
Read
Data I
F
e
t
c
h
/
D
e
c
D
e
c
/
E
x
e
c
E
x
e
c
/
M
e
m
M
e
m
/
W
B
IF ID EX MEM WB
System Clock
Sign
Extend
Inst -2 Inst -1 lw sub and
Inst -1 lw bubble sub and
Inst 2
Inst 1
lw
sub
and
ECE232: Pipelining I 24 Adapted from Computer Organization and Design, Patterson&Hennessy,UCB, Kundu,UMass Koren
3 Types of Data Hazards
RAW (read after write)
only hazard for fixed pipelines
later instruction must read after earlier instruction writes
WAW (write after write)
variable-length pipeline
later instruction must write after earlier instruction writes
WAR (write after read)
instruction with late read (e.g., waiting for an execution unit)
later instruction must write after earlier instruction reads
div $1,$4,$3
add $1,$2,$5
mlt $4,$1,$3
add $1,$2,$5
add $1,$2,$3
sub $4,$1,$5
W M EX D F
W M EX D F
W M EX D F
s4 s5 E1 s2 s1 s3 W E3 E2 D F
E2 E1 E3 W E5 E4 D F
W M EX D F
ECE232: Pipelining I 25 Adapted from Computer Organization and Design, Patterson&Hennessy,UCB, Kundu,UMass Koren
Control Hazard
Time (in cycles)
F
I
n
s
t
r
u
c
t
i
o
n
D EX M W
F D EX M W
Destination Available Here
Need Destination Here
JR R25
...
XX: ADD ...
Simple solution: Flush Instruction fetch until branch resolved

Anda mungkin juga menyukai