Anda di halaman 1dari 61

ELEN 350

Single Cycle
Datapath

Adapted from the lecture notes of John Kubiatowicz(UCB)


and Hank Walker (TAMU)
The Big Picture: The Performance Perspective
° Performance of a machine is determined by: CPI
• Instruction count
• Clock cycle time
• Clock cycles per instruction
Inst. Count Cycle Time
° Processor design (datapath and control) will
determine:
• Clock cycle time
• Clock cycles per instruction

° Single cycle processor:


• Advantage: One clock cycle per instruction
• Disadvantage: long cycle time
How to Design a Processor: step-by-
step
° 1. Analyze instruction set => datapath requirements
• the meaning of each instruction is given by the register transfers
• datapath must include storage element for ISA registers
- possibly more
• datapath must support each register transfer

° 2. Select set of datapath components and establish clocking


methodology

° 3. Assemble datapath meeting the requirements

° 4. Analyze implementation of each instruction to determine setting of


control points that effects the register transfer.

° 5. Assemble the control logic


The MIPS Instruction
Formats
° All MIPS instructions are 32 bits long. The three instruction formats:
31 26 21 16 11 6 0
• R-type op rs rt rd shamt funct
6 bits 5 bits 5 bits 5 bits 5 bits 6 bits
31 26 21 16 0
• I-type op rs rt immediate
6 bits 5 bits 5 bits 16 bits
31 26 0
• J-type
op target address
6 bits 26 bits

° The different fields are:


• op: operation of the instruction
• rs, rt, rd: the source and destination register specifiers
• shamt: shift amount
• funct: selects the variant of the operation in the “op” field
• address / immediate: address offset or immediate value
• target address: target address of the jump instruction
Step 1a: The MIPS-lite
Subset
31 26 21 16 11 6 0
° ADD and SUB
op rs rt rd shamt funct
• addu rd, rs, rt
6 bits 5 bits 5 bits 5 bits 5 bits 6 bits
• subu rd, rs, rt
31 26 21 16 0
° OR Immediate:
op rs rt immediate
• ori rt, rs, imm16 6 bits 5 bits 5 bits 16 bits
31 26 21 16 0
° LOAD and STORE Word op rs rt immediate
• lw rt, rs, imm16 6 bits 5 bits 5 bits 16 bits
• sw rt, rs, imm16
31 26 21 16 0
° BRANCH: op rs rt immediate
• beq rs, rt, imm16 6 bits 5 bits 5 bits 16 bits
Logical Register Transfers
° RTL gives the meaning of the instructions
° All start by fetching the instruction
op | rs | rt | rd | shamt | funct = MEM[ PC ]
op | rs | rt | Imm16 = MEM[ PC ]

inst Register Transfers


addu R[rd] <– R[rs] + R[rt]; PC <– PC + 4
subu R[rd] <– R[rs] – R[rt]; PC <– PC + 4
ori R[rt] <– R[rs] | zero_ext(Imm16); PC <– PC + 4
lw R[rt] <– MEM[ R[rs] + sign_ext(Imm16)]; PC <– PC + 4
sw MEM[ R[rs] + sign_ext(Imm16) ] <– R[rt]; PC <– PC + 4
beq if ( R[rs] == R[rt] ) then PC <– PC + 4 +
sign_ext(Imm16)] || 00
else PC <– PC + 4
Step 1: Requirements of the Instruction
Set
° Memory
• instruction & data
° Registers (32 x 32)
• read RS
• read RT
• Write RT or RD
° PC
° Extender
° Add and Sub register or extended immediate
° Add 4 or extended immediate to PC
Step 2: Components of the
Datapath

° Combinational Elements

° Storage Elements
• Clocking methodology
Combinational Logic Elements (Basic Building
Blocks) CarryIn
° Adder A
32

Adder
Sum
32
B Carry
32

Select
° MUX
A
32
MUX

Y
32
B
32
OP

° ALU A
32
ALU

Result
32
B
32
Verilog Implementation of Basic
Blocks

module adder(sum, carry_out, a, b, carry_in);


input [31:0] a, b;
input carry_in;
output [31:0] sum;
output carry_out;

assign {carry_out, sum} = a+b;


endmodule

module mux(y, a, b, select);


input [31:0] a, b;
input select;
output [31:0] y;

assign y = (select) ? a: b;

endmodule
Storage Element: Register (Basic Building
Block)
° Register Write Enable
• Similar to the D Flip Flop except
Data In Data Out
- 32-bit input and output
32 32
- Write Enable input
• Write Enable:
Clk
- negated (0): Data Out will not
change
- asserted (1): Data Out will
become Data In
Verilog Implementation of Basic
Blocks

module register(data_out, clk, data_in, write_en);


input [31:0] data_in;
input clk, write_en;
output [31:0] data_out;

reg [31:0] data_out;

always @(posedge clk)


begin
if (write_en)
data_out = data_in;
end
endmodule
Storage Element: Register
File
° Register File consists of 32 registers: RWRA RB
Write Enable 5 5 5
• Two 32-bit output busses:
busA and busB busA
busW 32 32-bit 32
• One 32-bit input bus: busW
32 Registers busB
Clk
° Register is selected by: 32
• RA (number) selects the register to put on busA (data)
• RB (number) selects the register to put on busB (data)
• RW (number) selects the register to be written
via busW (data) when Write Enable is 1

° Clock input (CLK)


• The CLK input is a factor ONLY during write operation
• During read operation, behaves as a combinational logic block:
- RA or RB valid => busA or busB valid after “access time.”
Storage Element: Idealized
Memory
Write Enable Address
° Memory (idealized)
• One input bus: Data In
Data In DataOut
• One output bus: Data Out 32 32
Clk
° Memory word is selected by:
• Address selects the word to put on Data Out
• Write Enable = 1: address selects the memory
word to be written via the Data In bus

° Clock input (CLK)


• The CLK input is a factor ONLY during write operation
• During read operation, behaves as a combinational logic
block:
- Address valid => Data Out valid after “access time.”
Clocking Methodology (Simple
View)

Clk

. . . .
. . . .
. . . .

° All storage elements are clocked by the same clock edge

° CLK-to-Q + Longest Delay Path < Cycle Time

° What is the longest path?


Step 3: Assemble Datapath meeting our
requirements
° Register Transfer Requirements
 Datapath Assembly

° Instruction Fetch

° Read Operands and Execute Operation


3a: Overview of the Instruction Fetch
Unit
° The common RTL operations
• Fetch the Instruction: mem[PC]
• Update the program counter:
- Sequential Code: PC <- PC + 4
- Branch and Jump: PC <- “something else”

Clk PC

Next Address
Logic

Address
Instruction Word
Instruction
Memory 32
3b: Add &
Subtract
° R[rd] <- R[rs] op R[rt] Example: addu rd, rs, rt
• Ra, Rb, and Rw come from instruction’s rs, rt, and rd fields
• ALUctr and RegWr: control logic after decoding the
instruction
31 26 21 16 11 6 0
op rs rt rd shamt funct
6 bits 5 bits 5 bits 5 bits 5 bits 6 bits

Rd Rs Rt
RegWr ALUctr
5 5 5
busA
Rw Ra Rb
busW 32 Result

ALU
32 32-bit
32 32
Registers
Clk busB
32
Register-Register Timing: One complete
Clk
cycle
Clk-to-Q
PC Old Value New Value
Instruction Memory Access Time
Rs, Rt, Rd, Old Value New Value
Op, Func
Delay through Control Logic
ALUctr Old Value New Value

RegWr Old Value New Value


Register File Access Time
busA, B Old Value New Value
ALU Delay
busW Old Value New Value

Rd Rs Rt
RegWr 5 5 ALUctr Register Write
5
Occurs Here
busA
Rw Ra Rb
busW 32 Result

ALU
32 32-bit
32 Registers 32
Clk busB
32
3c: Logical Operations with
°
Immediate
R[rt] <- R[rs] op ZeroExt[imm16] ]
31 26 21 16 11 0
op rs rt immediate
6 bits 5 bits 5 bits rd? 16 bits
31 16 15 0
0000000000000000 immediate
16 bits 16 bits
Rd Rt
RegDst
Mux
Rs Rt? ALUctr
RegWr 5 5 5
busA
Rw Ra Rb
busW 32 Result
32 32-bit

ALU
32 Registers 32
Clk busB
32
Mux
ZeroExt

imm16
16 32
ALUSrc
3d: Load
°
Operations
R[rt] <- Mem[R[rs] + SignExt[imm16]] Example: lw rt, rs, imm16

31 26 21 16 11 0
op rs rt immediate
6 bits 5 bits 5 bits rd 16 bits

Rd Rt
RegDst
Mux
Rs Rt?
RegWr 5 ALUctr
5 5
busA W_Src
Rw Ra Rb
busW 32
32 32-bit

ALU
32 Registers 32
Clk busB MemWr

Mux
32
Mux

WrEn Adr
Extender

Data In 32
32 ?? Data
imm16 32
16 Memory
Clk
ALUSrc

ExtOp
3e: Store
Operations
° Mem[ R[rs] + SignExt[imm16] <- R[rt] ] Example: sw rt, rs, imm16
31 26 21 16 0
op rs rt immediate
6 bits 5 bits 5 bits 16 bits
Rd Rt ALUctr MemWr W_Src
RegDst
Mux
Rs Rt
RegWr 5 5 5
busA
Rw Ra Rb
busW 32
32 32-bit

ALU
32 Registers 32
Clk busB

x
Mu
Mux
32
WrEn Adr
Extender

Data In 32 32
imm16 Data
32
16 Memory
Clk

ExtOp ALUSrc
3f: The Branch
Instruction
31 26 21 16 0
op rs rt immediate
6 bits 5 bits 5 bits 16 bits

° beq rs, rt, imm16

• mem[PC] Fetch the instruction from memory

• Equal <- R[rs] == R[rt] Calculate the branch condition

• if (Equal) Calculate the next instruction’s address


- PC <- PC + 4 + ( SignExt(imm16) x 4 )
• else
- PC <- PC + 4
Datapath for Branch
Operations
° beq rs, rt, imm16 Datapath generates condition (equal)

31 26 21 16 0
op rs rt immediate
6 bits 5 bits 5 bits 16 bits
Inst Address Cond

nPC_sel Rs Rt
4 RegWr 5 5 5
Adder

32 busA
Rw Ra Rb
00

busW

Equal?
32 32-bit 32
Mux

Registers
PC

Clk busB
32
Adder
PC Ext

imm16
Clk
Putting it All Together: A Single Cycle
Datapath
Instruction<31:0>

<21:25>

<16:20>

<11:15>
Inst

<0:15>
Memory
Adr
Rs Rt Rd Imm16

nPC_sel RegDst ALUctr MemWr MemtoReg


Equal
Rd Rt
1 0
Rs Rt
4 RegWr 5 5 5
Adder

busA
busW Rw Ra Rb =
00

32 32-bit 32

ALU
Mux

32 Registers busB 32 0
PC

Mux
32

Mux
Adder

Clk 32 WrEn Adr 1


Extender

1 Data In
PC Ext

Clk imm16 32 Data


imm16

16 Memory
Clk

ExtOp ALUSrc
An Abstract View of the Critical
° Path file and ideal memory:
Register
• The CLK input is a factor ONLY during write operation
• During read operation, behave as combinational logic:
- Address valid => Output valid after “access time.”
Critical Path (Load Operation) =
PC’s Clk-to-Q +
Ideal Instruction Memory’s Access Time +
Instruction Register File’s Access Time +
Instruction ALU to Perform a 32-bit Add +
Memory
Rd Rs Rt Imm Data Memory Access Time +
5 5 5 16 Setup Time for Register File Write +
Instruction Clock Skew
Address
A Data
Rw Ra Rb 32
Next Address

32 Address
32 Ideal
32 32-bit
ALU
Data
PC

Registers Data In
B Memory

Clk
Clk

Clk
32
An Abstract View of the
Implementation

Ideal Control
Instruction Control Signals Conditions
Instruction
Memory
Rd Rs Rt
5 5 5
Instruction
Address
A Data
Rw Ra Rb 32 Data
Next Address

32 Address
32 Ideal Out
32 32-bit

ALU
Data
PC

Registers Data In Memory


B

Clk Clk
32
Clk

Datapath
Recap: A Single Cycle Datapath
° Rs, Rt, Rd and Imed16 hardwired into datapath from Fetch Unit
° We have everything except control signals (underline)
• Today’s lecture will show you how to generate the control signals
Instruction<31:0>
nPC_sel Instruction

<21:25>

<16:20>

<11:15>

<0:15>
Rd Rt Fetch Unit
Clk
RegDst 1 Mux 0
Rs Rt Rt Rs Rd Imm16
RegWr 5 5 5 ALUctr Zero
busA MemWr MemtoReg
Rw Ra Rb
busW 32
32 32-bit

ALU
32 0
Registers busB 0 32
Clk

Mux
32
Mux

32
WrEn Adr 1
Extender

1 Data In 32
imm16 Data
32
16 Memory
Clk
ALUSrc

ExtOp
Recap: Meaning of the Control
Signals
° nPC_MUX_sel: 0  PC <– PC + 4
1  PC <– PC + 4 + SignExt(Im16) || 00

° Later in lecture: higher-level connection between mux and branch cond

nPC_MUX_sel

Inst
Memory
Adr
4
Adder

00
Mux

PC
Adder
imm16

PC Ext

Clk
Recap: Meaning of the Control
°
Signals“zero”, “sign”
ExtOp: ° MemWr: 1  write memory
° MemtoReg: 0  ALU; 1  Mem
° ALUsrc: 0  regB; 1  immed
° RegDst: 0  “rt”; 1  “rd”
° ALUctr: “add”, “sub”, “or”
° RegWr: 1  write register
RegDst
Equal ALUctr MemWr MemtoReg
Rd Rt
1 0
Rs Rt
RegWr 5 5 5
busA
Rw Ra Rb =
busW
32 32-bit 32

ALU
32 Registers busB 0
0 32

Mux
32
Mux

Clk 32 WrEn Adr


Extender

1 Data In
Data 1
imm16 32
16 Memory
Clk

ExtOp ALUSrc
RTL: The Add
Instruction
31 26 21 16 11 6 0
op rs rt rd shamt funct
6 bits 5 bits 5 bits 5 bits 5 bits 6 bits

° add rd, rs, rt

• mem[PC] Fetch the instruction


from memory

• R[rd] <- R[rs] + R[rt] The actual operation

• PC <- PC + 4 Calculate the next


instruction’s address
Instruction Fetch Unit at the Beginning of
Add
° Fetch the instruction from Instruction memory: Instruction <- mem[PC]
• This is the same for all instructions

Inst
Memory Instruction<31:0>
Adr

nPC_MUX_sel

4
Adder

00
Mux

PC
Adder

Clk
imm16

PC Ext
The Single Cycle Datapath during
Add 31 26 21 16 11 6 0
op rs rt rd shamt funct

° R[rd] <- R[rs] + R[rt]


Instruction<31:0>
nPC_sel= +4
Instruction

<21:25>

<16:20>

<11:15>

<0:15>
Rd Rt Fetch Unit
RegDst = 1 Clk
1 Mux 0
Rs Rt ALUctr = Add Rt Rs Rd Imm16
RegWr = 1 5 5 5 MemtoReg = 0
busA Zero MemWr = 0
Rw Ra Rb
busW 32
32 32-bit

ALU
32 0
Registers busB 0 32
Clk

Mux
32
Mux

32
WrEn Adr 1
Extender

1 Data In 32
imm16 Data
32
16 Memory
Clk
ALUSrc = 0

ExtOp = x
Instruction Fetch Unit at the End of
Add
° PC <- PC + 4
• This is the same for all instructions except: Branch and Jump

Inst
Memory Instruction<31:0>
Adr

nPC_MUX_sel

4
Adder

0
00
Mux

PC

1
Adder

Clk
imm16
The Single Cycle Datapath during Or
Immediate
31 26 21 16 0
op rs rt immediate

° R[rt] <- R[rs] or ZeroExt[Imm16]


Instruction<31:0>
nPC_sel =
Instruction

<21:25>

<16:20>

<11:15>

<0:15>
Rd Rt Fetch Unit
RegDst = Clk
1 Mux 0
Rs Rt ALUctr = Rt Rs Rd Imm16
RegWr = 5 5 5 MemtoReg =
busA Zero MemWr =
Rw Ra Rb
busW 32
32 32-bit

ALU
32 0
Registers busB 0 32
Clk

Mux
32
Mux

32
WrEn Adr 1
Extender

1 Data In 32
imm16 Data
32
16 Memory
Clk
ALUSrc =

ExtOp =
The Single Cycle Datapath during
Load 31 26 21 16 0
op rs rt immediate

° R[rt] <- Data Memory {R[rs] + SignExt[imm16]}


Instruction<31:0>
nPC_sel= +4
Instruction

<21:25>

<16:20>

<11:15>

<0:15>
Rd Rt Fetch Unit
RegDst = 0 Clk
1 Mux 0
Rs Rt ALUctr Rt Rs Rd Imm16
RegWr = 1 5 5 5 = Add MemtoReg = 1
busA Zero MemWr = 0
Rw Ra Rb
busW 32
32 32-bit

ALU
32 0
Registers busB 0 32
Clk

Mux
32
Mux

WrEn Adr 1
Extender

1 Data In 32
imm16 Data 32
32
16 Memory
Clk
ALUSrc = 1

ExtOp = 1
The Single Cycle Datapath during
Store 31 26 21 16 0
op rs rt immediate

° Data Memory {R[rs] + SignExt[imm16]} <- R[rt]


Instruction<31:0>
nPC_sel =
Instruction

<21:25>

<16:20>

<11:15>

<0:15>
Rd Rt Fetch Unit
RegDst = Clk
1 Mux 0
Rs Rt ALUctr = Rt Rs Rd Imm16
RegWr = 5 5 5
MemtoReg =
busA Zero MemWr =
Rw Ra Rb
busW 32
32 32-bit

ALU
32 0
Registers busB 0 32
Clk

Mux
32
Mux

32
WrEn Adr 1
Extender

1 Data In 32
imm16 Data
32
16 Memory
Clk
ALUSrc =

ExtOp =
The Single Cycle Datapath during
Store 31 26 21 16 0
op rs rt immediate

° Data Memory {R[rs] + SignExt[imm16]} <- R[rt]


Instruction<31:0>
nPC_sel= +4
Instruction

<21:25>

<16:20>

<11:15>

<0:15>
Rd Rt Fetch Unit
RegDst = x Clk
1 Mux 0
Rs Rt ALUctr Rt Rs Rd Imm16
RegWr = 0 5 5 5 = Add
MemtoReg = x
busA Zero MemWr = 1
Rw Ra Rb
busW 32
32 32-bit

ALU
32 0
Registers busB 0 32
Clk

Mux
32
Mux

32
WrEn Adr 1
Extender

1 Data In 32
imm16 Data
32
16 Memory
Clk
ALUSrc = 1

ExtOp = 1
The Single Cycle Datapath during
Branch31 26 21 16 0
op rs rt immediate

° if (R[rs] - R[rt] == 0) then Zero <- 1 ; else Zero <- 0


Instruction<31:0>
nPC_sel= “Br”
Instruction

<21:25>

<16:20>

<11:15>

<0:15>
Rd Rt Fetch Unit
RegDst = x Clk
1 Mux 0
RegWr = 0 Rs Rt ALUctr =Sub Rt Rs Rd Imm16
5 5 5 MemtoReg = x
busA Zero MemWr = 0
Rw Ra Rb
busW 32
32 32-bit

ALU
32 0
Registers busB 0 32
Clk

Mux
32
Mux

32
WrEn Adr 1
Extender

1 Data In 32
imm16 Data
32
16 Memory
Clk
ALUSrc = 0

ExtOp = x
Instruction Fetch Unit at the End of
Branch31 26 21 16 0
op rs rt immediate

° if (Zero == 1) then PC = PC + 4 + SignExt[imm16]*4 ; else PC = PC + 4

Inst
Memory Instruction<31:0>
nPC_sel Adr
Zero
° What is encoding of nPC_sel?
• Direct MUX select?
nPC_MUX_sel
• Branch / not branch
4
° Let’s choose second option
Adder

0 nPC_sel zero? MUX


00

0 x 0
Mux

1 0 0
PC

1 1 1
1
Adder

Clk
imm16
Step 4: Given Datapath: RTL ->
Control
Instruction<31:0>

<21:25>
Inst

<21:25>

<16:20>

<11:15>

<0:15>
Memory
Adr
Op Fun Rt Rs Rd Imm16

Control

nPC_sel RegWr RegDst ExtOp ALUSrc ALUctr MemWr MemtoReg Zero

DATA PATH
A Summary of Control
Signals

inst Register Transfer


ADD R[rd] <– R[rs] + R[rt]; PC <– PC + 4
ALUsrc = RegB, ALUctr = “add”, RegDst = rd, RegWr, nPC_sel = “+4”
SUB R[rd] <– R[rs] – R[rt]; PC <– PC + 4
ALUsrc = RegB, ALUctr = “sub”, RegDst = rd, RegWr, nPC_sel = “+4”
ORi R[rt] <– R[rs] + zero_ext(Imm16); PC <– PC + 4
ALUsrc = Im, Extop = “Z”, ALUctr = “or”, RegDst = rt, RegWr, nPC_sel = “+4”
LOAD R[rt] <– MEM[ R[rs] + sign_ext(Imm16)]; PC <– PC + 4
ALUsrc = Im, Extop = “Sn”, ALUctr = “add”,
MemtoReg, RegDst = rt, RegWr, nPC_sel = “+4”
STORE MEM[ R[rs] + sign_ext(Imm16)] <– R[rs]; PC <– PC + 4
ALUsrc = Im, Extop = “Sn”, ALUctr = “add”, MemWr, nPC_sel = “+4”
BEQ if ( R[rs] == R[rt] ) then PC <– PC + sign_ext(Imm16)] || 00 else PC <– PC + 4
nPC_sel = “Br”, ALUctr = “sub”
A Summary of the Control
Signals
See func 100000 100010 We Don’t Care :-)
Appendix A op 000000 000000 001101 100011 101011 000100 000010
add sub ori lw sw beq jump
RegDst 1 1 0 0 x x x
ALUSrc 0 0 1 1 1 0 x
MemtoReg 0 0 0 1 x x x
RegWrite 1 1 1 1 0 0 0
MemWrite 0 0 0 0 1 0 0
nPCsel 0 0 0 0 0 1 0
Jump 0 0 0 0 0 0 1
ExtOp x x 0 1 1 x x
ALUctr<2:0> Add Subtract Or Add Add Subtract xxx

31 26 21 16 11 6 0
R-type op rs rt rd shamt funct add, sub

I-type op rs rt immediate ori, lw, sw, beq

J-type op target address jump


The Concept of Local
Decoding
op 00 0000 00 1101 10 0011 10 1011 00 0100 00 0010
R-type ori lw sw beq jump
RegDst 1 0 0 x x x
ALUSrc 0 1 1 1 0 x
MemtoReg 0 0 1 x x x
RegWrite 1 1 1 0 0 0
MemWrite 0 0 0 1 0 0
Branch 0 0 0 0 1 0
Jump 0 0 0 0 0 1
ExtOp x 0 1 1 x x
ALUop<N:0> “R-type” Or Add Add Subtract xxx

func
ALU ALUctr
op Main 6
ALUop Control 3
6 Control
(Local)
N

ALU
The Encoding of
ALUop
func
op 6 ALU ALUctr
Main
ALUop Control
6 Control 3
(Local)
N

° In this exercise, ALUop has to be 2 bits wide to represent:


• (1) “R-type” instructions
• “I-type” instructions that require the ALU to perform:
- (2) Or, (3) Add, and (4) Subtract

° To implement the full MIPS ISA, ALUop has to be 3 bits to represent:


• (1) “R-type” instructions
• “I-type” instructions that require the ALU to perform:
- (2) Or, (3) Add, (4) Subtract, and (5) And (Example: andi)

R-type ori lw sw beq jump


ALUop (Symbolic) “R-type” Or Add Add Subtract xxx
ALUop<2:0> 1 00 0 10 0 00 0 00 0 01 xxx
The Decoding of the “func”
Field
func
op 6 ALU ALUctr
Main
ALUop Control
6 Control 3
(Local)
N

R-type ori lw sw beq jump


ALUop (Symbolic) “R-type” Or Add Add Subtract xxx
ALUop<2:0> 1 00 0 10 0 00 0 00 0 01 xxx

31 26 21 16 11 6 0
R-type op rs rt rd shamt funct
P. 286 text:
funct<5:0> Instruction Operation ALUctr ALUctr<2:0> ALU Operation
10 0000 add 000 And
10 0010 subtract ALU
001 Or
10 0100 and 010 Add
10 0101 or 110 Subtract
10 1010 set-on-less-than 111 Set-on-less-than
The Truth Table for
ALUctr
funct<3:0> Instruction Op.
0000 add
ALUop R-type ori lw sw beq 0010 subtract
(Symbolic) “R-type” Or Add Add Subtract 0100 and
ALUop<2:0> 1 00 0 10 0 00 0 00 0 01 0101 or
1010 set-on-less-than

ALUop func ALU ALUctr


bit<2> bit<1> bit<0> bit<3> bit<2> bit<1> bit<0> Operation bit<2> bit<1> bit<0>
0 0 0 x x x x Add 0 1 0
0 x 1 x x x x Subtract 1 1 0
0 1 x x x x x Or 0 0 1
1 x x 0 0 0 0 Add 0 1 0
1 x x 0 0 1 0 Subtract 1 1 0
1 x x 0 1 0 0 And 0 0 0
1 x x 0 1 0 1 Or 0 0 1
1 x x 1 0 1 0 Set on < 1 1 1
The Logic Equation for
ALUctr<2>
ALUop func
bit<2> bit<1> bit<0> bit<3> bit<2> bit<1> bit<0> ALUctr<2>
0 x 1 x x x x 1
1 x x 0 0 1 0 1
1 x x 1 0 1 0 1

This makes func<3> a don’t care

° ALUctr<2> = !ALUop<2> & ALUop<0> +


ALUop<2> & !func<2> & func<1> & !func<0>
The Logic Equation for
ALUctr<1>
ALUop func
bit<2> bit<1> bit<0> bit<3> bit<2> bit<1> bit<0> ALUctr<1>
0 0 0 x x x x 1
0 x 1 x x x x 1
1 x x 0 0 0 0 1
1 x x 0 0 1 0 1
1 x x 1 0 1 0 1

° ALUctr<1> = !ALUop<2> & !ALUop<1> +


ALUop<2> & !func<2> & !func<0>
The Logic Equation for
ALUctr<0>
ALUop func
bit<2> bit<1> bit<0> bit<3> bit<2> bit<1> bit<0> ALUctr<0>
0 1 x x x x x 1
1 x x 0 1 0 1 1
1 x x 1 0 1 0 1

° ALUctr<0> = !ALUop<2> & ALUop<1>


+ ALUop<2> & !func<3> & func<2> & !func<1> & func<0>
+ ALUop<2> & func<3> & !func<2> & func<1> & !func<0>
The ALU Control
Block
func
6 ALU ALUctr
ALUop Control
3
(Local)
3

° ALUctr<2> = !ALUop<2> & ALUop<0> +


ALUop<2> & !func<2> & func<1> & !func<0>

° ALUctr<1> = !ALUop<2> & !ALUop<1> +


ALUop<2> & !func<2> & !func<0>

° ALUctr<0> = !ALUop<2> & ALUop<1>


+ ALUop<2> & !func<3> & func<2> & !func<1> & func<0>
+ ALUop<2> & func<3> & !func<2> & func<1> & !func<0>
Step 5: Logic for each control
signal

° nPC_sel <= (OP == `BEQ) ? `Br : `plus4;

° ALUsrc <= (OP == `Rtype) ? `regB : `immed;

° ALUctr <= (OP == `Rtype`) ? funct :


(OP == `ORi) ? `ORfunction :
(OP == `BEQ) ? `SUBfunction : `ADDfunction;

° ExtOp <= _____________

° MemWr <= _____________

° MemtoReg <= _____________

° RegWr: <=_____________

° RegDst: <= _____________


Step 5: Logic for each control
signal

° nPC_sel <= (OP == `BEQ) ? `Br : `plus4;

° ALUsrc <= (OP == `Rtype) ? `regB : `immed;

° ALUctr <= (OP == `Rtype`) ? funct :


(OP == `ORi) ? `ORfunction :
(OP == `BEQ) ? `SUBfunction : `ADDfunction;

° ExtOp <= (OP == `ORi) : `ZEROextend : `SIGNextend;

° MemWr <= (OP == `Store) ? 1 : 0;

° MemtoReg <= (OP == `Load) ? 1 : 0;

° RegWr: <= ((OP == `Store) || (OP == `BEQ)) ? 0 : 1;

° RegDst: <= ((OP == `Load) || (OP == `ORi)) ? 0 : 1;


The “Truth Table” for the Main
Control RegDst
func
ALUSrc ALU ALUctr
op Main 6
6 Control
: Control 3
ALUop (Local)
3
op 00 0000 00 1101 10 0011 10 1011 00 0100 00 0010
R-type ori lw sw beq jump
RegDst 1 0 0 x x x
ALUSrc 0 1 1 1 0 x
MemtoReg 0 0 1 x x x
RegWrite 1 1 1 0 0 0
MemWrite 0 0 0 1 0 0
nPC_sel 0 0 0 0 1 0
Jump 0 0 0 0 0 1
ExtOp x 0 1 1 x x
ALUop (Symbolic) “R-type” Or Add Add Subtract xxx
ALUop <2> 1 0 0 0 0 x
ALUop <1> 0 1 0 0 0 x
ALUop <0> 0 0 0 0 1 x
The “Truth Table” for
RegWrite
op 00 0000 00 1101 10 0011 10 1011 00 0100 00 0010
R-type ori lw sw beq jump
RegWrite 1 1 1 0 0 0

° RegWrite = R-type + ori + lw


= !op<5> & !op<4> & !op<3> & !op<2> & !op<1> & !op<0> (R-type)
+ !op<5> & !op<4> & op<3> & op<2> & !op<1> & op<0> (ori)
+ op<5> & !op<4> & !op<3> & !op<2> & op<1> & op<0> (lw)
op<5> .. op<5> .. op<5> .. op<5> .. op<5> .. op<5> ..
<0> <0> <0> <0> <0> op<0>

R-type ori lw sw beq jump


RegWrite
PLA Implementation of the Main
Control
op<5> ..
op<5> op<5> ..
op<5> op<5>.. .. .. op<5> ..
<0> <0> <0> <0> <0> op<0>

R-type ori lw sw beq jump


RegWrite

ALUSrc
RegDst
MemtoReg
MemWrite
Branch
Jump
ExtOp
ALUop<2>
ALUop<1>
ALUop<0>
Putting it All Together: A Single Cycle
Processor ALUop
ALU ALUctr
RegDst 3 func Control
op Main 3
ALUSrc Instr<5:0> 6
6 Control
Instr<31:26> : Instruction<31:0>
nPC_sel
Instruction

<21:25>

<16:20>

<11:15>

<0:15>
Rd Rt Fetch Unit
RegDst Clk
1 Mux 0
Rs Rt Rt Rs Rd Imm16
RegWr 5 5 5 ALUctr
busA Zero MemWr MemtoReg
Rw Ra Rb
busW 32
32 32-bit

ALU
32 0
Registers busB 0 32
Clk

Mux
32
Mux

32
WrEn Adr 1
Extender

1 Data In 32
imm16 Data
32
Instr<15:0> 16 Memory
Clk
ALUSrc

ExtOp
Recap: An Abstract View of the Critical Path
° (Load)
Register file and ideal memory:
• The CLK input is a factor ONLY during write operation
• During read operation, behave as combinational logic:
- Address valid => Output valid after “access time.”
Critical Path (Load Operation) =
PC’s Clk-to-Q +
Ideal Instruction Memory’s Access Time +
Instruction Register File’s Access Time +
Instruction ALU to Perform a 32-bit Add +
Memory
Rd Rs Rt Imm Data Memory Access Time +
5 5 5 16 Setup Time for Register File Write +
Instruction Clock Skew
Address
A Data
Rw Ra Rb 32
Next Address

32 Address
32 Ideal
32 32-bit
ALU
Data
PC

Registers Data In
B Memory

Clk Clk
32
Clk
Worst Case Timing
Clk
(Load)
Clk-to-Q
PC Old Value New Value
Instruction Memory Access Time
Rs, Rt, Rd, Old Value New Value
Op, Func
Delay through Control Logic
ALUctr Old Value New Value

ExtOp Old Value New Value

ALUSrc Old Value New Value

MemtoReg Old Value New Value Register


Write Occurs
RegWr Old Value New Value
Register File Access Time
busA Old Value New Value
Delay through Extender & Mux
busB Old Value New Value
ALU Delay
Address Old Value New Value
Data Memory Access Time
busW Old Value New
Drawback of this Single Cycle
Processor

° Long cycle time:


• Cycle time must be long enough for the load instruction:
PC’s Clock -to-Q +
Instruction Memory Access Time +
Register File Access Time +
ALU Delay (address calculation) +
Data Memory Access Time +
Register File Setup Time +
Clock Skew

° Cycle time for load is much longer than needed for all other instructions
Summary
° Single cycle datapath => CPI=1, CCT => long

° 5 steps to design a processor


• 1. Analyze instruction set => datapath requirements
• 2. Select set of datapath components & establish clock methodology
• 3. Assemble datapath meeting the requirements
• 4. Analyze implementation of each instruction to determine setting of
control points that effects the register transfer.
• 5. Assemble the control logic
Processor
Input
° Control is the hard part Control
Memory
° MIPS makes control easier
Datapath Output
• Instructions same size
• Source registers always in same place
• Immediates same size, location
• Operations always on registers/immediates

Anda mungkin juga menyukai