Circuits
A Design Perspective
Jan M. Rabaey
Anantha Chandrakasan
Borivoje Nikolic
Arithmetic Circuits
January, 2003
1
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
A Generic Digital Processor
MEM ORY
INPUT-OUTPUT
CONTROL
DATAPATH
2
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Building Blocks for Digital Architectures
Arithmetic unit
- Bit-sliced datapath (adder, multiplier, shifter, comparator, etc.)
Memory
- RAM, ROM, Buffers, Shift registers
Control
- Finite state machine (PLA, random logic.)
- Counters
Interconnect
- Switches
- Arbiters
- Bus
3
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
An Intel Microprocessor
9-1 Mux
5-1 Mux
a g64
CARRYGEN
node1
SUMSEL
sum sumb
REG
ck1 to Cache
9-1 Mux
LU : Logical
Unit
1000um
4
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Bit-Sliced Design
Control
Bit 3
Data-Out
Multiplexer
Bit 2
Data-In
Register
Adder
Shifter
Bit 1
Bit 0
Multiplexers
Shifter
Adder stage 1
Wiring
Loopback Bus
Loopback Bus
Loopback Bus
Adder stage 2
Wiring
Bit slice 63
Bit slice 2
Bit slice 1
Bit slice 0
Adder stage 3
Sum Select
6
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Itanium Integer Datapath
8
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Full-Adder
A B
Sum
9
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
The Binary Adder
A B
Sum
S = A B Ci
10
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Express Sum and Carry as a function of P, G, D
VDD
Ci A B
A B
A
B
Ci B
VDD
A
X
Ci
Ci A S
Ci
A B B VDD
A B Ci A
Co B
28 Transistors
12
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
A Better Structure: The Mirror Adder
VDD
VDD VDD A
A B B A B Ci B
Kill
"0"-Propagate A Ci
Co
Ci S
A Ci
"1"-Propagate Generate
A B B A B Ci A
24 transistors
13
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Mirror Adder
Stick Diagram
VDD
A B Ci B A Ci Co Ci A B
Co
GND
14
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
The Mirror Adder
The NMOS and PMOS chains are completely symmetrical.
A maximum of two series transistors can be observed in the carry-
generation circuitry.
When laying out the cell, the most critical issue is the minimization
of the capacitance at node Co. The reduction of the diffusion
capacitances is particularly important.
The capacitance at node Co is composed of four diffusion
capacitances, two internal gate capacitances, and six gate
capacitances in the connecting adder cell .
The transistors connected to Ci are placed closest to the output.
Only the transistors in the carry stage have to be optimized for
optimal speed. All transistors in the sum stage can be minimal
size.
15
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Transmission Gate Full Adder
P
VDD
VDD Ci
A
P S Sum Generation
A A P Ci
A P VDD
B B
VDD A
P
P Co Carry Generation
Ci Ci Ci
A
Setup P
16
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
One-phase dynamic CMOS adder
17
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
One-phase dynamic CMOS adder
18
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
One-phase dynamic CMOS adder
19
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
The Ripple-Carry Adder
A0 B0 A1 B1 A2 B2 A3 B3
S0 S1 S2 S3
20
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Inversion Property
A B A B
Ci FA Co Ci FA Co
S S
21
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Minimize Critical Path by Reducing Inverting Stages
A0 B0 A1 B1 A2 B2 A3 B3
S0 S1 S2 S3
22
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Carry Look-Ahead Adders
23
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Carry-Lookahead Adders
24
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Carry-Lookahead Adders
25
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Look-Ahead: Topology
Expanding Lookahead equations: VDD
C o, k = Gk + Pk (Gk 1 + Pk 1 Co , k 2 ) G3
G2
G1
All the way:
G0
C o, k = Gk + Pk ( Gk 1 + P k 1( + P1 ( G0 + P0 Ci , 0 ) ) )
Ci,0
Co,3
P0
P1
P2
P3
26
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Manchester Carry Chain
VDD
Pi
VDD
Pi
Ci Co
Gi
Co Gi
Ci
Di
Pi
27
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Manchester Carry Chain
VDD
P0 P1 P2 P3
C3
Ci,0
G0 G1 G2 G3
C0 C1 C2 C3
28
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Manchester Carry Chain
Stick Diagram
Propagate/Generate Row
VDD
Pi Gi Pi + 1 Gi + 1
Ci - 1 Ci Ci + 1
GND
Inverter/Sum Row
29
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Carry-Bypass Adder
P0 G1 P0 G1 P2 G2 P3 G3 Also called
Carry-Skip
Ci,0 C o,0 C o,1 Co,2 Co,3
FA FA FA FA
P0 G1 P0 G1 P2 G2 P3 G3
BP=P oP1 P2 P3
Ci,0 C o,0 Co,1 C o,2
Multiplexer
FA FA FA FA
Co,3
30
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Carry-Bypass Adder (cont.)
Bit 03 Bit 47 Bit 811 Bit 1215
Setup tsetup Setup Setup Setup
tbypass
M bits
31
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Carry Ripple versus Carry Bypass
tp
ripple adder
bypass adder
4..8 N
32
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Carry-Select Adder
Setup
P,G
Carry Vector
Sum Generation
33
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Carry Select Adder: Critical Path
Bit 03 Bit 47 Bit 811 Bit 1215
Setup Setup Setup Setup
34
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Linear Carry Select
Bit 0-3 Bit 4-7 Bit 8-11 Bit 12-15
(1)
35
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Square Root Carry Select
Bit 0-1 Bit 2-4 Bit 5-8 Bit 9-13 Bit 14-19
36
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Adder Delays - Comparison
50
30
Linear select
20
10
Square root select
0
0 20 40 60
N
37
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
O Operator
Definizione
38
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Properties of the O operator
39
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Properties of the O operator
40
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Group Generate and Propagate
41
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Group Generate and Propagate
42
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Group Generate and Propagate
43
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Look-Ahead - Basic Idea
A0, B0 A1, B1 AN-1, BN-1
Ci,0 P0 Ci,1 P1
Ci, N-1 PN-1
S0 S1 SN-1
C o, k = f (A k, B k, Co , k 1 ) = Gk + P k Co , k 1
44
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Logarithmic Look-Ahead Adder
A0 F
A1 A2 A3 A4 A5 A6 A7
A0
tp N
A1
A2
A3
F
A4
A5
A6 tp log2(N)
A7
45
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Brent-Kung BLC adder
46
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Folding of the inverse tree
47
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Folding the inverse tree
48
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Dense tree with minimum fan-out
49
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Dense tree with simple connections
50
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Carry Lookahead Trees
Co , 0 = G0 + P0 Ci , 0
C o, 1 = G1 + P1 G 0 + P1 P0 Ci, 0
C o, 2 = G2 + P2 G 1 + P2 P1 G0 + P2 P1 P0 C i, 0
= ( G 2 + P2 G1) + ( P2 P1 ) ( G 0 + P0 Ci , 0 ) = G 2:1 + P2:1 C o, 0
51
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Digital
(A0, B0) S0
(A1, B1) S1
(A2, B2) S2
(A3, B3) S3
(A5, B5) S5
(A6, B6) S6
(A7, B7) S7
(A8, B8) S8
(a 1, b 1) S1
(a 2, b 2) S2
(a 3, b 3) S3
(a 5, b 5) S5
(a 6, b 6) S6
(a 7, b 7) S7
(a 8, b 8) S8
(a 10, b 10) S 10
(a 11, b 11) S 11
(a 12, b 12) S 12
(a 13, b 13) S 13
(a 14, b 14) S 14
(a 15, b 15) S 15
53
Arithmetic Circuits
Digital
(a 0, b 0) S0
(a 1, b 1) S1
(a 2, b 2) S2
(a 3, b 3) S3
(a 5, b 5) S5
Sparse Trees
(a 6, b 6) S6
(a 7, b 7) S7
(a 8, b 8) S8
(a 9, b 9) S9
(a 10, b 10) S 10
(a 11, b 11) S 11
(a 12, b 12) S 12
16-bit radix-2 sparse tree with sparseness of 2
(a 13, b 13) S 13
(a 14, b 14) S 14
(a 15, b 15) S 15
54
Arithmetic Circuits
Digital
(A0, B0) S0
(A1, B1) S1
(A2, B2) S2
Brent-Kung Tree
(A4, B4) S4
Tree Adders
(A5, B5) S5
(A6, B6) S6
(A7, B7) S7
(A8, B8) S8
(A9, B9) S9
VDD
Clk
Gi = aibi
Clk
P i = a i + bi ai
ai bi
bi
Clk
Clk
Propagate Generate
56
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Example: Domino Adder
VDD
VDD
Clkk
Clkk
Pi:i-2k+1 Gi:i-2k+1
Pi:i-k+1 Pi:i-k+1
Gi:i-k+1
Pi-k:i-2k+1 Gi-k:i-2k+1
Propagate Generate
57
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Example: Domino Sum
VDD VDD Keeper
Clk Clkd
Sum
Gi:0
Clk Si0
Clkd
Clk
Gi:0
Si1
Clk
58
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Adders Summary
59
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Multipliers
60
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
The Binary Multiplication
61
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
The Binary Multiplication
62
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
The Binary Multiplication
1 0 1 0 1 0 Multiplicand
x 1 0 1 1 Multiplier
1 0 1 0 1 0
1 0 1 0 1 0
0 0 0 0 0 0 Partial products
+ 1 0 1 0 1 0
1 1 1 0 0 1 1 1 0 Result
63
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
The Array Multiplier
X3 X2 X1 X0 Y0
X3 X2 X1 X0 Y1 Z0
HA FA FA HA
X3 X2 X1 X0 Y2 Z1
FA FA FA HA
X3 X2 X1 X0 Y3 Z2
FA FA FA HA
Z7 Z6 Z5 Z4 Z3
64
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
The MxN Array Multiplier
Critical Path
HA FA FA HA
FA FA FA HA Critical Path 1
Critical Path 2
65
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Carry-Save Multiplier
HA HA HA HA
HA FA FA FA
HA FA FA FA
HA FA FA HA
66
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Multiplier Floorplan
X3 X2 X1 X0
Y0
Y1 HA Multiplier Cell
C S C S C S C S
Z0
FA Multiplier Cell
Y2
C S C S C S C S
Z1 Vector Merging Cell
Y3
C S C S C S C S X and Y signals are broadcasted
Z2 through the complete array.
( )
C C C C
S S S S
Z7 Z6 Z5 Z4 Z3
67
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Wallace-Tree Multiplier
Partial products First stage
6 5 4 3 2 1 0 6 5 4 3 2 1 0 Bit position
(a) (b)
FA HA
(c) (d)
68
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Wallace-Tree Multiplier
x3y2 x2y2 x3y1 x1y2 x3y0 x1y1 x2y0 x0y1
Partial products x3y3 x2y3 x1y3 x0y3 x2y1 x0y2 x1y0 x0y0
First stage
HA HA
Second stage FA FA FA FA
Final adder
z7 z6 z5 z4 z3 z2 z1 z0
69
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Wallace-Tree Multiplier
y0 y1
y2
y 0 y 1 y2 y3 y4 y5
Ci-1
FA
y3
FA FA
Ci Ci Ci-1
Ci-1
FA Ci Ci-1
y4
FA
Ci Ci-1 Ci Ci-1
FA
y5
Ci FA
FA
C S
C S
70
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Wallace Tree Mult. Performance
71
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Wallace Tree Multiplier Complexity
72
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
4:2 Adder
73
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Eight-input Tree
74
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Architectural comparison of
multiplier solutions
75
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
SPIM Architecture
76
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
SPIM Pipe Timing
77
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
SPIM Microphotograph
78
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
SPIM clock generator circuit
79
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Binary Tree Multiplier Performance
80
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Binary Tree Multiplier Complexity
81
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Multipliers Summary
82
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Multipliers Summary
83
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Booth encoding
84
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Booth encoding
85
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Tree Multiplier with Booth Encoding
86
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
The f.p. addition algorithm
87
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
The f.p. multiplication algorithm
Mantissas multiplication
Exponent addition
Mantissa normalization and exponent
adjusting (if needed)
Rounding of result
88
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Dividers
89
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Iterative Division (Newton-Raphson)
90
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Iterative Division (Newton-Raphson)
91
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Quadratic Convergence of the Newton
Method
92
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Properties of the Newton Method
93
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Iterative Division (Goldschmidt)
94
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Iterative division (Goldschmidt)
95
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Iterative Division (Goldschmidt)
96
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Modified Goldschmidt Algorithm
(correction of round-off errors)
97
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
The Link between the Newton-Raphson
and Goldschmidt Methods
98
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
The Link between the Newton-Raphson
and Goldschmidt Methods
99
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Comparison between Newton and
Goldschmidt methods
The Newton and Goldschmidt methods
are essentially equivalent;
Both methods exhibit an asymptotically
quadratic convergence;
Both methods are able to correct round-
off errors;
The Goldschmidt methods directly
computes the a/b ratio.
100
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Shifters
101
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
The Binary Shifter
Right nop Left
Ai Bi
Ai-1 Bi-1
Bit-Slice i
...
102
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
The Barrel Shifter
A3
B3
Sh1
A2
B2
Sh3
A0
B0
A2
A1
A0
Sh0 Sh 1 S h2 Sh3
B uffer
Widthbarrel ~ 2 pm M
104
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Logarithmic Shifter
Sh1 Sh1 Sh2 Sh2 Sh4 Sh4
A3 B3
A2 B2
A1 B1
A0 B0
105
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
0-7 bit Logarithmic Shifter
A
3
Out3
A
2
Out2
A
1
Out1
A
0
Out0
106
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
ALUs
107
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Two-bit MUX
108
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Two-bit MUX Truth Table
109
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Two-bit Selector Truth Table
110
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Carry-chain Truth Table
111
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
ALU block diagram (Mead-Conway)
112
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
ALU Operations
113
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
ALU Operations
114
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
ALU Operations
115
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
ALU Operations
116
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
ALU Operations
117
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
MIPS-X Instruction Format
118
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Pipeline dependencies in MIPS-X
119
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Die Photo of MIPS-X
120
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
MIPS-X Architecture
121
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
MIPS-X Instruction Cache-miss timing
122
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
MIPS-X Tag Memory
123
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
MIPS-X Valid Store Array
124
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
RAM Sense Amplifier
125
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
CMOS Dual-port Register Cell
126
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Self-timed bit-line write circuit
127
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Register bypass logic
128
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Schematic of comparator circuit
129
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Squash FSM
130
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits
Cache-miss FSM
131
Digital
EE141 Integrated Circuits2nd
Arithmetic Circuits