In Out
Combinational Combinational
In Logic Out Logic
Circuit Circuit
State
Combinational Sequential
X Y Y = X if A and B
X B Y = X if A OR B
Y
A B
X Y Y = X if A AND B = A + B
X B Y = X if A OR B = AB
Y
VDD VDD
PUN
S D
VDD
S D
Complementary CMOS Logic Style
Example Gate: NAND
Example Gate: NOR
Complex CMOS Gate
B
A
C
D
OUT = D + A • (B + C)
A
D
B C
Constructing a Complex Gate
VDD VDD
C
SN1 F SN4 A
F
SN2 B
A A
D D SN3
B C B C D
C1 C2 C3 CN
t pd
nodes i
Ri to sourceCi
B 2x 2C
h copies
R
Y
(6+4h)C
t pdr
Example: 2-input NAND
• Estimate rising and falling propagation delays
of a 2-input NAND driving h identical gates.
2 2 Y
A 2 6C 4hC
B 2x 2C h copies
R
Y
(6+4h)C
t pdr 6 4h RC
Estimate the worst case falling propagation delays of
a 2-input NAND driving h identical gates
2 2 Y Suppose A = 1, B = 0, then
B 2x 2C h copies
t pdf
x R/2 Y
R/2 2C (6+4h)C
Example: 2-input NAND
• Estimate rising and falling propagation delays
of a 2-input NAND driving h identical gates.
2 2 Y
A 2 6C 4hC
B 2x 2C h copies
x R/2 Y
t pdf 2C R2 6 4h C R2 R2
7 4h RC
R/2 2C (6+4h)C
Delay Components
• Delay has two parts
– Parasitic delay, gate driving its own internal
diffusion capacitance
• 6 or 7 RC
• Independent of load
– Effort delay, depends on the ration of external load
capacitance to input capacitance,
– Effort delay changes with transistor width
• Proportional to load capacitance
• Logical effort and Electrical effort
Contamination Delay
• Best-case (contamination) delay can be
substantially less than propagation delay.
• Ex: If both inputs fall simultaneously, the
output should be pulled up in half the time
2 2 Y
A 2 6C 4hC
B 2x 2C tcdr = (R/2)(6+4h)C
tcdr 3 2h RC
R R
Y
(6+4h)C
CIRCUIT FAMILIES
• Static CMOS
• Ratioed Circuits
• Cascode Voltage Switch Logic (CVSL)
• Dynamic Circuits
• Pass-transistor Circuits
Static CMOS Circuits
Static CMOS
Circuits
In Static CMOS circuits with n inputs, 2n transistors are
needed.
nMOS block is a dual of the pMOS block.
What ever is in series in nMOS, appears in parallel in
pMOS and vice versa.
CMOS gates consume power only during the transition of
inputs.
Static complementary gate
structure
pull-up
network
inputs out
Pull-down
network
VSS
Pull-up/pull-down network
design
invert
or
and
AOI CMOS Gate
• An Or-And-Invert (OAI) CMOS gate is similar to the AOI gate except that it is an
implementation of product-of-sums realization of a function
• The N-tree is implemented as follows:
– Each product term is a set of parallel transistors for each input in the term
– All product terms (parallel groups) are put in series
– The complete function is again assumed to be an inverted representation
• The P-tree can be implemented as the dual of the N-tree
• Note: AO and OA gates (non-inverted function representation) can be
implemented directly on the P-tree if inverted inputs are available
Properties of CMOS Gates
Pseudo-nMOS Circuits
Ganged CMOS
Source-Follower Pull-up Logic (SFPL)
Pseudo-nMOS Circuits
• Adding a single pFET to otherwise nFET-only
circuit produces a logic family that is called
pseudo-nMOS
– Less transistor than CMOS
– For N inputs, only requires (N+1) FETs
– Pull-up device: pFET is biased active since
the grounded gate gives VSGp = VDD
Figure 1 General structure of
– Pull-down device: nFET logic array acts as a a pseudo-nMOS logic gate
large switch between the output f and
ground
– However, since the pFET is always biased
on, VOL can never achieve the ideal value
of 0 V
• A simple inverter using pseudo-nMOS as
Figure 2 Figure 2 Pseudo-nMOS inverter
Ganged CMOS (Symmetric Circuits)
A B C
R. W. Knepper
SC571, page 5-25
Cascode Voltage Switch Logic (CVSL)
Q Q Q Q
a
a
b
...
...
b
a b c
c
Dynamic CMOS Logic
Y_l Y_h
= A*B A_h = A*B
A_l B_l B_h
Example: XOR/XNOR
• Sometimes possible to share transistors
Y_l Y_h
= A xnor B A_h A_l A_l A_h = A xor B
B_l B_h
TRANSMISSION GATES
• This is the reason that N-Channel transistors are used in the pull-down
network and P-Channel in the pull-up network of a CMOS gate.
Otherwise the noise margin would be significantly reduced.
Transmission Gates
• Pass transistors produce degraded outputs
• Transmission gates pass both 0 and 1 well
Input Output
g = 0, gb = 1 g = 1, gb = 0
g
a b 0 strong 0
a b g = 1, gb = 0 g = 1, gb = 0
a b 1 strong 1
gb
g g g
a b a b a b symbols
gb gb gb
Transmission Gates
• Implementing XOR gates
– With NAND gates and inverters:
• If A is high, B is passed
through the gate to the
output
• If A is low, -B is passed
through the gate to the
output
Pass Transistor
• At right,
– (a) is a 2-input NAND
pass transistor circuit
– (b) is a 2-input NOR pass
transistor circuit
Pass Variables
Inputs
Control f f
Variables
F F
Basic logic functions in CPL
A B B A A B A A A A A B
B B
B B
A B B A A B
A B B A A C B C
B
C
B
C
CPL Logic
A A
A A
B
n1 n2
B B
n3 n4
B
C
Q Qb C
S S (a) (b)
XOR gate S S
Sum circuit
B C B C
A
B C B C
S Cout
A
B C B C
A
B C B C
S Cout
B
A
B
Double Pass-Transistor Logic (DPL):
VDD AND/NAND
A B B A
A B
B B
A A
O
O
A B A B A B A B XOR/XNOR
B A
A B
A B
B A
A B
O
O
Double Pass-Transistor Logic (DPL):
A A
A A
n1 p2 n1 p2
B
B
p1 n2 p1 n2
B C
Q Qb
C
O O
S S
(a) XOR (b)
One bit full-adder:
Sum circuit
Double Pass-Transistor Logic (DPL):
AND/NAND
DPL Full Adder
Vcc
A
A C C
B Vcc
B S
Vcc
A S
A
Multiplexer Buffer
B
B
The critical path traverses two transistors only
(not counting the buffer)
OR/NOR
Dynamic CMOS
• In static circuits at every point in time (except when
switching) the output is connected to either GND or
VDD via a low resistance path.
– fan-in of n requires 2n (n N-type + n P-type)
devices
Y
The Foot
• What if pulldown network is ON during
precharge?
• Use series evaluation transistor to prevent
fight.
precharge transistor
Y Y
Y inputs inputs
A f f
foot
footed unfooted
Logical Effort
Inverter NAND2 NOR2
1
Y
1 1
A 2
unfooted Y Y
A 1 B 2 A 1 B 1
gd = 1/3 gd = 2/3 gd = 1/3
pd = 2/3 pd = 3/3 pd = 3/3
1
Y
1 1
A 3
Y Y
footed A 2 B 3 A 2 B 2
gd = 2/3 gd = 3/3 gd = 2/3
2 pd = 3/3 3 pd = 4/3 2 pd = 5/3
Issues in Dynamic Design 1: Charge Leakage
CLK
Clk Mp
Out
A CL
VOut Evaluate
Clk Me
Precharge
Leakage sources
Clk Mp Mkp
A Out
CL
B
Clk Me
Clk Me CB
Charge Sharing Example
Clk
Out
A A CL=50fF
Ca=15fF B B B !B Cb=15fF
Cc=15fF C C Cd=10fF
Clk
Charge Sharing
V DD
VDD case 1) if Vout < VTn
Clk Mp
Mp
L out
Out C V = C V t + C V – V V
L DD a DD Tn X
Out
or
CL
A Ma CL Ca
A Ma V out = Vout t – V DD = – -------- V DD – V Tn V X
XX CL
= M CCa a
BB 00 Mbb case 2) if Vout > VTn
Ca
CCb b Vout = –V DD ----------------------
Clk M
Mee C +C
a L
Solution to Charge Redistribution
Clk Me
Clk Mp Out1 =1
Out2 =0
A=0 In
CL1 CL2
B=0
Clk Me
2
Out1
1 Clk
0 Out2
In
-1
0 2 Time, ns 4 6
Issues in Dynamic Design 4: Clock Feedthrough
In3 In &
0.5 Clk
In4 Out
Clk -0.5
0 0.5 Time, ns 1
Clock feedthrough
Other Effects
• Capacitive coupling
• Substrate coupling
• Minority charge injection
• Supply noise (ground bounce)
Cascading Dynamic Gates
V
V
Out2
– 0 -> 1
violates monotonicity
– 1 -> 1 during evaluation
A
– But not 1 -> 0
Precharge Evaluate Precharge
Clk Me Clk Me
Domino Gates
• Follow dynamic stage with inverting static gate
– Dynamic / static pair is called domino gate
– Produces monotonic outputs
domino AND
Precharge Evaluate Precharge
W X Y Z W
A
X
B C
Y
dynamic static
Z
NAND inverter
A W X A X
H Y =
B H Z B Z
C C
Domino Optimizations
• Each domino gate triggers next one, like a string of dominos
toppling over
• Gates evaluate sequentially but precharge in parallel
• Thus evaluation is more critical than precharge
• HI-skewed static stages can perform logic
S0 S1 S2 S3
D0 D1 D2 D3
Y
H
S4 S5 S6 S7
D4 D5 D6 D7
Dual-Rail Domino
• Domino only performs noninverting functions:
– AND, OR but not NAND, NOR, or XOR
• Dual-rail domino solves this problem
– Takes true and complementary inputs
– Produces
sig_h sig_ltrue and complementary outputs
Meaning
0 0 Precharged Y_l Y_h
inputs
0 1 ‘0’ f f
1 0 ‘1’
1 1 invalid
Example: AND/NAND
• Given A_h, A_l, B_h, B_l
• Compute Y_h = AB, Y_l = AB
• Pulldown networks are conduction
complements
Y_l Y_h
= A*B A_h = A*B
A_l B_l B_h
Example: XOR/XNOR
• Sometimes possible to share transistors
Y_l Y_h
= A xnor B A_h A_l A_l A_h = A xor B
B_l B_h
np-CMOS
NORA Logic
NP Domino
Zipper CMOS
• The NP-Domino or NORA logic is very
susceptible to noise and leakage.
• Zipper Domino has the same structure, but
the precharge transistors are left slightly ON
during evaluation.
Leakage
• Dynamic node floats high during evaluation
– Transistors are leaky (IOFF 0)
– Dynamic value will leak away over time
– Formerly miliseconds, now nanoseconds
• Use keeper to hold dynamic node
weak keeper
– Must be weakenough
1 k not to fight evaluation
X
H Y
A 2
2
Charge Sharing
• Dynamic gates suffer from charge sharing
Y A
A x CY
Y
B=0 Cx Charge sharing noise
CY
Vx VY VDD
C x CY
Secondary Precharge
• Solution: add secondary precharge transistors
– Typically need to precharge every other node
• Big load capacitance CY helps as well
secondary
precharge
Y transistor
A x
B
Noise Sensitivity
• Dynamic gates are very sensitive to noise
– Inputs: VIH Vtn
– Outputs: floating output susceptible noise
• Noise sources
– Capacitive crosstalk
– Charge sharing
– Power supply noise
– Feedthrough noise
– And more!
Power
• Domino gates have high activity factors
– Output evaluates and precharges
• If output probability = 0.5, a = 0.5
– Output rises and falls on half the cycles
– Clocked transistors have a = 1
– For a 4 input NAND, aCMOS = 3/16, aDynamic = 1/4
• Leads to very high power consumption
• However, glitching does not occur in dynamic
logic.
• The load capacitances are lower.
MODL
• It is often necessary to compute multiple
functions where one is a subfunction of the
other or shares a subfunction.
• One very typical example is the carry in
addition:
c1 g1 p1c 0
c 2 g2 p2 g1 p1c 0
c 3 g3 p3 g2 p2 g1 p1c 0
c 4 g4 p4 g3 p3 g2 p2 g1 p1c 0
MODL Carry Chains
MODL
• Beware of sneak paths.
• Certain inputs must be mutually exclusive.
Domino Summary
• Domino logic is attractive for high-speed circuits
– 1.3 – 2x faster than static CMOS
– But many challenges:
• Monotonicity, leakage, charge sharing, noise
• Widely used in high-performance
microprocessors in 1990s when speed was king
• Largely displaced by static CMOS now that power
is the limiter
• Still used in memories for area efficiency
POWER DISSIPATION
Power is drawn from a voltage source attached to
the VDD pin(s) of a chip.
E 1 T
Pavg iDD (t )VDD dt
Average Power: T T 0
Overview of Power Dissipation
Ptotal = Pdynamic+Pstatic
Power Consumption (Pdynamic)
Dynamic power Consumption Pdynamic = Pswitching + Pshortcircuit
Switching load capacitances
Short-circuit current
– Charging and discharging capacitors
Short Circuit Power Consumption (Pshort-circuit)
– Short circuit path between supply rails during switching
Power Dissipation Sources
Subthreshold leakage
Gate leakage
Junction leakage
Contention current
Dynamic Power
Dynamic power is required to charge and discharge load
capacitances when transistors switch.
One cycle involves a rising and falling output.
On rising output, charge Q = CVDD is required
On falling output, charge is dumped to GND
This repeats Tfsw times Vdd
over an interval of T
Vin Vout
CL
fsw
Dynamic Power
T
1
Pdynamic iDD (t )VDD dt
T 0
T
VDD
T 0 iDD (t )dt
VDD
VDD iDD(t)
TfswCVDD
T
CVDD 2 f sw
C
fsw
Dynamic Power
Suppose the system clock frequency = f
Let fsw = af, where a = activity factor
If the signal is a clock, a = 1
If the signal switches once per cycle, a = ½
Dynamic gates:
Switch either 0 or 2 times per cycle, a = ½
Static gates:
Depends on design, but typically a = 0.1
Vin Vout E / E
CL
8 W/L|P = 7.2 mm/1.2mm VDD = 5 V
7 W/L|N = 2.4 mm/1.2mm
Large capacitive load 6
5
VDD 4
3
ISC≈IMAX 2 VDD = 3.3 V
1
Vin Vout 0 r
CL 0 1 2 3 4 5
The power dissipation due to short circuit
currents is minimized by matching the rise/fall
Small capacitive load times of the input and output signals.
Dynamic Power Reduction
Pswitching a CVDD f
2
Try to minimize:
– Activity factor
– Capacitance
– Supply voltage
– Frequency
Voltage Scaling
– Gate capacitance
– Fewer stages of logic
– Small gate sizes
Wire capacitance
– Good floorplanning to keep communicating blocks close
to each other
– Drive long wires with inverters or buffers rather than
complex gates
Clock Gating
The best way to reduce the activity is to turn off the clock
to registers in unused blocks
– Saves clock activity (a = 1)
– Eliminates all switching activity in the block
– Requires determining if block will be used
Voltage / Frequency