Anda di halaman 1dari 41

EE 587

SoC Design & Test


Partha Pande
School of EECS
Washington State University
pande@eecs.wsu.edu

SoC Physical Design Issues


Power and Clock Distribution

Overview

Reading

HJSChapter11PowerGridandClockDesign
Forbackgroundinformationrefertochapter5ofthetextbook

Purpose of Power Distribution


Goal of power distribution system is to deliver the required
current across the chip while maintaining the voltage levels
necessary for proper operation of logic circuits
Must route both power and ground to all gates
Design Challenges:
How many power and ground pins should we allocate?
Which layers of metal should be used to route
power/ground?
How wide should be make the wire to minimize voltage
drops and reliability problems
How do we maintain VDD and Gnd within noise budget?
How do we verify overall power distribution system?
4

Design Example

Target Impedance of the power grid

VDD ( FractionalNoiseBudget )
I DD

For a supply voltage of 1.2 V and a supply current of 100 A,

with a 10 % noise budget the power grid impedance can be 1.2


m

Power Distribution Issues - IR Drop


< Vdd

Vdd

n4

n3

< Vdd

n2
n1

Narrow line widths increase


metal line resistance
As current flows through power
grid, voltage drops occur
Actual voltage supplied to
transistors is less than Vdd
Impacts speed and functionality
Need to choose wire widths to
handle current demands of
each segment

n8
n5

n6

n7

Block Interaction yields IR Drop

Plots courtesy of Simplex Solutions, Inc.


7

Power Grid Issues Static IR Drop

Block placement and


global power routing
determines IR drop
on the chip
Possible solutions
Rearrange blocks
More Vdd pins
Connect bottom
portion of grid to
top portion

Plot courtesy of Simplex Solutions, Inc.


8

Power Grid Issues Static IR Drop

If we connect bottom
portion of grid to top
portion, the IR drop is
reduced significantly
However, this is only
one part of the
problem
We must also
examine
electromigration

Plot courtesy of Simplex Solutions, Inc.


9

Power Grid Issues - Electromigration

n4

n3

n2
n1

As current flows down


narrow wires, metal begins
to migrate
Metal lines break over time
due to metal fatigue
Based on average/peak
current density
Need to widen wires
enough to avoid this
phenomenon

n8
n5

n6

n7
10

Case Study IR and EM Tradeoff

11

Ldi/dt Effects in the Power Supply

In addition to IR drop, power system inductance is also an issue


Inductance may be due to power pin, power bump or power grid
Overall voltage drop is:
Vdrop = IR + Ldi/dt

Distribute decoupling capacitors (decaps) liberally throughout design


Capacitors store up charge
Can provide instantaneous source of current for switching

12

On-chip Decoupling Caps

On-chip decaps help to stabilize the power grid voltage


First line of defense against noise which can extend beyond 10GHz
Simple Example:

Drop across inductors = 2 x L x di/dt = 2 x 0.2nH x 20mA/100ps =


80mV (problematic if supply is 1.2V)
Actual power pad or bump may need to support thousands of inverters
Use capacitors to supply instantaneous charge to inverters

13

Making a Decoupling Cap

Decaps are basically NMOS transistors. Top plate is polysilicon,


bottom-plate is inverted channel, insulator is gate oxide.
Connect poly to Vdd and source/drain to Vss

Low-frequency capacitance is roughly COX W L.


Since these are large capacitance to be used at high frequencies, more
accurate representation is needed

14

How much Decoupling Cap?

To estimate required decap value, run SPICE on patch of chip area with
power grid, part of logic block, and sprinkle of decaps

Amount of decap depends on:


Acceptable ripple on Vdd-Vss (typically 10% noise budget)
Switching activity of logic circuits (usually need 10X switched cap)
Current provided by power grid (di/dt)
Required frequency response (high frequency operation)
How much decap exists ( non-switching diffusion, gate, wire caps)

15

Simulation with Decoupling Caps

Plot center region of grid


Local Vdd-Vss in worst-case
location in patch (center)
First dip can be dealt with by a
low-frequency on-chip voltage
regulator and low-frequency
decaps
Steady-state ripple is
controlled by high-frequency
decoupling caps
Adjust location of decaps until
the ripple is within noise
budget
16

Designing Power Distribution


Floorplanner should be aware of IR+Ldi/dt drop and EM
problems and design accordingly
Requires knowledge of current distributions and voltage
drop constraints of blocks being placed
Provide adequate number of VDD and Gnd pins
Route power distribution system according to current demands
of the blocks
Widen wires based on expected current density in branches
Distribute decoupling capacitors liberally throughout design
Verify full chip with IR/EM tools

17

Reducing the Effects of IR drop and Ldi/dt


Stagger the firing of buffers (bad idea: increases skew)
Use different power grid tap points for clock buffers (but it
makes routing more complicated for automated tools)
Use smaller buffers (but it degrades edge rates/increases delay)
Make power busses wider (requires area but should do it)
Use more Vdd/Vss pins; adjust locations of Vdd/Vss pins
Put in power straps where needed to deliver current
Place decoupling capacitors wherever there is free space
Integrate decoupling capacitors into buffer cells
These caps act
as decoupling
caps when they
are not
switching

18

Power Routing Examples

Block A

Single Trunk

Block B

Block A

Block B

Multiple Trunks

19

Simple Routing Examples

Block A

Block B

Double-Ended Connections

Block A

Block B

Wider Trunks

20

Interleaved Power/Ground Routing

Interleaved Vdd/Vss
21

Flip-Flop and Clock Design


Flip-flops and latches are used to gate signals in sequential
logic designs.
The critical parameters of setup and hold times
Clock design is also a complex issue in DSM due to RC delay
components in the interconnect and the power dissipation.
We will look at clock trees, H-trees and clock grids.
An overall examination of the issues of clocks skew, IR drop and
signal integrity, and how to manage them using circuit
techniques.
Lets start by revisiting the expected limits of clock speed

22

Clocked D Flip-flop
Very useful FF
Widely used in IC design for temporary storage of data
May be edge-triggered (Flip-flop) or level-sensitive (transparent
latch)

data
CK

Clk Q

output

Qn+1

0
1

0
1

23

Latch vs. Flip-flop


Latch (level-sensitive, transparent)
When the clock is high it passes In value to Out
When the clock is low, it holds value that In had when the clock fell
Flip-Flop (edge-triggered, non transparent)
On the rising edge of clock (pos-edge trig), it transfers the value of In to Out
It holds the value at all other times.

In

Out

Clk

In
Out

CLK

Latch

In
Clk

Out

In
Out
CLK

Flip-Flop
24

Alternative View
Clocks serve to slow down signals that are too fast
Flip-flops / latches act as barriers

Latch

Flip-Flop

DQ

DQ
Clk

Ld
Clk
Soft Barrier

Clk 0->1

Clk=1

DQ

DQ

Ld

Ld

Clk
Hard Barrier

Clk

With a latch, a signal can propagate through while the clock is high
With a Flip-flop, the signal only propagates through on the rising edge
Flip-flops consist of two latch like elements (master and slave latch)
25

Clocking Overhead
FF and Latches have setup and hold times that must be satisfied:

Flip Flop

will work

wont work

Din
Clk

Latch

may work

Din

Thold

Qout

Tsetup

Clk

Thold

Qout

Tsetup + Tclk-q

Td-q

If Din arrives before setup time and is stable after the hold time, FF will work; if Din
arrives after hold time, it will fail; in between, it may or may not work; FF delays the
slowest signal by the setup + clk-q delay in the worst case
Latch has small setup and hold times; but it delays the late arriving signals by Td-q
26

Clock Skew
Not all clocks arrive at the same time, i.e., they may be skewed.
SKEW = mismatch in the delays between arrival times of clock edges at FFs
SKEW causes two problems:

Tclk-q
Fix critical path

Logic
Td

Tcycle = Td +Tsetup + Tclk-q + Tskew


Shows up as a SETUP time violation

Shows up as a HOLD time violation

Flop

when Tskew + Thold > Tclk-q

Early

Td=0

Flop

The part can get the wrong answer

Late

Flop

Flop

The cycle time gets longer by the skew

Tsetup

Insert buffer
Delay elements

Early

Late
27

Overhead for a Clock


CMOS FO4 delay is roughly 425ps/um x Leff
For 0.13um, FO4 delay 40 - 50ps
For a 1GHz clock, this allows < 20 FO4 gate delays/cycle
Clock overhead (including margins for setup/hold)
2 FF/Latches cost about 2-3 FO4 delays
skew costs approximately 2-3 FO4 delays
Overhead of clock is roughly 4-6 FO4 delays
14-16 FO4 delays left to work with for logic
Need to reduce skew and FF cost
Tcycle

Skew Tclk-q

Tlogic

CLOCK

28

Signal Integrity Issues at FFs


What happens if a glitch occurs in a clock signal?
Positive-Edge Triggered Flip-Flop

DQ
Clk
Clk

Flip-flop captures and propagates incorrect data


Could view any signal that, if glitched, could cause a logic upset
as a clock signal
Need to space out clocks/signals or shield them
29

Signal Integrity Issues at FFs


What happens if a glitch occurs in data signal?
Positive-Edge Triggered Flip-Flop

DQ
Clk
Clk

Flip-flop captures and propagates incorrect data


Need to insure that data signal is stable during FF setup time
Shielding with stable signals or spacing is needed
30

Sources of Clock Skew


Main sources:
1. Imbalance between different paths from clock source to FFs
interconnect length determines RC delays
capacitive coupling effects cause delay variations
buffer sizing
number of loads driven
2. Process variations across die
interconnect and devices have different statistical variations
Secondary Sources:
1. IR drop in power supply
2. Ldi/dt drop in supply
31

IR Drop Impacts on Clock Skew

Ideal Vdd
- Low delay
- Low skew
Delay (latency)

Skew

Conservative Vdd
- High delay
- Low skew

Actual IR drop impact


- delay about 5-15% larger
- skew about 25-30% larger
32

Power dissipation in Clocks


Significant power dissipation can occur in clocks in highperformance designs:

clock switches on every cycle so P= CV2f (i.e., =1)


clock capacitance can be ~nF range, say 1nF = 1000pF
assuming a power supply of 1.8V, CV = 1800pC of charge
if clock switches every 2ns (500MHz), thats 0.9A
for VDD = 1.8V, P=IV=0.9(1.8)=1.6W in the clock circuit alone

Much of the power (and the skew) occurs in the final drivers due
to the sizing up of buffers to drive the flip-flops
Key to reducing the power is to examine equation CV 2f and reduce
the terms wherever possible
VDD is usually given to us; would not want to reduce swing due
to coupling noise, etc.
Look more closely at C and f
33

Reducing Power in Clocking


Gated Clocks:
can gate clock signals through AND gate before applying to
flip-flop; this is more of a total chip power savings
all clock trees should have the same type of gating whether
they are used or not, and at the same level - total balance
Reduce overall capacitance (again, shielding vs. spacing)
shield

clock

shield

(a) higher total cap./less area

Signal 1

clock

Signal 2

(b) lower cap./ more area

Tradeoff between the two approaches due to coupling noise


approach (a) is better for inductive noise; (b) is better for
capacitive noise
34

Clock Design Objectives


Now that we understand the role of the clock and some of the
key issues, how do we design it?
Minimize the clock skew (in presence of IR drop)
Minimize the clock delay (latency)
Minimize the clock power (and area)
Maximize noise immunity (due to coupling effects)
Maximize the clock reliability (signal EM)
Problems that we will have to deal with
Routing the clock to all flip-flops on the chip
Driving unbalanced loading, which will not be known until
the chip is nearly completed
On-chip process/temperature variations
35

Clock Design

Tree

Minimal area cost


Requires clock-tree
management
Use a large superbuffer to
drive downstream buffers
Balancing may be an
issue

Multi-stage clock tree

Secondary clock drivers

Main clock
driver

36

Clock Configurations

H-Tree

Place clock root at


center of chip and
distribute as an H
structure to all areas of
the chip
Clock is delayed by an
equal amount to every
section of the chip
Local skew inside blocks
is kept within tolerable
limits
37

Clock Configurations

Grid

Greater area cost


Easier skew control
Increased power
consumption
Electromigration risk
increased at drivers
Severely restricts
floorplan and routing

38

Clock Design and Verification

Many design styles


Low-speed designs: regular signals, symmetric tree
Medium-speed designs: balanced H-tree
High-speed designs
Balanced buffered H-tree
Grid

Clock verification is more complex in DSM


RC Interconnect delays
Signal integrity (capacitive coupling, inductance)
IR drop
Signal Electromigration
Clock Jitter

Normal Distribution

Jitter

1
0.8
0.6
Prob

time-domain variation of a given clock signal due to random


noise, IR drop, temperature, etc.

Series1
0.4
0.2
0
-3

-2

-1

sigma
Clock
edge

39

Clock Design Today

Old methodology

Advanced Clock Verification

Route clock
Route rest of nets
Extract clock parasitics
Perform timing verification
Balance clock by snaking
route in reserved areas

IR Drop and Ldi/dt effects


Coupling capacitance
Electromigration checks
Full-chip skew/slew
analysis
Jitter analysis
Inductance Effects
Process variations
40

Good Practices in Clock Design


Try to achieve the lowest Latency (Super Buffer/H-tree)
Control transition times (keep edge rates sharp)
Use 1 type of clock buffer for good matching (except perhaps in
the last leg where you need to have adjustable buffers)
Have min/max line lengths for good matching
Determine whether spacing or shielding provides better tradeoff
Use integral decoupling in buffers to reduce IR and Ldi/dt

41