Feat Size (um) Vdd 0.1 Jan-85 Jan-88 Jan-91 Jan-94 Jan-97 Jan-00 Jan-03
Slide 3
Bad News
Voltage scaling has stopped as well
kT/q does not scale Vth scaling has power consequences
Energy
Performance
Slide 5
Energy
Performance
Slide 6
Energy
Performance
Slide 7
10
Stat.Carr.Chain
Stat.Carr.Sel
Dom. LF
Dom. KS
10
Stat. LF
Stat. 84421
10
10
Slide 8
Key Observation:
Define the Energy/Delay sensitivity of parameter
For example Vdd:
Sens (Vdd ) =
E D
Vdd Vdd
Vdd =Vdd *
[V =1.3 Vt =0.22]
dd N
10
[V =1.2 Vt =0.26]
dd N
10
3
[V =0.68, Vt =0.31]
dd N
[V =1 Vt =0.30]
dd N
10
1
10
Slide 10
Leakage Energy
Matching marginal costs for Vdd and Vth
1.2 1 0.8 Voltage 0.6 0.4 0.2 0 -2 10 Leakage Ratio 0.2 Vth 0.1 Vdd 0.5
0.3
10 Activity Factor
-1
Slide 11
Reducing waste
Stop using energy for stuff that does not produce results Stop waiting for stuff that you dont need (parallelism)
Problem reformulation
Reduce work (less energy and less delay)
Slide 12
But costs
Performance Drop in Vdd, Gnd
Range of Applicability
Power supply gating
Done to remove leakage power But slows down the circuit
Adds series resistance to the supply Makes circuit worse when energy sensitivity is high
Energy/op
Performance
Slide 15
Parallelism
If the application has data parallelism
Existing Processors
1
Watts/(Spec*L*Vdd )
2
0.1
0.01 1 10
Slide 17
Problem Reformulation
Best way to save energy is to do less work
Energy directly reduced by the reduction in work But required time for the function decreases as well
Convert this into extra power gains
User Performance
Slide 18
Cost of Variation
Variability changes position of the optimal curves
Need to margin Vth, Vdd to ensure circuit always works
10
1
Vth = 120 mV
Energy
10
10
Slide 19
-1
Vth = 0 mV
10
-1
Performance
10
Partial Compensation
Adjust Vdd after you get part back
Compensates very well for small deviations in Vth
10
1
Vth = 120 mV
Energy
10
0
10
Slide 20
-1
Vth = 0 mV
10
-1
Performance
10
Slide 22
RAZOR FF
Slide 23
Future Systems
Some simple math
Assume scaling continues Dies dont shrink in size Average power/gate must decrease by 2x / generation
Lower energy/op
E/P will decrease Vdd, sizes, etc will reduce Build simpler architectures
Performance
Exploit Specialization
Optimize execution units for different applications
Reformulate the hardware to reduce needed work Can improve energy efficiency for a class of applications
SRF Clusters
Slide 26
Exploit Integration
If both those techniques dont work
Still can increase integration by at least 1.4x
Slide 27
TI - OMAP2420
Specialization And power domains
Most units are off
OMAP 2420
5 Power Domains #1: MCU Core #2: DSP Core #3: Graphic Accelerator #4: Core + Periph. #5: Always On logic
Royannez, et al, 90nm Low Leakage SoC Design Techniques for Wireless Applications, ISSCC 2005
Slide 28
Conclusions
Unfortunately power is an old problem
Magic bullets have mostly been spent
Power will be addressed by application-level optimization, parallelism/specialized functional units, and more adaptive control Performance growth will slow
NRE costs will be a very large issue More performance will require application specialization
Slide 30