Anda di halaman 1dari 32

Power of Realtime 3D-Rendering

Raja Koduri

We ate our GPU cake vuoi la botte piena e la moglie ubriaca

And had more too! 16+ years of (sugar) high! In every GPU generation More performance and performance-per-watt More programmability, precision and features While maintaining compatibility with 8+ generations of
APIs

Basic Equations
Chip Power = Static Power + Dynamic Power System Power = CPU + GPU + Other Static power is leakage of inactive transistors
Static Power = N*V*e-Vt Dynamic power is from active switching transistors Dynamic Power = A*N*C*F*V2
N - Number of transistors V - Voltage Vt - Thresh-hold Voltage A - Activity Factor F - Frequency C - Capacitance per transistor
3

Energy = Power(Watts) * Time

System Categories and GPU Power Limits

0-10 Watts TDP

Desktops

10-50 Watts TDP

50-300 Watts TDP

Mobile Devices (Phones, tablets etc)

Mobile Computers (Computers with a battery)

Desktops, Servers, Consoles etc (Always plugged in)

TDP - Thermal design power Maximum amount of power that the thermal system can sustain
4

Disruptive transition - Chip technology limits

Sub Moores law scaling of performance-per-watt


N wants to go up, but Sub-Moore scaling on C and V. GPUs scale better than CPUs
Beware of the Fine Print though

Disruptive transition - Market shift

Towards lower power computers


Battery life Thermal limits Acoustics $-per-KiloWattHour Green

System categories & Volumes


300 W 50 W Mobile Computers 10 W Desktops

0W

Mobile Devices HW vendors compelled to succeed in mobile markets


7

CPU v/s GPU v/s FixedFunction


Chip Power = N*V*e-Vt + A*N*C*F*V2

CPUs Prioritize Frequency - higher V and Vt Spend N for caches, cores, flexibility and compute quality GPUs Lower Frequency and Voltage Spend N for shaders, textures, pixels etc FixedFunction Lowest N, F and V for a given task
8

CPU v/s GPU


If your workload parallelizes well on a GPU Use the GPU Its only a win if GPU_Workload_power*GPU_Executiontime + CPU_GPU_feeding_power*CPU_GPU_feeding_time is less than CPU_Workload_power*CPU_Executiontime Optimize for System Energy
9

A Peek inside the power management black box

10

State of Art - Static Power Management


Static Power = N*V*e-Vt

Power off unused areas(Power gating)


When not in use Static Power ~= 0

Fine print
Latency with power toggle (few nano-seconds to a few hundred microseconds) Too aggressive switching may cause performance problems, too conservative
switching may lead to wasted power

11

Dynamic Power Management


Dynamic Power = A*N*C*F*V2
Primary OS+HW strategy is to control F & V
Based on history of A Applications have control of Activity(A) Lower A, leads to lower F and V and lower power.

Fine Print
Switching F&V states can range between a few milliseconds to few seconds!

12

Activity->Performance->Power
Activity/Performance/Power sample illustration
1.0

Activity/Frequency/Power

0.8

0.5

0.3

State Switch latency

Time
Activity Frequency Power

13

Power Management system diagram


Applications GPU API/Runtime
GPU UserMode Drivers GPU Power Management Driver
Power u-Controller

GPU Kernel Mode Driver Drivers

GPU
14

Basic Application Optimizations for Power

Tip 1 Control frame rate to minimum desired Pumping out more frames than user can see is wasteful anyways

15

Wasteful frame-rate - case study


400

50 Watt System Frames-per-sec Power in Watts

350

300

Simple rendering (like menus)


250 200

150

Complex rendering (like game play)

100

Power Limit @50W

50

10

20

30

40

50

60

70

80

90

100

110

16

Case study - What happens in a 25 watt system?


400

25 Watt System
350

Frames-per-sec Power in Watts

300

Simple rendering (like menus)

250

200

150

60 fps app
100

Complex rendering (like game play)


Thermal limit

Power Limit @25W

50

10

20

30

40

50

60

70

80

90

100

17

Real world good example

18

Basic Application Optimizations for Power

Tip 2 Optimize the frame rendering time to a minimum Dont stop optimizing when you hit your minimum frame rate targets!

19

Tip 2 - Ideal rendering algorithm case 1


100 90

Case1 Energy 4*Pmax+12*Pstatic


Pstatic is near zero in power optimized GPUs, So

80

70

60
Activity

Case1 Energy 4*Pmax

50

40

30

20

10

8 Time in Milliseconds

10

12

14

16

20

Tip 2 - Sub-optimal rendering algorithm case 2


100 90

Case1 Energy 4*Pmax Case2 Energy 16*Pmax

80

70

60
Activity

50

40

30

20

10

8 Time in Milliseconds

10

12

14

16

21

Basic Application Optimizations for Power

Tip 3 Dont scatter work in a frame (coalesce) Insufficient idle intervals for power-state reduction

22

Dont scatter work in a frame


100 90 80

70

60

Activity

50

40

30

20

10

8 Time in Milliseconds

10

12

14

16

May not be enough idle time for switching to lower power


23

Basic Application Optimizations for Power

Tip 4 Avoid spin-loops Eg:- CPU waiting on GPU This looks like real work to CPU

24

Advanced topics and Research Opportunities

25

Scheduling for power optimization

A complex subject
Dynamic scheduling systems based on power and
thermal feedback Hardware v/s Software schedulers Scheduling CPU and GPU Many more topics

26

FixedFunction Revenge
Premature declaration of death of fixed-function
in GPU hardware What new candidates can we move to FixedFunction? What interface principles should FixedFunction hardware adopt to be mainstream programmer friendly?
27

Advanced power and thermal management models

Can we build Predictive models to augment current reactive models? Should there be APIs for apps to influence power states and monitor feedback?

28

Plenty of opportunity for new power optimized rendering approaches


2560x1600 Screen @60Hz ~ 250 MPixels/Sec A 25W GPU today 5800 MPixels/Sec 17400 MTexels/Sec 696000 MFLOPS

29

Backup

30

Static Power Management


Static Power = N*V*e-Vt

Reduction Choices
Reduce N, but that reduces capabilities Reduce V, but limits performance and not in your
control(process limits)

Reduce Vt, Slower transistor, lower performance

Dominant battery life factor for common usage scenarios


31

Tip 2 - Anomalous Case 3


100 90 80

Case1 Energy 4*Pmax Case3 Energy 8*0.7*P?

70

60
Activity

50

40

30

20

10

8 Time in Milliseconds

10

12

14

16

32

Anda mungkin juga menyukai