PerformanceX / PerformanceY = n
= Execution time y / Execution time x
25 / 20 = 1.25
A is 1.25 times faster than B. (the speedup of
A over B is 1.25)
time
cycle time = time between ticks = seconds per cycle
clock rate (frequency) = cycles per second (1 Hz. = 1 cycle/sec)
A 2 Ghz. clock has a 1 / 2×109 = 0.5 nano-second (ns) cycle time
Don't Panic, you can easily work this out from basic principles
4th
5th
6th
...
time
n
total clock cycles = ∑ CPI i × I i
i =1
∑ P =1
i
i
ALU 1 50%
Branch 2 20%
Load 2 20%
Store 2 10%
Average CPI is
1 x 50% + 2 x 20% + 2 x 20% + 2 x 10%
= 0.5 + 0.4 + 0.4 +0.2 = 1.5
9 9
8 8
7 7
6 6
SPECint
SPECfp
5 5
4 4
3 3
2 2
1 1
0 0
50 100 150 200 250 50 100 150 200 250
ExTimeold 1
Speedup = =
ExTimenew Fractionenhanced
(1 − Fractionenhanced ) +
Speedupenhanced
Examples:
All instructions requires an instruction fetch, only a
fraction require a data fetch/store => optimize
instruction access over data access
Programs spend a lot of time accessing memory and
exhibit spatial and temporal locality =>design a storage
hierarchy such that the most frequent access are to the
smallest (fastest) memories