Anda di halaman 1dari 35

http://repo-ck.com/bench/cpu_schedulers_compared.

pdf

CPU SCHEDULERS COMPARED


By graysky
20-Oct-2012

I Introduction d i
Purpose P Benchmark details Test CPUs/systems /

Purpose p
3

Con Kolivas Brain Fuck Scheduler (bfs) was designed to provide superior desktop interactivity and responsiveness to machines running it.1 However, it was not implicitly designed to provide superior performance. The purpose of this study was to evaluate the Completely Fair Scheduler (cfs) in the vanilla Linux kernel and the bfs in the corresponding kernel patched with the ck1 patchset. Seven (7) different machines were used to see if differences exist and, to what degree they scale using performance based metrics. Again, these end-points were never factors in the primary design goals of the bfs.

(1) http://ck.kolivas.org/patches/bfs/sched-BFS.txt

Benchmark Details
4

The collective benchmark was called from a simple Bash script that ran each individual benchmark task multiple times at least ten in order to get a decent number of observations to power a statistical comparison. Test machines were booted into either the stock kernel or into a corresponding ck1 patched kernel and then challenged with three different benchmark tasks. The time to complete each task was captured to a log file and the test repeated:
1.

Compilation using gcc to `make -jx bzImage` for a preconfigured linux kernel v3.6.2.2 Compression using lrzip to compress the source tree for the linux kernel v3.6.2. Video compression using ffmpeg to transcode a 720p MPEG2 clip to a 360p video suitable for playback on a smartphone.

2.

3.

(2) In the `make make -jx jx bzImage bzImage` statement, statement x=(number of physical cores + virtual cores)+1. cores)+1 I am aware that it is recommended NOT to use the +1 for kernels running the bfs but felt that in order to fairly compare both schedulers, this needed to be held constant.

CPUs Compared p with Core Count


5

Athlon XP 3200+ Intel E5200 Intel Atom 330 Intel i7-2620M Intel X3360 Intel i7-3770K Dual Intel E5620
= Physical core = Hyperthreaded core

=1 1 core =2 cores =4 cores =4 cores =4 cores =8 cores =16 cores

Each test machine ran Arch Linux x86_64 x86 64 except for the Athlon XP which ran Arch i686 due to its lack of 64-bit support.

R l and Results dC Conclusion l i


Compression C i B Benchmark h k Make Benchmark Video Encoding Benchmark Conclusion

Compression p Benchmark Results


7

CPU AMD Athlon XP Intel E5200 Intel Atom 330 Intel i7-2620M Intel X3360 Intel i7-3770K Dual Intel E5620

Average Time (sec)


Vanilla
558.3320 166.3141 470.0283 81.2978 9 68.2635 35.2904 27.7919

CK Kernel is
Difference
-10.5753 -0.5519 -15.2098 -1.808 -0.4648 -0.9594 +0.4797

CK-Patched
547.7577 165.7622 454.8185 79.4898 67.7987 34.3310 28.2716

Result
1.9 % faster 0.3 % faster 3.2 % faster 2.2 % faster 0.7 % faster 2.7 % faster 1.7 % slower

Small (1-3 %) efficiency/speed gains were observed almost universally across th test the t t systems t when h timing ti i compression i using i lrzip. l i Th notable The t bl exception ti being the multi-socket machine which was around 2 % slower.

Make Benchmark Results


8

CPU AMD Athlon XP Intel E5200 Intel Atom 330 Intel i7-2620M Intel X3360 Intel i7-3770K Dual Intel E5620

Average Time (sec)


Vanilla
1,120.6486 374.0274 1,568.5016 192.5477 9 127.6179 68.9835 74.1218

CK Kernel is
Difference
-25.4671 -7.9362 -22.0212 -1.8765 -0.3839 -1.0164 -5.8553

CK-Patched
1,095.6486 366.0912 1,546.4804 190.6712 127.2340 67.9671 68.2665

Result
2.3 % faster 2.1 % faster 1.4 % faster 1.4 % faster 0.3 % faster 1.5 % faster 7.9 % faster

Small to moderate (2-8 %) efficiency/speed gains were observed across all t t systems test t with ith the th gcc-based b d endpoint. d i t Of note t is i the th multi-socket lti k t dual d l quad machine which saw the largest boost using the bfs of nearly 8 %.

Video Benchmark Results


9

CPU AMD Athlon XP Intel E5200 Intel Atom 330 Intel i7-2620M Intel X3360 Intel i7-3770K Dual Intel E5620

Average Time (sec)


Vanilla
380.8340 102.3183 471.0781 50.1631 39.3656 19.6863 30.9037

CK Kernel is
Difference
+5.5866 -2.4439 -27.6331 -1.4142 -1.7232 -1.5819 -0.1896

CK-Patched
386.4206 99.8744 443.4450 48.7489 37.6724 18.1044 30.7141

Result
1.5 % slower 2.4 % faster 5.9 % faster 2.8 % faster 4.4 % faster 8.0 % faster 0.6 % faster

Small to moderate (2-8 %) efficiency/speed gains were observed almost universally i ll across the th test t t systems t when h timing ti i ffmpeg ff video id encoding. di H Here the oldest CPU showed a slight decrease in speed of around 1.5 %.

Conclusion
10

In addition to the primary design goals of the bfs, increased desktop interactivity and responsiveness, kernels patched with the ck1 patch set including the bfs outperformed the vanilla kernel using the cfs at nearly all the performance-based benchmarks tested. Further study with a larger test set could be conducted, but based on the small test set of 7 PCs evaluated, these increases in process queuing, efficiency/speed are, on the whole, independent of CPU type (mono, dual, quad, hyperthreaded, etc.), CPU architecture (32-bit and 64-bit), 64 bit) and of CPU multiplicity (mono or dual socket). socket) Moreover, several modern CPUs (Intel C2D and Ci7) that represent common workstations and laptops, consistently outperformed the vanilla kernel at t all ll benchmarks. b h k Efficiency Effi i and d speed d gains i were small ll to t moderate. d t Feel free to contact the author with questions, suggestions, or rants: graysky AT archlinux DOT us

11

A Appendix/Supporting d /S Information I f
List of Li f software f used d Additional hardware details Statistical relevance of results and more details

Software Used
12

The requisite software packages used for each task came from the official Arch Linux repos with the exception of the linux-ck packages which came from the unofficial linux-ck repo.3 Package names including version numbers:

linux-3.6.2-1 linux-ck-3.6.2-1

linux-ck uses bfs v0.425 contained in the 3.6-ck1 patchset.4

gcc-4.7.2-1 472 1 ffmpeg-1:1.0-1 lrzip-0.614-1 p

Finally, the Oneway ANOVA plots presented in the Appendix were generated using version 10.0 JMP.5
(3) https://wiki.archlinux.org/index.php/repo-ck (4) http://ck.kolivas.org/patches/3.0/3.6/3.6-ck1/patch-3.6-ck1.bz2 (5) http://www.jmp.com

Software Used
13

The Bash script used to drive the benchmarks and to create the log file is available in grayskys github.6 Users may edit the initial variables in the script to repeat this study on their own systems. The Linux source code can be downloaded from http://kernel.org which will provide substrate for the Make and for the Compression benchmarks. Due to copyright limitations, I am unable to provide the 2 min 720p MPEG clip, but users seeking to run the Video benchmark need only provide their own clip and edit the video clip name variable in the script as a workaround. workaround The raw data generated during the course of this study and also presented in the Oneway analyses on the slides to follow is available as a tab d l i t d text delaminated t t file. fil 7

(6) https://github.com/graysky2/bin (7) http://repo-ck.com/bench/raw_data.txt

Additional Hardware Details


14

CPU Model AMD Athlon XP 3200+ Intel E5200 Intel Atom 330 Intel i7 i7-2620M 2620M Intel X3360 Intel i7-3770K Dual Intel E5620

Clock Speed (GHz) 2.20 3.33 1.60 2 70 2.70 3.40 4.50 2.83

RAM (GB) 1.0 4.0 2.0 40 4.0 8.0 16.0 8.0

Comment Overclocked 12.5x266

Overclocked 8.5x400 Overclocked 45x100 Dual socket machine

Note that the overclocked systems have been deemed stable overclocks though hours of punishment without errors including torture testing with mprime, linpack, systester, and with gcc. For more on Linux stress testing, see: https://wiki.archlinux.org/index.php/Stress_Test

15

Oneway Analysis Athl XP/Compression Athlon XP/C i

16

Oneway Analysis Athl XP/Make Athlon XP/M k

17

Oneway Analysis Athl XP/Video Athlon XP/Vid

18

Oneway Analysis E5200/C E5200/Compression i

19

Oneway Analysis E5200/M k E5200/Make

20

Oneway Analysis E5200/Video E5200/Vid

21

Oneway Analysis At 330/Compression Atom 330/C i

22

Oneway Analysis At 330/M Atom 330/Make k

23

Oneway Analysis At 330/Vid Atom 330/Video

24

Oneway Analysis i7 2620M/C i7-2620M/Compression i

25

Oneway Analysis i7 2620M/M k i7-2620M/Make

26

Oneway Analysis i7 2620M/Vid i7-2620M/Video

27

Oneway Analysis X3360/C X3360/Compression i

28

Oneway Analysis X3360/M k X3360/Make

29

Oneway Analysis X3360/Video X3360/Vid

30

Oneway Analysis i7 3370K/C i7-3370K/Compression i

31

Oneway Analysis i7 3370K/M k i7-3370K/Make

32

Oneway Analysis i7 3370K/Vid i7-3370K/Video

33

Oneway Analysis D l E5620/C Dual E5620/Compression i

34

Oneway Analysis D l E5620/M Dual E5620/Make k

35

Oneway Analysis D l E5620/Vid Dual E5620/Video

Anda mungkin juga menyukai