BFS vs. mainline scheduler benchmarks and measurements
BFS vs. mainline scheduler benchmarks and measurements
Posted Sep 7, 2009 13:37 UTC (Mon) by mingo (subscriber, #31122)In reply to: BFS vs. mainline scheduler benchmarks and measurements by kragil
Parent article: BFS vs. mainline scheduler benchmarks and measurements
What I take from this discussion is that Kernel devs live in a world where Intels fastest chips in multi socket systems are low end and they will cater only to the enterprise bullcrap that pays their bills.
I certainly dont live in such a world and i use a bog standard dual core system as my main desktop. I also have a 833 MHz Pentium-3 laptop that i booted into a new kernel 4 times today alone:
#0, d5f8b495, Mon_Sep__7_08_39_36_CEST_2009: 0 kernels/hour #1, b9e808ca, Mon_Sep__7_09_19_47_CEST_2009: 1 kernels/hour #2, b9e808ca, Mon_Sep__7_10_26_28_CEST_2009: 1 kernels/hour #3, b9e808ca, Mon_Sep__7_14_58_48_CEST_2009: 0 kernels/hour $ head /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 8 model name : Pentium III (Coppermine) stepping : 10 cpu MHz : 846.242 cache size : 256 KB $ uname -a Linux m 2.6.31-rc9-tip-01360-gb9e808c-dirty #1178 SMP Mon Sep 7 22:38:18 CEST 2009 i686 i686 i386 GNU/Linux
And that test-system does that every day - today isnt a special day. Look at the build count: #1178. This means that i booted more than a thousand development kernels on this system already.
Now, to reply to your suggestion: for scheduler performance i picked the 8 core system because that's where i do scheduler tests: it allows me to characterise that system _and_ also allows me to characterise lower performance systems to a fair degree.
Check out the updated jpgs with quad-core results.
See how similar the single-socket quad results are to the 8-core results i posted initially? People who do scheduler development do this trick frequently: most of the "obvious" results can be downscaled as a ballpark figure.
(the reason for that is very fundamental: you dont see new scheduler limitations pop up as you go down with the number of cores. The larger system already includes all the limitations the scheduler has on 4, 2 or 1 core, and reflects those properties already so there's no surprises. Plus, testing is a lot faster. It took me 8 hours today to get all the results from the quad system. And this is right before the 2.6.32 merge window opens, when Linux maintainers like me are very busy.)
Certainly there are borderline graphs and also trickier cases that cannot be downscaled like that, and in general 'interactivity' - i.e. all things latency related come out on smaller systems in a more pronounced way.
But when it comes to scheduler design and merge decisions that will trickle down and affect users 1-2 years down the line (once it gets upstream, once distros use the new kernels, once users install the new distros, etc.), i have to "look ahead" quite a bit (1-2 years) in terms of the hardware spectrum.
Btw., that's why the Linux scheduler performs so well on quad core systems today - the groundwork for that was laid two years ago when scheduler developers were testing on a quads. If we discovered fundamental problems on quads _today_ it would be way too late to help Linux users.
Hope this explains why kernel devs are sometimes seen to be ahead of the hardware curve. It's really essential, and it does not mean we are detached from reality.
In any case - if you see any interactivity problems, on any class of systems, please do report them to lkml and help us fix them.