Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 346021 - Back port of sched autogroup patch to 2.6.36
Summary: Back port of sched autogroup patch to 2.6.36
Status: RESOLVED CANTFIX
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: New packages (show other bugs)
Hardware: All Linux
: High normal (vote)
Assignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-11-19 00:50 UTC by Mike Pagano
Modified: 2011-03-17 03:25 UTC (History)
15 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
backported autogroup sched patch (1800_automated-per-tty-task-groups.patch,14.97 KB, patch)
2010-11-19 01:06 UTC, Mike Pagano
Details | Diff
backported autogroup sched patch (1800_automated-per-tty-task-groups.patch,14.94 KB, patch)
2010-11-19 14:13 UTC, Mike Pagano
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Mike Pagano gentoo-dev 2010-11-19 00:50:26 UTC
Bug to track the progress of backporting the sched autogroup patch to 2.6.36. Looking for volunteers to test out the patch
Comment 1 Mike Pagano gentoo-dev 2010-11-19 01:06:37 UTC
Created attachment 254777 [details, diff]
backported autogroup sched patch

Can people please test and let me know how it goes.

Thanks, 
Mike
Comment 2 jon R-B 2010-11-19 02:44:17 UTC
In file included from kernel/sched.c:1931:
kernel/sched_autogroup.c: In function ‘autogroup_create’:
kernel/sched_autogroup.c:56: error: ‘struct task_group’ has no member named ‘autogroup’
make[1]: *** [kernel/sched.o] Error 1
Comment 3 jon R-B 2010-11-19 02:57:56 UTC
http://ompldr.org/vNjg0aQ

is what I used on gentoo-source-2.6.36 (sorry can't remember the url where I got it from)
applies fine, compiles and works.

I did a make -j30 while watching a youtube to test
Comment 4 kfm 2010-11-19 04:02:11 UTC
Regarding the error reported in Comment 2, line #62 of the patch appears suspect:

#if (defined(CONFIG_SCHED_AUTOGROUP) && defined(CONFIG_SCHED_DEBUG))

It shouldn't depend on SCHED_DEBUG, otherwise a build failure is inevitable if only SCHED_AUTOGROUP is enabled.

Readers may also be interested to note that Leonard Poettering has a solution which is enacted from userspace (CONFIG_CGROUPS required):

http://www.webupd8.org/2010/11/alternative-to-200-lines-kernel-patch.html
Comment 5 Fernando (likewhoa) 2010-11-19 04:45:02 UTC
(In reply to comment #3)
> http://ompldr.org/vNjg0aQ
> 
> is what I used on gentoo-source-2.6.36 (sorry can't remember the url where I
> got it from)
> applies fine, compiles and works.
> 
> I did a make -j30 while watching a youtube to test
> 

This works for me.. ~amd64 gentoo-sources-2.36
Comment 6 Mike Pagano gentoo-dev 2010-11-19 12:22:16 UTC
In reply to comment #3)
> http://ompldr.org/vNjg0aQ
> 
> is what I used on gentoo-source-2.6.36 (sorry can't remember the url where I
> got it from)

Would like to know the source of this, mine is from the author of the original patch.  I'll look at the differences but I need to know the source.
Comment 7 Fernando (likewhoa) 2010-11-19 12:53:30 UTC
(In reply to comment #6)
> In reply to comment #3)
> > http://ompldr.org/vNjg0aQ
> > 
> > is what I used on gentoo-source-2.6.36 (sorry can't remember the url where I
> > got it from)
> 
> Would like to know the source of this, mine is from the author of the original
> patch.  I'll look at the differences but I need to know the source.
> 

The source url of comment #3 was posted at http://www.phoronix.com/forums/showthread.php?p=156862#post156862.


Comment 8 Mike Pagano gentoo-dev 2010-11-19 13:41:47 UTC
(In reply to comment #7)
> (In reply to comment #6)
> The source url of comment #3 was posted at
> http://www.phoronix.com/forums/showthread.php?p=156862#post156862.
> 

In the least, that patch is missing code utilizing the runqueue lock from the author's version.  I will post a version that does not require SCHED_DEBUG, but I don't recommend any one use that patch linked from that German blog.

Comment 9 Mike Pagano gentoo-dev 2010-11-19 14:13:42 UTC
Created attachment 254827 [details, diff]
backported autogroup sched patch

Updated patch. Please test this patch, thanks.
Comment 10 Fernando (likewhoa) 2010-11-19 21:32:24 UTC
(In reply to comment #9)
> Created an attachment (id=254827) [details]
> backported autogroup sched patch
> 
> Updated patch. Please test this patch, thanks.
> 

testing now while running make -j60 and playing a 1080p youtube video at full screen with a load average of 62.0 30.0 12.0 feels good man.
Comment 11 jon R-B 2010-11-20 19:23:45 UTC
patches, compiles, boots, responds

looking good
Comment 12 Mike Pagano gentoo-dev 2010-11-21 01:41:52 UTC
Masked version of gentoo-sources released containing this patch.

Unmask gentoo-sources-2.6.36-r2 to test.
Comment 13 MMelchert 2010-11-21 07:05:48 UTC
Just unmasked 2.6.36-r2, compiles and runs as expected
Had a different patch against 2.6.36 since 11/19 but
it definitely is nicer to have it in gentoo repository.
thanx mike.
Comment 14 Andreas Sturmlechner gentoo-dev 2010-11-21 15:56:38 UTC
(In reply to comment #12)
> Masked version of gentoo-sources released containing this patch.
> 
> Unmask gentoo-sources-2.6.36-r2 to test.
> 

Manually updated tuxonice-sources with the new gentoo patchset to confirm that it too builds and runs fine! :)
Comment 15 Krzysztof Pawlik (RETIRED) gentoo-dev 2010-11-21 16:10:01 UTC
(In reply to comment #14)
> (In reply to comment #12)
> > Masked version of gentoo-sources released containing this patch.
> > 
> > Unmask gentoo-sources-2.6.36-r2 to test.
> > 
> 
> Manually updated tuxonice-sources with the new gentoo patchset to confirm that
> it too builds and runs fine! :)

I will update tuxonice-sources and ck-sources when gentoo-sources that include this patch gets unmasked.
Comment 16 Andreas Sturmlechner gentoo-dev 2010-11-21 16:32:30 UTC
The patch isn't perfect it seems - now I get dropouts in Amarok which shouldn't happen on a modern dual core system. Are there any recommended/mandatory control group/CFQ options to set alongside the new automation option?
Comment 17 Andreas Sturmlechner gentoo-dev 2010-11-21 16:44:37 UTC
(In reply to comment #16)
> The patch isn't perfect it seems - now I get dropouts in Amarok which shouldn't
> happen on a modern dual core system. Are there any recommended/mandatory
> control group/CFQ options to set alongside the new automation option?
> 

Under not so heavy load tbh, emerge checking dependencies is enough to make Amarok cringe now...
Comment 18 MMelchert 2010-11-21 18:49:03 UTC
tested mplayer-1.0_rc4_p20101114 and amarok-1.4.10_p20090130-r4
(no amarok2 on system) on ~amd64 installation running kernel make -j64 
while playing movie and mp3 stream/file and didn't encounter any 
dropouts like #17 reported. but this is plain gentoo-sources-2.6.36-r2.
Comment 19 Andreas Sturmlechner gentoo-dev 2010-11-22 00:04:46 UTC
(In reply to comment #18)
> tested mplayer-1.0_rc4_p20101114 and amarok-1.4.10_p20090130-r4
> (no amarok2 on system) on ~amd64 installation running kernel make -j64 
> while playing movie and mp3 stream/file and didn't encounter any 
> dropouts like #17 reported. but this is plain gentoo-sources-2.6.36-r2.
> 

There won't be any difference to gentoo-sources runtime-wise imo. Sorry to spoil the fun but I'm able to reproduce this issue any time I'm firing up 'emerge -uvaDN world' and it definitely wasn't present without the patch:

amarok-2.3.2-r1: sound stuttering
kaffeine-9999 TV: sound stops, then video, eventually returns or can't come back again
kmplayer-9999 XVid movie: system completely stalled for ages (a double-digit seconds span), no mouse, redrawing, anything

This happening on a fairly decent equipped Thinkpad X200s with a Core2 Duo @1.86 GHz. My system might be exotic in that it is equipped with a Seagate Momentus XT 500GB HDD including 4GB SLC SSD cache, with the portage tree quite likely residing inside the SSD cache due to its regular use (and emerge --sync really is pretty fast).

Someone else in possession of (and the portage tree residing in) an SSD could shed light on this.
Comment 20 Andreas Sturmlechner gentoo-dev 2010-11-22 00:39:33 UTC
Another test case:

emerge =vanilla-sources-2.6.37_rc3 just made kaffeine cringe again during unpacking (to tmpfs), NOT in the install phase.
Comment 21 kfm 2010-11-22 01:05:28 UTC
I think this ought to be designated as an experimental feature for the following reasons:

  * it's now in genpatches-base and may thus propagate to other ebuilds but ...
  * has had limited testing
  * may not be a guaranteed win across all workloads
  * is not yet featured in any mainline stable kernel

Perhaps something like this ...

    bool "Automatic process group scheduling (EXPERIMENTAL)"
    depends on EXPERIMENTAL
    select CGROUPS
    select CGROUP_SCHED
    select FAIR_GROUP_SCHED
    help
      This option optimizes the scheduler for common desktop workloads by
      automatically creating and populating task groups.  This separation
      of workloads isolates aggressive CPU burners (like build jobs) from
      desktop applications.  Task group autogeneration is currently based
      upon task session. This feature has not yet been tested with a wide
      variety of workloads and, therefore, performance regressions may be
      seen in certain situations. If in doubt, say N.
Comment 22 kfm 2010-11-22 01:24:00 UTC
Not to diminish the potential importance of this patch, but it should also be noted that there are already very effective methods for preventing builds jobs from monopolising CPU resources as it stands. For example:

  schedtool -B emerge <package>

Alas, there's no way that I know of to make that a default in make.conf although it can be handled with tools such as schedtoold (it used to be in portage but seems to have been dropped at some point):

  http://www.darav.de/schedtoold.html
Comment 23 Andreas Sturmlechner gentoo-dev 2010-11-22 01:36:56 UTC
Yes, this should be deemed experimental. I would also refrain from crediting it with common desktop workload optimization as long as most people seem to test this with make -j64. I mean, how common is that compared on a desktop to unpacking random stuff whilst listening to music?
Comment 24 kfm 2010-11-22 02:57:12 UTC
Re: Comment 23 - that's the exact point that Leonard made on the LKML thread although there is clearly some disagreement about what exactly constitutes a 'desktop' workload:

  http://lkml.org/lkml/2010/11/16/420

You might want to report your experiences upstream as the discussion here is most likely flying under the radar, so to speak.
Comment 25 Yang Dehua 2010-11-22 05:29:41 UTC
Just tried in my G5 running PPC64, but it didn't work. In yaboot process, it got errors something like:
...
Invalid memory access at   %SRR0:000...
...
0>
The only thing I could do is to shutdown at that stage.
Comment 26 Yang Dehua 2010-11-22 05:51:05 UTC
(In reply to comment #25)
> Just tried in my G5 running PPC64, but it didn't work. In yaboot process, it
> got errors something like:
> ...
> Invalid memory access at   %SRR0:000...
> ...
> 0>
> The only thing I could do is to shutdown at that stage.
> 

To be exact, the error message is:

Please wait, loading kernel...
  Elf64 kernel loaded...
Invalid memory access at %SRR0:00000000.00c00000 %SRR1:10000000.00083030
Apple PowerMac7,2 5.1.4f0 BootROM built on 11/21/03 at 17:41:48
Copyright 1994-2003, Apple computer, Inc.
All rights reserved
Welcome to Open Firmware, the system time and date is: 05:25:07 11/22/2010
To continue booting, type "mac-boot" and press return.
To shut down, type "shut-down" and press return.
  ok
0>
Comment 27 Mike Pagano gentoo-dev 2010-11-22 13:34:29 UTC
For everyone, I wouldn't kill yourself testing this patch as it is evolving fairly quickly, and enough people have posted adverse effects.

It's flaky enough not to release a gentoo-sources that's not package masked containing it. 

If I have time to look at backporting V4, I will release a new masked version, but even that version needs changes, as anyone following lkml has seen.

OTOH, I plan on using it on my desktop systems, since it's been great for me.

Thanks to everyone for your help. You are all a big part of making Gentoo a better distribution.

-Mike

Comment 28 Pacho Ramos gentoo-dev 2011-01-09 20:59:34 UTC
Was this finally included in 2.6.37?
Comment 29 Andreas Sturmlechner gentoo-dev 2011-01-09 22:28:18 UTC
(In reply to comment #28)
> Was this finally included in 2.6.37?
> 

I would hope not, as it only made things worse (not only) on my system.
Comment 30 kfm 2011-01-10 06:54:58 UTC
No, it's not in 2.6.37.
Comment 31 Mike Pagano gentoo-dev 2011-03-11 16:13:35 UTC
Closing until something new occurs in this area
Comment 33 kfm 2011-03-17 03:25:52 UTC
I'm not sure if there is any interest in backporting this at this point but, just in case anyone strays upon this and is at all interested, commit 3ff6dcac735704824c1dff64dc6863c390d364cc should be taken into consideration:

 "sched: Fix poor interactivity on UP systems due to group scheduler nice
         tune bug
    
  Michael Witten and Christian Kujau reported that the autogroup
  scheduling feature hurts interactivity on their UP systems.
   
  It turns out that this is an older bug in the group scheduling code,
  and the wider appeal provided by the autogroup feature exposed it
  more prominently."