Emerge reschedules too quickly after returning from SIGTSTP <Ctrl-Z> via fg, causes it to recognize 'idle load' and runs max '--jobs'.
Steps to Reproduce:
1. emerge --load-average=1 --jobs 5 -1 firefox qtcore qtgui pypy gentoo-sources vanilla-sources libreoffice
3. Wait for load to drop
After fg, max number of jobs is being run, because load is low.
After few seconds it skyrockets, note the nasty and demanding compilation packages I chose in example.
Register SIGCONT, don't schedule new jobs for a few seconds. Reschedule after load from running compilation kicks in.
(In reply to Tomáš "tpruzina" Pružina (amd64 AT) from comment #0)
> 1. emerge --load-average=1 --jobs 5 -1 firefox qtcore qtgui pypy
> gentoo-sources vanilla-sources libreoffice
Your example seems somewhat contrived. Having such a large difference between your --load-average and --jobs values doesn't make much sense in practice, does it?
Anyway, I suppose that we could add a separate option for dampening of the scheduler. Without such an option, we may not be able to avoid over-dampening in some cases (see bug #438650).
(In reply to Zac Medico from comment #1):
It indeed doesn't, I was making sure that example is clear enough.
I ran into this when recompiling stable chroot with ~glibc,
when I suddenly needed to convert a video, paused process .. and stuff hit me.
I considered fixing this myself via dirty hack, something like:
1. signal handler registers SIGCONT
2. set self.no_reschedule = true
4. set self.no_reschedule = false
And ofc, I would make schedule instances wait till no_reschedule == true.
Haven't got to do this yet, should be matter of few lines of code I suppose.
Also, I kinda don't like to mess with other peoples code unless I absolutely have to.
Yeah, using a SIGCONT handler like that make sense. The Scheduler class already handles SIGINT and SIGTERM, so I suppose it can handle SIGCONT as well. The signal handler can save a timestamp in a variable, and we can have the scheduling code check for that and expire it at an appropriate time.
Created attachment 352820 [details, diff]
Untested 'concept' patch
Created 'concept patch', not really sure this thingy even works since I have not tested it at all.
time.clock() is used instead of time.time to avoid potential problems if system time is changed during emerge.
Perhaps even other functions should be using time.clock instead of time.time, unless time is used to output time to user.
Oh, patch lacks registering the handler, mea culpa, it's getting late...
Its not good to sleep in an even-loop driven program like this, because it will starve the event loop. What we really want to do is add some logic to the Scheduler._job_delay() method. Instead of sleeping, is uses self._event_loop.timeout_add() to delay scheduling until a later time.
Also, it looks like time.clock() is deprecated in python3.3. We can simply use time.time() like _job_delay() does. If the current time is less than the previously recorded time (due to system clock adjustement), then it simply returns False (so it doesn't try to dampen scheduling in this case). This type of behavior should suffice in place of time.clock().
This is fixed in git:
This is fixed in 22.214.171.124 and 2.2.0_alpha188.