With Linux's autogroup scheduling feature (CONFIG_SCHED_AUTOGROUP) setting a nice value on a per-process base has only an effect for scheduling decisions relative to the other threads in the same session (typically: the same terminal window). See the section "The nice value and group scheduling" in the sched(7) man page. Basically this means that portage "just" setting the nice value, has no effect in presence of autogroup scheduling being active (which is probably true for most (desktop) user systems). Hence portage should set the autogroup's value to PORTAGE_NICENESS and restore the original value afterwards. Reproducible: Always
Created attachment 692715 [details, diff] 0001-PORTAGE_NICENESS-Consider-autogroup-scheduling.patch
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/proj/portage.git/commit/?id=a4d882964ee1931462f911d0c46a80e27e59fa48 commit a4d882964ee1931462f911d0c46a80e27e59fa48 Author: Florian Schmaus <flo@geekplace.eu> AuthorDate: 2021-03-21 11:07:38 +0000 Commit: Zac Medico <zmedico@gentoo.org> CommitDate: 2021-06-13 21:45:32 +0000 PORTAGE_NICENESS: Consider autogroup scheduling With Linux's autogroup scheduling feature (CONFIG_SCHED_AUTOGROUP) setting a nice value on a per-process base has only an effect for scheduling decisions relative to the other threads in the same session (typically: the same terminal window). See the section "The nice value and group scheduling" in the sched(7) man page. Basically this means that portage "just" setting the nice value, has no effect in presence of autogroup scheduling being active (which is probably true for most (desktop) user systems). This commit changes emerge to set the autogroup's nice value, instead of the processes' nice value, in case autogroups are present (detected by the existence of /proc/self/autogroup). The tricky part about autogroup nice values is that we want restore the orignal nice value once we are finished. As otherwise, the session, e.g. your terminal, would continue using this value, and so would subsequently executed processes. For that we use Python's atexit functinaly, to register a function that will restore the orignal nice value of the autogroup. Bug: https://bugs.gentoo.org/777492 Signed-off-by: Florian Schmaus <flo@geekplace.eu> Signed-off-by: Zac Medico <zmedico@gentoo.org> lib/_emerge/actions.py | 36 +++++++++++++++++++++++++++++++++--- man/make.conf.5 | 10 +++++++++- 2 files changed, 42 insertions(+), 4 deletions(-)
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=c0f239a182977dcd67e5592dfedc5f07d927ad53 commit c0f239a182977dcd67e5592dfedc5f07d927ad53 Author: Zac Medico <zmedico@gentoo.org> AuthorDate: 2021-06-13 22:03:39 +0000 Commit: Zac Medico <zmedico@gentoo.org> CommitDate: 2021-06-13 22:08:03 +0000 sys-apps/portage: Bump to version 3.0.20 for final EAPI 8 #777492: PORTAGE_NICENESS: Consider autogroup scheduling #794166: setup.py: prefer setuptools over distutils Bug: https://bugs.gentoo.org/785484 Bug: https://bugs.gentoo.org/777492 Bug: https://bugs.gentoo.org/794166 Package-Manager: Portage-3.0.20, Repoman-3.0.3 Signed-off-by: Zac Medico <zmedico@gentoo.org> sys-apps/portage/Manifest | 1 + sys-apps/portage/portage-3.0.20.ebuild | 266 +++++++++++++++++++++++++++++++++ 2 files changed, 267 insertions(+)
Had this error reported in #gentoo-portage: > # PORTAGE_NICENESS=20 emerge -av --depclean > OSError: [Errno 22] Invalid argument > > During handling of the above exception, another exception occurred: > > Traceback (most recent call last): > File "/usr/lib/python-exec/python3.9/emerge", line 51, in <module> > retval = emerge_main() > File "/usr/lib/python3.9/site-packages/_emerge/main.py", line 1319, in emerge_main > return run_action(emerge_config) > File "/usr/lib/python3.9/site-packages/_emerge/actions.py", line 2999, in run_action > apply_priorities(emerge_config.target_config.settings) > File "/usr/lib/python3.9/site-packages/_emerge/actions.py", line 2635, in apply_priorities > nice(settings) > File "/usr/lib/python3.9/site-packages/_emerge/actions.py", line 2672, in nice > out.eerror("%s\n" % str(e)) > OSError: [Errno 22] Invalid argument
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/proj/portage.git/commit/?id=ac4f07b4b04aadf57f78cb21729e1f5439609f81 commit ac4f07b4b04aadf57f78cb21729e1f5439609f81 Author: Zac Medico <zmedico@gentoo.org> AuthorDate: 2021-06-14 06:23:56 +0000 Commit: Zac Medico <zmedico@gentoo.org> CommitDate: 2021-06-14 06:26:00 +0000 Revert "PORTAGE_NICENESS: Consider autogroup scheduling" This reverts commit a4d882964ee1931462f911d0c46a80e27e59fa48. It triggered this regression: # PORTAGE_NICENESS=20 emerge -av --depclean OSError: [Errno 22] Invalid argument During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/lib/python-exec/python3.9/emerge", line 51, in <module> retval = emerge_main() File "/usr/lib/python3.9/site-packages/_emerge/main.py", line 1319, in emerge_main return run_action(emerge_config) File "/usr/lib/python3.9/site-packages/_emerge/actions.py", line 2999, in run_action apply_priorities(emerge_config.target_config.settings) File "/usr/lib/python3.9/site-packages/_emerge/actions.py", line 2635, in apply_priorities nice(settings) File "/usr/lib/python3.9/site-packages/_emerge/actions.py", line 2672, in nice out.eerror("%s\n" % str(e)) OSError: [Errno 22] Invalid argument Bug: https://bugs.gentoo.org/777492#c4 Signed-off-by: Zac Medico <zmedico@gentoo.org> lib/_emerge/actions.py | 36 +++--------------------------------- man/make.conf.5 | 10 +--------- 2 files changed, 4 insertions(+), 42 deletions(-)
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=19043cf034fc35902ada6c59b377d8edd57afd2e commit 19043cf034fc35902ada6c59b377d8edd57afd2e Author: Zac Medico <zmedico@gentoo.org> AuthorDate: 2021-06-14 06:37:50 +0000 Commit: Zac Medico <zmedico@gentoo.org> CommitDate: 2021-06-14 06:41:12 +0000 sys-apps/portage: 3.0.20-r1 revbump Reverting the patch from bug 777492 because it triggered this regression: # PORTAGE_NICENESS=20 emerge -av --depclean OSError: [Errno 22] Invalid argument During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/lib/python-exec/python3.9/emerge", line 51, in <module> retval = emerge_main() File "/usr/lib/python3.9/site-packages/_emerge/main.py", line 1319, in emerge_main return run_action(emerge_config) File "/usr/lib/python3.9/site-packages/_emerge/actions.py", line 2999, in run_action apply_priorities(emerge_config.target_config.settings) File "/usr/lib/python3.9/site-packages/_emerge/actions.py", line 2635, in apply_priorities nice(settings) File "/usr/lib/python3.9/site-packages/_emerge/actions.py", line 2672, in nice out.eerror("%s\n" % str(e)) OSError: [Errno 22] Invalid argument Bug: https://bugs.gentoo.org/785484 Bug: https://bugs.gentoo.org/777492#c4 Package-Manager: Portage-3.0.20, Repoman-3.0.3 Signed-off-by: Zac Medico <zmedico@gentoo.org> sys-apps/portage/Manifest | 1 + .../portage/{portage-3.0.20.ebuild => portage-3.0.20-r1.ebuild} | 6 +++++- 2 files changed, 6 insertions(+), 1 deletion(-)
The valid range of nice values goes from -20 to 19 (cf. nice(1)), so PORTAGE_NICENESS=20 is outside this range. Python's os.nice() appears to be non-strict when it comes to valid input values, ie. os.nice(20) sets the value to 19. Writing 20 to /proc/self/autogroup results in a failure with errno set to "Invalid argument". We have now multiple options: - Verify the value of PORTAGE_NICENESS to be within the valid range, and return a clear error message to the user if not - Verify the value of PORTAGE_NICENESS and automatically cap it to be within the valid range if not (that appears to be the current behavior)
Should be fixed with https://github.com/gentoo/portage/pull/727/commits/deeb59a5256f3fcc1299fdc8a4b861ecc5ffdf74
(In reply to Zac Medico from comment #4) > Had this error reported in #gentoo-portage: > > > # PORTAGE_NICENESS=20 emerge -av --depclean > > OSError: [Errno 22] Invalid argument > > > > During handling of the above exception, another exception occurred: > > > > Traceback (most recent call last): > > File "/usr/lib/python-exec/python3.9/emerge", line 51, in <module> > > retval = emerge_main() > > File "/usr/lib/python3.9/site-packages/_emerge/main.py", line 1319, in emerge_main > > return run_action(emerge_config) > > File "/usr/lib/python3.9/site-packages/_emerge/actions.py", line 2999, in run_action > > apply_priorities(emerge_config.target_config.settings) > > File "/usr/lib/python3.9/site-packages/_emerge/actions.py", line 2635, in apply_priorities > > nice(settings) > > File "/usr/lib/python3.9/site-packages/_emerge/actions.py", line 2672, in nice > > out.eerror("%s\n" % str(e)) > > OSError: [Errno 22] Invalid argument Another interesting failure mode was posted by Guilherme Amadio in #gentoo-dev: > ~ $ emerge --sync > OSError: [Errno 22] Invalid argument > > During handling of the above exception, another exception occurred: > > Traceback (most recent call last): > File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python-exec/python3.8/emerge", line 51, in <module> > retval = emerge_main() > File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/main.py", line 1319, in emerge_main > return run_action(emerge_config) > File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/actions.py", line 2999, in run_action > apply_priorities(emerge_config.target_config.settings) > File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/actions.py", line 2635, in apply_priorities > nice(settings) > File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/actions.py", line 2672, in nice > out.eerror("%s\n" % str(e)) > BlockingIOError: [Errno 11] write could not complete without blocking
(In reply to Zac Medico from comment #9) > (In reply to Zac Medico from comment #4) > Another interesting failure mode was posted by Guilherme Amadio in > #gentoo-dev: > > > ~ $ emerge --sync > > OSError: [Errno 22] Invalid argument I believe that this is fixed with the new version of the patchset (available in PR #727), which writes the return value of os.nice() to /proc/self/autogroup. I've asked Amadio in #gentoo-dev to test it.
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/proj/portage.git/commit/?id=055abe523c2c3f6c8f1dccfb53565209222f90c1 commit 055abe523c2c3f6c8f1dccfb53565209222f90c1 Author: Florian Schmaus <flo@geekplace.eu> AuthorDate: 2021-03-21 11:07:38 +0000 Commit: Zac Medico <zmedico@gentoo.org> CommitDate: 2021-06-14 21:20:06 +0000 PORTAGE_NICENESS: Consider autogroup scheduling With Linux's autogroup scheduling feature (CONFIG_SCHED_AUTOGROUP) setting a nice value on a per-process base has only an effect for scheduling decisions relative to the other threads in the same session (typically: the same terminal window). See the section "The nice value and group scheduling" in the sched(7) man page. Basically this means that portage "just" setting the nice value, has no effect in presence of autogroup scheduling being active (which is probably true for most (desktop) user systems). This commit changes emerge to set the autogroup's nice value, instead of the processes' nice value, in case autogroups are present (detected by the existence of /proc/self/autogroup). The tricky part about autogroup nice values is that we want restore the orignal nice value once we are finished. As otherwise, the session, e.g. your terminal, would continue using this value, and so would subsequently executed processes. For that we use Python's atexit functinaly, to register a function that will restore the orignal nice value of the autogroup. Users may have set PORTAGE_NICENESS to a value outside of the range of valid nice values [-20, 19]. Calling os.nice() with such a value will simply cap the process's nice value, but writing this invalid value to the autogoup pseudo-file will fail with "Invalid argument". Since os.nice() returns the current nice value, we simply use the returned value to set the autogroup nice value. Portage would previously always change the nice value to zero, even if the user did not explicitly request so. Now we do not change the nice value unless requested. Closes: https://github.com/gentoo/portage/pull/727 Bug: https://bugs.gentoo.org/777492 Signed-off-by: Florian Schmaus <flo@geekplace.eu> Signed-off-by: Zac Medico <zmedico@gentoo.org> lib/_emerge/actions.py | 48 +++++++++++++++++++++++++++++++++++++++++++++--- man/make.conf.5 | 10 +++++++++- 2 files changed, 54 insertions(+), 4 deletions(-)
With the current version of https://github.com/gentoo/portage/pull/727 (0bc947e), I get the following: $ emerge -1p portage # no setting, returns early with nice="do-not-change" These are the packages that would be merged, in order: Calculating dependencies... done! [ebuild R ] sys-apps/portage-3.0.20-r1 $ nice -n -10 -- emerge -1p portage # fails to set niceness, returns early with nice="do-not-change" nice: cannot set niceness: Permission denied These are the packages that would be merged, in order: Calculating dependencies... done! [ebuild R ] sys-apps/portage-3.0.20-r1 $ nice -n 10 -- emerge -1p portage # still returns early with nice="do-not-change" These are the packages that would be merged, in order: Calculating dependencies... done! [ebuild R ] sys-apps/portage-3.0.20-r1 $ env PORTAGE_NICENESS=-1 emerge -1p portage # debug: autogroup_nice_value=0 * Failed to change nice value to -1 * [Errno 1] Operation not permitted PermissionError: [Errno 1] Operation not permitted During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python-exec/python3.8/emerge", line 51, in <module> retval = emerge_main() File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/main.py", line 1319, in emerge_main return run_action(emerge_config) File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/actions.py", line 3015, in run_action apply_priorities(emerge_config.target_config.settings) File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/actions.py", line 2635, in apply_priorities nice(settings) File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/actions.py", line 2689, in nice out.eerror("%s\n" % str(e)) PermissionError: [Errno 1] Operation not permitted ==================================== Error in portage.process.run_exitfuncs File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/portage/process.py", line 193, in run_exitfuncs func(*targs, **kargs) File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/actions.py", line 2680, in <lambda> lambda value: autogroup_file.write_text(value), File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/pathlib.py", line 1256, in write_text return f.write(data) [Errno 11] write could not complete without blocking ==================================== Error in atexit._run_exitfuncs: Traceback (most recent call last): File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/portage/process.py", line 193, in run_exitfuncs func(*targs, **kargs) File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/actions.py", line 2680, in <lambda> lambda value: autogroup_file.write_text(value), File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/pathlib.py", line 1256, in write_text return f.write(data) BlockingIOError: [Errno 11] write could not complete without blocking $ env PORTAGE_NICENESS=1 emerge -1p portage # debug: current_nice_value=1, autogroup_nice_value=0 OSError: [Errno 22] Invalid argument During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python-exec/python3.8/emerge", line 51, in <module> retval = emerge_main() File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/main.py", line 1319, in emerge_main return run_action(emerge_config) File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/actions.py", line 3015, in run_action apply_priorities(emerge_config.target_config.settings) File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/actions.py", line 2635, in apply_priorities nice(settings) File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/actions.py", line 2689, in nice out.eerror("%s\n" % str(e)) BlockingIOError: [Errno 11] write could not complete without blocking ==================================== Error in portage.process.run_exitfuncs File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/portage/process.py", line 193, in run_exitfuncs func(*targs, **kargs) File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/actions.py", line 2680, in <lambda> lambda value: autogroup_file.write_text(value), File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/pathlib.py", line 1256, in write_text return f.write(data) [Errno 11] write could not complete without blocking ==================================== Error in atexit._run_exitfuncs: Traceback (most recent call last): File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/portage/process.py", line 193, in run_exitfuncs func(*targs, **kargs) File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/actions.py", line 2680, in <lambda> lambda value: autogroup_file.write_text(value), File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/pathlib.py", line 1256, in write_text return f.write(data) BlockingIOError: [Errno 11] write could not complete without blocking
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=c9af8d3c9f5854407f1306c3adaceee1bb4ac07c commit c9af8d3c9f5854407f1306c3adaceee1bb4ac07c Author: Zac Medico <zmedico@gentoo.org> AuthorDate: 2021-06-14 21:34:41 +0000 Commit: Zac Medico <zmedico@gentoo.org> CommitDate: 2021-06-14 21:44:22 +0000 sys-apps/portage: 3.0.20-r2 revbump Update the patch for bug 777492. See: https://github.com/gentoo/portage/pull/727 Bug: https://bugs.gentoo.org/777492 Bug: https://bugs.gentoo.org/785484 Package-Manager: Portage-3.0.20, Repoman-3.0.3 Signed-off-by: Zac Medico <zmedico@gentoo.org> sys-apps/portage/Manifest | 1 + .../portage/{portage-3.0.20-r1.ebuild => portage-3.0.20-r2.ebuild} | 6 ++++-- 2 files changed, 5 insertions(+), 2 deletions(-)
Just as a relevant piece of information, on this system (a CentOS 7 where I have the prefix installed), autogroup scheduling is disabled: $ sysctl kernel.sched_autogroup_enabled kernel.sched_autogroup_enabled = 0 If I enable it by hand and try again, I only get a failure when using negative niceness: $ nice -n -10 emerge -1p portage # fails only to set niceness, takes early-return path with nice=do-not-change nice: cannot set niceness: Permission denied These are the packages that would be merged, in order: Calculating dependencies... done! $ env PORTAGE_NICENESS=-10 emerge -1p portage * Failed to change nice value to -10 * [Errno 1] Operation not permitted PermissionError: [Errno 1] Operation not permitted During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python-exec/python3.8/emerge", line 51, in <module> retval = emerge_main() File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/main.py", line 1319, in emerge_main return run_action(emerge_config) File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/actions.py", line 3011, in run_action apply_priorities(emerge_config.target_config.settings) File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/actions.py", line 2635, in apply_priorities nice(settings) File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/actions.py", line 2684, in nice out.eerror("%s\n" % str(e)) PermissionError: [Errno 1] Operation not permitted
I think the patch should be updated to check if autogroup scheduling is actually enabled before trying to write PORTAGE_NICENESS to /proc/self/autogroup.
(In reply to Guilherme Amadio from comment #15) > I think the patch should be updated to check if autogroup scheduling is > actually enabled before trying to write PORTAGE_NICENESS to > /proc/self/autogroup. We want to handle an error after the fact in order to avoid TOCTOU races as discussed in #gentoo-dev. Thanks @dwfreed for the TOCTOU reference: https://en.wikipedia.org/wiki/Time-of-check_to_time-of-use
(In reply to Guilherme Amadio from comment #15) > I think the patch should be updated to check if autogroup scheduling is > actually enabled before trying to write PORTAGE_NICENESS to > /proc/self/autogroup. The code already does check if CONFIG_SCHED_AUTOGROUP is enabled by silently bailing out if /proc/self/autogroup does not exist.
The file /proc/self/autogroup is there whether autogroup scheduling is enabled or not, so just trying to open the file is not a good check to detect if it's enabled or not. This is wrong: autogroup_file = Path("/proc/self/autogroup") try: f = autogroup_file.open("r+") except EnvironmentError: # Autogroup scheduling is not enabled on this system. return Opening succeeds when autogroup scheduling is disabled, so that comment is quite misleading. The error happens only when you try to write to the file as a normal user (e.g. on prefix), and you get the error I posted in the previous comment. I think you need something like this to avoid problems: def nice(settings): nice_value: str = settings.get("PORTAGE_NICENESS", "") if not nice_value: return sched_autogroup = Path("/proc/sys/kernel/sched_autogroup_enabled") try: f = sched_autogroup.open("r") sched_autogroup_enabled = int(f.readline()) if not sched_autogroup_enabled: return except EnvironmentError: # Failed to check if autogroup scheduling is enabled return try: current_nice_value = os.nice(...) (remainder of the function as before ...) With this, things work fine for me when autogroup_scheduling is disabled. Also, the exception that happens in the lambda registered to be called at exit is not caught in the current version of the code, and it probably should, because it's when you try to write to the file that the exception occurs when autogroup scheduling is disabled.
(In reply to Florian Schmaus from comment #17) > (In reply to Guilherme Amadio from comment #15) > > I think the patch should be updated to check if autogroup scheduling is > > actually enabled before trying to write PORTAGE_NICENESS to > > /proc/self/autogroup. > > The code already does check if CONFIG_SCHED_AUTOGROUP is enabled by silently > bailing out if /proc/self/autogroup does not exist. This check is wrong, because CONFIG_SCHED_AUTOGROUP can be enabled, and the file will exist, and it will open just fine, but the feature can be actually disabled via sysctl and cause the problem I reported. That's why I'm saying you need to check if it's enabled or not explicitly, not via opening the file, because it's there even when the feature is disabled.
(In reply to Guilherme Amadio from comment #19) > This check is wrong, because CONFIG_SCHED_AUTOGROUP can be enabled, and the > file will exist, and it will open just fine, but the feature can be actually > disabled via sysctl Fair point. Based on your code I came up with https://github.com/gentoo/portage/pull/728/commits/6a62b5a47194002b29da580339eb6daddbcef298
Thanks, that's an improvement. When autogroup scheduling is disabled, it works. However, when it's enabled, I still get this: $ env PORTAGE_NICENESS=-10 emerge -1p portage # operation not permitted, fine * Failed to change nice value to -10 * [Errno 1] Operation not permitted These are the packages that would be merged, in order: Calculating dependencies... done! [ebuild R ] sys-apps/portage-3.0.20-r2 $ env PORTAGE_NICENESS=10 emerge -1p portage # unhandled exceptions, not fine OSError: [Errno 22] Invalid argument During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python-exec/python3.8/emerge", line 51, in <module> retval = emerge_main() File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/main.py", line 1319, in emerge_main return run_action(emerge_config) File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/actions.py", line 3023, in run_action apply_priorities(emerge_config.target_config.settings) File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/actions.py", line 2635, in apply_priorities nice(settings) File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/actions.py", line 2693, in nice portage.atexit_register( BlockingIOError: [Errno 11] write could not complete without blocking ==================================== Error in portage.process.run_exitfuncs File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/portage/process.py", line 193, in run_exitfuncs func(*targs, **kargs) File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/actions.py", line 2694, in <lambda> lambda value: autogroup_file.write_text(value), File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/pathlib.py", line 1256, in write_text return f.write(data) [Errno 11] write could not complete without blocking ==================================== Error in atexit._run_exitfuncs: Traceback (most recent call last): File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/portage/process.py", line 193, in run_exitfuncs func(*targs, **kargs) File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/actions.py", line 2694, in <lambda> lambda value: autogroup_file.write_text(value), File "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/pathlib.py", line 1256, in write_text return f.write(data) BlockingIOError: [Errno 11] write could not complete without blocking So, you need to list BlockingIOError in the except block, as on prefix we are a regular user and cannot write to the file even when autogroup scheduling is enabled. Also, if possible you should only try to register the funtion to be called at exit if you changed the value, as right now you have two failures, once when you try to write to the file and fail, then again when you try again from atexit._run_exitfuncs (which doesn't catch the exception).
Nevermind, this prefix system is on a 2.6.32 kernel, so for me just echoing 10 to /proc/self/autogroup fails even as root and when the feature is enabled. I guess the error above will only happen for prefix systems with such old kernels. The current version of the code should be fine to merge then.
I just tried on another prefix where the kernel is a bit newer (3.10), and there I don't see any problem writing to /proc/self/autogroup whether the feature is enabled or not. If the error I reported only happens on ancient kernels, I guess we can keep the version without the extra check I suggested in the end, to avoid the race condition mentioned by zmedico. As long as nothing is tried unless PORTAGE_NICENESS is set, it should be ok. It may just be worth adding a comment somewhere stating that for very old kernels you may not be able to set PORTAGE_NICENESS because the autogroup scheduling feature is just not functional.
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=797c0e1ba74de82ff9ddf5aee2f7dd05eb8fe0a9 commit 797c0e1ba74de82ff9ddf5aee2f7dd05eb8fe0a9 Author: Zac Medico <zmedico@gentoo.org> AuthorDate: 2021-06-17 17:29:38 +0000 Commit: Zac Medico <zmedico@gentoo.org> CommitDate: 2021-06-17 17:37:06 +0000 Revert "sys-apps/portage: 3.0.20-r2 revbump" This reverts commit c9af8d3c9f5854407f1306c3adaceee1bb4ac07c due to another regression. See: https://github.com/gentoo/portage/pull/728 Bug: https://bugs.gentoo.org/777492 Bug: https://bugs.gentoo.org/785484 Package-Manager: Portage-3.0.20, Repoman-3.0.3 Signed-off-by: Zac Medico <zmedico@gentoo.org> sys-apps/portage/Manifest | 1 - .../{portage-3.0.20-r2.ebuild => portage-3.0.20-r1.ebuild} | 8 ++++---- 2 files changed, 4 insertions(+), 5 deletions(-)
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/proj/portage.git/commit/?id=03520f0ac680d6af62176beb4a072750c11c0b49 commit 03520f0ac680d6af62176beb4a072750c11c0b49 Author: Zac Medico <zmedico@gentoo.org> AuthorDate: 2021-06-17 17:43:49 +0000 Commit: Zac Medico <zmedico@gentoo.org> CommitDate: 2021-06-17 17:45:00 +0000 Revert "PORTAGE_NICENESS: Consider autogroup scheduling" This reverts commit 055abe523c2c3f6c8f1dccfb53565209222f90c1 due to another regression. See: https://github.com/gentoo/portage/pull/728 Bug: https://bugs.gentoo.org/777492 Bug: https://bugs.gentoo.org/785484 Signed-off-by: Zac Medico <zmedico@gentoo.org> lib/_emerge/actions.py | 48 +++--------------------------------------------- man/make.conf.5 | 10 +--------- 2 files changed, 4 insertions(+), 54 deletions(-)
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=2b0d1682428d1f718d6556c0b06da02768b9494e commit 2b0d1682428d1f718d6556c0b06da02768b9494e Author: Zac Medico <zmedico@gentoo.org> AuthorDate: 2021-06-18 06:52:24 +0000 Commit: Zac Medico <zmedico@gentoo.org> CommitDate: 2021-06-18 06:55:44 +0000 sys-apps/portage: 3.0.20-r3 revbump Reverting the remaining patch from bug 777492 due to a reported regression. Reported-by: Joonas Niilola <juippis@gentoo.org> Bug: https://bugs.gentoo.org/777492 Bug: https://bugs.gentoo.org/785484 Package-Manager: Portage-3.0.20, Repoman-3.0.3 Signed-off-by: Zac Medico <zmedico@gentoo.org> sys-apps/portage/Manifest | 1 + .../portage/{portage-3.0.20-r1.ebuild => portage-3.0.20-r3.ebuild} | 4 +++- 2 files changed, 4 insertions(+), 1 deletion(-)
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/proj/portage.git/commit/?id=8e47286b7082aac21fe25402a1f9d03db968cd30 commit 8e47286b7082aac21fe25402a1f9d03db968cd30 Author: Zac Medico <zmedico@gentoo.org> AuthorDate: 2021-06-18 06:58:10 +0000 Commit: Zac Medico <zmedico@gentoo.org> CommitDate: 2021-06-18 06:59:17 +0000 Revert "pid-ns-init: Carry the autogroup's nice value into the new session" This reverts commit 209be9a8bee13384dd04a4762436b4c2a5e35bc6 due to another regression. Reported-by: Joonas Niilola <juippis@gentoo.org> Bug: https://bugs.gentoo.org/777492 Bug: https://bugs.gentoo.org/785484 Signed-off-by: Zac Medico <zmedico@gentoo.org> bin/pid-ns-init | 22 +--------------------- 1 file changed, 1 insertion(+), 21 deletions(-)
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/proj/portage.git/commit/?id=875c8dbcc6e9c98d289ec1869c61fbcc4da5864c commit 875c8dbcc6e9c98d289ec1869c61fbcc4da5864c Author: Florian Schmaus <flow@gentoo.org> AuthorDate: 2021-06-29 08:54:19 +0000 Commit: Michał Górny <mgorny@gentoo.org> CommitDate: 2021-08-18 16:52:30 +0000 pid-ns-init: Consider autogroup scheduling With Linux's autogroup scheduling feature (CONFIG_SCHED_AUTOGROUP) setting a nice value on a per-process base has only an effect for scheduling decisions relative to the other threads in the same session (typically: the same terminal window). See the section "The nice value and group scheduling" in the sched(7) man page. Basically this means that portage "just" setting the nice value, has no effect in presence of autogroup scheduling being active (which is probably true for most (desktop) user systems). This commit changes pid-ns-init to set the autogroup's nice value in case autogroups are present (detected by the existence of /proc/self/autogroup). My initial attempt to consider autogroup scheduling revolved around nice() in actions.py setting the autogroup nice value and restoring the original value with an atexit handler. See 055abe523c2c ("PORTAGE_NICENESS: Consider autogroup scheduling"). However this is fragile if the performing process is unprivileged (think of a user calling "ebuild foo-1.0.0.ebuild manifest") as Linux employs a rate limiting to autogroup changes by unprivileged processes [1]. Eventually this means portage can only reliable set the autogroup value within the pid-ns-init helper, where a new session is created. We only set the autogroup value within the new session, which relieves portage from restoring the original value, as the autogroup will cease to exist once the session exists, i.e. with the termination of the pid-ns-init helper. Note that the pid-ns-init helper is an optional portage feature 'pid-sandbox'. Only if this is enabled, portage will set the autogroup's nice value. 1: https://github.com/torvalds/linux/blob/fd0aa1a4567d0f09e1bfe367a950b004f99ac290/kernel/sched/autogroup.c#L226-L227 Bug: https://bugs.gentoo.org/777492 Signed-off-by: Florian Schmaus <flow@gentoo.org> Closes: https://github.com/gentoo/portage/pull/728 Signed-off-by: Michał Górny <mgorny@gentoo.org> bin/pid-ns-init | 20 +++++++++++++++++++- man/make.conf.5 | 5 ++++- 2 files changed, 23 insertions(+), 2 deletions(-)