Created attachment 883433 [details] custom openrc service for nvidia-powerd On both Linux & Windows laptops, since version 530, Nvidia has disabled manual control of the power limit via nvidia-smi: ``` >>> sudo nvidia-smi -pl=80 Changing power management limit is not supported for GPU: 00000000:01:00.0. Treating as warning and moving on. All done. ``` See this bug report for more info: <https://github.com/NVIDIA/open-gpu-kernel-modules/issues/483> In my case, I am limited to 60W where previously some of my workloads could use close to 100W. The solution is nvidia-powerd. Starting nvidia-powerd as root allows the GPU to increase its own power limit. Not as good as manual control, but still allows full utilization of the laptop GPU. `nvidia-powerd.service` is usable for systemd-based installs of Gentoo. However, there is no equivalent for OpenRC-based installs. I hope that `nvidia-powerd` can be provided for OpenRC as well. Attached is the service definition I'm using for now. I tested using `watch -n1 nvidia-smi` to monitor the power usage & limit as I enabled & disabled the service.
Created attachment 883434 [details] nvidia-powerd.initd Can you try this simplified script instead? As far as I can tell (similarly to nvidia-persistenced), nvidia-powerd is supposed to be able to create its own pidfile and fork the background without command_background=true. Unsure if you've already tried and failed, and thus added it. ***Note that in the event this fails, it may sit there in foreground and may prevent continuing with booting. Unfortunately cannot test myself given powerd will exit after detecting an unsupported configuration, but I do see: $ strings /opt/bin/nvidia-powerd | grep nvidia-powerd.pid /var/run/nvidia-powerd.pid Also don't think this can take arguments? So can skip command_args/confd.
(In reply to Ionen Wolkens from comment #1) > Created attachment 883434 [details] > nvidia-powerd.initd > > Can you try this simplified script instead? > > As far as I can tell (similarly to nvidia-persistenced), nvidia-powerd is > supposed to be able to create its own pidfile and fork the background > without command_background=true. Unsure if you've already tried and failed, > and thus added it. > > ***Note that in the event this fails, it may sit there in foreground and may > prevent continuing with booting. > > Unfortunately cannot test myself given powerd will exit after detecting an > unsupported configuration, but I do see: > > $ strings /opt/bin/nvidia-powerd | grep nvidia-powerd.pid > /var/run/nvidia-powerd.pid > > Also don't think this can take arguments? So can skip command_args/confd. Indeed, unlike `nvidia-persistenced`, `nvidia-powerd` doesn't background automatically, so I had to add `command_background=true`. You're also right that as far as I can tell, it doesn't take any arguments.
Thanks, I'll do it like you did then.
The bug has been closed via the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=8c3f7ffc9f5a88869bc70150eddf8465c8d5c70d commit 8c3f7ffc9f5a88869bc70150eddf8465c8d5c70d Author: Ionen Wolkens <ionen@gentoo.org> AuthorDate: 2024-01-28 12:15:42 +0000 Commit: Ionen Wolkens <ionen@gentoo.org> CommitDate: 2024-01-28 16:32:24 +0000 x11-drivers/nvidia-drivers: add nvidia-powerd openrc script Untested given requires specific hardware to even start the daemon which I do not have. Please report if any issues. Not worth revbumps, can let it propagate with rebuilds during kernel upgrades. Use /var/run rather than /run given nvidia hardcodes path to the pid file and /run may be incorrect if /var/run is not a symlink. Albeit with command_background=true openrc is technically the one handling it (may avoid duplicates, again can't test). Closes: https://bugs.gentoo.org/923117 Signed-off-by: Ionen Wolkens <ionen@gentoo.org> x11-drivers/nvidia-drivers/files/nvidia-powerd.initd | 11 +++++++++++ x11-drivers/nvidia-drivers/nvidia-drivers-525.147.05.ebuild | 1 + x11-drivers/nvidia-drivers/nvidia-drivers-535.146.02.ebuild | 1 + x11-drivers/nvidia-drivers/nvidia-drivers-535.154.05.ebuild | 1 + x11-drivers/nvidia-drivers/nvidia-drivers-535.43.23.ebuild | 1 + x11-drivers/nvidia-drivers/nvidia-drivers-545.29.06-r1.ebuild | 1 + x11-drivers/nvidia-drivers/nvidia-drivers-550.40.07.ebuild | 1 + 7 files changed, 17 insertions(+)
Ionen, thanks! > (only for use with specific laptops) But is installed on every amd64? > need dbus But there is no the dbus dependency...
(In reply to Alexander Kurakin from comment #5) > Ionen, thanks! > > > (only for use with specific laptops) > > But is installed on every amd64? > https://projects.gentoo.org/qa/policy-guide/installed-files.html#pg0301 > > need dbus > > But there is no the dbus dependency... Note that the ebuild already installs before now: insinto /usr/share/dbus-1/system.d doins nvidia-dbus.conf But it does only have dbus in DEPEND for tools. But I don't really know much about what dbus is used for here.
(In reply to Sam James from comment #6) > > > > need dbus > > > > But there is no the dbus dependency... > > Note that the ebuild already installs before now: > insinto /usr/share/dbus-1/system.d > doins nvidia-dbus.conf > > But it does only have dbus in DEPEND for tools. But I don't really know much > about what dbus is used for here. I just have > Service `nvidia-powerd' needs non existent service `dbus' on each startup. A clean install with `x11-drivers/nvidia-drivers` emerged only.
Yeah it does lack the dbus dependency with USE=-tools, which felt kind of harmless (bit in optfeature territory) but openrc now makes that more annoying. Kind of unlikely to be missing in general but I could see this happening on a headless cuda server not counting "these" users. Admit it is kinda tempting to switch to a hard dependency because three things can dlopen() libdbus-1.so.3 and it used to cause message spam when it's missing (even if not entirely broken) that I noticed before with 32bit because I didn't have multilib dbus. I think nvidia may or may not have improved that since though. $ grep -F libdbus-1.so.3 * grep: 32/libnvidia-glcore.so.550.40.07: binary file matches grep: 32/libnvidia-eglcore.so.550.40.07: binary file matches grep: libnvidia-eglcore.so.550.40.07: binary file matches grep: libnvidia-glcore.so.550.40.07: binary file matches grep: nvidia-powerd: binary file matches And in case of nvidia-powerd it's not optional, and I think glcore/eglcore use dbus to communicate with it to increase power usage when needed. But well, guess I could gate it behind USE=powerd. May disrupt existing powerd users but there are so few that I don't think it's worth a news item nor wroth being default (MULTILIB_USEDEP could disrupt current users instead). Hopefully they will notice the new USE and find it obvious they need to enable it.
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=1d31c1c2f2808bce35615c3b445c70deaa039032 commit 1d31c1c2f2808bce35615c3b445c70deaa039032 Author: Ionen Wolkens <ionen@gentoo.org> AuthorDate: 2024-02-07 01:32:00 +0000 Commit: Ionen Wolkens <ionen@gentoo.org> CommitDate: 2024-02-07 02:21:41 +0000 x11-drivers/nvidia-drivers: move nvidia-powerd behind IUSE=powerd Mostly due to the openrc service's "need dbus" which is an annoyance when dbus is missing (preventing from being a runtime-only optfeature), but even with systemd this now also allows to properly check for multilib on dbus (used by 32bit glcore/eglcore libraries to communicate with powerd). *Technically* needs a revbump given --changed-use does not know it needs to rebuild here, but given this only cause problems for rare users without dbus (e.g. headless cuda servers with USE=-tools) and will propagate with kernel updates+rebuilds let's not bother every users over this. Hopefully users of powerd (incl. for systemd which may have more existing ones) will notice the new USE and enable it. Also re-arrange arm64 handling, it makes more sense to mask the USE on arm64 than keep it as a no-op by checking if use !amd64. Exception to this is 0/550 branch which started to include a arm64 nvidia-powerd build (albeit do not think hardware that need this even exists yet). Hope did not break installation there given did not test on arm64, please report if so. Bug: https://bugs.gentoo.org/923117 Signed-off-by: Ionen Wolkens <ionen@gentoo.org> profiles/arch/arm64/package.use.mask | 4 ++++ x11-drivers/nvidia-drivers/metadata.xml | 1 + .../nvidia-drivers/nvidia-drivers-525.147.05.ebuild | 9 ++++++--- .../nvidia-drivers/nvidia-drivers-535.146.02.ebuild | 9 ++++++--- .../nvidia-drivers/nvidia-drivers-535.154.05.ebuild | 9 ++++++--- .../nvidia-drivers/nvidia-drivers-535.43.25.ebuild | 9 ++++++--- .../nvidia-drivers/nvidia-drivers-545.29.06-r1.ebuild | 9 ++++++--- .../nvidia-drivers/nvidia-drivers-550.40.07.ebuild | 17 +++++++++++------ 8 files changed, 46 insertions(+), 21 deletions(-)
Ionen, big thanks! > it does lack the dbus dependency with USE=-tools Oh, sorry, really my setup was USE="-tools -X". P.S. Read here: https://download.nvidia.com/XFree86/Linux-x86_64/530.41.03/README/dynamicboost.html that `cpufreq` infrastructure must be enabled.
(In reply to Alexander Kurakin from comment #10) > P.S. Read here: > https://download.nvidia.com/XFree86/Linux-x86_64/530.41.03/README/ > dynamicboost.html > that `cpufreq` infrastructure must be enabled. Guess there's no harm in checking for it given there's a USE to make the check conditional now.
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=f4f24ee482eba4cf13f895368918584f79cf8324 commit f4f24ee482eba4cf13f895368918584f79cf8324 Author: Ionen Wolkens <ionen@gentoo.org> AuthorDate: 2024-02-07 09:44:30 +0000 Commit: Ionen Wolkens <ionen@gentoo.org> CommitDate: 2024-02-07 09:49:53 +0000 x11-drivers/nvidia-drivers: check for CPU_FREQ with USE=powerd Bug: https://bugs.gentoo.org/923117 Signed-off-by: Ionen Wolkens <ionen@gentoo.org> x11-drivers/nvidia-drivers/nvidia-drivers-525.147.05.ebuild | 1 + x11-drivers/nvidia-drivers/nvidia-drivers-535.146.02.ebuild | 1 + x11-drivers/nvidia-drivers/nvidia-drivers-535.154.05.ebuild | 1 + x11-drivers/nvidia-drivers/nvidia-drivers-535.43.25.ebuild | 1 + x11-drivers/nvidia-drivers/nvidia-drivers-545.29.06-r1.ebuild | 1 + x11-drivers/nvidia-drivers/nvidia-drivers-550.40.07.ebuild | 1 + 6 files changed, 6 insertions(+)