Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 752018

Summary: x11-drivers/nvidia-drivers-455.28 - bootup/nvidia-smi hangs indefinitely unless pm set to 1
Product: Gentoo Linux Reporter: f0o <f0o>
Component: Current packagesAssignee: David Seifert <soap>
Status: RESOLVED DUPLICATE    
Severity: normal CC: ionen, jstein
Priority: Normal    
Version: unspecified   
Hardware: AMD64   
OS: Linux   
Whiteboard:
Package list:
Runtime testing required: ---

Description f0o 2020-10-31 10:37:38 UTC
The root of the issue is in /lib/udev/nvidia-udev.sh (x11-drivers/nvidia-drivers/files/nvidia-udev.sh) with a plain call to /opt/bin/nvidia-smi on line 12.
I alterered it to call `/opt/bin/nvidia-smi -pm 1` instead, to enable persistence mode and turn the GPU back on.

Arguably doing this in udev seems to be the wrong way altogether. Specially when multiple gpus are involved you might not want the nvidia gpu to be online all the time.
I believe removing the file altogether might also work but that requires the user to issue `/opt/bin/nvidia-smi -pm 1` prior to using the gpu, such as with xorg or cuda.

Anyway, as a low-effort fix, adding `-pm 1` to the nvidia-smi call solved the issue for me.

Reproducible: Sometimes

Steps to Reproduce:
1. reboot
2. in some cases powersaving of nvidia turns the GPU off at which point nvidia-smi on boot will hang forever. Doing multiple reboots eventually kicks it back on.
3. ???



01:00.0 3D controller: NVIDIA Corporation TU117GLM [Quadro T2000 Mobile / Max-Q] (rev a1)
        Subsystem: Dell TU117GLM [Quadro T2000 Mobile / Max-Q]
        Kernel driver in use: nvidia
        Kernel modules: nouveau, nvidia_drm, nvidia

[ebuild   R    ] x11-drivers/nvidia-drivers-455.28:0/455::gentoo  USE="X driver kms (libglvnd) multilib tools -compat -gtk3 -static-libs -uvm -wayland" ABI_X86="32 (64) (-x32)" 0 KiB
Comment 1 David Seifert gentoo-dev 2020-11-18 18:42:10 UTC
Can you report this upstream please?
Comment 2 Ionen Wolkens gentoo-dev 2021-03-06 08:16:24 UTC
The call to nvidia-smi will be removed, and if want to set persistence I believe you want the nvidia-persistenced init script (this script will also be reworked a bit, systemd unit coming as well).

I'll mark this as yet another duplicate of the older bug. Even if never did for me, clearly nvidia-udev.sh caused a lot of trouble.

*** This bug has been marked as a duplicate of bug 454740 ***