Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 693384 - x11-drivers/nvidia-drivers: (with sys-auth/elogind?) won't resume after suspend
Summary: x11-drivers/nvidia-drivers: (with sys-auth/elogind?) won't resume after suspend
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal
Assignee: Ionen Wolkens
URL:
Whiteboard:
Keywords: PullRequest
: 860291 (view as bug list)
Depends on:
Blocks:
 
Reported: 2019-09-03 08:54 UTC by Necktwi Ozfguah
Modified: 2024-09-16 09:59 UTC (History)
22 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
emerge --info (emerge_info.txt,7.86 KB, text/plain)
2021-06-22 19:41 UTC, wolfwood
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Necktwi Ozfguah 2019-09-03 08:54:14 UTC
This bug is same as https://bugs.gentoo.org/114213 but with elogind. The system resume when suspend is with `echo mem > /sys/power/state` works fine but if I suspend the system with `loginctl suspend`, resume gives black screen; I have to ssh xdm restart to get into gnome-shell.


```
$ emerge --info
Portage 2.3.69 (python 3.6.5-final-0, default/linux/amd64/17.0/desktop/gnome, gcc-8.3.0, glibc-2.29-r2, 4.19.66-gentoo x86_64)
=================================================================
System uname: Linux-4.19.66-gentoo-x86_64-Intel-R-_Core-TM-_i5-9600K_CPU_@_3.70GHz-with-gentoo-2.6
KiB Mem:    16135080 total,  11962144 free
KiB Swap:          0 total,         0 free
Head commit of repository gentoo: 75c5d9a19576231aa822810b0dd49a7870c68697

sh bash 4.4_p23-r1
ld GNU ld (Gentoo 2.32 p2) 2.32.0
distcc 3.3.2 x86_64-pc-linux-gnu [disabled]
app-shells/bash:          4.4_p23-r1::gentoo
dev-lang/perl:            5.28.2-r1::gentoo
dev-lang/python:          2.7.15::gentoo, 3.6.5::gentoo
dev-util/cmake:           3.14.6::gentoo
sys-apps/baselayout:      2.6-r1::gentoo
sys-apps/openrc:          0.41.2::gentoo
sys-apps/sandbox:         2.13::gentoo
sys-devel/autoconf:       2.13-r1::gentoo, 2.69-r4::gentoo
sys-devel/automake:       1.11.6-r3::gentoo, 1.16.1-r1::gentoo
sys-devel/binutils:       2.32-r1::gentoo
sys-devel/gcc:            8.3.0-r1::gentoo
sys-devel/gcc-config:     2.0::gentoo
sys-devel/libtool:        2.4.6-r3::gentoo
sys-devel/make:           4.2.1-r4::gentoo
sys-kernel/linux-headers: 4.19::gentoo (virtual/os-headers)
sys-libs/glibc:           2.29-r2::gentoo
Repositories:

gentoo
    location: /usr/portage
    sync-type: git
    sync-uri: https://github.com/gentoo/gentoo.git
    priority: -1000

crossdev
    location: /usr/local/portage-crossdev
    masters: gentoo
    priority: 10

ACCEPT_KEYWORDS="amd64"
ACCEPT_LICENSE="*"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=native -O2 -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/gnupg/qualified.txt"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo"
CXXFLAGS="-march=native -O2 -pipe"
DISTDIR="/usr/portage/distfiles"
ENV_UNSET="DBUS_SESSION_BUS_ADDRESS DISPLAY GOBIN PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR"
FCFLAGS="-march=native -O2 -pipe"
FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs config-protect-if-modified distlocks ebuild-locks fixlafiles ipc-sandbox merge-sync multilib-strict network-sandbox news parallel-fetch pid-sandbox preserve-libs protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
FFLAGS="-march=native -O2 -pipe"
GENTOO_MIRRORS="ftp://ftp.swin.edu.au/gentoo http://ftp.swin.edu.au/gentoo https://gentoo.c3sl.ufpr.br/ http://gentoo.c3sl.ufpr.br/ rsync://gentoo.c3sl.ufpr.br/gentoo/ http://gentoo.gossamerhost.com rsync://gentoo.gossamerhost.com/gentoo-distfiles/ ftp://mirrors.tera-byte.com/pub/gentoo http://gentoo.mirrors.tera-byte.com/ rsync://mirrors.tera-byte.com/gentoo ftp://mirror.csclub.uwaterloo.ca/gentoo-distfiles/ https://mirror.csclub.uwaterloo.ca/gentoo-distfiles/ http://mirror.csclub.uwaterloo.ca/gentoo-distfiles/ https://mirrors.163.com/gentoo/ http://mirrors.163.com/gentoo/ https://mirrors.tuna.tsinghua.edu.cn/gentoo http://ftp.fi.muni.cz/pub/linux/gentoo/ ftp://ftp.fi.muni.cz/pub/linux/gentoo/ rsync://ftp.fi.muni.cz/pub/linux/gentoo/"
LANG="en_US.utf8"
LC_ALL="en_US.utf8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
MAKEOPTS="-j3"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git"
PORTAGE_TMPDIR="/var/tmp"
USE="X a52 aac acl acpi alsa amd64 berkdb bluetooth branding bzip2 cairo cdda cdr cli colord crypt cups cxx dbus dri dts dvd dvdr eds egl elogind emboss encode evo exif fam flac fortran gdbm gif glamor gnome gnome-keyring gnome-online-accounts gpm gstreamer gtk iconv icu introspection ipv6 jpeg lcms ldap libnotify libsecret libtirpc mad mng mp3 mp4 mpeg multilib nautilus ncurses networkmanager nls nptl ogg opengl openmp pam pango pcre pdf png policykit ppds pulseaudio qt5 readline sdl seccomp spell split-usr ssl startup-notification svg tcpd tiff truetype udev udisks unicode upower usb vorbis wayland wxwidgets x264 xattr xcb xml xv xvid zlib" ABI_X86="64" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="karbon sheets words" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="mmx mmxext sse sse2" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" GRUB_PLATFORMS="efi-64" INPUT_DEVICES="libinput" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" NETBEANS_MODULES="apisupport cnd groovy gsf harness ide identity j2ee java mobility nb php profiler soa visualweb webcommon websvccommon xml" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php7-2" POSTGRES_TARGETS="postgres10 postgres11" PYTHON_SINGLE_TARGET="python3_6" PYTHON_TARGETS="python2_7 python3_6" RUBY_TARGETS="ruby24 ruby25" USERLAND="GNU" VIDEO_CARDS="intel i965 nvidia vmware" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CC, CPPFLAGS, CTARGET, CXX, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LINGUAS, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
```
Comment 1 Necktwi Ozfguah 2019-09-03 12:43:11 UTC
I'm sorry, the related bug is https://bugs.gentoo.org/287710, not the one mentioned above.
Comment 2 Jeroen Roovers (RETIRED) gentoo-dev 2019-09-04 05:54:23 UTC
It isn't even clear where the problem arises: could be a display driver or somewhere much higher up the software stack. Until you find the cause of the problem, there is no bug to fix. Please use our official support forums (IRC, mailing lists, web forums) to get support in fixing your problem, and file bug reports when you find actual bugs.
Comment 3 Necktwi Ozfguah 2019-09-04 16:00:25 UTC
Its not a bug! that's harsh! here is the system log when system is suspended with power button and when it is suspended with echo mem > /sys/power/state`

power button: https://pastebin.com/KUvVuam8
/sys/power/state: https://pastebin.com/SJkvM1q2
Comment 4 Mart Raudsepp gentoo-dev 2019-09-04 16:09:24 UTC
My understanding from IRC support channel, from where Necktwi comes from, is that the problem is with elogind too; so this is a elogind or related bug, not gnome-shell. Direct forced suspend via kernel file echo resumes fine; loginctl suspend doesn't resume fine.
Comment 5 Mart Raudsepp gentoo-dev 2019-09-04 20:08:28 UTC
I just said it's not a gnome-shell bug. Please wrangle properly...
Comment 6 Mart Raudsepp gentoo-dev 2019-09-05 05:21:36 UTC
sigh...
Comment 7 Necktwi Ozfguah 2019-09-12 18:13:26 UTC
I've merged gnome-light with VIDEO_CARDS="intel i965 nvidia vmware virtualbox". can this create any problem? vmware and virtualbox are to boot the gentoo physical partition from windows.
Comment 8 Necktwi Ozfguah 2019-10-08 14:27:20 UTC
But then `echo mem > /sys/power/state` works!
Comment 9 Sven Eden 2019-10-09 14:30:39 UTC
(In reply to Necktwi Ozfguah from comment #8)
> But then `echo mem > /sys/power/state` works!

And that's exactly what elogind does, it writes the chosen method (See /etc/elogind/logind.conf) into /sys/power/state.

The difference is, that elogind and systemd-login tell programs that are registered, that a suspension/hibernation is due, and that they have to prepare for it.
After wakeup, both tell these programs that the machine is resumed, and that they shall resume themselves, too.

One of the effect of this is network-manager to reconnect after resume.

If waking up generally works when done by hand (meaning: without messaging registered programs), then I daresay that something is not waking up properly.

The message is either "PrepareForSleep" or "PrepareForShutdown" via dbus through org.freedesktop.login1.Manager, and its either with the value 'true' before shutdown/hibernate/suspend, or 'false' after wakeup.

Something that should react on sending PrepareForSleep:false, doesn't as it should.
Comment 10 Sven Eden 2019-10-21 09:36:31 UTC
Another idea:

Is you whole system built with USE="-consolekit elogind -systemd" ?
Comment 11 Necktwi Ozfguah 2019-10-21 12:56:26 UTC
USE="wayland egl -tracker bluetooth nls vaapi"

I've isolated the issue. issue arises when I change VIDEO_CARDS="intel i965" to VIDEO_CARDS="intel i965 nvidia virtualbox vmware". and also i noticed that "gnome on Xorg" option is no longer available on gdm login screen; now it only got "gnome, Xsession, custom" options.

though i changed back to VIDEO_CARDS="intel i965" and did --newuse --deep @world, the issue won't go away and also there is no "gnome on Xorg" option in gdm login screen. 

and also now i got
```
!!! existing preserved libs:
>>> package: x11-drivers/nvidia-drivers-435.21
 *  - /usr/lib64/OpenCL/vendors/nvidia/libOpenCL.so.1
 *  - /usr/lib64/OpenCL/vendors/nvidia/libOpenCL.so.1.0.0
 *      used by /usr/lib64/libavfilter.so.7.40.101 (media-video/ffmpeg-4.1.3)
 *      used by /usr/lib64/libavutil.so.56.22.100 (media-video/ffmpeg-4.1.3)
Use emerge @preserved-rebuild to rebuild packages using these libraries
```
I don't understand why this message is shown even though the nvidia-drivers has been unmerged by the --newuse @world and it won't go away even if i do `emerge @preserver-rebuild`
Comment 12 Necktwi Ozfguah 2019-10-21 13:05:56 UTC
```
!!! existing preserved libs:
>>> package: x11-drivers/nvidia-drivers-435.21
```

is no more after i did `eselect opencl set biegnet`.

How to get back the "gnome on Xorg" option in gdm login screen?
Comment 13 Andreas Sturmlechner gentoo-dev 2019-10-21 13:06:35 UTC
this is not the Gnome on Xorg bug.
Comment 14 Necktwi Ozfguah 2019-10-21 13:15:47 UTC
but i lost the "gnome on Xorg" option only when this issue arised!

changing to VIDEO_CARDS="intel i965 nvidia virtualbox vmware" from VIDEO_CARDS="intel i965" caused this issue.
Comment 15 Andreas Sturmlechner gentoo-dev 2019-10-21 13:17:08 UTC
this bug is completely unrelated to gnome, please don't spam it further.
Comment 16 Necktwi Ozfguah 2019-10-22 15:49:28 UTC
With VIDEO_CARDS="intel i965" I tried changing profile from 17.1/desktop/gnome to 17.1 and emerge -ca. Then again changed profile to desktop/gnome and sudo emerge --deep --with-bdeps=y --changed-use --update --ask --verbose -k @world. but this didn't fix the issue.

https://pastebin.com/4Nws3qC3 is the system log from suspend to resume.
Comment 17 Necktwi Ozfguah 2019-10-22 18:39:04 UTC
I have installed a fresh Gentoo.
Suspend worked fine when VIDEO_CARDS="intel i965".
but stopped working when I used VIDEO_CARDS="intel i965 nvidia" and emerge --newuse
Comment 18 Necktwi Ozfguah 2019-10-24 01:57:35 UTC
blacklisting nvidia, nvidia_modeset, nvidia_drm fixed this issue.
Comment 19 Sven Eden 2019-11-27 19:13:07 UTC
(In reply to Necktwi Ozfguah from comment #17)
> I have installed a fresh Gentoo.
> Suspend worked fine when VIDEO_CARDS="intel i965".
> but stopped working when I used VIDEO_CARDS="intel i965 nvidia" and emerge
> --newuse

On my laptop I have:
VIDEO_CARDS="fbdev intel i965 nvidia"

On my desktop I have:
VIDEO_CARDS="fbdev nvidia"

Both resume fine from suspension, both from console and from Plasma started via SDDM.
I did not try out Wayland, though.

If "Gnome on Xorg" is missing, I suppose you are on Wayland? Is your system built with USE="wayland" activated?

I am asking, because all three of your logs show the same workflow... There is no hint about anything being "forgotten" when suspending using elogind.
...which would cause me to wonder anyway, as elogind does _nothing_ different.
Comment 20 Necktwi Ozfguah 2019-11-27 19:24:43 UTC
Yes, I have use="wayland" but on gdm login screen i choose "Gnome on Xorg" or it anyway falls back to Xorg due to nvidia card.
Comment 21 Necktwi Ozfguah 2019-12-02 13:02:31 UTC
I've removed wayland use flag and did emerge --newuse. it remerged nvidia-drivers along with other packages. But it didn't fix the issue. I also tried setting
options nvidia NVreg_DynamicPowerManagement=0x02
as per http://download.nvidia.com/XFree86/Linux-x86_64/435.17/README/dynamicpowermanagement.html but it didn't work.
Comment 22 Andreas Sturmlechner gentoo-dev 2019-12-02 13:12:09 UTC
(In reply to Necktwi Ozfguah from comment #17)
> Suspend worked fine when VIDEO_CARDS="intel i965".
> but stopped working when I used VIDEO_CARDS="intel i965 nvidia" and emerge
> --newuse
(In reply to Necktwi Ozfguah from comment #18)
> blacklisting nvidia, nvidia_modeset, nvidia_drm fixed this issue.
Per these informations I'm not sure what elogind is supposed to do in $summary.

Adding nvidia-drivers maintainer and keeping myself in CC for the time being.
Comment 23 Sven Eden 2019-12-05 07:57:42 UTC
(In reply to Necktwi Ozfguah from comment #18)
> blacklisting nvidia, nvidia_modeset, nvidia_drm fixed this issue.

Oh! Now I get it! You have a hybrid-laptop, right?

Yes, there are countless threads over several years that the nvidia-drivers cause trouble on resuming from suspend and/or hibernate.

I can replicate your problem easily by starting anything with primusrun/optirun and then suspending. Simple solution: Don't do that.

Maybe you could add hook scripts to unload the nvidia module(s) prior suspending/hibernating and reloading them after wakeup. But that would kill anything running using the nvidia drivers.

This *seems* to be a problem only for hybrid laptops without a MUX, like mine. On my nvidia-based desktop, suspend/hibernate work without any issue.

Absolutely not elogind related, and a long known problem with the proprietary nvidia drivers on linux.
Comment 24 Necktwi Ozfguah 2019-12-06 04:04:40 UTC
But remember **I was able to do a proper resume if I suspend with `echo mem > /sys/power/state`.**

Mine is not a laptop and i'm not running anything with primusrun/optirun. But recently I was able to offload render applications to Nvidia card.
Comment 25 Mi Yu 2019-12-24 06:27:07 UTC
I can reproduce this with nvidia-drivers and elogind with the same behavior: `echo mem > /sys/power/state` works, but `loginctl suspend` resumes to a blinking cursor and then to a black screen. I'm using a plain xorg-server setup with `exec bspwm` in xinitrc, so it should not be related to gnome-shell, but has to do either with nvidia-drivers or elogind. I am running a Quadro P2000 Mobile on my first-gen ThinkPad P1. My package versions:

```
$ emerge -pv gentoo-sources nvidia-drivers elogind xorg-server

These are the packages that would be merged, in order:

Calculating dependencies... done!
[ebuild   R    ] sys-kernel/gentoo-sources-5.4.6:5.4.6::gentoo  USE="-build -experimental -symlink" 0 KiB
[ebuild   R    ] sys-auth/elogind-241.4::gentoo  USE="acl pam -debug -doc -policykit (-selinux)" 0 KiB
[ebuild   R    ] x11-base/xorg-server-1.20.6:0/1.20.6::gentoo  USE="elogind ipv6 libressl suid udev xorg xvfb -debug -dmx -doc -kdrive -libglvnd -minimal (-selinux) -static-libs -systemd -unwind -wayland -xcsecurity -xephyr -xnest" 0 KiB
[ebuild   R    ] x11-drivers/nvidia-drivers-440.44-r1:0/440::gentoo  USE="X acpi driver kms multilib tools -compat -gtk3 -libglvnd -static-libs -uvm -wayland" ABI_X86="(64) -32 (-x32)" 0 KiB

Total: 4 packages (4 reinstalls), Size of downloads: 0 KiB
```

Per NVIDIA's documentation [1], I've tried installing a suspend hook for elogind:

```
$ cat /lib64/elogind/system-sleep/20-nvidia 
#!/bin/sh
case "${1-}" in
    'pre')
        logger -t "elogind" -s "/usr/bin/nvidia-sleep.sh suspend"
        /usr/bin/nvidia-sleep.sh suspend
        logger -t "elogind" -s "/usr/bin/nvidia-sleep.sh: done"
        ;;

    'post')
        logger -t "elogind" -s "/usr/bin/nvidia-sleep.sh resume"
        /usr/bin/nvidia-sleep.sh resume
        logger -t "elogind" -s "/usr/bin/nvidia-sleep.sh: done"
        ;;

    *)
        exit 1
        ;;
esac
```

/usr/bin/nvidia-sleep.sh is unpacked from NVIDIA's official drivers, which essentially does `echo suspend > /proc/driver/nvidia/suspend`. From syslog I can verify that during both suspend and resume the script executes successfully, but I still have a black screen. `cat /sys/power/mem_sleep` shows that my suspend mode is "deep," which shouldn't affect the NVIDIA card.

I've also disabled all other power management tools except NVIDIA's built-in PRIME renderer offload [2]. I suspect this might have caused the issue, but a similar setup on Arch Linux seems to work fine. I'll test with a non-PRIME setup and/or systemd once I get the chance.

[1]: https://download.nvidia.com/XFree86/Linux-x86_64/440.44/README/powermanagement.html
[2]: https://download.nvidia.com/XFree86/Linux-x86_64/440.44/README/primerenderoffload.html
Comment 26 consus 2020-01-24 16:11:57 UTC
I can confirm this. Writing "mem" directly to /sys/class/power works great, "loginctl suspend" resumes to a black screen. However, if I run "loginctl suspend" without X (it does not matter if X is running or not, what matters is running loginctl from a tty) everything is great, no problems, resuming to a tty works, switching back to X works.

Hardware: Matebook X Pro (2018)
Kernel: 5.4.13 (vanilla-kernel-bin)

$ qlist -Iv | grep -E '/(vanilla-kernel|i3|elogind)'
sys-auth/elogind-241.4
sys-kernel/vanilla-kernel-bin-5.4.13
sys-kernel/vanilla-kernel-bin-5.4.10-r1
x11-wm/i3-gaps-4.16.1-r2

$ lsmod | grep -E '(intel|nvidia|nouveau)'
intel_rapl_msr         20480  0
intel_powerclamp       20480  0
intel_cstate           16384  0
intel_uncore          114688  0
intel_rapl_perf        16384  0
intel_wmi_thunderbolt    20480  0
intel_lpss_pci         20480  2
intel_lpss             16384  1 intel_lpss_pci
intel_pch_thermal      16384  0
intel_gtt              24576  1 i915
intel_rapl_common      28672  2 intel_rapl_msr,processor_thermal_device
intel_xhci_usb_role_switch    16384  0
intel_soc_dts_iosf     20480  1 processor_thermal_device

$ grep VIDEO /etc/portage/make.conf 
VIDEO_CARDS="intel i965"

$ grep modesetting /var/log/Xorg.0.log 
[   512.691] (==) Matched modesetting as autoconfigured driver 1
[   512.691] (II) LoadModule: "modesetting"
[   512.691] (II) Loading /usr/lib64/xorg/modules/drivers/modesetting_drv.so
[   512.691] (II) Module modesetting: vendor="X.Org Foundation"
[   512.691] (II) modesetting: Driver for Modesetting Kernel Drivers: kms

$ grep -E '\(EE|WW\)' /var/log/Xorg.0.log 
	(WW) warning, (EE) error, (NI) not implemented, (??) unknown.
[   512.691] (WW) Warning, couldn't open module intel
[   512.691] (EE) Failed to load module "intel" (module does not exist, 0)
[   512.691] (WW) Warning, couldn't open module fbdev
[   512.691] (EE) Failed to load module "fbdev" (module does not exist, 0)
[   512.691] (WW) Warning, couldn't open module vesa
[   512.691] (EE) Failed to load module "vesa" (module does not exist, 0)
Comment 27 consus 2020-01-24 16:40:51 UTC
Also keyboard is not working, so I believe this is actually a Xorg bug. If I press "power" button long enough X is terminated, and console is shown to me with usual OpenRC stuff about shutting down. And keyboard is working again.
Comment 28 nicky12801 2020-05-22 10:55:54 UTC
Got it to work by putting this into /lib64/elogind/system-sleep/20-nvidia:

```
#!/bin/sh
case "${1-}" in
    'pre')
	chvt 63
	echo "suspend" > /proc/driver/nvidia/suspend
        ;;

    'post')
	echo "resume" > /proc/driver/nvidia/suspend
        ;;
    *)
        exit 1
        ;;
esac
```

This does almost everything that calling nvidia-sleep.sh would do, except restoring the previous TTY.

For whatever reason, suspending Nvidia (by writing to /proc/driver/nvidia/suspend) cannot be done while X is on the current TTY. So nvidia-sleep.sh first switches to TTY #63, which works fine. When resuming, however, it then tries to switch back to whatever TTY X was on, which freezes the system (I have no idea why). This is what my 20-nvidia script doesn't include.

After editing 20-nvidia, suspend/resume works fine, the only annoyance being that I have to manually ctrl-alt-F1 to get back to X.

I still have no clue why writing "mem" to /sys/power/state manually works fine, but calling `loginctl suspend` requires suspending the driver.

Anyway, I hope this helps someone and perhaps provides some more information to go on regarding the cause of the problem.
Comment 29 igel 2020-06-07 12:00:03 UTC
(In reply to nicky12801 from comment #28)
> Got it to work by putting this into /lib64/elogind/system-sleep/20-nvidia:
> #!/bin/sh
> case "${1-}" in
>     'pre')
> 	chvt 63
> 	echo "suspend" > /proc/driver/nvidia/suspend
>         ;;
> 
>     'post')
> 	echo "resume" > /proc/driver/nvidia/suspend
>         ;;
>     *)
>         exit 1
>         ;;
> esac

I had the same issue (even with nvidia* modules blacklisted) and this script fixed the issue for me. Might I suggest adding "(sleep 2 && chvt 1)&" in the 'post' case to avoid having to ctrl+alt+F1
Comment 30 Mi Yu 2020-08-10 17:35:48 UTC
Thanks to the comments above, I got it working with /usr/bin/nvidia-sleep.sh, which has the added benefit of detecting one's current vconsole number and restoring it upon resume. In /lib64/elogind/system-sleep/20-nvidia:

#!/bin/sh
case "${1-}" in
    'pre')
        /usr/bin/nvidia-sleep.sh suspend
        ;;
    'post')
        /usr/bin/nvidia-sleep.sh resume &
        ;;
    *)
        exit 1
        ;;
esac

The & sign in the 'post' case is odd: I don't understand why it should be there, but without it elogind still resumes to a blank console with a blinking cursor.

Is there any way we can move this bug forward? I see that the status of the bug is unconfirmed, but emerging a system with USE="elogind -systemd" and VIDEO_CARDS="nvidia" consistently reproduces this bug, regardless of kernel version or DE/WM. In nvidia-driver's pkg_postinst I also noticed this message:

> To enable nvidia sleep services under systemd, run these commands:
>	systemctl enable nvidia-suspend.service
>	systemctl enable nvidia-hibernate.service"
>	systemctl enable nvidia-resume.service"
> Set the NVreg_TemporaryFilePath kernel module parameter to a
> suitable path in case the default of /tmp does not work for you

It seems that, for feature parity under elogind, we would need to provide the same functionalities as these systemd services. Is there a way to add the appropriate elogind hook to the nvidia-drivers package?
Comment 31 Sven Eden 2020-09-01 06:04:54 UTC
(In reply to Mi Yu from comment #30)
> Thanks to the comments above, I got it working with
> /usr/bin/nvidia-sleep.sh, which has the added benefit of detecting one's
> current vconsole number and restoring it upon resume. In
> /lib64/elogind/system-sleep/20-nvidia:
> 
> #!/bin/sh
> case "${1-}" in
>     'pre')
>         /usr/bin/nvidia-sleep.sh suspend
>         ;;
>     'post')
>         /usr/bin/nvidia-sleep.sh resume &
>         ;;
>     *)
>         exit 1
>         ;;
> esac
> 

I have just suggested a similar approach in https://github.com/elogind/elogind/issues/140#issuecomment-684391794

> It seems that, for feature parity under elogind, we would need to provide
> the same functionalities as these systemd services. Is there a way to add
> the appropriate elogind hook to the nvidia-drivers package?

Currently the nvidia-drivers package provides the systemd service files, correct? Maybe the package could have an "elogind" flag on its own, so it would provide a system-sleep hook script instead when enabled?

If you look at my comment behind that link above, you'll see that the approach, although it works, results in elogind logging an exit status of 2 from the script, and I don't know why.
So if anybody has an idea why this is happening, I'd appreciate any help I can get, as I would like to fix that before the v246-series of elogind hits RC status.

Thank you all for your (almost heavenly) patience!
Comment 32 bfrg 2020-09-01 17:39:48 UTC
I'm using the old 390.138-r4 nvidia-drivers because of an older graphics card (NVS 4200M). Unfortunately, this version doesn't seem to provide the nvidia-sleep.sh script.
Comment 33 Ionen Wolkens gentoo-dev 2021-06-12 13:32:54 UTC
Would've helped if someone set the package name at some point.

Is this still an issue? Last I tried elogind it resumed fine from suspend without any special hooks, but I haven't tested it much or perhaps just doesn't apply to my hardware (i.e. I don't have a dual gpu laptop or similar)
Comment 34 Frederik Pfautsch 2021-06-12 13:57:09 UTC
(In reply to Ionen Wolkens from comment #33)
> Would've helped if someone set the package name at some point.
> 
> Is this still an issue? Last I tried elogind it resumed fine from suspend
> without any special hooks, but I haven't tested it much or perhaps just
> doesn't apply to my hardware (i.e. I don't have a dual gpu laptop or similar)

For me (GTX 1650) it is still relevant. Tried several different things mentioned in the wiki and elsewhere (also the script from #28), nothing seems to work. If anybody is interested in specific logs/settings I am happy to provide them and try out stuff.

After resume I just get a blank screen (no cursor). Magic SysRq REIS followed by (login and) openrc (or reboot) usually helps but at that point my session is gone. Using GDM (gnome-shell) with XOrg. No hooks (empty /lib64/elogind/system-sleep/).

Nvidia-drivers-465.31
elgind-246.10-r1 (started at boot)
gentoo-sources-5.12.9

Basically followed https://wiki.gentoo.org/wiki/GNOME/GNOME_Without_systemd/Gentoo
Nothing regarding nvidia in dmesg
Comment 35 Ionen Wolkens gentoo-dev 2021-06-12 23:55:23 UTC
Unfortunately if the script doesn't work for you then I don't think there's anything I can do, your case may be an nvidia upstream issue.

And as far as nvidia-sleep.sh goes, it "shouldn't" be necessary with elogind's HandleNvidiaSleep option, so there's no sense in installing a hook at this point (it'd conflict if anything).

Do verify that you have NVreg_PreserveVideoMemoryAllocations set to 0 in /etc/modprobe.d/nvidia.conf as =1 tend to cause issues (your mileage may vary).

I believe original issue is technically fixed but I'll leave this bug open as reference for now. If someone that still has issues has sane solutions to suggest that could benefit from ebuild changes, I'm open to them.
Comment 36 wolfwood 2021-06-22 19:41:30 UTC
Created attachment 717717 [details]
emerge --info

I am also having this issue (echo mem > /sys/power/state works but loginctl suspend does not) on a desktop with an nvidia gpu.  I can ssh in after the failed resume but there is no amount of VT switching that brings X back.

I already had NVreg_PreserveVideoMemoryAllocations=0 in /etc/modprobe.d/nvidia.conf so that is not the issue.

it seems that the nvidia specific logic in elogind is not working as intended.
Comment 37 wolfwood 2021-06-22 20:38:05 UTC
additionally, sudo bash -c "/usr/bin/nvidia-sleep.sh suspend; echo mem > /sys/power/state; /usr/bin/nvidia-sleep.sh resume" works (ie nvidia-sleep.sh doesn't break anything) but "/usr/bin/nvidia-sleep.sh suspend; loginctl suspend; /usr/bin/nvidia-sleep.sh resume" does not (neither does the /lib64/elogind/system-sleep/20-nvidia script which seems to either never have been called or have had a permission issue because it never created the directory /var/run/nvidia-sleep)

finally, and bizarrely, _none_ of the methods work if I am just at a console and have not started X11 yet.
Comment 38 arcctgx 2021-09-05 22:05:35 UTC
I'm affected by this problem as well. When suspend is activated using "echo mem > /sys/power/state" resume works fine. Same for hibernation. But ever since I installed elogind I get blank screen on resume when suspend or hibernate has been triggered from XFCE menus.

I'm stuck with x11-drivers/nvidia-drivers-390.144 because of older GPU (NVIDIA GTS 450). I had no problems before I started to use elogind. After reading the comments here I initially thought I could use the nvidia-sleep.sh script from the newer driver version, but it refers to file /proc/driver/nvidia/suspend which I don't seem to have in my system (presumably because of older driver version).

So I'd say this bug will be relevant for as long as x11-drivers/nvidia-drivers-390.144 is supported in Gentoo.
Comment 39 Piotr Karbowski (RETIRED) gentoo-dev 2021-09-06 21:01:51 UTC
(In reply to wolfwood from comment #36)
> Created attachment 717717 [details]
> emerge --info
> 
> I am also having this issue (echo mem > /sys/power/state works but loginctl
> suspend does not) on a desktop with an nvidia gpu.  I can ssh in after the
> failed resume but there is no amount of VT switching that brings X back.
> 
> I already had NVreg_PreserveVideoMemoryAllocations=0 in
> /etc/modprobe.d/nvidia.conf so that is not the issue.
> 
> it seems that the nvidia specific logic in elogind is not working as
> intended.

Looking at elogind code, there's a quirk for nvidia cards, If it detects nvidia GPU it will put it to sleep via it's interface, which is not done when you write 'mem' stringo into /sys/power/state.

Would be good if you could locally patch your elogind and see if it changes anything if you remove lines of src/sleep/sleep.c

473         /* See whether we have an nvidia card to put to sleep */
474         if ( m->handle_nvidia_sleep )
475                 have_nvidia = nvidia_sleep(m, verb, &vtnr);

and 

484         /* Wakeup a possibly put to sleep nvidia card */
485         if (have_nvidia)
486                 nvidia_sleep(m, "resume", &vtnr);

This way the elogind suspend should work in the very same way as writting mem to /sys/power/state. Let me know.
Comment 40 cddr 2021-09-10 02:25:24 UTC
(In reply to Piotr Karbowski from comment #39)
> (In reply to wolfwood from comment #36)
> > Created attachment 717717 [details]
> > emerge --info
> > 
> > I am also having this issue (echo mem > /sys/power/state works but loginctl
> > suspend does not) on a desktop with an nvidia gpu.  I can ssh in after the
> > failed resume but there is no amount of VT switching that brings X back.
> > 
> > I already had NVreg_PreserveVideoMemoryAllocations=0 in
> > /etc/modprobe.d/nvidia.conf so that is not the issue.
> > 
> > it seems that the nvidia specific logic in elogind is not working as
> > intended.
> 
> Looking at elogind code, there's a quirk for nvidia cards, If it detects
> nvidia GPU it will put it to sleep via it's interface, which is not done
> when you write 'mem' stringo into /sys/power/state.
> 
> Would be good if you could locally patch your elogind and see if it changes
> anything if you remove lines of src/sleep/sleep.c
> 
> 473         /* See whether we have an nvidia card to put to sleep */
> 474         if ( m->handle_nvidia_sleep )
> 475                 have_nvidia = nvidia_sleep(m, verb, &vtnr);
> 
> and 
> 
> 484         /* Wakeup a possibly put to sleep nvidia card */
> 485         if (have_nvidia)
> 486                 nvidia_sleep(m, "resume", &vtnr);
> 
> This way the elogind suspend should work in the very same way as writting
> mem to /sys/power/state. Let me know.

I am using the latest nvidia-drivers-470.63 and 5.14 kernel. I also have been affected by this issue and the elogind system-sleep shell script didn't work for me. Now I tried locally patching elogind as suggested but that doesn't seem to work either. For time being I have resorted to disabling the LID button and using echo 'mem' > /sys/power/state to manually make the laptop sleep. Waking up seems to work fine with this configuration.
Comment 41 arcctgx 2021-09-10 21:11:56 UTC
(In reply to Piotr Karbowski from comment #39)
> Would be good if you could locally patch your elogind and see if it changes
> anything if you remove lines of src/sleep/sleep.c
> 
> 473         /* See whether we have an nvidia card to put to sleep */
> 474         if ( m->handle_nvidia_sleep )
> 475                 have_nvidia = nvidia_sleep(m, verb, &vtnr);
> 
> and 
> 
> 484         /* Wakeup a possibly put to sleep nvidia card */
> 485         if (have_nvidia)
> 486                 nvidia_sleep(m, "resume", &vtnr);
> 
> This way the elogind suspend should work in the very same way as writting
> mem to /sys/power/state. Let me know.

As you suggested I rebuilt sys-auth/elogind-246.10-r1 with these lines removed, but I'm still getting blank screen issue after resume with x11-drivers/nvidia-drivers-390.144.
Comment 42 Piotr Karbowski (RETIRED) gentoo-dev 2021-09-10 21:26:09 UTC
You might want to take up up to the github issue tracker of elogind and help them reproduce the problem you are facing
Comment 43 Ionen Wolkens gentoo-dev 2021-09-10 21:58:02 UTC
If nvidia-drivers-390.144 legacy drivers are involved, current solutions may not necessarily apply (nvidia changed many things regarding sleep since then).

Not that I think I can help with this, typically lucky if these work at all nowadays given NVIDIA lts driver support is very minimal.
Comment 44 Sven Eden 2023-02-09 14:38:27 UTC
(In reply to arcctgx from comment #41)
> (In reply to Piotr Karbowski from comment #39)
> > Would be good if you could locally patch your elogind and see if it changes
> > anything if you remove lines of src/sleep/sleep.c
> > 
> > 473         /* See whether we have an nvidia card to put to sleep */
> > 474         if ( m->handle_nvidia_sleep )
> > 475                 have_nvidia = nvidia_sleep(m, verb, &vtnr);
> > 
> > and 
> > 
> > 484         /* Wakeup a possibly put to sleep nvidia card */
> > 485         if (have_nvidia)
> > 486                 nvidia_sleep(m, "resume", &vtnr);
> > 
> > This way the elogind suspend should work in the very same way as writting
> > mem to /sys/power/state. Let me know.
> 
> As you suggested I rebuilt sys-auth/elogind-246.10-r1 with these lines
> removed, but I'm still getting blank screen issue after resume with
> x11-drivers/nvidia-drivers-390.144.

Those lines will not do anything unless you set "HandleNvidiaSleep" to "yes". in the elogind config. It defaults to "no", so removing those lines in patch does nothing at all.
Comment 45 Akio Takano 2024-04-11 06:33:37 UTC
elogind-252.23 (not in the tree) fixed this for me.
Comment 46 Nguyen Thai Ngoc Duy 2024-04-11 16:07:25 UTC
Perhaps it's (In reply to Akio Takano from comment #45)
> elogind-252.23 (not in the tree) fixed this for me.

Perhaps it's 98c92fead (logind: fix abnormal switching causing the screen to go black, 2023-10-25). I only skimmed through commit messages though.
Comment 47 Romel Salwi 2024-04-26 11:00:31 UTC
Although I haven't come across this issue in general usage, but I face this 'black out' screen (with no output on monitor) on resuming the host after VM shutdown.

I generally pass through my GPU (2080Ti) with the help of libvirt hooks. After shutting down the VM and logging in back into the host system, I've observed that I'm unable to switch to a console terminal, and or no output when monitor is turned off for any reason (generally DPMS).

A simple workaround which has resolved this issue in my use case:

edit the file /etc/modprobe.d/nvidia.conf
and enable "options nvidia-drm fbdev=1"

Since I was unable to receive an output on console output. I thought of looking into enabling nvidia's fbdev.

Hope it helps.
Comment 48 Sven Eden 2024-07-18 06:01:17 UTC
(In reply to Nguyen Thai Ngoc Duy from comment #46)
> Perhaps it's (In reply to Akio Takano from comment #45)
> > elogind-252.23 (not in the tree) fixed this for me.
> 
> Perhaps it's 98c92fead (logind: fix abnormal switching causing the screen to
> go black, 2023-10-25). I only skimmed through commit messages though.

It is more likely elogind now forking out a sub process to do the sleep/suspend stuff, so it now works more like systemd + systemd-sleep.

This way elogind can keep running and thus receives and react on the dbus messages it gets prior sleep and after wakeup.
Without the forking, the messages go nowhere.

For more information, please see:
https://github.com/elogind/elogind/issues/234#issuecomment-1712908507
Comment 49 Andreas Sturmlechner gentoo-dev 2024-08-28 08:47:56 UTC
*** Bug 860291 has been marked as a duplicate of this bug. ***
Comment 50 Larry the Git Cow gentoo-dev 2024-09-14 10:52:40 UTC
The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=a0f953fc4720a191288b7e28c1df4ab50eb9a598

commit a0f953fc4720a191288b7e28c1df4ab50eb9a598
Author:     Ionen Wolkens <ionen@gentoo.org>
AuthorDate: 2024-09-06 09:13:41 +0000
Commit:     Ionen Wolkens <ionen@gentoo.org>
CommitDate: 2024-09-14 10:49:26 +0000

    x11-drivers/nvidia-drivers: use PreserveVideoMemoryAllocations=1
    
    (Disclaimer: I do not have the right setup to test any of this, but
    doing it blind given the increasing amount of affected users -- call
    for testing has shown that it should work as expected albeit users
    with more custom setups such as suspend without elogind/systemd will
    need to pay attention to warnings given breakage *is* expected).
    
    Was formerly disabled because it broke sleep with elogind, systemd if
    the units were not enabled, and other custom methods to enable sleep.
    
    However, =0 is limited and is seemingly broken with wayland (typically
    resulting in graphical corruption after resume). GDM straight up refuse
    to show a wayland session if it's not set wrt bug #873160, and several
    Plasma 6 users been reporting issues and its upstream also instructs
    distributions to set this.
    
    So this adds a elogind hook, enables systemd units by default (much
    like it is for the elogind hook), and at least warn for the last case
    which is considered semi-unsupported.
    
    elogind does have its own HandleNvidiaSleep option, but it is intended
    for old drivers which did not ship a nvidia-sleep.sh and reports seem to
    show that it may not be working properly. Ebuild warns that it should be
    disabled instead, and also tries to warn if there is old custom scripts
    installed by the user.
    
    One downside of hook vs the option is that hooks are not told if using
    suspend or hibernate and this sends the wrong message to the drivers
    (albeit not known to be an issue at the moment).
    
    May not fix everything wrt bug #693384, but believe this is the best
    we can do downstream unless someone knows better, and so closing it.
    There are plenty of issues unrelated to elogind too, ideally would
    need users to compare with systemd before filing more elogind sleep
    bugs unless know exactly what is causing issues in elogind.
    
    wrt bug #873160, this only fix *one* thing that the gdm udev rules
    check and so may not mean will necessarily start seeing wayland in
    gdm. Rules currently need =1, systemd-only, and a non-hybrid setup
    (aka just nvidia, no offloading). See also the general bug #939201.
    
    Straight-to-stable may not be the best idea, but wanted to simplify
    and not revbump the 3 .conf, duplicate them, and adjust every ebuilds
    further for this (believe it *should* be ok, or at least not make
    things worse for typical users). Also want to deliver the fix early
    to plasma 6 users newly using wayland by default.
    
    Closes: https://bugs.gentoo.org/693384
    Closes: https://bugs.gentoo.org/873160
    Closes: https://github.com/gentoo/gentoo/pull/38482
    Signed-off-by: Ionen Wolkens <ionen@gentoo.org>

 x11-drivers/nvidia-drivers/files/nvidia-470.conf   |  8 ++--
 x11-drivers/nvidia-drivers/files/nvidia-545.conf   |  8 ++--
 x11-drivers/nvidia-drivers/files/nvidia-555.conf   |  8 ++--
 .../nvidia-drivers/files/system-sleep.elogind      |  7 +++
 ....ebuild => nvidia-drivers-470.256.02-r1.ebuild} | 56 ++++++++++++++++++++++
 ....ebuild => nvidia-drivers-525.147.05-r1.ebuild} | 56 ++++++++++++++++++++++
 ....ebuild => nvidia-drivers-535.183.01-r1.ebuild} | 56 ++++++++++++++++++++++
 ....ebuild => nvidia-drivers-550.107.02-r1.ebuild} | 56 ++++++++++++++++++++++
 ...1.ebuild => nvidia-drivers-550.40.71-r1.ebuild} | 56 ++++++++++++++++++++++
 ...3.ebuild => nvidia-drivers-560.35.03-r1.ebuild} | 56 ++++++++++++++++++++++
 10 files changed, 358 insertions(+), 9 deletions(-)
Comment 51 Steve Evans 2024-09-16 09:18:00 UTC
This breaks one of my machines. It is using mythtv, so the suspend is performed by it. On restore the display doesn't restart. I need to run rc-service display-manager restart to get the display back.
Comment 52 Steve Evans 2024-09-16 09:34:55 UTC
Ignore my last post. I found the problem. I already had a script of my own to perform an nvidia-sleep.sh suspend/resume. Now that the driver also provides a script to do that it was being run twice, causing the problem. Removing my script has fixed it.
Comment 53 Ionen Wolkens gentoo-dev 2024-09-16 09:52:00 UTC
(In reply to Steve Evans from comment #52)
> Ignore my last post. I found the problem. I already had a script of my own
> to perform an nvidia-sleep.sh suspend/resume. Now that the driver also
> provides a script to do that it was being run twice, causing the problem.
> Removing my script has fixed it.
About that, the nvidia-drivers ebuild "tries" warn about duplicate scripts already, but depending on the script and how you run it it could've missed it (or you could've simply missed the message).

if [[ $(realpath "${EROOT}"{/etc,{/usr,}/lib*}/elogind/system-sleep | sort | uniq | \
    xargs -d'\n' grep -Ril nvidia 2>/dev/null | wc -l) -gt 2 ]]
then
    ewarn
    ewarn "!!! WARNING !!!"
    ewarn "Detected a custom script at ${EROOT}{/etc,{/usr,}/lib*}/elogind/system-sleep"
    ewarn "referencing NVIDIA. This version of ${PN} has installed its own"
    ewarn "hook at ${EROOT}/usr/lib/elogind/system-sleep/nvidia and it is recommended"
    ewarn "to remove the custom one to avoid potential issues."
    ewarn
    ewarn "Feel free to ignore this warning if you know the other NVIDIA-related"
    ewarn "scripts can be used together. The warning will be removed in the future."
fi
Comment 54 Ionen Wolkens gentoo-dev 2024-09-16 09:59:01 UTC
(In reply to Ionen Wolkens from comment #53)
>  -gt 2 ]]
And before someone asks, the -gt 2 is intentional because the ebuild installs the script at two locations so it finds two (only one is ran depending on which elogind version is used, for compatibility given the path changed).