Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 860291 - sys-auth/elogind causes X11 crash on resume from suspend: suspend hangs with black screen with nvidia-drivers
Summary: sys-auth/elogind causes X11 crash on resume from suspend: suspend hangs with ...
Status: UNCONFIRMED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: AMD64 Linux
: Normal major (vote)
Assignee: Andreas Sturmlechner
URL: https://github.com/elogind/elogind/is...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-07-23 17:32 UTC by Avalon Williams
Modified: 2022-09-24 09:04 UTC (History)
4 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
X11 logs (Xorg.0.log,51.11 KB, text/x-log)
2022-07-23 17:32 UTC, Avalon Williams
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Avalon Williams 2022-07-23 17:32:30 UTC
I'm running Gentoo with elogind/openrc, and since install (a couple months ago) I've had issues with suspend/resume, since setting up NVIDIA drivers. I'm on an Optimus laptop, (Thinkpad T15g gen 1 with an RTX2070), and have the properitary drivers.

Depending on the setting of HandleNvidiaSleep and/or the presence of a script that does the same thing using /usr/bin/nvidia-sleep.sh, I will get two different behaviors. If I have HandleNvidiaSleep=yes or in my config or if I have a suspend script that calls nvidia-sleep.sh (or one does chvt and such manually), my system will hang indefinetly with a black screen (backlight is still on, however, and system hasn't yet entered sleep state from what the kernel logs say). However, if I do not have any settings for handling NVIDIA sleep, I get a black screen for around 20-30 seconds, and then it goes to sleep; when my laptop wakes up, my X server segfaults.

These problems occur regardless of the setting of SuspendMode (I tried setting it to deep to see if that would help, it didn't), and otherwise I have default configs.

I'm on elogind version 246.10-r2, kernel version 5.15.48-gentoo-dist, and NVIDIA driver version 510.73.05-r1, though it should be noted that these issues have persisted across updates of all three of these, as well as updates of X.

I submitted a bug report to the upstream elogind (https://github.com/elogind/elogind/issues/234), but the repository seems to not have had any contributions since earlier this year unfortunately.


Reproducible: Always

Steps to Reproduce:
1. Run loginctl suspend (on X11) or close laptop lid
2. Wait until backlight turns off (can take a while to actually suspend)
3. Resume
Actual Results:  
the X11 server has a segfault

Expected Results:  
The X11 server resumes from suspend normally

Portage 3.0.30 (python 3.9.13-final-0, default/linux/amd64/17.1/desktop, gcc-11.3.0, glibc-2.34-r13, 5.15.48-gentoo-dist x86_64)
=================================================================
System uname: Linux-5.15.48-gentoo-dist-x86_64-Intel-R-_Core-TM-_i7-10750H_CPU_@_2.60GHz-with-glibc2.34
KiB Mem:    40737708 total,  37470784 free
KiB Swap:   67108860 total,  67108860 free
Timestamp of repository gentoo: Tue, 21 Jun 2022 23:30:01 +0000
Head commit of repository gentoo: 2a75273025f7eb6434e1c847471f4a13a9f8345b
Timestamp of repository steam-overlay: Sun, 12 Jun 2022 09:02:56 +0000
Head commit of repository steam-overlay: 23a727b7f9d868134563b44dcf4ebba5dd46b5a2

sh bash 5.1_p16
ld GNU ld (Gentoo 2.37_p1 p2) 2.37
app-misc/pax-utils:        1.3.3::gentoo
app-shells/bash:           5.1_p16::gentoo
dev-java/java-config:      2.3.1::gentoo
dev-lang/perl:             5.34.1-r3::gentoo
dev-lang/python:           3.9.13::gentoo, 3.10.4::gentoo
dev-lang/rust-bin:         1.60.0::gentoo
dev-util/cmake:            3.22.4::gentoo
dev-util/meson:            0.61.4-r2::gentoo
sys-apps/baselayout:       2.8::gentoo
sys-apps/openrc:           0.44.10::gentoo
sys-apps/sandbox:          2.29::gentoo
sys-devel/autoconf:        2.71-r1::gentoo
sys-devel/automake:        1.16.5::gentoo
sys-devel/binutils:        2.37_p1-r2::gentoo
sys-devel/binutils-config: 5.4.1::gentoo
sys-devel/clang:           14.0.4::gentoo
sys-devel/gcc:             11.3.0::gentoo
sys-devel/gcc-config:      2.5-r1::gentoo
sys-devel/libtool:         2.4.7::gentoo
sys-devel/llvm:            14.0.4::gentoo
sys-devel/make:            4.3::gentoo
sys-kernel/linux-headers:  5.15-r3::gentoo (virtual/os-headers)
sys-libs/glibc:            2.34-r13::gentoo
Repositories:

gentoo
    location: /var/db/repos/gentoo
    sync-type: rsync
    sync-uri: rsync://rsync.gentoo.org/gentoo-portage
    priority: -1000
    sync-rsync-verify-metamanifest: yes
    sync-rsync-verify-max-age: 24
    sync-rsync-verify-jobs: 1
    sync-rsync-extra-opts: 

local
    location: /var/db/repos/local
    masters: gentoo

steam-overlay
    location: /var/db/repos/steam-overlay
    sync-type: git
    sync-uri: https://github.com/gentoo-mirror/steam-overlay.git
    masters: gentoo

ACCEPT_KEYWORDS="amd64"
ACCEPT_LICENSE="*"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=native -O2 -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/config /usr/share/gnupg/qualified.txt"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo"
CXXFLAGS="-march=native -O2 -pipe"
DISTDIR="/var/cache/distfiles"
EMERGE_DEFAULT_OPTS="--jobs 12"
ENV_UNSET="CARGO_HOME DBUS_SESSION_BUS_ADDRESS DISPLAY GOBIN GOPATH PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR"
FCFLAGS="-march=native -O2 -pipe"
FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs buildpkg-live config-protect-if-modified distlocks ebuild-locks fixlafiles ipc-sandbox merge-sync multilib-strict network-sandbox news parallel-fetch pid-sandbox preserve-libs protect-owned qa-unresolved-soname-deps sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
FFLAGS="-march=native -O2 -pipe"
GENTOO_MIRRORS="http://www.gtlib.gatech.edu/pub/gentoo rsync://rsync.gtlib.gatech.edu/gentoo https://gentoo.osuosl.org/ http://gentoo.osuosl.org/ https://mirrors.rit.edu/gentoo/"
LANG="en_US.utf8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
MAKEOPTS="-j13 -l4"
PKGDIR="/var/cache/binpkgs"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git"
PORTAGE_TMPDIR="/var/tmp"
SHELL="/bin/zsh"
USE="X a52 aac acl acpi alsa amd64 appindicator bluetooth branding bzip2 cairo cdda cdr cli crypt cups dbus device-mapper dist-kernel djvu dri dts dvd dvdr elogind encode exif ffmpeg flac fontconfig fortran gdbm gif gpm gstreamer gtk gui gzip iconv icu initramfs ipv6 jpeg lcms libglvnd libnotify libtirpc lvm mad man mng mp3 mp4 mpeg multilib ncurses nls nptl nvidia ogg opengl openmp pam pango pcre pdf pipewire png policykit postscript ppds pulseaudio qt5 readline screencast sdl seccomp spell split-usr ssl startup-notification svg tiff truetype udev udisks unicode upower usb vorbis vulkan wayland webp wxwidgets x264 xattr xcb xml xv xvid zip zlib zsh-completion" ABI_X86="32 64" ADA_TARGET="gnat_2020" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="karbon sheets words" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="mmx mmxext sse sse2" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock greis isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" INPUT_DEVICES="libinput synaptics" KERNEL="linux" L10N="en en-US" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" LUA_SINGLE_TARGET="lua5-1" LUA_TARGETS="lua5-1" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php7-4 php8-0" POSTGRES_TARGETS="postgres12 postgres13" PYTHON_SINGLE_TARGET="python3_9" PYTHON_TARGETS="python3_9" RUBY_TARGETS="ruby27" USERLAND="GNU" VIDEO_CARDS="intel i965 iris nvidia" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq proto steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  ADDR2LINE, AR, ARFLAGS, AS, ASFLAGS, CC, CCLD, CONFIG_SHELL, CPP, CPPFLAGS, CTARGET, CXX, CXXFILT, ELFEDIT, EXTRA_ECONF, F77FLAGS, FC, GCOV, GPROF, INSTALL_MASK, LC_ALL, LD, LEX, LFLAGS, LIBTOOL, LINGUAS, MAKE, MAKEFLAGS, NM, OBJCOPY, OBJDUMP, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, RANLIB, READELF, RUSTFLAGS, SIZE, STRINGS, STRIP, YACC, YFLAGS
Comment 1 Avalon Williams 2022-07-23 17:32:48 UTC
Created attachment 793523 [details]
X11 logs
Comment 2 Piotr Karbowski (RETIRED) gentoo-dev 2022-07-26 11:26:35 UTC
This seems to affects only users with binary nvidia drivers, at very least, the intel and amdgpu drivers do not ran into this problem.
Comment 3 alpir 2022-07-28 13:23:34 UTC
Confirm that.
I have installed Gnome (openrc), nvidia-drivers-515.57 (tried also 510.73.05-r1).

With gdm and Gnome (wayland) and Gnome (X11) - the same behaviour.

Openbox with gdm - the same. 

Old system with Openbox and sddm - all is good.
Comment 4 Piotr Karbowski (RETIRED) gentoo-dev 2022-07-28 18:05:19 UTC
Can you add the information about Wayland to https://github.com/elogind/elogind/issues/234?
Comment 5 alpir 2022-07-29 04:59:14 UTC
With gdm and Gnome (wayland) and Gnome (X11):

1. with current stable kernel in Gentoo 5.15.52 the system can't suspend, but hang with black screen, exit only with REISUB.
2. with kernel 5.18.14 the system going to suspend, but can't resume - it hangs with black screen and exit only with REISUB.
3. Gnome with sddm - the same.

I didn't find anything in the logs that clearly indicates the reason

Sway from git + sddm - the same behavior.

Openbox with gdm - the same.

Old system with Openbox and sddm - all is OK.

I tried suspend with echo mem > /sys/power/state, but the system does not suspend, immediately returns to Gnome. I don't remember exactly, but it seems with the error "broken pipe".
Comment 6 Sven Eden 2022-09-24 09:04:47 UTC
nvidia Quadro T2000 with (binary) nvidia drivers 515.76, on Plasma 5.25.5 (X11), OpenRC + elogind, started using SDDM, powering three Monitors (2 x Full HD @ 60 Hz, 1 x UHD (2K) @ 144 Hz)

Suspend and resume work fine as often as I like. I use "loginctl suspend" to go to sleep. Takes a few seconds because of the ZFS over 3 external HDs I have and the 12TB backup drive that isn't the fasted when it comes to wakeup/park.

The only issue I have after wakeup is, that I have to shutdown akonadi and have to "kill -9" the private mysql (mariadb) server instance it uses before kontact works.

These are the changes on logind.conf I made:

========
 ~ $ grep -P -v '^#' /etc/elogind/logind.conf 

[Login]

[Sleep]
AllowSuspendInterrupts=yes
SuspendMode=deep s2idle
========

Additional to that I have two hook scripts to handle my IPSec connection and to lay boinc projects to sleep. Don't know if they might be of any help...

========
 ~ $ cat /etc/elogind/system-sleep/boinc.sh
#!/bin/bash

WHEN="$1"
WHAT="$2"

if [[ "xpre" = "x$WHEN" ]]; then
        /usr/bin/logger -t "$WHAT" -s "boinc.sh $WHEN start"
        /etc/init.d/boinc suspend
        /usr/bin/logger -t "$WHAT" -s "boinc.sh $WHEN done"
elif [[ "xpost" = "x$WHEN" ]]; then
        /usr/bin/logger -t "$WHAT" -s "boinc.sh $WHEN start"
        /etc/init.d/boinc resume
        /usr/bin/logger -t "$WHAT" -s "boinc.sh $WHEN done"
fi

exit 0
========

and

========
 ~ $ cat /etc/elogind/system-sleep/network_off.sh 
#!/bin/bash

WHEN="$1"
WHAT="$2"

if [[ "xpre" = "x$WHEN" ]]; then
        /usr/bin/logger -t "$WHAT" -s "network_off.sh $WHEN start"
        /etc/init.d/ntp-client stop
        /etc/init.d/ipsec stop
        /etc/init.d/iptables stop
        /etc/init.d/ip6tables stop
        /usr/bin/logger -t "$WHAT" -s "network_off.sh $WHEN done"
elif [[ "xpost" = "x$WHEN" ]]; then
        /usr/bin/logger -t "$WHAT" -s "network_off.sh $WHEN start"
        /etc/init.d/ip6tables restart
        /etc/init.d/iptables restart
        /etc/init.d/ntp-client restart
        /etc/init.d/ipsec restart
        /usr/bin/logger -t "$WHAT" -s "network_off.sh $WHEN done"
fi

exit 0
========

(Maybe I should add another script for Akonadi/Kontact...)

--- Important --- : 

elogind does _NOT_ do any magical tricks. It does what systemd-sleep does, which simply is:

1) Check whether suspend/hybrid-sleep/hibernation is possible
   (Like: Is something blocking it? Is there enough space for hibernation?)

2) Execute hook scripts (Maybe break the attempt off if some hook script fails)

3) Inform all processes that have registered for it, that suspend/hibernate is going to happen via dbus. (Maybe break off if some process blocks suspension)

4) "Do it" by doing what you would do on the console, like:
   `echo "mem" > /sys/power/state`
   (No joke, that is exactly what is done: Write to /sys/power/state)

5) Execute hook scripts on wakeup

6) Inform all processes that have registered for it, that resume/wakup has happened.

-----

So, whenever anything odd happens, or suspend/hybrid/hibernate/resume/wakeup doesn't work correctly, elogind is in 99,9% of all times I have ever encountered not responsible and merely the messenger in /var/log/messages.

-----

Q: "But echoing "mem" to sys/power/state works! elogind must be doing something wrong here!"

A: Then it is either a hook script, or a dbus notification that did not work as expected.
If you are using nvidia-drivers, then check whether you have enabled "HandleNvidiaSleep" in /etc/elogind/logind.conf and disable it.
(I haven't needed it in years with Nvdia Quadro M1200 and Quadro T2000 cards.)