Since upgrading to sys-kernel/gentoo-sources-6.6.13 the system sometimes (after less than a handful of suspend/resume cycles) fails to resume. My initial suspicion was the (experimental) rtw88_usb module (now blacklisted) or nvidia-drivers-535.154.05 (installed at the same time, reverted to the one I used with the 6.1.57 kernel). So those seem to not be the issue. Reproducible: Sometimes Steps to Reproduce: 1. systemctl suspend 2. press power button 3. go back to (1) Actual Results: Fans spin up, but no video signal, no network activity, and no messages written to the log. Expected Results: System resumes, showing i3-lock. Portage 3.0.61 (python 3.11.7-final-0, default/linux/amd64/17.1/desktop/gnome/systemd, gcc-13, glibc-2.38-r10, 6.6.13-gentoo x86_64) ================================================================= System uname: Linux-6.6.13-gentoo-x86_64-Intel-R-_Core-TM-_i7-4790K_CPU_@_4.00GHz-with-glibc2.38 KiB Mem: 16337688 total, 13290024 free KiB Swap: 4194300 total, 4194300 free Timestamp of repository gentoo: Mon, 05 Feb 2024 01:00:01 +0000 Timestamp of repository guru: Mon, 05 Feb 2024 12:03:24 +0000 Head commit of repository guru: 0d9e1051657bdfdbbe6a38d4773a93ba28f6bf39 sh bash 5.1_p16-r6 ld GNU ld (Gentoo 2.41 p4) 2.41.0 app-misc/pax-utils: 1.3.7::gentoo app-shells/bash: 5.1_p16-r6::gentoo dev-build/autoconf: 2.13-r8::gentoo, 2.71-r6::gentoo dev-build/automake: 1.15.1-r2::gentoo, 1.16.5-r2::gentoo dev-build/cmake: 3.27.9::gentoo dev-build/libtool: 2.4.7-r1::gentoo dev-build/make: 4.4.1-r1::gentoo dev-build/meson: 1.3.0-r2::gentoo dev-java/java-config: 2.3.3-r1::gentoo dev-lang/perl: 5.38.2-r1::gentoo dev-lang/python: 3.9.18::gentoo, 3.10.13::gentoo, 3.11.7::gentoo, 3.12.1_p1::gentoo dev-lang/rust: 1.74.1::gentoo sys-apps/baselayout: 2.14-r1::gentoo sys-apps/sandbox: 2.38::gentoo sys-apps/systemd: 254.8-r1::gentoo sys-devel/binutils: 2.41-r3::gentoo sys-devel/binutils-config: 5.5::gentoo sys-devel/clang: 16.0.6::gentoo, 17.0.6::gentoo sys-devel/gcc: 13.2.1_p20240113-r1::gentoo sys-devel/gcc-config: 2.11::gentoo sys-devel/lld: 17.0.6::gentoo sys-devel/llvm: 16.0.6::gentoo, 17.0.6::gentoo sys-kernel/linux-headers: 6.6::gentoo (virtual/os-headers) sys-libs/glibc: 2.38-r10::gentoo Repositories: gentoo location: /usr/portage sync-type: webrsync sync-uri: rsync://rsync.gentoo.org/gentoo-portage priority: -1000 volatile: True sync-webrsync-verify-signature: true guru location: /var/db/repos/guru sync-type: git sync-uri: https://github.com/gentoo-mirror/guru.git masters: gentoo volatile: False local location: /usr/local/portage masters: gentoo volatile: True steam-overlay location: /var/lib/layman/steam-overlay masters: gentoo priority: 50 volatile: True ACCEPT_KEYWORDS="amd64" ACCEPT_LICENSE="* -@EULA" CBUILD="x86_64-pc-linux-gnu" CFLAGS="-O2 -pipe -march=native" CHOST="x86_64-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/share/config /usr/share/gnupg/qualified.txt" CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c" CXXFLAGS="-O2 -pipe -march=native" DISTDIR="/usr/portage/distfiles" EMERGE_DEFAULT_OPTS="--quiet-build" ENV_UNSET="CARGO_HOME DBUS_SESSION_BUS_ADDRESS DISPLAY GDK_PIXBUF_MODULE_FILE GOBIN GOPATH PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR XDG_STATE_HOME" FCFLAGS="-O2 -pipe" FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs buildpkg-live config-protect-if-modified distlocks ebuild-locks fixlafiles ipc-sandbox merge-sync multilib-strict network-sandbox news parallel-fetch pid-sandbox pkgdir-index-trusted preserve-libs protect-owned qa-unresolved-soname-deps sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr" FFLAGS="-O2 -pipe" GENTOO_MIRRORS="ftp://ftp.free.fr/mirrors/ftp.gentoo.org/" LANG="en_US.utf8" LDFLAGS="-Wl,-O1 -Wl,--as-needed" LEX="flex" LINGUAS="en en_US de de_DE fr fr_FR" MAKEOPTS="-j9" PKGDIR="/usr/portage/packages" PORTAGE_CONFIGROOT="/" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git" PORTAGE_TMPDIR="/var/tmp" SHELL="/bin/bash" USE="X a52 aac acl acpi alsa amd64 apache2 branding bzip2 cairo cdda cdr cli colord crypt cups dbus dri dts dvd dvdr eds emacs encode evo exif ffmpeg flac fortran gdbm gif gnome-keyring gnome-shell gnutls gpm gstreamer gtk gtk3 gui iconv icu introspection ipv6 jpeg keyring lcms libnotify libtirpc mmx mp3 mp4 mpeg multilib ncurses networkmanager nls ogg opengl openmp opus pam pango pcre pdf png policykit ppds pulseaudio readline sdl seccomp sound spell split-usr ssl startup-notification svg sysprof systemd test-rust theora tiff truetype udev udisks unicode upower usb vcd vhosts vorbis vulkan wxwidgets x264 xattr xcb xft xinerama xml xv xvid zlib" ABI_X86="64" ADA_TARGET="gnat_2021" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_anon authn_dbm authn_file authz_dbm authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir env expires ext_filter file_cache filter headers include info log_config logio mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="karbon sheets words" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="aes avx avx2 fma3 mmx mmxext pclmul popcnt sse sse2 sse3 sse4_1 sse4_2 ssse3" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock greis isync itrax mtk3301 ntrip navcom oceanserver oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 tsip tripmate tnt ublox" INPUT_DEVICES="libinput" KERNEL="linux" L10N="en en-US de de-DE fr" LCD_DEVICES="bayrad cfontz glk hd44780 lb216 lcdm001 mtxorb text" LUA_SINGLE_TARGET="lua5-1" LUA_TARGETS="lua5-1" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php8-1" POSTGRES_TARGETS="postgres15" PYTHON_SINGLE_TARGET="python3_11" PYTHON_TARGETS="python3_11" RUBY_TARGETS="ruby31" VIDEO_CARDS="nvidia nouveau" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipp2p iface geoip fuzzy condition tarpit sysrq proto logmark ipmark dhcpmac delude chaos account" Unset: ADDR2LINE, AR, ARFLAGS, AS, ASFLAGS, CC, CCLD, CONFIG_SHELL, CPP, CPPFLAGS, CTARGET, CXX, CXXFILT, ELFEDIT, EXTRA_ECONF, F77FLAGS, FC, GCOV, GPROF, INSTALL_MASK, LC_ALL, LD, LFLAGS, LIBTOOL, MAKE, MAKEFLAGS, NM, OBJCOPY, OBJDUMP, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, PYTHONPATH, RANLIB, READELF, RUSTFLAGS, SIZE, STRINGS, STRIP, YACC, YFLAGS
Created attachment 884421 [details] systemd journal from beginning of suspend to next boot
Created attachment 884422 [details] kernel .config
I would be happy to provide additional information. But, given the absence of useful error messages, I don't know how. Any help for narrowing down the cause would be appreciated. Also, up to kernel 6.1.57 suspend/resume worked flawlessly, often with 15+ suspend/resume cycles between reboots.
What brand / model of system is this?
System is an Intel i7-4790K on an MSI Z97S SLI Krait Edition (Intel Z97 Express Chipset) with an NVIDIA GeForce GTX 960. Custom built around 8 years ago, never had any suspend/resume issues.
Firstly, can you test with nouveau to eliminate nvidia-drivers. This would be a requirement if this needs to go upstream as that marks the kernel tainted. Second, are you able to do a git bisect so we can determine if a specific commit is causing this issue?
I'm not sure that is realistic. Nouveau seems to not support power management on my graphics card. Combined with this being my primary desktop machine and the error only occurring sporadically, this would probably take several days to determine whether any given kernel revision is "good" or "bad". I think, as a first step, I will try a 6.1.74 longterm kernel. If that works, at least there would be a stable and still supported baseline.
I would try the latest 6.1.X whenever you do test
One more data point: gentoo-sources-6.1.74 (latest stable 6.1.x) nvidia-drivers-535.146.02 RinCat/RTL88x2BU-Linux-Driver (from GitHub @ 7bdc911e1c14c...) Uptime 4 days, a dozen or so suspend cycles, no issues. So, I guess that qualifies as "good" and rules out hardware defects.
(In reply to Christian D. from comment #9) > One more data point: > > gentoo-sources-6.1.74 (latest stable 6.1.x) > nvidia-drivers-535.146.02 > RinCat/RTL88x2BU-Linux-Driver (from GitHub @ 7bdc911e1c14c...) > > Uptime 4 days, a dozen or so suspend cycles, no issues. > So, I guess that qualifies as "good" and rules out hardware defects. Can you attach the output of lspci -vv
Created attachment 885155 [details] lspci -vv generation emits the following to stderr (probably irrelevant): pcilib: sysfs_read_vpd: read failed: No such device
Can I have your full boot log also, please ?
I'm commenting because this may be related to my current issue since the update. Since the upgrade to kernel 6.6.x (Testing with gentoo-kernel-bin), I have been failing often to boot (the kernel panics even before it tries to print something which is interesting, it is stuck at checking TPM PCR 9 even though I disabled secureboot, it may be related to a grub update perhaps). Sometimes though, and only sometimes, it does boot, even if rarely (I'd like to know what cause it to boot, but I don't even know so far, I need to dig more about it). The problem when it boots is that it always fails to resume when I try to sleep the laptop (black screen), I noticed that also some filesystem kernel modules weren't loaded at all aswell or even a lot of modules. The thing is that everything works correctly when I downgrade to 6.1.x, has there been some changes that I've been unaware of in the kernel that would likely cause such issues? Let me know If I need to make another bug report for this with more details, but for now I'm using an older kernel until I dig and read code more what would cause such issues. I'm suspecting multiple things: 1) nvidia-driver (open) 2) dracut initramfs not doing something properly - grub not loading properly the kernel, perhaps due to the fact that the PCR check failed, even though I disabled secureboot, but it doesn't make sense since the kernel does boot sometimes so I suspect more of the 1) and 2), or it may be still grub but something totally different.
PS It could be an issue with the kernel code itself, but it seems unlikely, it could happen though I believe.