Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 923947 - sys-kernel/gentoo-sources-6.6.13 failure to resume after suspend
Summary: sys-kernel/gentoo-sources-6.6.13 failure to resume after suspend
Status: RESOLVED NEEDINFO
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal critical
Assignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-02-06 16:00 UTC by Christian D.
Modified: 2024-05-08 19:27 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
systemd journal (file_923947.txt,5.45 KB, text/plain)
2024-02-06 16:03 UTC, Christian D.
Details
kernel .config (.config,143.16 KB, text/plain)
2024-02-06 16:05 UTC, Christian D.
Details
lspci -vv (lspci-vv.txt,33.93 KB, text/plain)
2024-02-16 14:46 UTC, Christian D.
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Christian D. 2024-02-06 16:00:42 UTC
Since upgrading to sys-kernel/gentoo-sources-6.6.13 the system sometimes (after less than a handful of suspend/resume cycles) fails to resume. 

My initial suspicion was the (experimental) rtw88_usb module (now blacklisted) or nvidia-drivers-535.154.05 (installed at the same time, reverted to the one I used with the 6.1.57 kernel). So those seem to not be the issue. 

Reproducible: Sometimes

Steps to Reproduce:
1. systemctl suspend
2. press power button
3. go back to (1)
Actual Results:  
Fans spin up, but no video signal, no network activity, and no messages written to the log.

Expected Results:  
System resumes, showing i3-lock. 

Portage 3.0.61 (python 3.11.7-final-0, default/linux/amd64/17.1/desktop/gnome/systemd, gcc-13, glibc-2.38-r10, 6.6.13-gentoo x86_64)
=================================================================
System uname: Linux-6.6.13-gentoo-x86_64-Intel-R-_Core-TM-_i7-4790K_CPU_@_4.00GHz-with-glibc2.38
KiB Mem:    16337688 total,  13290024 free
KiB Swap:    4194300 total,   4194300 free
Timestamp of repository gentoo: Mon, 05 Feb 2024 01:00:01 +0000
Timestamp of repository guru: Mon, 05 Feb 2024 12:03:24 +0000
Head commit of repository guru: 0d9e1051657bdfdbbe6a38d4773a93ba28f6bf39

sh bash 5.1_p16-r6
ld GNU ld (Gentoo 2.41 p4) 2.41.0
app-misc/pax-utils:        1.3.7::gentoo
app-shells/bash:           5.1_p16-r6::gentoo
dev-build/autoconf:        2.13-r8::gentoo, 2.71-r6::gentoo
dev-build/automake:        1.15.1-r2::gentoo, 1.16.5-r2::gentoo
dev-build/cmake:           3.27.9::gentoo
dev-build/libtool:         2.4.7-r1::gentoo
dev-build/make:            4.4.1-r1::gentoo
dev-build/meson:           1.3.0-r2::gentoo
dev-java/java-config:      2.3.3-r1::gentoo
dev-lang/perl:             5.38.2-r1::gentoo
dev-lang/python:           3.9.18::gentoo, 3.10.13::gentoo, 3.11.7::gentoo, 3.12.1_p1::gentoo
dev-lang/rust:             1.74.1::gentoo
sys-apps/baselayout:       2.14-r1::gentoo
sys-apps/sandbox:          2.38::gentoo
sys-apps/systemd:          254.8-r1::gentoo
sys-devel/binutils:        2.41-r3::gentoo
sys-devel/binutils-config: 5.5::gentoo
sys-devel/clang:           16.0.6::gentoo, 17.0.6::gentoo
sys-devel/gcc:             13.2.1_p20240113-r1::gentoo
sys-devel/gcc-config:      2.11::gentoo
sys-devel/lld:             17.0.6::gentoo
sys-devel/llvm:            16.0.6::gentoo, 17.0.6::gentoo
sys-kernel/linux-headers:  6.6::gentoo (virtual/os-headers)
sys-libs/glibc:            2.38-r10::gentoo
Repositories:

gentoo
    location: /usr/portage
    sync-type: webrsync
    sync-uri: rsync://rsync.gentoo.org/gentoo-portage
    priority: -1000
    volatile: True
    sync-webrsync-verify-signature: true

guru
    location: /var/db/repos/guru
    sync-type: git
    sync-uri: https://github.com/gentoo-mirror/guru.git
    masters: gentoo
    volatile: False

local
    location: /usr/local/portage
    masters: gentoo
    volatile: True

steam-overlay
    location: /var/lib/layman/steam-overlay
    masters: gentoo
    priority: 50
    volatile: True

ACCEPT_KEYWORDS="amd64"
ACCEPT_LICENSE="* -@EULA"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-O2 -pipe -march=native"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/config /usr/share/gnupg/qualified.txt"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c"
CXXFLAGS="-O2 -pipe -march=native"
DISTDIR="/usr/portage/distfiles"
EMERGE_DEFAULT_OPTS="--quiet-build"
ENV_UNSET="CARGO_HOME DBUS_SESSION_BUS_ADDRESS DISPLAY GDK_PIXBUF_MODULE_FILE GOBIN GOPATH PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR XDG_STATE_HOME"
FCFLAGS="-O2 -pipe"
FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs buildpkg-live config-protect-if-modified distlocks ebuild-locks fixlafiles ipc-sandbox merge-sync multilib-strict network-sandbox news parallel-fetch pid-sandbox pkgdir-index-trusted preserve-libs protect-owned qa-unresolved-soname-deps sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
FFLAGS="-O2 -pipe"
GENTOO_MIRRORS="ftp://ftp.free.fr/mirrors/ftp.gentoo.org/"
LANG="en_US.utf8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
LEX="flex"
LINGUAS="en en_US de de_DE fr fr_FR"
MAKEOPTS="-j9"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git"
PORTAGE_TMPDIR="/var/tmp"
SHELL="/bin/bash"
USE="X a52 aac acl acpi alsa amd64 apache2 branding bzip2 cairo cdda cdr cli colord crypt cups dbus dri dts dvd dvdr eds emacs encode evo exif ffmpeg flac fortran gdbm gif gnome-keyring gnome-shell gnutls gpm gstreamer gtk gtk3 gui iconv icu introspection ipv6 jpeg keyring lcms libnotify libtirpc mmx mp3 mp4 mpeg multilib ncurses networkmanager nls ogg opengl openmp opus pam pango pcre pdf png policykit ppds pulseaudio readline sdl seccomp sound spell split-usr ssl startup-notification svg sysprof systemd test-rust theora tiff truetype udev udisks unicode upower usb vcd vhosts vorbis vulkan wxwidgets x264 xattr xcb xft xinerama xml xv xvid zlib" ABI_X86="64" ADA_TARGET="gnat_2021" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_anon authn_dbm authn_file authz_dbm authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir env expires ext_filter file_cache filter headers include info log_config logio mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="karbon sheets words" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="aes avx avx2 fma3 mmx mmxext pclmul popcnt sse sse2 sse3 sse4_1 sse4_2 ssse3" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock greis isync itrax mtk3301 ntrip navcom oceanserver oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 tsip tripmate tnt ublox" INPUT_DEVICES="libinput" KERNEL="linux" L10N="en en-US de de-DE fr" LCD_DEVICES="bayrad cfontz glk hd44780 lb216 lcdm001 mtxorb text" LUA_SINGLE_TARGET="lua5-1" LUA_TARGETS="lua5-1" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php8-1" POSTGRES_TARGETS="postgres15" PYTHON_SINGLE_TARGET="python3_11" PYTHON_TARGETS="python3_11" RUBY_TARGETS="ruby31" VIDEO_CARDS="nvidia nouveau" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipp2p iface geoip fuzzy condition tarpit sysrq proto logmark ipmark dhcpmac delude chaos account"
Unset:  ADDR2LINE, AR, ARFLAGS, AS, ASFLAGS, CC, CCLD, CONFIG_SHELL, CPP, CPPFLAGS, CTARGET, CXX, CXXFILT, ELFEDIT, EXTRA_ECONF, F77FLAGS, FC, GCOV, GPROF, INSTALL_MASK, LC_ALL, LD, LFLAGS, LIBTOOL, MAKE, MAKEFLAGS, NM, OBJCOPY, OBJDUMP, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, PYTHONPATH, RANLIB, READELF, RUSTFLAGS, SIZE, STRINGS, STRIP, YACC, YFLAGS
Comment 1 Christian D. 2024-02-06 16:03:34 UTC
Created attachment 884421 [details]
systemd journal

from beginning of suspend to next boot
Comment 2 Christian D. 2024-02-06 16:05:30 UTC
Created attachment 884422 [details]
kernel .config
Comment 3 Christian D. 2024-02-06 16:10:17 UTC
I would be happy to provide additional information. But, given the absence of useful error messages, I don't know how. Any help for narrowing down the cause would be appreciated. 

Also, up to kernel 6.1.57 suspend/resume worked flawlessly, often with 15+ suspend/resume cycles between reboots.
Comment 4 Mike Pagano gentoo-dev 2024-02-06 17:36:07 UTC
What brand / model of system is this?
Comment 5 Christian D. 2024-02-06 18:41:57 UTC
System is an Intel i7-4790K on an MSI Z97S SLI Krait Edition (Intel Z97 Express Chipset) with an NVIDIA GeForce GTX 960. 

Custom built around 8 years ago, never had any suspend/resume issues.
Comment 6 Mike Pagano gentoo-dev 2024-02-07 18:39:33 UTC
Firstly, can you test with nouveau to eliminate nvidia-drivers.  This would be a requirement if this needs to go upstream as that marks the kernel tainted.

Second, are you able to do a git bisect so we can determine if a specific commit is causing this issue?
Comment 7 Christian D. 2024-02-07 19:55:30 UTC
I'm not sure that is realistic. Nouveau seems to not support power management on my graphics card. Combined with this being my primary desktop machine and the error only occurring sporadically, this would probably take several days to determine whether any given kernel revision is "good" or "bad". 

I think, as a first step, I will try a 6.1.74 longterm kernel. If that works, at least there would be a stable and still supported baseline.
Comment 8 Mike Pagano gentoo-dev 2024-02-08 20:30:58 UTC
I would try the latest 6.1.X whenever you do test
Comment 9 Christian D. 2024-02-12 09:56:54 UTC
One more data point:

gentoo-sources-6.1.74 (latest stable 6.1.x)
nvidia-drivers-535.146.02
RinCat/RTL88x2BU-Linux-Driver (from GitHub @ 7bdc911e1c14c...)

Uptime 4 days, a dozen or so suspend cycles, no issues. 
So, I guess that qualifies as "good" and rules out hardware defects.
Comment 10 Mike Pagano gentoo-dev 2024-02-16 13:33:56 UTC
(In reply to Christian D. from comment #9)
> One more data point:
> 
> gentoo-sources-6.1.74 (latest stable 6.1.x)
> nvidia-drivers-535.146.02
> RinCat/RTL88x2BU-Linux-Driver (from GitHub @ 7bdc911e1c14c...)
> 
> Uptime 4 days, a dozen or so suspend cycles, no issues. 
> So, I guess that qualifies as "good" and rules out hardware defects.

Can you attach the output of lspci -vv
Comment 11 Christian D. 2024-02-16 14:46:03 UTC
Created attachment 885155 [details]
lspci -vv

generation emits the following to stderr (probably irrelevant):
pcilib: sysfs_read_vpd: read failed: No such device
Comment 12 Mike Pagano gentoo-dev 2024-02-16 17:13:45 UTC
Can I have your full boot log also, please ?
Comment 13 admin 2024-05-08 19:25:38 UTC
I'm commenting because this may be related to my current issue since the update.

Since the upgrade to kernel 6.6.x (Testing with gentoo-kernel-bin), I have been failing often to boot (the kernel panics even before it tries to print something which is interesting, it is stuck at checking TPM PCR 9 even though I disabled secureboot, it may be related to a grub update perhaps).

Sometimes though, and only sometimes, it does boot, even if rarely (I'd like to know what cause it to boot, but I don't even know so far, I need to dig more about it).
The problem when it boots is that it always fails to resume when I try to sleep the laptop (black screen), I noticed that also some filesystem kernel modules weren't loaded at all aswell or even a lot of modules.

The thing is that everything works correctly when I downgrade to 6.1.x, has there been some changes that I've been unaware of in the kernel that would likely cause such issues?

Let me know If I need to make another bug report for this with more details, but for now I'm using an older kernel until I dig and read code more what would cause such issues.

I'm suspecting multiple things:
1) nvidia-driver (open)
2) dracut initramfs not doing something properly
- grub not loading properly the kernel, perhaps due to the fact that the PCR check failed, even though I disabled secureboot, but it doesn't make sense since the kernel does boot sometimes so I suspect more of the 1) and 2), or it may be still grub but something totally different.
Comment 14 admin 2024-05-08 19:27:54 UTC
PS It could be an issue with the kernel code itself, but it seems unlikely, it could happen though I believe.