Since kernel 5.0 onwards, my HP laptop (AMD processor and embedded graphics) graphics mode locks up when resuming after suspending. Specifically, if I login as root on tty1, and stop xdm, I can issue pm-suspend successfully, and on touching the power button the system resumes, and appears to work, but only until I restart xdm. Then the screen goes grey and locks up; after forcing the system to shut down, syslog shows something along the following lines: kernel: [ 81.096666] [drm:amdgpu_job_timedout] *ERROR* ring gfx timeout, signaled seq=51, emitted seq=52 kernel: [ 81.096671] [drm] IP block:gfx_v8_0 is hung! kernel: [ 81.096734] [drm] GPU recovery disabled. If instead I simply login from the sddm screen (I'm using KDE), and close the lid or otherwise select suspend, on resume, the screen locks and the same messages appear. The problem remains in the most recent kernel 5.2.1, and if I specify amdgpu.gpu_recovery=1, I get a bit further: kernel: [ 279.726475] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=57, emitted seq=59 kernel: [ 279.726536] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process X pid 2860 thread X:cs0 pid 2861 kernel: [ 279.726542] [drm] IP block:gfx_v8_0 is hung! kernel: [ 279.726609] amdgpu 0000:00:01.0: GPU reset begin! kernel: [ 279.726992] amdgpu 0000:00:01.0: GRBM_SOFT_RESET=0x000F0001 kernel: [ 279.727047] amdgpu 0000:00:01.0: SRBM_SOFT_RESET=0x00000100 kernel: [ 279.863162] [drm] recover vram bo from shadow start kernel: [ 279.863164] [drm] recover vram bo from shadow done kernel: [ 279.863166] [drm] Skip scheduling IBs! kernel: [ 279.863191] amdgpu 0000:00:01.0: GPU reset(2) succeeded! kernel: [ 280.015794] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125! That remains true with the latest x11-misc/sddm-0.18.1-r1 and media-libs/mesa-19.1.2 (I normally run stable). You'll be pleased to know I've run git bisect. I wish I'd painted a wall to give me something to watch :-) 106c7d6148e5aadd394e6701f7e498df49b869d1 is the first bad commit commit 106c7d6148e5aadd394e6701f7e498df49b869d1 Author: Likun Gao <Likun.Gao@amd.com> Date: Thu Nov 8 20:19:54 2018 +0800 drm/amdgpu: abstract the function of enter/exit safe mode for RLC Abstract the function of amdgpu_gfx_rlc_enter/exit_safe_mode and some part of rlc_init to improve the reusability of RLC. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> :040000 040000 8f3b365496f3bbd380a62032f20642ace51c8fef e14ec968011019e3f601df3f15682bb9ae0bafc6 M drivers I'll attach the .config for both the build identified by bisection, and the one I now use with kernel 5.2.1 (which has several changed, none of which bypass the problem).
Created attachment 583156 [details] emerge --info
Created attachment 583158 [details] Gzipped config for the bisected kernel (config-4.20.0-rc1)
Created attachment 583160 [details] Gzipped config for the bisected kernel (config-5.2.1)
This bug looks to be the same as: https://bugs.freedesktop.org/show_bug.cgi?id=110258 and perhaps: https://bugzilla.freedesktop.org/show_bug.cgi?id=110457 though the people reporting it there haven't done the git bisect.
I'm in communication with the module author, and currently testing a fix.
(In reply to Paul Gover from comment #5) > I'm in communication with the module author, and currently testing a fix. Hi Paul. Any update here on a patch ?
Mike, sorry, I forgot about this Gentoo bug. The freedesktop bug was https://bugs.freedesktop.org/show_bug.cgi?id=110258 and the fix https://cgit.freedesktop.org/drm/drm/commit/?h=drm-fixes&id=72cda9bb5e219aea0f2f62f56ae05198c59022a7 has gone into the kernel; I can't actually work out which kernel it turned up in; it was a month or so back. Since it came out, I've been happily running with an unpatched kernel. As far as I'm concerned the bug is fixed, as is the freedesktop one. (I note the latter is still status New; that ought to be changed; it's not my bug, so I don't know if I can/should change it.)
(In reply to Paul Gover from comment #7) > Mike, sorry, I forgot about this Gentoo bug. The freedesktop bug was > https://bugs.freedesktop.org/show_bug.cgi?id=110258 > and the fix > https://cgit.freedesktop.org/drm/drm/commit/?h=drm- > fixes&id=72cda9bb5e219aea0f2f62f56ae05198c59022a7 > has gone into the kernel; I can't actually work out which kernel it turned > up in; it was a month or so back. Since it came out, I've been happily > running with an unpatched kernel. > > As far as I'm concerned the bug is fixed, as is the freedesktop one. (I > note the latter is still status New; that ought to be changed; it's not my > bug, so I don't know if I can/should change it.) Thanks, Paul. Appreciate the response.