Created attachment 603144 [details] emerge --info I have a crash with an AMDGPU. Searches don't reveal similar bugs in this kernel version. Logs show: [ 2937.188546] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=76632, emitted seq=76634 [ 2937.188551] [drm] GPU recovery disabled. [ 3564.900439] INFO: task kworker/3:2:252 blocked for more than 120 seconds. [ 3564.900442] Not tainted 4.19.86-gentoo #1 [ 3564.900443] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 3564.900446] kworker/3:2 D 0 252 2 0x80000000 [ 3564.900578] Workqueue: events dm_irq_work_func [amdgpu] [ 3564.900580] Call Trace: [ 3564.900591] ? __schedule+0x24f/0x870 [ 3564.900595] schedule+0x32/0x80 [ 3564.900598] schedule_preempt_disabled+0xa/0x10 [ 3564.900602] __mutex_lock.isra.0+0x262/0x4b0 [ 3564.900697] amdgpu_dm_update_connector_after_detect+0x171/0x1d0 [amdgpu] [ 3564.900796] handle_hpd_irq+0xf4/0x100 [amdgpu] [ 3564.900894] dm_irq_work_func+0x49/0x60 [amdgpu] [ 3564.900900] process_one_work+0x1d4/0x3b0 [ 3564.900903] worker_thread+0x4a/0x3d0 [ 3564.900909] kthread+0xfb/0x130 [ 3564.900911] ? process_one_work+0x3b0/0x3b0 [ 3564.900914] ? kthread_park+0x80/0x80 [ 3564.900920] ret_from_fork+0x3a/0x50 [ 3564.900973] INFO: task X:5818 blocked for more than 120 seconds. [ 3564.900974] Not tainted 4.19.86-gentoo #1 [ 3564.900975] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 3564.900976] X D 0 5818 5817 0x00000000 [ 3564.900980] Call Trace: [ 3564.900984] ? __schedule+0x24f/0x870 [ 3564.900988] schedule+0x32/0x80 [ 3564.900992] schedule_timeout+0x2e7/0x470 [ 3564.901057] ? amdgpu_atom_execute_table_locked+0x12a/0x210 [amdgpu] [ 3564.901062] ? _raw_spin_unlock_irqrestore+0x1f/0x30 [ 3564.901066] dma_fence_default_wait+0x195/0x280 [ 3564.901069] ? dma_fence_wait_timeout+0xf0/0xf0 [ 3564.901072] dma_fence_wait_timeout+0xd9/0xf0 [ 3564.901137] amdgpu_fence_wait_empty+0x56/0xa0 [amdgpu] [ 3564.901207] amdgpu_pm_compute_clocks+0x54/0x180 [amdgpu] [ 3564.901304] dm_pp_apply_display_requirements+0x19c/0x1b0 [amdgpu] [ 3564.901399] pplib_apply_display_requirements+0x1a7/0x1c0 [amdgpu] [ 3564.901509] dce110_set_bandwidth+0x20b/0x230 [amdgpu] [ 3564.901610] dc_commit_state_no_check+0x142/0x4a0 [amdgpu] [ 3564.901709] dc_commit_state+0x8f/0xb0 [amdgpu] [ 3564.901807] amdgpu_dm_atomic_commit_tail+0x35f/0xde0 [amdgpu] [ 3564.901817] ? trace_hardirqs_on+0x31/0xd0 [ 3564.901825] ? wait_for_common+0x137/0x170 [ 3564.901894] ? amdgpu_bo_ref+0x5/0x20 [amdgpu] [ 3564.901914] commit_tail+0x3c/0x70 [drm_kms_helper] [ 3564.901930] drm_atomic_helper_commit+0x108/0x110 [drm_kms_helper] [ 3564.901944] drm_atomic_helper_set_config+0x81/0x90 [drm_kms_helper] [ 3564.901981] drm_mode_setcrtc+0x19b/0x690 [drm] [ 3564.901992] ? __switch_to_asm+0x41/0x70 [ 3564.901999] ? __switch_to_asm+0x35/0x70 [ 3564.902006] ? __switch_to_asm+0x41/0x70 [ 3564.902012] ? __switch_to_asm+0x35/0x70 [ 3564.902036] ? drm_mode_getcrtc+0x180/0x180 [drm] [ 3564.902056] drm_ioctl_kernel+0xa4/0xf0 [drm] [ 3564.902075] drm_ioctl+0x1f8/0x3c0 [drm] [ 3564.902096] ? drm_mode_getcrtc+0x180/0x180 [drm] [ 3564.902157] amdgpu_drm_ioctl+0x49/0x80 [amdgpu] [ 3564.902164] do_vfs_ioctl+0x3f5/0x650 [ 3564.902168] ksys_ioctl+0x5e/0x90 [ 3564.902172] __x64_sys_ioctl+0x16/0x20 [ 3564.902177] do_syscall_64+0x5a/0x110 [ 3564.902182] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 3564.902186] RIP: 0033:0x7f6bc0ab21e7 [ 3564.902194] Code: Bad RIP value. [ 3564.902196] RSP: 002b:00007ffe7c2e2f38 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 3564.902199] RAX: ffffffffffffffda RBX: 00007ffe7c2e2f70 RCX: 00007f6bc0ab21e7 [ 3564.902200] RDX: 00007ffe7c2e2f70 RSI: 00000000c06864a2 RDI: 000000000000000b [ 3564.902202] RBP: 00000000c06864a2 R08: 0000000000000393 R09: 0000558288f39560 [ 3564.902203] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 3564.902205] R13: 000000000000000b R14: 00005582878a2990 R15: 0000558287eaa6b0 [ 3564.902371] INFO: task kworker/3:0:12027 blocked for more than 120 seconds. [ 3564.902372] Not tainted 4.19.86-gentoo #1 [ 3564.902373] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 3564.902374] kworker/3:0 D 0 12027 2 0x80000000 [ 3564.902474] Workqueue: events dm_irq_work_func [amdgpu] [ 3564.902476] Call Trace: [ 3564.902481] ? __schedule+0x24f/0x870 [ 3564.902485] schedule+0x32/0x80 [ 3564.902488] schedule_preempt_disabled+0xa/0x10 [ 3564.902491] __mutex_lock.isra.0+0x262/0x4b0 [ 3564.902587] amdgpu_dm_update_connector_after_detect+0x171/0x1d0 [amdgpu] [ 3564.902680] handle_hpd_irq+0xf4/0x100 [amdgpu] [ 3564.902772] dm_irq_work_func+0x49/0x60 [amdgpu] [ 3564.902777] process_one_work+0x1d4/0x3b0 [ 3564.902780] worker_thread+0x4a/0x3d0 [ 3564.902784] kthread+0xfb/0x130 [ 3564.902787] ? process_one_work+0x3b0/0x3b0 [ 3564.902790] ? kthread_park+0x80/0x80 [ 3564.902795] ret_from_fork+0x3a/0x50 [ 3687.780418] INFO: task kworker/3:2:252 blocked for more than 120 seconds. [ 3687.780422] Not tainted 4.19.86-gentoo #1 [ 3687.780423] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 3687.780425] kworker/3:2 D 0 252 2 0x80000000 [ 3687.780556] Workqueue: events dm_irq_work_func [amdgpu] [ 3687.780558] Call Trace: [ 3687.780570] ? __schedule+0x24f/0x870 [ 3687.780573] schedule+0x32/0x80 [ 3687.780577] schedule_preempt_disabled+0xa/0x10 [ 3687.780581] __mutex_lock.isra.0+0x262/0x4b0 [ 3687.780677] amdgpu_dm_update_connector_after_detect+0x171/0x1d0 [amdgpu] [ 3687.780773] handle_hpd_irq+0xf4/0x100 [amdgpu] [ 3687.780864] dm_irq_work_func+0x49/0x60 [amdgpu] [ 3687.780870] process_one_work+0x1d4/0x3b0 [ 3687.780873] worker_thread+0x4a/0x3d0 [ 3687.780878] kthread+0xfb/0x130 [ 3687.780881] ? process_one_work+0x3b0/0x3b0 [ 3687.780885] ? kthread_park+0x80/0x80 [ 3687.780890] ret_from_fork+0x3a/0x50 [ 3687.780940] INFO: task X:5818 blocked for more than 120 seconds. [ 3687.780942] Not tainted 4.19.86-gentoo #1 [ 3687.780943] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 3687.780944] X D 0 5818 5817 0x00000000 [ 3687.780947] Call Trace: [ 3687.780952] ? __schedule+0x24f/0x870 [ 3687.780955] schedule+0x32/0x80 [ 3687.780959] schedule_timeout+0x2e7/0x470 [ 3687.781025] ? amdgpu_atom_execute_table_locked+0x12a/0x210 [amdgpu] [ 3687.781030] ? _raw_spin_unlock_irqrestore+0x1f/0x30 [ 3687.781035] dma_fence_default_wait+0x195/0x280 [ 3687.781038] ? dma_fence_wait_timeout+0xf0/0xf0 [ 3687.781041] dma_fence_wait_timeout+0xd9/0xf0 [ 3687.781106] amdgpu_fence_wait_empty+0x56/0xa0 [amdgpu] [ 3687.781176] amdgpu_pm_compute_clocks+0x54/0x180 [amdgpu] [ 3687.781273] dm_pp_apply_display_requirements+0x19c/0x1b0 [amdgpu] [ 3687.781369] pplib_apply_display_requirements+0x1a7/0x1c0 [amdgpu] [ 3687.781480] dce110_set_bandwidth+0x20b/0x230 [amdgpu] [ 3687.781581] dc_commit_state_no_check+0x142/0x4a0 [amdgpu] [ 3687.781681] dc_commit_state+0x8f/0xb0 [amdgpu] [ 3687.781780] amdgpu_dm_atomic_commit_tail+0x35f/0xde0 [amdgpu] [ 3687.781790] ? trace_hardirqs_on+0x31/0xd0 [ 3687.781798] ? wait_for_common+0x137/0x170 [ 3687.781868] ? amdgpu_bo_ref+0x5/0x20 [amdgpu] [ 3687.781890] commit_tail+0x3c/0x70 [drm_kms_helper] [ 3687.781906] drm_atomic_helper_commit+0x108/0x110 [drm_kms_helper] [ 3687.781921] drm_atomic_helper_set_config+0x81/0x90 [drm_kms_helper] [ 3687.781958] drm_mode_setcrtc+0x19b/0x690 [drm] [ 3687.781969] ? __switch_to_asm+0x41/0x70 [ 3687.781975] ? __switch_to_asm+0x35/0x70 [ 3687.781982] ? __switch_to_asm+0x41/0x70 [ 3687.781988] ? __switch_to_asm+0x35/0x70 [ 3687.782011] ? drm_mode_getcrtc+0x180/0x180 [drm] [ 3687.782032] drm_ioctl_kernel+0xa4/0xf0 [drm] [ 3687.782054] drm_ioctl+0x1f8/0x3c0 [drm] [ 3687.782075] ? drm_mode_getcrtc+0x180/0x180 [drm] [ 3687.782136] amdgpu_drm_ioctl+0x49/0x80 [amdgpu] [ 3687.782143] do_vfs_ioctl+0x3f5/0x650 [ 3687.782147] ksys_ioctl+0x5e/0x90 [ 3687.782151] __x64_sys_ioctl+0x16/0x20 [ 3687.782156] do_syscall_64+0x5a/0x110 [ 3687.782161] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 3687.782164] RIP: 0033:0x7f6bc0ab21e7 [ 3687.782172] Code: Bad RIP value. [ 3687.782174] RSP: 002b:00007ffe7c2e2f38 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 3687.782177] RAX: ffffffffffffffda RBX: 00007ffe7c2e2f70 RCX: 00007f6bc0ab21e7 [ 3687.782178] RDX: 00007ffe7c2e2f70 RSI: 00000000c06864a2 RDI: 000000000000000b [ 3687.782180] RBP: 00000000c06864a2 R08: 0000000000000393 R09: 0000558288f39560 [ 3687.782181] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 3687.782183] R13: 000000000000000b R14: 00005582878a2990 R15: 0000558287eaa6b0 [ 3687.782351] INFO: task kworker/3:0:12027 blocked for more than 120 seconds. [ 3687.782352] Not tainted 4.19.86-gentoo #1 [ 3687.782353] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 3687.782354] kworker/3:0 D 0 12027 2 0x80000000 [ 3687.782454] Workqueue: events dm_irq_work_func [amdgpu] [ 3687.782456] Call Trace: [ 3687.782462] ? __schedule+0x24f/0x870 [ 3687.782465] schedule+0x32/0x80 [ 3687.782468] schedule_preempt_disabled+0xa/0x10 [ 3687.782471] __mutex_lock.isra.0+0x262/0x4b0 [ 3687.782567] amdgpu_dm_update_connector_after_detect+0x171/0x1d0 [amdgpu] [ 3687.782660] handle_hpd_irq+0xf4/0x100 [amdgpu] [ 3687.782751] dm_irq_work_func+0x49/0x60 [amdgpu] [ 3687.782755] process_one_work+0x1d4/0x3b0 [ 3687.782759] worker_thread+0x4a/0x3d0 [ 3687.782763] kthread+0xfb/0x130 [ 3687.782765] ? process_one_work+0x3b0/0x3b0 [ 3687.782769] ? kthread_park+0x80/0x80 [ 3687.782773] ret_from_fork+0x3a/0x50 [ 3810.660431] INFO: task kworker/3:2:252 blocked for more than 120 seconds. [ 3810.660435] Not tainted 4.19.86-gentoo #1 [ 3810.660436] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 3810.660438] kworker/3:2 D 0 252 2 0x80000000 [ 3810.660565] Workqueue: events dm_irq_work_func [amdgpu] [ 3810.660568] Call Trace: [ 3810.660579] ? __schedule+0x24f/0x870 [ 3810.660583] schedule+0x32/0x80 [ 3810.660586] schedule_preempt_disabled+0xa/0x10 [ 3810.660590] __mutex_lock.isra.0+0x262/0x4b0 [ 3810.660686] amdgpu_dm_update_connector_after_detect+0x171/0x1d0 [amdgpu] [ 3810.660780] handle_hpd_irq+0xf4/0x100 [amdgpu] [ 3810.660873] dm_irq_work_func+0x49/0x60 [amdgpu] [ 3810.660879] process_one_work+0x1d4/0x3b0 [ 3810.660882] worker_thread+0x4a/0x3d0 [ 3810.660887] kthread+0xfb/0x130 [ 3810.660890] ? process_one_work+0x3b0/0x3b0 [ 3810.660893] ? kthread_park+0x80/0x80 [ 3810.660898] ret_from_fork+0x3a/0x50 [ 3810.660949] INFO: task X:5818 blocked for more than 120 seconds. [ 3810.660951] Not tainted 4.19.86-gentoo #1 [ 3810.660952] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 3810.660953] X D 0 5818 5817 0x00000000 [ 3810.660956] Call Trace: [ 3810.660961] ? __schedule+0x24f/0x870 [ 3810.660964] schedule+0x32/0x80 [ 3810.660968] schedule_timeout+0x2e7/0x470 [ 3810.661033] ? amdgpu_atom_execute_table_locked+0x12a/0x210 [amdgpu] [ 3810.661038] ? _raw_spin_unlock_irqrestore+0x1f/0x30 [ 3810.661042] dma_fence_default_wait+0x195/0x280 [ 3810.661045] ? dma_fence_wait_timeout+0xf0/0xf0 [ 3810.661048] dma_fence_wait_timeout+0xd9/0xf0 [ 3810.661112] amdgpu_fence_wait_empty+0x56/0xa0 [amdgpu] [ 3810.661180] amdgpu_pm_compute_clocks+0x54/0x180 [amdgpu] [ 3810.661275] dm_pp_apply_display_requirements+0x19c/0x1b0 [amdgpu] [ 3810.661368] pplib_apply_display_requirements+0x1a7/0x1c0 [amdgpu] [ 3810.661480] dce110_set_bandwidth+0x20b/0x230 [amdgpu] [ 3810.661581] dc_commit_state_no_check+0x142/0x4a0 [amdgpu] [ 3810.661682] dc_commit_state+0x8f/0xb0 [amdgpu] [ 3810.661780] amdgpu_dm_atomic_commit_tail+0x35f/0xde0 [amdgpu] [ 3810.661790] ? trace_hardirqs_on+0x31/0xd0 [ 3810.661797] ? wait_for_common+0x137/0x170 [ 3810.661867] ? amdgpu_bo_ref+0x5/0x20 [amdgpu] [ 3810.661890] commit_tail+0x3c/0x70 [drm_kms_helper] [ 3810.661906] drm_atomic_helper_commit+0x108/0x110 [drm_kms_helper] [ 3810.661922] drm_atomic_helper_set_config+0x81/0x90 [drm_kms_helper] [ 3810.661960] drm_mode_setcrtc+0x19b/0x690 [drm] [ 3810.661970] ? __switch_to_asm+0x41/0x70 [ 3810.661977] ? __switch_to_asm+0x35/0x70 [ 3810.661983] ? __switch_to_asm+0x41/0x70 [ 3810.661989] ? __switch_to_asm+0x35/0x70 [ 3810.662009] ? drm_mode_getcrtc+0x180/0x180 [drm] [ 3810.662028] drm_ioctl_kernel+0xa4/0xf0 [drm] [ 3810.662047] drm_ioctl+0x1f8/0x3c0 [drm] [ 3810.662068] ? drm_mode_getcrtc+0x180/0x180 [drm] [ 3810.662129] amdgpu_drm_ioctl+0x49/0x80 [amdgpu] [ 3810.662136] do_vfs_ioctl+0x3f5/0x650 [ 3810.662140] ksys_ioctl+0x5e/0x90 [ 3810.662144] __x64_sys_ioctl+0x16/0x20 [ 3810.662149] do_syscall_64+0x5a/0x110 [ 3810.662155] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 3810.662158] RIP: 0033:0x7f6bc0ab21e7 [ 3810.662165] Code: Bad RIP value. [ 3810.662167] RSP: 002b:00007ffe7c2e2f38 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 3810.662170] RAX: ffffffffffffffda RBX: 00007ffe7c2e2f70 RCX: 00007f6bc0ab21e7 [ 3810.662172] RDX: 00007ffe7c2e2f70 RSI: 00000000c06864a2 RDI: 000000000000000b [ 3810.662174] RBP: 00000000c06864a2 R08: 0000000000000393 R09: 0000558288f39560 [ 3810.662175] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 3810.662176] R13: 000000000000000b R14: 00005582878a2990 R15: 0000558287eaa6b0 [ 3810.662342] INFO: task kworker/3:0:12027 blocked for more than 120 seconds. [ 3810.662344] Not tainted 4.19.86-gentoo #1 [ 3810.662345] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 3810.662346] kworker/3:0 D 0 12027 2 0x80000000 [ 3810.662445] Workqueue: events dm_irq_work_func [amdgpu] [ 3810.662447] Call Trace: [ 3810.662453] ? __schedule+0x24f/0x870 [ 3810.662456] schedule+0x32/0x80 [ 3810.662459] schedule_preempt_disabled+0xa/0x10 [ 3810.662462] __mutex_lock.isra.0+0x262/0x4b0 [ 3810.662560] amdgpu_dm_update_connector_after_detect+0x171/0x1d0 [amdgpu] [ 3810.662656] handle_hpd_irq+0xf4/0x100 [amdgpu] [ 3810.662751] dm_irq_work_func+0x49/0x60 [amdgpu] [ 3810.662756] process_one_work+0x1d4/0x3b0 [ 3810.662759] worker_thread+0x4a/0x3d0 [ 3810.662764] kthread+0xfb/0x130 [ 3810.662766] ? process_one_work+0x3b0/0x3b0 [ 3810.662770] ? kthread_park+0x80/0x80 [ 3810.662775] ret_from_fork+0x3a/0x50 [ 3933.540409] INFO: task kworker/3:2:252 blocked for more than 120 seconds. [ 3933.540413] Not tainted 4.19.86-gentoo #1 [ 3933.540414] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 3933.540416] kworker/3:2 D 0 252 2 0x80000000 [ 3933.540543] Workqueue: events dm_irq_work_func [amdgpu] [ 3933.540546] Call Trace: [ 3933.540556] ? __schedule+0x24f/0x870 [ 3933.540560] schedule+0x32/0x80 [ 3933.540564] schedule_preempt_disabled+0xa/0x10 [ 3933.540567] __mutex_lock.isra.0+0x262/0x4b0 [ 3933.540664] amdgpu_dm_update_connector_after_detect+0x171/0x1d0 [amdgpu] [ 3933.540759] handle_hpd_irq+0xf4/0x100 [amdgpu] [ 3933.540852] dm_irq_work_func+0x49/0x60 [amdgpu] [ 3933.540858] process_one_work+0x1d4/0x3b0 [ 3933.540861] worker_thread+0x4a/0x3d0 [ 3933.540866] kthread+0xfb/0x130 [ 3933.540869] ? process_one_work+0x3b0/0x3b0 [ 3933.540872] ? kthread_park+0x80/0x80 [ 3933.540877] ret_from_fork+0x3a/0x50 [ 3933.652916] [drm] amdgpu_dm_irq_schedule_work FAILED src 1 [ 3933.934713] [drm] amdgpu_dm_irq_schedule_work FAILED src 1 [ 3935.015288] [drm] amdgpu_dm_irq_schedule_work FAILED src 5 [ 3935.274718] [drm] amdgpu_dm_irq_schedule_work FAILED src 5
Maybe related to https://bugzilla.kernel.org/show_bug.cgi?id=201957
Created attachment 603146 [details] lspci -vvv Attaching output of lspci -vvv
Which kernel are you running? sys-kernel/gentoo-sources-{version} maybe? What version?
(In reply to Jeroen Roovers from comment #3) > Which kernel are you running? sys-kernel/gentoo-sources-{version} maybe? > What version? Apologies - I thought this was included in `emerge --info`. $ uname -a Linux andromeda 4.19.86-gentoo #1 SMP Fri Dec 13 09:08:51 CET 2019 x86_64 Intel(R) Core(TM) i7-5960X CPU @ 3.00GHz GenuineIntel GNU/Linux
My PC crashed again today except it had a slightly different message: [19326.999864] amdgpu 0000:02:00.0: GPU fault detected: 147 0x06f08401 for process X pid 5469 thread X:cs0 pid 5474 [19326.999868] amdgpu 0000:02:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x0CB004DE [19326.999870] amdgpu 0000:02:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x06084001 [19326.999874] amdgpu 0000:02:00.0: VM fault (0x01, vmid 3, pasid 32768) at page 212862174, read from 'TC7' (0x54433700) (132) [19327.000804] [drm:gfx_v8_0_priv_reg_irq [amdgpu]] *ERROR* Illegal register access in command stream [19327.000813] [drm] GPU recovery disabled. [19337.061185] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=404540, emitted seq=404542 [19337.061190] [drm] GPU recovery disabled.
Can you test with the latest 5.5.X kernel. which is 5.5.10 as of this writing.