Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 655160 - =sys-kernel/gentoo-sources-4.16.6: blk-mq: unable to handle kernel NULL pointer dereference / blk_mq_timeout_work
Summary: =sys-kernel/gentoo-sources-4.16.6: blk-mq: unable to handle kernel NULL point...
Status: RESOLVED OBSOLETE
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-05-07 13:14 UTC by Till Schäfer
Modified: 2018-06-12 12:42 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Till Schäfer 2018-05-07 13:14:58 UTC
I am hitting this bug https://patchwork.kernel.org/patch/10204381/ at several machines with blk-mq enabled gentoo-sources-4.16.6. Seems that the bug is not yet fixed upstream. The latest patch is https://patchwork.kernel.org/patch/10333887/. 

[    5.611081] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[    5.611093] IP: scsi_times_out+0x17/0x150
[    5.611095] PGD 0 P4D 0 
[    5.611100] Oops: 0000 [#1] SMP PTI
[    5.611102] Modules linked in: bbswitch(O) iwlmvm x86_pkg_temp_thermal iwlwifi nouveau ttm
[    5.611111] CPU: 3 PID: 26 Comm: kworker/3:0H Tainted: G           O     4.16.6-gentoo #1
[    5.611113] Hardware name: LENOVO 20ANCTO1WW/20ANCTO1WW, BIOS GLET78WW (2.32 ) 03/03/2015
[    5.611119] Workqueue: kblockd blk_mq_timeout_work
[    5.611122] RIP: 0010:scsi_times_out+0x17/0x150
[    5.611124] RSP: 0018:ffffc900019e3d78 EFLAGS: 00010246
[    5.611127] RAX: 0000000000000000 RBX: ffff88042a7cd100 RCX: 0000000000000000
[    5.611129] RDX: ffffffff820db520 RSI: 0000000000000000 RDI: ffff88042a7cd100
[    5.611131] RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000
[    5.611132] R10: 0000000000000000 R11: 0000000000000326 R12: ffff88042a7cd250
[    5.611134] R13: ffff88042a4b3d80 R14: 0000000000000003 R15: 0000000000000004
[    5.611137] FS:  0000000000000000(0000) GS:ffff88043e2c0000(0000) knlGS:0000000000000000
[    5.611139] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    5.611141] CR2: 0000000000000000 CR3: 000000000360a005 CR4: 00000000001606e0
[    5.611143] Call Trace:
[    5.611148]  blk_mq_terminate_expired+0x45/0x90
[    5.611152]  bt_iter+0x43/0x50
[    5.611156]  blk_mq_queue_tag_busy_iter+0x177/0x280
[    5.611159]  ? blk_mq_add_to_requeue_list+0xc0/0xc0
[    5.611162]  ? blk_mq_add_to_requeue_list+0xc0/0xc0
[    5.611165]  blk_mq_timeout_work+0xcf/0x1b0
[    5.611170]  process_one_work+0x1b4/0x3e0
[    5.611173]  worker_thread+0x26/0x3c0
[    5.611176]  ? trace_event_raw_event_workqueue_execute_start+0xa0/0xa0
[    5.611181]  kthread+0x10e/0x130
[    5.611185]  ? kthread_create_worker_on_cpu+0x40/0x40
[    5.611189]  ret_from_fork+0x35/0x40
[    5.611192] Code: ff 0f 0b e9 25 ff ff ff 66 90 66 2e 0f 1f 84 00 00 00 00 00 41 55 41 54 4c 8d a7 50 01 00 00 55 53 48 8b 87 88 01 00 00 48 89 fb <48> 8b 28 8b 05 e8 8b e4 00 85 c0 0f 8f df 00 00 00 83 bd 2c 01 
[    5.611229] RIP: scsi_times_out+0x17/0x150 RSP: ffffc900019e3d78
[    5.611230] CR2: 0000000000000000
Comment 1 Mike Pagano gentoo-dev 2018-06-04 16:17:19 UTC
Hi,

Did that patch work for you? I see some other scsi patches coming out in stable in the next few days.  You can either test that one you referenced or wait a bit.

Mike
Comment 2 Till Schäfer 2018-06-04 17:49:04 UTC
I have not tested the patch. Furthermore, I was unable to reproduce this bug recently. I have not noticed the bug on any machine, the last 2 weeks or so. 

It was quite frequent a month ago (every second boot or so). I just picket one machine here an examined the logs: It did not happened for 7 boots in a row on 4.16.6. After that, I have upgraded to gentoo-sources 4.16.11 and have not seen the error since (about 10 boots).

Well lovely race conditions… I am unsure if this is fixed for 4.6.11.
Comment 3 Mike Pagano gentoo-dev 2018-06-12 12:06:42 UTC
Ok, can you reopen this if you see it again, please?
Comment 4 Till Schäfer 2018-06-12 12:42:22 UTC
(In reply to Mike Pagano from comment #3)
> Ok, can you reopen this if you see it again, please?
yes