Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 368999 - sys-kernel/gentoo-sources-2.6.38-r5: NULL pointer dereference in elv_queue_empty
Summary: sys-kernel/gentoo-sources-2.6.38-r5: NULL pointer dereference in elv_queue_empty
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: All Linux
: Normal normal
Assignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers
URL: http://thread.gmane.org/gmane.linux.k...
Whiteboard: linux-2.6.39.1, linux-2.6.38.8, gento...
Keywords:
: 369843 (view as bug list)
Depends on:
Blocks:
 
Reported: 2011-05-28 10:05 UTC by Martin von Gagern
Modified: 2011-06-10 02:18 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Kernel configuration (config-2.6.38-gentoo-r5,79.67 KB, application/octet-stream)
2011-05-28 10:05 UTC, Martin von Gagern
Details
Back ported patch for gentoo-sources-2.6.38-r5 (2100_state-guards-to-elv-next-request-fix.patch,372 bytes, patch)
2011-06-03 18:44 UTC, Mike Pagano
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Martin von Gagern 2011-05-28 10:05:53 UTC
Created attachment 274865 [details]
Kernel configuration

When plugging or mounting (not sure which) the USB connection to my digital camera, the system switched to text mode to print the following kernel bug trace. After that I had to reboot using reset. The issue isn't easily reproducible; I used that device often without any issues so far, and even trying again immediately after reboot caused no problems at all. In fact, the trace was manually copied from a photo taken with just that camera, so there might be some typos around. If you doubt some specific part, I can check that for you against the photo.

BUG: unable to handle kernel
sd 7:0:0:0: [sdc] Incomplete mode parameter data
sd 7:0:0:0: [sdc] Assuming drive cache: write through
sd 7:0:0:0: [sdc] Adjusting the sector count from its reported value: 15954944
sd 7:0:0:0: [sdc] Incomplete mode parameter data
sd 7:0:0:0: [sdc] Assuming drive cache: write through
 sdc: sdc1
sdc: p1 size 15946752 extends beyond EOD, enabling native capacity
NULL pointer dereference at 0000000000000048
IP: [<ffffffff81194bcb>] elv_queue_empty+0x1b/0x40
PGD 3ff03067 PUD 3ff0d067 PMD 0
Oops: 0000 [#1] PREEMPT SMP
last sysfs file: /sys/devices/virtual/bdi/8:32/uevent
CPU 0
Modules linked in: usb_storage uas snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss tun autofs4 ipv6 btrfs zlib_deflate libcrc32c dm_mod kvm_amd kvm fuse nfs lockd auth_rpcgss nf_conntrack_h323 nf_conntrack_sip nf_conntrack_irc nf_conntrack_ftp nf_conntrack uhci_hcd sunrpc loop lirc_i2c tuner_simple tuner_types tuner msp3400 usbhid nouveau bttv ttm v4l2_common videodev drm_kms_helper v4l2_compat_ioctl32 sr_mod cdrom snd_hda_codec_via lirc_dev ir_sony_decoder drm videobuf_dma_sg i2c_algo_bit ir_jvc_decoder videobuf_core video ir_rc6_decoder ir_rc5_decoder btcx_risc snd_hda_intel snd_hda_codec snd_bt87x snd_hwdep snd_pcm ata_generic ohci_hcd ehci_hcd snd_timer ir_nec_decoder backlight snd usbcore rc_core sym53c8xx pata_atiixp button asus_atk0110 evdev r8169 sg soundcore snd_page_alloc k10temp tveeprom i2c_core pcspkr scsi_transport_spi mii

Pid: 3, comm: ksoftirqd/0 Not tainted 2.6.38-gentoo-r5 #3 System manufacturer System Product Name/M4A785TD-V EVO
RIP: 0010:[<ffffffff81194bcb>] [<ffffffff81194bcb>] elv_queue_empty+0x1b/0x40
RSP: 0018:ffff8800cfc03e08  EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffff88004060d968 RCX: 000000000005c9e8
RDX: ffff8800859d7300 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff88004060dc88 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffff8800cfc03e68 R14: ffff88012ecd3040 R15: ffff8800406aa000
FS:  00007f12232fd700(0000) GS:ffff8800cfc00000(0000) klnGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000048 CR3: 00000000af997000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process ksoftirqd/0 (pid: 3, threadinfo ffff88012fcb2000, task ffff88012fca8cc0)
Stack:
 ffff88004060d968 ffffffff81197b1b ffff88004060d968 0000000000000292
 ffff88004060d968 ffffffff81197d3a ffff88012ecd3000 ffff88012ecd3000
 ffff8800cfc03e68 ffffffff8125e1aa ffff880055abcac0 0000000000000246
Call Trace:
 <IRQ>
 [<ffffffff81197b1b>] ? __blk_run_queue+0x3b/0x180
 [<ffffffff81197d3a>] ? blk_run_queue+0x2a/0x50
 [<ffffffff8125e1aa>] ? scsi_run_queue+0xda/0x350
 [<ffffffff812602eb>] ? scsi_next_command+0x3b/0x60
 [<ffffffff812605c2>] ? scsi_io_completion+0x252/0x570
 [<ffffffff8119d57d>] ? blk_done_softirq+0x6d/0x80
 [<ffffffff810409c0>] ? __do_softirq+0x90/0x110
 [<ffffffff8100348c>] ? call_softirq+0x1c/0x30
 <EOI>
 [<ffffffff810055ad>] ? do_softirq+0x4d/0x80
 [<ffffffff810402e3>] ? run_ksoftirqd+0xb3/0x1e0
 [<ffffffff81040230>] ? run_ksoftirqd+0x0/0x1e0
 [<ffffffff81040230>] ? run_ksoftirqd+0x0/0x1e0
 [<ffffffff81055676>] ? kthread+0x96/0xa0
 [<ffffffff81003394>] ? kernel_thread_helper+0x4/0x10
 [<ffffffff810555e0>] ? kthread+0x0/0xa0
 [<ffffffff81003390>] ? kernel_thread_helper+0x0/0x10
Code: eb 94 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 48 83 ec 08 31 c0 48 3b 3f 48 8b 57 18 74 09 48 83 c4 08 c3 0f 1f 40 00 48 8b 02 <48> 8b 40 48 48 85 c0 74 0c 48 83 c4 08 ff e0 66 0f 1f 44 00 00
RIP  [<ffffffff81194bcb>] elv_queue_empty+0x1b/0x40
 RSP <ffff8800cfc03e08>
CR2: 0000000000000048

ksymoops doesn't have much to add to this, except a disassembly of elv_queue_empty:
  00:   48 83 ec 08               sub    $0x8,%rsp
  04:   31 c0                     xor    %eax,%eax
  06:   48 3b 3f                  cmp    (%rdi),%rdi
  09:   48 8b 57 18               mov    0x18(%rdi),%rdx
  0d:   74 09                     je     28 <_RIP+0x28>
  0f:   48 83 c4 08               add    $0x8,%rsp
  13:   c3                        retq   
  14:   0f 1f 40 00               nopl   0x0(%rax)
  18:   48 8b 02                  mov    (%rdx),%rax
  1b:   48 8b 40 48               mov    0x48(%rax),%rax   <=====
  1f:   48 85 c0                  test   %rax,%rax
  22:   74 0c                     je     40 <_RIP+0x40>
  24:   48 83 c4 08               add    $0x8,%rsp
  28:   ff e0                     jmpq   *%rax
  2a:   66 0f 1f 44 00 00         nopw   0x0(%rax,%rax,1)
Comment 1 Martin von Gagern 2011-05-28 10:29:31 UTC
Comparing the assembly with the function code:

int elv_queue_empty(struct request_queue *q)
{
	struct elevator_queue *e = q->elevator;

	if (!list_empty(&q->queue_head))
		return 0;

	if (e->ops->elevator_queue_empty_fn)
		return e->ops->elevator_queue_empty_fn(q);

	return 1;
}

 00:   48 83 ec 08         sub    $0x8,%rsp        // setup stack
 04:   31 c0               xor    %eax,%eax        // constant 0
 06:   48 3b 3f            cmp    (%rdi),%rdi      // first if
 09:   48 8b 57 18         mov    0x18(%rdi),%rdx  // e = q->elevator
 0d:   74 09               je     28 <_RIP+0x28>   // skip first if block
 0f:   48 83 c4 08         add    $0x8,%rsp        // clean stack
 13:   c3                  retq                    // return the 0 above
 14:   0f 1f 40 00         nopl   0x0(%rax)        // padding
 18:   48 8b 02            mov    (%rdx),%rax      // e->ops
 1b:   48 8b 40 48  <===>  mov    0x48(%rax),%rax  // ->elevator_queue_empty_fn
 1f:   48 85 c0            test   %rax,%rax        // second if
 22:   74 0c               je     40 <_RIP+0x40>   // skip second if block
 24:   48 83 c4 08         add    $0x8,%rsp        // clean stack
 28:   ff e0               jmpq   *%rax            // tail-call empty_fn
 2a:   66 0f 1f 44 00 00   nopw   0x0(%rax,%rax,1) // padding

So it seems that e->ops == NULL in this case.
Comment 2 Chí-Thanh Christopher Nguyễn gentoo-dev 2011-05-29 10:32:01 UTC
It is a regression between 2.6.38.5 and 2.6.38.6

Patch is referenced in the thread.
Comment 3 Mike Pagano gentoo-dev 2011-06-03 18:44:52 UTC
Created attachment 275705 [details, diff]
Back ported patch for gentoo-sources-2.6.38-r5

Can someone please test this against gentoo-sources-2.6.38-r5 and post the results?
Comment 4 Stratos Psomadakis (RETIRED) gentoo-dev 2011-06-07 13:31:45 UTC
*** Bug 369843 has been marked as a duplicate of this bug. ***
Comment 5 Martin von Gagern 2011-06-07 13:40:49 UTC
(In reply to comment #3)
> Can someone please test this against gentoo-sources-2.6.38-r5 and post the
> results?

As I couldn't reproduce the issue reliably on my system, I can't provide good testing for the fix either. But bug #369843 comment #4 sounds like Vladimir did test the patch (presumably the same), and he could reproduce reliably.
Comment 6 Mike Pagano gentoo-dev 2011-06-07 18:13:39 UTC
This patch is included in gentoo-sources-2.6.38-r7.

Can someone test that?
Comment 7 Mike Pagano gentoo-dev 2011-06-08 18:22:37 UTC
Patch included in gentoo-sources-2.6.38-r7, gentoo-sources-2.6.39-r1
Comment 8 rpansky 2011-06-10 00:29:27 UTC
Probably, this bug is worth reopening and the patch needs some more testing. Running 2.6.38-r7 (alongside with Xorg and Kde as in [bug]369843[/bug]), I periodically encounter kernel panics with various messages after a USB stick insertion.

There are two of them (I was too lazy to write the full traces down :-():

1. BUG: unable to handle kernel paging request at 0000000001000000
IP: [<0000000001000000>] 0x1000000

2. kernel BUG at block/blk-core.c:1932!
invalid opcode: 0000 [#1] SMP
Comment 9 rpansky 2011-06-10 02:15:58 UTC
The trace's tail:

...Registers...
...Stack...

Call Trace:
<IRQ>
[<ffffffff811a792d>] ? blk_run_queue+0x1d/0x50
[<ffffffff8124fd33>] ? scsi_run_queue+0xc3/0x360
[<ffffffff81250bfb>] ? scsi_next_command+0x3b/0x60
[<ffffffff81251807>] ? scsi_io_completion+0x0x337/0x530
[<ffffffff811ac3fd>] ? blk_done_softirq+0x6d/0x80
[<ffffffff81040371>] ? __do_softirq+0x91/0x120
[<ffffffff8100334c>] ? call_softirq+0x1c/0x30
<EOI>
[<ffffffff81004fed>] ? do_softirq+0x4d/0x80
[<ffffffff8103ff07>] ? run_ksoftirqd+0x87/0x150
[<ffffffff8103fe80>] ? run_ksoftirqd+0x0/0x150
[<ffffffff8103fe80>] ? run_ksoftirqd+0x0/0x150
[<ffffffff81054aa6>] ? kthread+0x96/0xa0
[<ffffffff81003254>] ? kernel_thread_helper+0x4/0x10
[<ffffffff81054a10>] ? kthread+0x0/0xa0
[<ffffffff81003250>] ? kernel_thread_helper+0x0/0x10

Code: 00 00 00 b8 00 01 00 00 f0 66 0f c1 07 38 e0 74 06 f3 90 8a 07 eb f6 c3
66 66 2e 0f 1f 84 00 00 00 00 00 9c 58 fa ba 00 01 00 00 <f0> 66 0f c1 17 38 f2
74 06 f3 90 8a 17 eb f6 c3 0f 1f 84 00 00

RIP [<ffffffff81344268>] _raw_spin_lock_irqsave+0x8/0x20
RSP <ffff8800cfa83e20>
CR2: 0000000000000000