I'm getting a kernel panic from time to time (about weekly on a daily used office laptop). Two shots of the panic messages can be found at https://www.dropbox.com/sh/lqfsr0b0n6p67vd/evR8ZSJlcA . The first line is:
IP: [<ffffffff8182554d>] smp_irq_move_cleanup_interrupt+0xad/0x130
Possibly related, but I'm unsure:
Looks like that patch should fix it from what I read.
Checking into the latest kernel sources, the fix seems present:
>(Ingo Molnar 2008-08-20 09:07:45 +0200 2247) cfg = irq_cfg(irq);
>(Dimitri Sivanich 2012-10-16 07:50:21 -0500 2248) if (!cfg)
>(Dimitri Sivanich 2012-10-16 07:50:21 -0500 2249) continue;
>(Dimitri Sivanich 2012-10-16 07:50:21 -0500 2250)
>(Thomas Gleixner 2009-11-17 16:46:45 +0100 2251) raw_spin_lock(&desc->lock);
This was introduced with the following commit:
Author: Dimitri Sivanich <email@example.com>
Date: Tue Oct 16 07:50:21 2012 -0500
> x86/irq/ioapic: Check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt
> Posting this patch to fix an issue concerning sparse irq's that
> I raised a while back. There was discussion about adding
> refcounting to sparse irqs (to fix other potential race
> conditions), but that does not appear to have been addressed
> yet. This covers the only issue of this type that I've
> encountered in this area.
> A NULL pointer dereference can occur in
> smp_irq_move_cleanup_interrupt() if we haven't yet setup the
> irq_cfg pointer in the irq_desc.irq_data.chip_data.
> In create_irq_nr() there is a window where we have set
> vector_irq in __assign_irq_vector(), but not yet called
> irq_set_chip_data() to set the irq_cfg pointer.
> Should an IRQ_MOVE_CLEANUP_VECTOR hit the cpu in question during
> this time, smp_irq_move_cleanup_interrupt() will attempt to
> process the aforementioned irq, but panic when accessing
> Only continue processing the irq if irq_cfg is non-NULL.
> Signed-off-by: Dimitri Sivanich <firstname.lastname@example.org>
> Cc: Suresh Siddha <email@example.com>
> Cc: Joerg Roedel <firstname.lastname@example.org>
> Cc: Yinghai Lu <email@example.com>
> Cc: Alexander Gordeev <firstname.lastname@example.org>
> Link: http://lkml.kernel.org/r/20121016125021.GA22935@sgi.com
> Signed-off-by: Ingo Molnar <email@example.com>
Which was committed between 3.7_rc2 and 3.7_rc3 so any 3.7 kernel should fix this for you. You can try unstable kernel versions using the following command:
`echo "sys-kernel/gentoo-sources" > /etc/portage/package.accept_keywords`
Or you can try to apply the patch by downloading it and running the following command:
`cd /usr/src/linux && patch -p1 < /path/to/patch`
(In reply to comment #1)
> Looks like that patch should fix it from what I read.
> Which was committed between 3.7_rc2 and 3.7_rc3 so any 3.7 kernel should fix
> this for you.
Ok, I'm running =sys-kernel/gentoo-sources-3.7.3.
Perhaps marking this 'Resolved Test-Request' is appropriate, as my test is rather open-ended (I don't know how to trigger the panic). In case I get the kernel panic again, I'll come back to this bug and we can reopen it.