I'm getting a kernel panic from time to time (about weekly on a daily used office laptop). Two shots of the panic messages can be found at https://www.dropbox.com/sh/lqfsr0b0n6p67vd/evR8ZSJlcA . The first line is: IP: [<ffffffff8182554d>] smp_irq_move_cleanup_interrupt+0xad/0x130 Possibly related, but I'm unsure: https://patchwork.kernel.org/patch/1600651/ https://bugzilla.redhat.com/show_bug.cgi?id=869341 http://permalink.gmane.org/gmane.linux.redhat.fedora.extras.cvs/890603
Looks like that patch should fix it from what I read. Checking into the latest kernel sources, the fix seems present: >54168ed7f arch/x86/kernel/io_apic_32.c >(Ingo Molnar 2008-08-20 09:07:45 +0200 2247) cfg = irq_cfg(irq); >94777fc51 arch/x86/kernel/apic/io_apic.c >(Dimitri Sivanich 2012-10-16 07:50:21 -0500 2248) if (!cfg) >94777fc51 arch/x86/kernel/apic/io_apic.c >(Dimitri Sivanich 2012-10-16 07:50:21 -0500 2249) continue; >94777fc51 arch/x86/kernel/apic/io_apic.c >(Dimitri Sivanich 2012-10-16 07:50:21 -0500 2250) >239007b84 arch/x86/kernel/apic/io_apic.c >(Thomas Gleixner 2009-11-17 16:46:45 +0100 2251) raw_spin_lock(&desc->lock); This was introduced with the following commit: commit 94777fc51b3ad85ff9f705ddf7cdd0eb3bbad5a6 Author: Dimitri Sivanich <sivanich@sgi.com> Date: Tue Oct 16 07:50:21 2012 -0500 > x86/irq/ioapic: Check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt > > Posting this patch to fix an issue concerning sparse irq's that > I raised a while back. There was discussion about adding > refcounting to sparse irqs (to fix other potential race > conditions), but that does not appear to have been addressed > yet. This covers the only issue of this type that I've > encountered in this area. > > A NULL pointer dereference can occur in > smp_irq_move_cleanup_interrupt() if we haven't yet setup the > irq_cfg pointer in the irq_desc.irq_data.chip_data. > > In create_irq_nr() there is a window where we have set > vector_irq in __assign_irq_vector(), but not yet called > irq_set_chip_data() to set the irq_cfg pointer. > > Should an IRQ_MOVE_CLEANUP_VECTOR hit the cpu in question during > this time, smp_irq_move_cleanup_interrupt() will attempt to > process the aforementioned irq, but panic when accessing > irq_cfg. > > Only continue processing the irq if irq_cfg is non-NULL. > > Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> > Cc: Suresh Siddha <suresh.b.siddha@intel.com> > Cc: Joerg Roedel <joerg.roedel@amd.com> > Cc: Yinghai Lu <yinghai@kernel.org> > Cc: Alexander Gordeev <agordeev@redhat.com> > Link: http://lkml.kernel.org/r/20121016125021.GA22935@sgi.com > Signed-off-by: Ingo Molnar <mingo@kernel.org> Which was committed between 3.7_rc2 and 3.7_rc3 so any 3.7 kernel should fix this for you. You can try unstable kernel versions using the following command: `echo "sys-kernel/gentoo-sources" > /etc/portage/package.accept_keywords` Or you can try to apply the patch by downloading it and running the following command: `cd /usr/src/linux && patch -p1 < /path/to/patch`
(In reply to comment #1) > Looks like that patch should fix it from what I read. > > [...] > > Which was committed between 3.7_rc2 and 3.7_rc3 so any 3.7 kernel should fix > this for you. Ok, I'm running =sys-kernel/gentoo-sources-3.7.3. Perhaps marking this 'Resolved Test-Request' is appropriate, as my test is rather open-ended (I don't know how to trigger the panic). In case I get the kernel panic again, I'll come back to this bug and we can reopen it.