Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 26967 - Kernel OOPS when PCMCIA card removed
Summary: Kernel OOPS when PCMCIA card removed
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: x86 Linux
: High major
Assignee: x86-kernel@gentoo.org (DEPRECATED)
URL:
Whiteboard:
Keywords:
: 26333 (view as bug list)
Depends on:
Blocks:
 
Reported: 2003-08-19 22:23 UTC by Lindsay Haisley
Modified: 2003-09-24 12:33 UTC (History)
3 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Digital photo of kernel oops screen (gentoo_kernel_oops.gif,94.42 KB, image/gif)
2003-08-19 22:26 UTC, Lindsay Haisley
Details
ksymoops output from kernel crash (oops_analysis.txt,3.44 KB, text/plain)
2003-08-20 08:36 UTC, Lindsay Haisley
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Lindsay Haisley 2003-08-19 22:23:44 UTC
Using kernel 2.4.20-gentoo-r6, pcmcia-cs v3.2.4.  I get a kernel oops when
removing any PCMCIA card from a Dell Latitude CPi D300XT unless I explicitly
eject the card in software first using cardctl.  I compiled a stock 2.4.20
kernel and pcmcia-cs modules for it and card removal is handled properly. 
Modules are loaded and/or unloaded in response to card insertion and removal.

This appears to be a problem with the interrupt handler.  The oops notes a
problem in sched.c, line 1141, along with the observation "interrupt handler -
not syncing".

Reproducible: Always
Steps to Reproduce:
1.Insert card into PCMCIA card slot, or boot machine with card inserted
2.After modules are loaded and the card is active, remove the card.
3.

Actual Results:  
Kernel oops.

Expected Results:  
Modules relevant to removed card should be unloaded in response to its removal.

I will amend this bug and attach a rather crappy photo of the kernel oops screen
(sorry about the quality!).
Comment 1 Lindsay Haisley 2003-08-19 22:26:51 UTC
Created attachment 16351 [details]
Digital photo of kernel oops screen

Digital photo of kernel oops.  Not great, but relevant information is either
there, or can be inferred.
Comment 2 Tim Yamin (RETIRED) gentoo-dev 2003-08-20 02:40:50 UTC
Thank you, the photo was fine. Can you please:

> [Do this on the buggy gentoo kernel]
> Stick the stuff below into a text file
> emerge ksymoops
> ksymoops < file_with_trace > file_with_output [ as root if you can't find it ]
> And attach file_with_output to Bugzilla...

Also, can you try gentoo-sources-r5 and see if you get the same problem?

Thanks...

--BEGIN-OOPS--
CPU:	0
EIP:	0010:[<c018cd20>]	Not tainted
EFLAGS:	00010002
eax: 00000001   ebx: 00000003   ecx: 00000000   edx: 00000001
esi: c6e8c840   edi: c6e8c858   ebp: c013ddec   esp: c013ddcc
ds: 0018   es: 0018   ss: 0018
Process swapper (pid: 0, stackpage=c013d000)
Stack:	66656463 6a696867 6e6d6c6b 7271706f 00000006 c013c000 c6e8c840 c6e8c858
	c6e8c860 c01bd540 c7c94000 c6e8c840 00000000 c150c000 c6f2a4e0 c01d2f93
	c6e8c840 c6e8c840 c01d3d4e c6e8c840 c0108a20 00000001 c6f2a7e0 c7024840
Call Trace:	[<c01bd5b0>] [<c01d2f93>] [<c01d3d4e>] [<c01ed986>] [<c01ebcd4>]
[<c01ed9e0>] [<c01d26cf>] [<c01d0ffa>] [<c01ec243>] [<c01ec2c9>] [<c895c600>]
[<c0220201>] [<c8957930>] [<c89583dc>] [<c8956f75>] [<c895c600>] [<c895e1c0>]
[<c895e130>] [<c019aaed>] [<c01960b3>] [<c0195f86>] [<c0195dcb>] [<c018207e>]
[<c017e660>] [<c01849a3>] [<c017e660>] [<c017e660>] [<c017e683>] [<c017e6e4>]

Code: 0f 0b 75 04 3f c0 2b c0 e9 13 fc ff ff 8d 76 00 55 89 e5 53
<0> Kernel panic: Aiee, killing interrupt handler
In interrupt handler - not syncing
--END-OOPS--
Comment 3 Lindsay Haisley 2003-08-20 08:36:04 UTC
Created attachment 16368 [details]
ksymoops output from kernel crash

# ksymoops --system-map=/usr/src/linux/System.map < gentoo_oops.txt >
oops_analysis.txt.

I trust this is what you want.	To the best of my knowledge, this corresponds
to what was running when I shot the oops output which you transcribed, and
everything should match up.  The only difference being that I recompiled
pcmcia-cs after going back to the gentoo kernel.  If this matters, I can run
the process again, although I don't look forward to hand-copying the screen
output into a text file :-(

This also happened with the r5 kernel.	I didn't report it, and hoped that it
might have been fixed in the r6 kernel, which it wasn't, so I decided to report
it.  I do a bit of programming, but I ain't no kernel hacker ;-)  I appreciate
your patience.
Comment 4 Tim Yamin (RETIRED) gentoo-dev 2003-08-26 05:30:26 UTC
Sorry I didn't reply earlier, bugzilla seems to have some bug with not bugging you upon new attachments. Can you recompile your kernel removing: Preemptible Kernel, any APIC-related things, and ACPI [temporarily].
Comment 5 Tim Yamin (RETIRED) gentoo-dev 2003-08-26 05:33:05 UTC
Never mind that. Looking through the code an evil nasty quick way would be to enable Preemptible Kernel. Enable that and essentially you can't have any interrupts as the kernel is preemptible, which should fix that bug.
Comment 6 Tim Yamin (RETIRED) gentoo-dev 2003-08-26 15:34:06 UTC
*** Bug 26333 has been marked as a duplicate of this bug. ***
Comment 7 Jesper Toft 2003-08-27 01:25:13 UTC
Enabling Preemptible Kernel does not fix the problem. Ejecting the card first with 
cardctl works. 
Comment 8 Tim Yamin (RETIRED) gentoo-dev 2003-08-27 01:35:38 UTC
Okay, can you try getting rid of lines 1140 and 1141 from kernel/sched.c and see what happens [ just comment them out with a "//" ]
Comment 9 Jesper Toft 2003-08-27 01:53:18 UTC
Gives: 
 
Oops: 0007 
CPU: 0 
EIP: 0023:[<400e6243>] Not tainted 
EFLAGS: 00010286 
eax: 00000001 ebx: 4014ae00 ecx: bfffafa0 edx: bfffafa0 
esi: 00000001 edi: 0805edd8 ebp: bffffdc48 esp: bffffaf90 
ds: 002b es:002b ss: 002b 
Process devfsd (pid: 160, stackpage=ddddd000) 
<0> Kernel panic: Aiee, killing interrupt handler 
In interrupt handler - not syncing 
 
It all works just fine if i kill devfsd. 
 
Comment 10 Tim Yamin (RETIRED) gentoo-dev 2003-08-27 02:21:58 UTC
Try this: [for lines 1140+1141]:

       if (in_interrupt())
                return;

The PCMCIA modules like to call devfs functions [if available] on a timer for some reason. When you get an interrupt, it has no clue what do to with it sends it over to schedule() which BUGs() out as it's also clueless...
Comment 11 Jesper Toft 2003-08-27 02:48:26 UTC
This works! 
Comment 12 Tim Yamin (RETIRED) gentoo-dev 2003-08-27 02:52:42 UTC
Resolving. I'll try and get this into the next gentoo kernel. Thanks for your help  :-)
Comment 13 Tim Yamin (RETIRED) gentoo-dev 2003-08-28 01:02:52 UTC
*** Bug 27448 has been marked as a duplicate of this bug. ***
Comment 14 Lindsay Haisley 2003-09-22 12:23:40 UTC
I note that this bug is still in gentoo-sources (r7) which I put on a system this past weekend (c.a. 9/20/03).  This problem occurs in other contexts, apparently, as I found out when a newly installed kernel on a desktop system crashed with the same error.  I had to apply the same fix.

Did this slip through the cracks?  Shouldn't it be in gentoo-sources by now?  It's marked as FIXED.
Comment 15 Tim Yamin (RETIRED) gentoo-dev 2003-09-24 12:33:06 UTC
Fixed in CVS, should sync over to Portage soon.