Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 32251 - kernel crash when using removable IDE hard drive
Summary: kernel crash when using removable IDE hard drive
Status: RESOLVED INVALID
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: x86 Linux
: High critical (vote)
Assignee: x86-kernel@gentoo.org (DEPRECATED)
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2003-10-29 02:10 UTC by Andrew Rysavy
Modified: 2003-11-03 08:26 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Andrew Rysavy 2003-10-29 02:10:00 UTC
I have a removable hard drive rack for (E)IDE hard drives.
When trying to make use of the hotswap ability of the drive, it causes a 
complete lockup of the system, along with a kernel bug report.
The drive in question was installed as the primary drive on the secondary chain.

Reproducible: Always
Steps to Reproduce:
1. No drive in removable media bay.
2. Put removable drive into bay.



Expected Results:  
It should have allowed the drive to be mounted once it was in place in the 
system.
Even if hot-swap drives are not supported, it should simply refuse to 
acknowledge the drive's existence, not cause a system crash/lockup.

The error output is as follows:

journal-601, buffer write failed
kernel BUG at prints.c:334!
invalid operand:  0000
CPU:    0
EIP:    0010:[<c0220ec7>]    Tainted: GF
EFLAGS: 00010282
eax: 00000024   ebx: f740ce00   ecx: 00000001   edx: f4040000
esi: 00000000   edi: 00000007   ebp: f740ce00   esp: c1b6fed4
ds: 0018   es: 0018   ss: 0018
Process kupdated (pid: 7, stackpage=c1b6f000)
Stack: c032dcb9 c0159200 c033b740 c1b6fef4 f8a7f4ec c022a841 f740ce00 c033b740
0000000a 00000008 00000000 f62ab250 00000000 00000009 f62a2000 00000000
c022de8d f740ce00 f8a7f4ec 00000001 f8a88388 00000004 00000002 00000000
Call Trace:    [<c022a841>] [<c022de8d>] [<c022d247>] [<c021e554>] [<c01d2c71>]
[<c01d2282>] [<c01d257f>] [<c019c1f6>] [<c01d24b0>]

Code: 0f 0b 4e 01 07 12 33 c0 85 db 68 00 92 15 c0 74 10 0f b7 43


Also, right before the crash, the console gave the following messages several 
times in a row:

hda: status error: status=0x00 { }
hda: drive not ready for command
ide0: reset: master: error (0x00?)


My motherboard (crash occurs with drives connected to onboard controller):
Tyan Tiger-MP
Comment 1 Tim Yamin (RETIRED) gentoo-dev 2003-10-29 14:55:39 UTC
1) Please emerge 'ksymoops'
2) Stick the OOPS into a text file
3) Run:- 'ksymoops < file_with_oops' on the *faulty* kernel and *paste* the
output into this bug.
Comment 2 Brian Jackson (RETIRED) gentoo-dev 2003-10-29 16:16:30 UTC
Umm, 2.4 doesn't support hotplug ide. It's way beyond something we can do
to fix that. I don't even know if 2.6 handles it to be honest (though it
has a better chance).
Comment 3 Tim Yamin (RETIRED) gentoo-dev 2003-10-30 10:20:11 UTC
<quote>
Even if hot-swap drives are not supported, it should simply refuse to 
acknowledge the drive's existence, not cause a system crash/lockup.
</quoute>

I've had worse things happening such as RAID failures, and my system went
on happily ever after <tm> so I see no reason why this shouldn't work. I'm
waiting for your stack trace before I can do anything.
Comment 4 Andrew Rysavy 2003-10-31 03:22:01 UTC
plasmaroo:  I compiled and ran the program you told me to.  Here is the output:

>>EIP; c0220ec7 <madvise_willneed+107/180>   <=====

>>ebx; f740ce00 <_end+370193e3/3867c643>
>>edx; f4040000 <_end+33c4c5e3/3867c643>
>>ebp; f740ce00 <_end+370193e3/3867c643>
>>esp; c1b6fed4 <_end+177c4b7/3867c643>

Trace; c022a841 <free_limit+51/170>
Trace; c022de8d <sys_swapon+37d/7a0>
Trace; c022d247 <try_to_unuse+77/360>
Trace; c021e554 <filemap_fdatasync+94/100>
Trace; c01d2c71 <linux_logo16+4b1/c80>
Trace; c01d2282 <linux_logo+1402/1900>
Trace; c01d257f <linux_logo+16ff/1900>
Trace; c019c1f6 <raw_devices+1d76/2400>
Trace; c01d24b0 <linux_logo+1630/1900>

Code;  c0220ec7 <madvise_willneed+107/180>
00000000 <_EIP>:
Code;  c0220ec7 <madvise_willneed+107/180>   <=====
   0:   0f 0b                     ud2a      <=====
Code;  c0220ec9 <madvise_willneed+109/180>
   2:   4e                        dec    %esi
Code;  c0220eca <madvise_willneed+10a/180>
   3:   01 07                     add    %eax,(%edi)
Code;  c0220ecc <madvise_willneed+10c/180>
   5:   12 33                     adc    (%ebx),%dh
Code;  c0220ece <madvise_willneed+10e/180>
   7:   c0 85 db 68 00 92 15      rolb   $0x15,0x920068db(%ebp)
Code;  c0220ed5 <madvise_willneed+115/180>
   e:   c0                        (bad)
Code;  c0220ed6 <madvise_willneed+116/180>
   f:   74 10                     je     21 <_EIP+0x21>
Code;  c0220ed8 <madvise_willneed+118/180>
  11:   0f b7 43 00               movzwl 0x0(%ebx),%eax


1 warning issued.  Results may not be reliable.
Comment 5 Tim Yamin (RETIRED) gentoo-dev 2003-10-31 04:53:20 UTC
For some reason, there's a lot of <linux_logo> stuff there, and I assume
that's a reliable trace as ksymoops says it is. Try disabling your framebuffer/...
and see if that would help?
Comment 6 Tim Yamin (RETIRED) gentoo-dev 2003-10-31 04:54:59 UTC
Also, what kernel are you using? Try gentoo-sources/vanilla-sources or try
2.6...
Comment 7 Andrew Rysavy 2003-11-03 07:24:17 UTC
I was using the general gentoo-sources kernel (2.4.20). Per suggestion, I
did try both vanilla-sources (2.4.22) and gentoo-dev-sources (2.6.0-test9).

Both of those kernels had the same problem.

I thought the problem might be with the /devfs kernel option, and tried compiling
without it, and the crash still occurred.

I recompiled without the frame buffer, and did another ksymoops:

>>EIP; c0289b58 <reiserfs_panic+38/70>   <=====

>>ebx; c1fe3800 <_end+1c10663/3862eec3>
>>edx; c0107440 <log_wait+4/c>
>>ebp; c1fe3800 <_end+1c10663/3862eec3>
>>esp; c34a7ebc <_end+30d4d1f/3862eec3>

Trace; c0294ea1 <flush_commit_list+2a1/420>
Trace; c0298e2f <do_journal_end+61f/b10>
Trace; c02980d7 <flush_old_commits+117/1a0>
Trace; c02eab9c <ide_do_request+cc/1d0>
Trace; c02867c0 <reiserfs_write_super+70/80>
Trace; c022a0e0 <sync_supers+d0/160>
Trace; c022904c <sync_old_buffers+3c/b0>
Trace; c022943b <kupdate+fb/140>
Trace; c01de6de <arch_kernel_thread+2e/40>
Trace; c0229340 <kupdate+0/140>

Code;  c0289b58 <reiserfs_panic+38/70>
00000000 <_EIP>:
Code;  c0289b58 <reiserfs_panic+38/70>   <=====
   0:   0f 0b                     ud2a      <=====
Code;  c0289b5a <reiserfs_panic+3a/70>
   2:   4e                        dec    %esi
Code;  c0289b5b <reiserfs_panic+3b/70>
   3:   01 3f                     add    %edi,(%edi)
Code;  c0289b5d <reiserfs_panic+3d/70>
   5:   af                        scas   %es:(%edi),%eax
Code;  c0289b5e <reiserfs_panic+3e/70>
   6:   39 c0                     cmp    %eax,%eax
Code;  c0289b60 <reiserfs_panic+40/70>
   8:   85 db                     test   %ebx,%ebx
Code;  c0289b62 <reiserfs_panic+42/70>
   a:   74 0e                     je     1a <_EIP+0x1a>
Code;  c0289b64 <reiserfs_panic+44/70>
   c:   0f b7 43 08               movzwl 0x8(%ebx),%eax
Code;  c0289b68 <reiserfs_panic+48/70>
  10:   89 04 24                  mov    %eax,(%esp,1)
Code;  c0289b6b <reiserfs_panic+4b/70>
  13:   e8 00 00 00 00            call   18 <_EIP+0x18>

1 warning issued.  Results may not be reliable.


...awaiting further instructions...
Comment 8 Andrew Rysavy 2003-11-03 08:09:19 UTC
Okay, I did some more testing and found out that this isn't a kernel bug
at all.

It seems that the problem was actually with my power supply sending a small
surge whenever I "turned on" the hotswap drive - which knocked my root drive
offline.

Sorry about wasting your time on this one...
Comment 9 Tim Yamin (RETIRED) gentoo-dev 2003-11-03 08:26:53 UTC
Thanks, resolving as such.

[ ... which would explain the sync and the reiserfs stuff which is in the
trace as your root HD is getting knocked out. Wonders of stacktraces ;-)
]