I have a removable hard drive rack for (E)IDE hard drives. When trying to make use of the hotswap ability of the drive, it causes a complete lockup of the system, along with a kernel bug report. The drive in question was installed as the primary drive on the secondary chain. Reproducible: Always Steps to Reproduce: 1. No drive in removable media bay. 2. Put removable drive into bay. Expected Results: It should have allowed the drive to be mounted once it was in place in the system. Even if hot-swap drives are not supported, it should simply refuse to acknowledge the drive's existence, not cause a system crash/lockup. The error output is as follows: journal-601, buffer write failed kernel BUG at prints.c:334! invalid operand: 0000 CPU: 0 EIP: 0010:[<c0220ec7>] Tainted: GF EFLAGS: 00010282 eax: 00000024 ebx: f740ce00 ecx: 00000001 edx: f4040000 esi: 00000000 edi: 00000007 ebp: f740ce00 esp: c1b6fed4 ds: 0018 es: 0018 ss: 0018 Process kupdated (pid: 7, stackpage=c1b6f000) Stack: c032dcb9 c0159200 c033b740 c1b6fef4 f8a7f4ec c022a841 f740ce00 c033b740 0000000a 00000008 00000000 f62ab250 00000000 00000009 f62a2000 00000000 c022de8d f740ce00 f8a7f4ec 00000001 f8a88388 00000004 00000002 00000000 Call Trace: [<c022a841>] [<c022de8d>] [<c022d247>] [<c021e554>] [<c01d2c71>] [<c01d2282>] [<c01d257f>] [<c019c1f6>] [<c01d24b0>] Code: 0f 0b 4e 01 07 12 33 c0 85 db 68 00 92 15 c0 74 10 0f b7 43 Also, right before the crash, the console gave the following messages several times in a row: hda: status error: status=0x00 { } hda: drive not ready for command ide0: reset: master: error (0x00?) My motherboard (crash occurs with drives connected to onboard controller): Tyan Tiger-MP
1) Please emerge 'ksymoops' 2) Stick the OOPS into a text file 3) Run:- 'ksymoops < file_with_oops' on the *faulty* kernel and *paste* the output into this bug.
Umm, 2.4 doesn't support hotplug ide. It's way beyond something we can do to fix that. I don't even know if 2.6 handles it to be honest (though it has a better chance).
<quote> Even if hot-swap drives are not supported, it should simply refuse to acknowledge the drive's existence, not cause a system crash/lockup. </quoute> I've had worse things happening such as RAID failures, and my system went on happily ever after <tm> so I see no reason why this shouldn't work. I'm waiting for your stack trace before I can do anything.
plasmaroo: I compiled and ran the program you told me to. Here is the output: >>EIP; c0220ec7 <madvise_willneed+107/180> <===== >>ebx; f740ce00 <_end+370193e3/3867c643> >>edx; f4040000 <_end+33c4c5e3/3867c643> >>ebp; f740ce00 <_end+370193e3/3867c643> >>esp; c1b6fed4 <_end+177c4b7/3867c643> Trace; c022a841 <free_limit+51/170> Trace; c022de8d <sys_swapon+37d/7a0> Trace; c022d247 <try_to_unuse+77/360> Trace; c021e554 <filemap_fdatasync+94/100> Trace; c01d2c71 <linux_logo16+4b1/c80> Trace; c01d2282 <linux_logo+1402/1900> Trace; c01d257f <linux_logo+16ff/1900> Trace; c019c1f6 <raw_devices+1d76/2400> Trace; c01d24b0 <linux_logo+1630/1900> Code; c0220ec7 <madvise_willneed+107/180> 00000000 <_EIP>: Code; c0220ec7 <madvise_willneed+107/180> <===== 0: 0f 0b ud2a <===== Code; c0220ec9 <madvise_willneed+109/180> 2: 4e dec %esi Code; c0220eca <madvise_willneed+10a/180> 3: 01 07 add %eax,(%edi) Code; c0220ecc <madvise_willneed+10c/180> 5: 12 33 adc (%ebx),%dh Code; c0220ece <madvise_willneed+10e/180> 7: c0 85 db 68 00 92 15 rolb $0x15,0x920068db(%ebp) Code; c0220ed5 <madvise_willneed+115/180> e: c0 (bad) Code; c0220ed6 <madvise_willneed+116/180> f: 74 10 je 21 <_EIP+0x21> Code; c0220ed8 <madvise_willneed+118/180> 11: 0f b7 43 00 movzwl 0x0(%ebx),%eax 1 warning issued. Results may not be reliable.
For some reason, there's a lot of <linux_logo> stuff there, and I assume that's a reliable trace as ksymoops says it is. Try disabling your framebuffer/... and see if that would help?
Also, what kernel are you using? Try gentoo-sources/vanilla-sources or try 2.6...
I was using the general gentoo-sources kernel (2.4.20). Per suggestion, I did try both vanilla-sources (2.4.22) and gentoo-dev-sources (2.6.0-test9). Both of those kernels had the same problem. I thought the problem might be with the /devfs kernel option, and tried compiling without it, and the crash still occurred. I recompiled without the frame buffer, and did another ksymoops: >>EIP; c0289b58 <reiserfs_panic+38/70> <===== >>ebx; c1fe3800 <_end+1c10663/3862eec3> >>edx; c0107440 <log_wait+4/c> >>ebp; c1fe3800 <_end+1c10663/3862eec3> >>esp; c34a7ebc <_end+30d4d1f/3862eec3> Trace; c0294ea1 <flush_commit_list+2a1/420> Trace; c0298e2f <do_journal_end+61f/b10> Trace; c02980d7 <flush_old_commits+117/1a0> Trace; c02eab9c <ide_do_request+cc/1d0> Trace; c02867c0 <reiserfs_write_super+70/80> Trace; c022a0e0 <sync_supers+d0/160> Trace; c022904c <sync_old_buffers+3c/b0> Trace; c022943b <kupdate+fb/140> Trace; c01de6de <arch_kernel_thread+2e/40> Trace; c0229340 <kupdate+0/140> Code; c0289b58 <reiserfs_panic+38/70> 00000000 <_EIP>: Code; c0289b58 <reiserfs_panic+38/70> <===== 0: 0f 0b ud2a <===== Code; c0289b5a <reiserfs_panic+3a/70> 2: 4e dec %esi Code; c0289b5b <reiserfs_panic+3b/70> 3: 01 3f add %edi,(%edi) Code; c0289b5d <reiserfs_panic+3d/70> 5: af scas %es:(%edi),%eax Code; c0289b5e <reiserfs_panic+3e/70> 6: 39 c0 cmp %eax,%eax Code; c0289b60 <reiserfs_panic+40/70> 8: 85 db test %ebx,%ebx Code; c0289b62 <reiserfs_panic+42/70> a: 74 0e je 1a <_EIP+0x1a> Code; c0289b64 <reiserfs_panic+44/70> c: 0f b7 43 08 movzwl 0x8(%ebx),%eax Code; c0289b68 <reiserfs_panic+48/70> 10: 89 04 24 mov %eax,(%esp,1) Code; c0289b6b <reiserfs_panic+4b/70> 13: e8 00 00 00 00 call 18 <_EIP+0x18> 1 warning issued. Results may not be reliable. ...awaiting further instructions...
Okay, I did some more testing and found out that this isn't a kernel bug at all. It seems that the problem was actually with my power supply sending a small surge whenever I "turned on" the hotswap drive - which knocked my root drive offline. Sorry about wasting your time on this one...
Thanks, resolving as such. [ ... which would explain the sync and the reiserfs stuff which is in the trace as your root HD is getting knocked out. Wonders of stacktraces ;-) ]