Summary: | shfs-0.35-r1 makes my system freeze completely | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | Kristoffer <krek6597> |
Component: | Current packages | Assignee: | AMD64 Project <amd64> |
Status: | RESOLVED TEST-REQUEST | ||
Severity: | critical | ||
Priority: | High | ||
Version: | unspecified | ||
Hardware: | AMD64 | ||
OS: | Linux | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- |
Description
Kristoffer
2005-10-25 08:34:53 UTC
*** Bug 110449 has been marked as a duplicate of this bug. *** To test if it has got to do with getting disconnected, try pulling out the network cable while connected and browse/unmount then. See what happens. Also see if dmesg tells you anything just before the lock up. If this is difficult on the same PC, try using the network console (a kernel feature) to log to another PC on the network. Good ideas! I disabled my internet connection, but that only made konqueror crash, and ls (run from bash) to freeze until the connection was established again. I have now begun to suspect that symlinks might have to do something with all this, since it has crashed with almost every time when I have tried to access a certain symlink on the remote file system. Any way, I was able to see the dmesg message before the system froze once, but not save it since my system halted ~5 seconds after konqueror crashed. But now the same thing happened, without my system crashing! The dmesg output is of the exact same kind as the one I got before, when the system crashed, and it looks like this: VFS: Busy inodes after unmount. Self-destruct in 5 seconds. Have a nice day... general protection fault: 0000 [1] SMP CPU 0 Modules linked in: 8139cp shfs nvidia sk98lin 8139too Pid: 8890, comm: bash Tainted: P 2.6.13-gentoo-r3 RIP: 0010:[<ffffffff8019a305>] <ffffffff8019a305>{iput+53} RSP: 0018:ffff81002811beb8 EFLAGS: 00010202 RAX: 00ebedef00ebedef RBX: ffff810030437b68 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000400 RDI: ffff810030437b68 RBP: ffff810030437b68 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000206 R12: ffff81002867c6b8 R13: 0000000000008000 R14: 0000000000000001 R15: 00007fffffcb85b4 FS: 000000000050cae0(0000) GS:ffffffff806cc800(0000) knlGS:0000000008b14be0 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00002aaaaaee39d0 CR3: 0000000000101000 CR4: 00000000000006e0 Process bash (pid: 8890, threadinfo ffff81002811a000, task ffff810028a62e30) Stack: ffff81002867c6b0 ffffffff80197e8a ffff81003ef0b0c0 ffff810028a62e30 ffff810029e62000 ffffffff8019ca5e ffff81003e09c6c0 ffff81003e09c6c0 ffff810028a63474 ffffffff8013978a Call Trace:<ffffffff80197e8a>{dput+458} <ffffffff8019ca5e>{__mntput+30} <ffffffff8013978a>{do_exit+634} <ffffffff8026d6e1>{__up_write+49} <ffffffff8013a21c>{do_group_exit+252} <ffffffff8010dc76>{system_call+126} Code: 48 8b 40 28 48 85 c0 74 05 48 89 df ff d0 48 8d 7b 48 48 c7 RIP <ffffffff8019a305>{iput+53} RSP <ffff81002811beb8> <1>Fixing recursive fault but reboot is needed! general protection fault: 0000 [2] SMP CPU 0 Modules linked in: 8139cp shfs nvidia sk98lin 8139too Pid: 8896, comm: ls Tainted: P 2.6.13-gentoo-r3 RIP: 0010:[<ffffffff8018f18c>] <ffffffff8018f18c>{__link_path_walk+3724} RSP: 0018:ffff81002889fd48 EFLAGS: 00010286 RAX: 018000db0004003d RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000001 RSI: 0000000000001fb6 RDI: ffff81002873f3d0 RBP: ffff81003b25b001 R08: ffff81003b25b000 R09: 000000000000000b R10: 0000000000000000 R11: 0000000000000206 R12: ffff810028741248 R13: ffff81002889fed8 R14: 0000000000000000 R15: 0000000000000004 FS: 0000000000514ae0(0063) GS:ffffffff806cc800(0000) knlGS:0000000008b14be0 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00002aaaaaebd100 CR3: 000000002872c000 CR4: 00000000000006e0 Process ls (pid: 8896, threadinfo ffff81002889e000, task ffff810028a620b0) Stack: ffff81002889fdac 0000010300000000 0000000100001fb6 ffff81003b25b000 ffff81003e598530 ffff81003e5984c0 ffff81003f37d810 ffff81002889fed8 ffff81003b25b000 ffff81003b25b000 Call Trace:<ffffffff8018f2ac>{link_path_walk+188} <ffffffff8017f0ab>{get_unused_fd+219} <ffffffff8018f562>{path_lookup+450} <ffffffff8018fdcf>{open_namei+175} <ffffffff8017efa7>{filp_open+39} <ffffffff8017f0ab>{get_unused_fd+219} <ffffffff8017f204>{sys_open+84} <ffffffff8010dc76>{system_call+126} Code: f6 40 09 40 74 12 48 8b 47 78 4c 89 ee bb 8c ff ff ff ff 10 RIP <ffffffff8018f18c>{__link_path_walk+3724} RSP <ffff81002889fd48> <0>general protection fault: 0000 [3] SMP CPU 0 Modules linked in: 8139cp shfs nvidia sk98lin 8139too Pid: 8897, comm: ls Tainted: P 2.6.13-gentoo-r3 RIP: 0010:[<ffffffff8018f18c>] <ffffffff8018f18c>{__link_path_walk+3724} RSP: 0018:ffff81002889fd48 EFLAGS: 00010286 RAX: 018000db0004003d RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000001 RSI: 0000000000001fb6 RDI: ffff81002873f3d0 RBP: ffff81003ae59001 R08: ffff81003ae59000 R09: 000000000000000b R10: 0000000000000000 R11: 0000000000000206 R12: ffff810028741248 R13: ffff81002889fed8 R14: 0000000000000000 R15: 0000000000000004 FS: 0000000000514ae0(0063) GS:ffffffff806cc800(0000) knlGS:0000000008b14be0 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00002aaaaaebd100 CR3: 0000000028088000 CR4: 00000000000006e0 Process ls (pid: 8897, threadinfo ffff81002889e000, task ffff810028a620b0) Stack: ffff81002889fdac 0000010300000000 0000000100001fb6 ffff81003ae59000 ffff81003e5992b0 ffff81003e599240 ffff81003f37d810 ffff81002889fed8 ffff81003ae59000 ffff81003ae59000 Call Trace:<ffffffff8018f2ac>{link_path_walk+188} <ffffffff8017f0ab>{get_unused_fd+219} <ffffffff8018f562>{path_lookup+450} <ffffffff8018fdcf>{open_namei+175} <ffffffff8017efa7>{filp_open+39} <ffffffff8017f0ab>{get_unused_fd+219} <ffffffff8017f204>{sys_open+84} <ffffffff8010dc76>{system_call+126} Code: f6 40 09 40 74 12 48 8b 47 78 4c 89 ee bb 8c ff ff ff ff 10 RIP <ffffffff8018f18c>{__link_path_walk+3724} RSP <ffff81002889fd48> I only get the above once. Then, every time I try to access the /mnt/ directory listing in some way, either through "cd /mnt/" and then try to bash-complete to see what's in there, running "ls /mnt" (segmantation fault) or browse with konqueror (produces this error message: "The process for the file protocol died unexpectedly"), this is added to the dmesg output with some variations on some of the registers and the stack: Kernel BUG at "include/linux/dcache.h":294 invalid operand: 0000 [4] SMP CPU 0 Modules linked in: 8139cp shfs nvidia sk98lin 8139too Pid: 8900, comm: ls Tainted: P 2.6.13-gentoo-r3 RIP: 0010:[<ffffffff8018de23>] <ffffffff8018de23>{__follow_mount+115} RSP: 0018:ffff8100280cbca8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff81003ef0b0c0 RCX: 0000000000000002 RDX: 0000000000000000 RSI: ffff81002867c6b0 RDI: ffff8100369b4750 RBP: ffff8100280cbd38 R08: ffff81003f3b8000 R09: 000000000000000b R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: ffff8100280cbd38 R14: ffff81003ff87ec0 R15: 0000000000520fa0 FS: 0000000000514ae0(0063) GS:ffffffff806cc800(0000) knlGS:0000000008a57fe0 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00002aaaab34c000 CR3: 0000000030887000 CR4: 00000000000006e0 Process ls (pid: 8900, threadinfo ffff8100280ca000, task ffff810028a62770) Stack: ffff8100369b4750 ffff8100280cbe68 ffff8100280cbd28 ffffffff8018dfe4 ffff8100280cbed8 00000000fffffff5 ffff81003f3b8000 0000000000000000 ffff81003f3b8004 ffff8100280cbd28 Call Trace:<ffffffff8018dfe4>{do_lookup+100} <ffffffff8018ed25>{__link_path_walk+2597} <ffffffff8018f2ac>{link_path_walk+188} <ffffffff8027029a>{strncpy_from_user+74} <ffffffff8018f562>{path_lookup+450} <ffffffff8018f84e>{__user_walk+62} <ffffffff80189556>{vfs_lstat+38} <ffffffff8018998f>{sys_newlstat+31} <ffffffff8010dc76>{system_call+126} Code: 0f 0b a3 2f 55 4d 80 ff ff ff ff c2 26 01 f0 ff 06 41 bc 01 RIP <ffffffff8018de23>{__follow_mount+115} RSP <ffff8100280cbca8> Btw, my remote file system is mounted on /mnt/shfs which obviously has something to do with this problem with /mnt/. Also, trying to access /mnt/shfs or any of its subdirectories in the same way as described above produces the same dmesg output. And considering error message, "invalid operand: 0000 [18] SMP", I have only one cpu. For some reason, however, I have SMP compiled into my kernel (CONFIG_SMP=y). Is this the cause of the problem? Or is SMP in this case referring to something else? Should I try to recompile my kernel without SMP and see what happens? I will try investigate further what exactly seems to cause this behaviour: symlinks, disconnect or something else and post eventual findings here. Thanks for you consideration! Ok, something seems to be broken with symlinks. After accessing symlinks a couple of times, whatever means I use to access it hangs or stops to respond. This time I got this in dmesg: VFS: Busy inodes after unmount. Self-destruct in 5 seconds. Have a nice day... Unable to handle kernel NULL pointer dereference at 0000000000000078 RIP: <ffffffff8019a0a0>{clear_inode+144} PGD 0 Oops: 0000 [1] SMP CPU 0 Modules linked in: 8139cp shfs nvidia sk98lin 8139too Pid: 8984, comm: bash Tainted: P 2.6.13-gentoo-r3 RIP: 0010:[<ffffffff8019a0a0>] <ffffffff8019a0a0>{clear_inode+144} RSP: 0018:ffff8100300d7ea8 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff81002cbe0000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 00000000fffffffa RDI: ffff81002cbe0220 RBP: ffff81002cbe0000 R08: 0000000000000002 R09: ffff8100300d7c56 R10: ffff8100300d7c70 R11: 0000000000000000 R12: ffff8100301996b8 R13: 0000000000008000 R14: 0000000000000001 R15: 00007fffffc23fe4 FS: 000000000088b7a0(0000) GS:ffffffff806cc800(0000) knlGS:00000000089a18c0 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000078 CR3: 0000000000101000 CR4: 00000000000006e0 Process bash (pid: 8984, threadinfo ffff8100300d6000, task ffff810031523770) Stack: ffff81002cbe0000 ffffffff8019b6a8 ffff8100301996b0 ffffffff80197e8a ffff81003a3c33c0 ffff810031523770 ffff810028691c00 ffffffff8019ca5e ffff81003ff3d640 ffff81003ff3d640 Call Trace:<ffffffff8019b6a8>{generic_drop_inode+344} <ffffffff80197e8a>{dput+458} <ffffffff8019ca5e>{__mntput+30} <ffffffff8013978a>{do_exit+634} <ffffffff8026d6e1>{__up_write+49} <ffffffff8013a21c>{do_group_exit+252} <ffffffff8010dc76>{system_call+126} Code: 48 8b 40 78 48 85 c0 74 07 48 89 df ff d0 66 90 48 83 bb c8 RIP <ffffffff8019a0a0>{clear_inode+144} RSP <ffff8100300d7ea8> CR2: 0000000000000078 <1>Fixing recursive fault but reboot is needed! My system didn't freeze this time either. Now I'm running without SMP and preemption. When running "ls -R" in a directory containing multiple levels of subdirectories and a couple of links back and forth within that directory, I get the following dmesg outpuy: scheduling while atomic: ls/0x00000001/8823 Call Trace:<ffffffff8047e7da>{schedule+122} <ffffffff8047f57e>{schedule_timeout+30} <ffffffff803e371c>{alloc_skb+108} scheduling while atomic: ls/0x00000001/8823 Call Trace:<ffffffff8047e7da>{schedule+122} <ffffffff8047f57e>{schedule_timeout+30} <ffffffff803e371c>{alloc_skb+108} scheduling while atomic: ls/0x00000001/8823 Call Trace:<ffffffff8047e7da>{schedule+122} <ffffffff80137edf>{current_fs_time+79} <ffffffff80130cf5>{sys_sched_yield+117} <ffffffff8017c99d>{do_coredump+349} <ffffffff8013cca5>{__dequeue_signal+501} <ffffffff8013e7ab>{get_signal_to_deliver+1227} <ffffffff8010dd5f>{do_signal+159} <ffffffff8017abce>{sys_newstat+46} <ffffffff8010f61a>{paranoid_userspace+59} note: ls[8823] exited with preempt_count 1 ----------- [cut here ] --------- [please bite here ] --------- Kernel BUG at "kernel/signal.c":1492 invalid operand: 0000 [1] CPU 0 Modules linked in: 8139cp shfs nvidia sk98lin 8139too Pid: 8823, comm: ls Tainted: P 2.6.13-gentoo-r3 RIP: 0010:[<ffffffff8013df25>] <ffffffff8013df25>{do_notify_parent+85} RSP: 0018:ffff81002b437cf8 EFLAGS: 00010087 RAX: 0000000000000000 RBX: ffff8100298aae50 RCX: 0000000000000011 RDX: ffff8100338e03f0 RSI: 0000000000000011 RDI: ffff8100298aae50 RBP: ffff8100298aae50 R08: ffff81003ec79458 R09: ffff81002b437d60 R10: 0000000000000001 R11: 0000000000000000 R12: ffff8100298aaef8 R13: ffff81000217f440 R14: ffff8100298aaef8 R15: ffff8100298ab428 FS: 0000000000514ae0(0000) GS:ffffffff80697800(0000) knlGS:0000000008a57fe0 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000404f60 CR3: 0000000028bd7000 CR4: 00000000000006e0 Process ls (pid: 8823, threadinfo ffff81002b436000, task ffff8100298aae50) Stack: 0000000000000292 0000000040000010 ffff8100284809c0 ffffffff80172dad ffff8100284809c0 ffff81003f2f0400 0000000000000000 0000000000000001 ffff81002b437e78 0000000000000286 Call Trace:<ffffffff80172dad>{__fput+301} <ffffffff801434d8>{find_task_by_pid_type+8} <ffffffff80134bfb>{will_become_orphaned_pgrp+27} <ffffffff801434d8>{find_task_by_pid_type+8} <ffffffff80135fab>{do_exit+2571} <ffffffff80136168>{do_group_exit+168} <ffffffff8013e7b7>{get_signal_to_deliver+1239} <ffffffff8010dd5f>{do_signal+159} <ffffffff8017abce>{sys_newstat+46} <ffffffff8010f61a>{paranoid_userspace+59} Code: 0f 0b a3 c0 61 4a 80 ff ff ff ff c2 d4 05 8b 83 04 01 00 00 RIP <ffffffff8013df25>{do_notify_parent+85} RSP <ffff81002b437cf8> <1>Fixing recursive fault but reboot is needed! scheduling while atomic: ls/0x00000001/8823 Call Trace:<ffffffff8047e7da>{schedule+122} <ffffffff8010fc83>{show_stack+195} <ffffffff80135697>{do_exit+247} <ffffffff80276527>{do_unblank_screen+119} <ffffffff8010ffe5>{die+69} <ffffffff80110421>{do_invalid_op+145} <ffffffff8013df25>{do_notify_parent+85} <ffffffff8010f1a9>{error_exit+0} <ffffffff8013df25>{do_notify_parent+85} <ffffffff80172dad>{__fput+301} <ffffffff801434d8>{find_task_by_pid_type+8} <ffffffff80134bfb>{will_become_orphaned_pgrp+27} <ffffffff801434d8>{find_task_by_pid_type+8} <ffffffff80135fab>{do_exit+2571} <ffffffff80136168>{do_group_exit+168} <ffffffff8013e7b7>{get_signal_to_deliver+1239} <ffffffff8010dd5f>{do_signal+159} <ffffffff8017abce>{sys_newstat+46} <ffffffff8010f61a>{paranoid_userspace+59} Note that there was no mentioning about "VFS: Busy inodes after unmount. Self-destruct in 5 seconds. Have a nice day...". Another differance is that my /mnt is still usable. When trying "ls /mnt/shfs" I get: "ls: reading directory .: Input/output error" and it was unmounted, and remount it works. Does it also happen with a vanilla kernel? With a vanilla kernel as of today (2.6.14, also with neither SMP nor preeption) the error is just as it was earlier (that is, without system freezes) and still occurs when, for example, listing a directory containing symlinks recursively. dmesg output: scheduling while atomic: ls/0x00000001/8852 Call Trace:<ffffffff804878ea>{schedule+122} <ffffffff8048868e>{schedule_timeout+30} <ffffffff803eabf3>{__alloc_skb+131} scheduling while atomic: ls/0x00000001/8852 Call Trace:<ffffffff804878ea>{schedule+122} <ffffffff8048868e>{schedule_timeout+30} <ffffffff803eabf3>{__alloc_skb+131} scheduling while atomic: ls/0x00000001/8852 Call Trace:<ffffffff804878ea>{schedule+122} <ffffffff80136bdf>{current_fs_time+79} <ffffffff8012f925>{sys_sched_yield+117} <ffffffff8017b40d>{do_coredump+349} <ffffffff80186783>{dput+35} <ffffffff8013b955>{__dequeue_signal+501} <ffffffff8013d53b>{get_signal_to_deliver+1243} <ffffffff8010dd5f>{do_signal+159} <ffffffff8017967e>{sys_newstat+46} <ffffffff8010f572>{paranoid_userspace+59} note: ls[8852] exited with preempt_count 1 ----------- [cut here ] --------- [please bite here ] --------- Kernel BUG at kernel/signal.c:1500 invalid operand: 0000 [1] CPU 0 Modules linked in: 8139cp shfs nvidia sk98lin 8139too Pid: 8852, comm: ls Tainted: P 2.6.14 #1 RIP: 0010:[<ffffffff8013ccd1>] <ffffffff8013ccd1>{do_notify_parent+81} RSP: 0018:ffff81002d8a9cf8 EFLAGS: 00010087 RAX: 0000000000000000 RBX: ffff8100320a4b30 RCX: 0000000000000011 RDX: ffff810032720af0 RSI: 0000000000000011 RDI: ffff8100320a4b30 RBP: ffff8100320a4b30 R08: ffff81003ec68458 R09: ffff81002d8a9d60 R10: 0000000000000001 R11: 0000000000000000 R12: ffff8100320a4bd8 R13: ffff81000217f450 R14: ffff8100320a4bd8 R15: ffff8100320a5108 FS: 0000000000514ae0(0000) GS:ffffffff806ac800(0000) knlGS:00000000089a18c0 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000967130 CR3: 00000000315f8000 CR4: 00000000000006e0 Process ls (pid: 8852, threadinfo ffff81002d8a8000, task ffff8100320a4b30) Stack: ffff81002f8c3ac0 0000000040000010 ffff81002f8c3ac0 ffffffff80171601 ffff81002f8c3ac0 ffff810030ce4140 0000000000000000 ffff810030ce4140 0000000000000001 0000000000000296 Call Trace:<ffffffff80171601>{__fput+305} <ffffffff80142218>{find_task_by_pid_type+8} <ffffffff8013392b>{will_become_orphaned_pgrp+27} <ffffffff80142218>{find_task_by_pid_type+8} <ffffffff80134cbb>{do_exit+2587} <ffffffff80134e6f>{do_group_exit+159} <ffffffff8013d547>{get_signal_to_deliver+1255} <ffffffff8010dd5f>{do_signal+159} <ffffffff8017967e>{sys_newstat+46} <ffffffff8010f572>{paranoid_userspace+59} Code: 0f 0b 68 09 bb 4b 80 c2 dc 05 66 66 90 66 90 8b 83 04 01 00 RIP <ffffffff8013ccd1>{do_notify_parent+81} RSP <ffff81002d8a9cf8> <1>Fixing recursive fault but reboot is needed! scheduling while atomic: ls/0x00000001/8852 Call Trace:<ffffffff804878ea>{schedule+122} <ffffffff8010fbb3>{show_stack+195} <ffffffff80134397>{do_exit+247} <ffffffff802c6287>{do_unblank_screen+119} <ffffffff8010ff31>{die+81} <ffffffff801101f1>{do_invalid_op+145} <ffffffff8013ccd1>{do_notify_parent+81} <ffffffff80132b2e>{vprintk+606} <ffffffff8010f199>{error_exit+0} <ffffffff8013ccd1>{do_notify_parent+81} <ffffffff80186783>{dput+35} <ffffffff80171601>{__fput+305} <ffffffff80142218>{find_task_by_pid_type+8} <ffffffff8013392b>{will_become_orphaned_pgrp+27} <ffffffff80142218>{find_task_by_pid_type+8} <ffffffff80134cbb>{do_exit+2587} <ffffffff80134e6f>{do_group_exit+159} <ffffffff8013d547>{get_signal_to_deliver+1255} <ffffffff8010dd5f>{do_signal+159} <ffffffff8017967e>{sys_newstat+46} <ffffffff8010f572>{paranoid_userspace+59} Hi, I don't use Gentoo/amd64, and cannot support this platform. Re-assigning to the AMD64 arch team to see if they can help. Best regards, Stu shfs has been dead upstream for a long time and probably will get removed soon from the tree, so you should use sshfs-fuse instead. I realize it's not yet marked stable, but it's worth giving it a try. If you do, please report back :) i can't reproduce it here... using gentoo-sources-2.6.14-r6 i could load the module, mount the fs, browsing and umounting without any problem. Can you please try with another kernel version or using sshfs-fuse ? i marked stable sshfs-fuse-1.2 please try it at least it runs very good here. |