Hi, i'm running a linux terminal server where some users use shfsmount. The problem with 2.6.14-hardened-r1 net-fs/shfs-0.35-r2 and 2.6.11-hardened-r15 net-fs/shfs-0.35-r1 is that shfs causes kernel oops which after a short while results in a lockup of the machine. 2.6.14-hardened-r1 and net-fs/shfs-0.35-r2 ========================================== SHell File System, (c) 2002-2004 Miroslav Spousta Unable to handle kernel paging request at virtual address f889f000 printing eip: f8835b73 *pgd = 44f001 *pmd = 54a9063 Oops: 0002 [#1] SMP Modules linked in: shfs CPU: 2 EIP: 0060:[<f8835b73>] Not tainted VLI EFLAGS: 00010206 (2.6.14-hardened-r1-rz3) EIP is at sock_read+0x63/0xfffe44f0 [shfs] eax: 00001008 ebx: f889f000 ecx: 00000400 edx: d59f5800 esi: d53da000 edi: f889f000 ebp: 00001000 esp: d4eca7fc ds: 007b es: 007b ss: 0068 Process java (pid: 535, threadinfo=d4eca000 task=d5063030) Stack: 00001008 d512f1a0 d4eca000 00000000 80000004 00000000 d53dc007 d53dc004 f883a43e d512f180 f883620e d53dc004 d53dc007 ffffffff d53dc03e d59f5800 f8837f4b d59f5800 f889f000 00001000 d4eca8b8 00001000 00001000 00001000 Call Trace: [<f883a43e>] __func__.5+0x3fc/0xfffdffbe [shfs] [<f883620e>] reply+0x5e/0xfffe3e50 [shfs] [<f8837f4b>] shell_read+0x31b/0xfffe23d0 [shfs] [<f88330ee>] fcache_file_read+0x25e/0xfffe7170 [shfs] [<c014a80d>] find_busiest_group+0x10d/0x370 [<c014ad83>] load_balance_newidle+0x43/0x110 [<c01491dd>] activate_task+0x8d/0xa0 [<c0149b5a>] try_to_wake_up+0x2da/0x340 [<c014b661>] __wake_up_common+0x41/0x70 [<c014b6ce>] __wake_up+0x3e/0x60 [<c018069c>] set_page_address+0xac/0x190 [<c01805d9>] page_address+0xb9/0xd0 [<c017fd6f>] kmap_high+0x13f/0x1f0 [<c0322626>] skb_dequeue+0x46/0x60 [<c037143c>] unix_stream_recvmsg+0x10c/0x480 [<c0322770>] skb_queue_tail+0x20/0x50 [<f883393e>] shfs_file_readpage+0x9e/0xfffe6760 [shfs] [<c031bea3>] sock_recvmsg+0xf3/0x110 [<c03879c6>] __reacquire_kernel_lock+0x26/0x50 [<c0385fa0>] schedule+0x6b0/0xd70 [<c01491dd>] activate_task+0x8d/0xa0 [<c0149b5a>] try_to_wake_up+0x2da/0x340 [<c014b661>] __wake_up_common+0x41/0x70 [<c014b6ce>] __wake_up+0x3e/0x60 [<c025171c>] __up+0x1c/0x20 [<c03858b3>] __up_wakeup+0x7/0xc [<f883963e>] .text.lock.shell+0x1b/0xfffe09dd [shfs] [<f883a4e6>] __func__.5+0x4a4/0xfffdffbe [shfs] [<f8837bc6>] shell_open+0x86/0xfffe24c0 [shfs] [<f883a4ee>] __func__.5+0x4ac/0xfffdffbe [shfs] [<f883a4e6>] __func__.5+0x4a4/0xfffdffbe [shfs] [<c0171893>] add_to_page_cache+0xc3/0xd0 [<c0179414>] read_pages+0xf4/0x140 [<c0176bde>] __alloc_pages+0x30e/0x4a0 [<c017955d>] __do_page_cache_readahead+0xfd/0x170 [<c0179739>] blockable_page_cache_readahead+0x59/0xe0 [<c0179996>] page_cache_readahead+0x126/0x1b0 [<c0172568>] do_generic_mapping_read+0x648/0x660 [<c01a3e1c>] link_path_walk+0x5c/0xe0 [<c0172892>] __generic_file_aio_read+0x202/0x240 [<c0172580>] file_read_actor+0x0/0x110 [<c01a28d9>] permission+0x89/0xa0 [<c0172a04>] generic_file_read+0xb4/0xd0 [<c0165af0>] autoremove_wake_function+0x0/0x60 [<c0193140>] generic_file_llseek+0x30/0xf0 [<c0193896>] vfs_read+0xb6/0x180 [<c0193c41>] sys_read+0x51/0x80 [<c0133069>] syscall_call+0x7/0xb Code: c0 0f 85 86 02 00 00 8b 4c 24 44 8b 6c 24 4c 8b 81 84 02 00 00 85 c0 7e 5b 39 c5 8b b1 80 02 00 00 89 df 0f 4f e8 89 e9 c1 e9 02 <f3> a5 89 e9 83 e1 03 74 02 f3 a4 8b 54 24 44 01 eb 8b 82 84 02 2.6.11-hardened-r15 and net-fs/shfs-0.35-r1 =========================================== Trying to vfree() nonexistent vm area (f88c8000) Badness in __vunmap at mm/vmalloc.c:368 [<c0181378>] vfree+0x28/0x40 [<f88328bf>] free_fcache+0x2f/0xe0 [shfs] [<f8832cf7>] fcache_file_close+0x57/0xa0 [shfs] [<f883414e>] shfs_file_release+0xde/0x160 [shfs] [<c01708e1>] __do_page_cache_readahead+0xb1/0x160 [<c036a1ee>] __wait_on_bit_lock+0x4e/0x70 [<c0240b93>] radix_tree_gang_lookup_tag+0x63/0x80 [<c01696c2>] find_get_pages_tag+0x72/0x80 [<c0173ab6>] pagevec_lookup_tag+0x36/0x40 [<c01b0f2e>] mpage_writepages+0x15e/0x3d0 [<c0240b93>] radix_tree_gang_lookup_tag+0x63/0x80 [<c01696c2>] find_get_pages_tag+0x72/0x80 [<c0173ab6>] pagevec_lookup_tag+0x36/0x40 [<c0168d36>] wait_on_page_writeback_range+0x76/0x130 [<c01b3705>] inotify_dentry_parent_queue_event+0x35/0xc0 [<c018bbb2>] __fput+0x162/0x180 [<c018a039>] filp_close+0x59/0x90 [<c018a0dd>] sys_close+0x6d/0x90 [<c01317b7>] syscall_call+0x7/0xb # emerge info Portage 2.0.54 (default-linux/x86/2006.0, gcc-3.3.6, glibc-2.3.5-r2, 2.6.14-hardened-r1-rz3 i686) ================================================================= System uname: 2.6.14-hardened-r1-rz3 i686 Intel(R) Xeon(TM) CPU 3.00GHz Gentoo Base System version 1.6.14 ccache version 2.3 [disabled] dev-lang/python: 2.3.5-r2, 2.4.2 sys-apps/sandbox: 1.2.12 sys-devel/autoconf: 2.13, 2.59-r7 sys-devel/automake: 1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r1 sys-devel/binutils: 2.16.1 sys-devel/libtool: 1.5.22 virtual/os-headers: 2.6.11-r2 ACCEPT_KEYWORDS="x86" AUTOCLEAN="yes" CBUILD="i686-pc-linux-gnu" CFLAGS="-O2 -march=pentium4 -fomit-frame-pointer -D_FILE_OFFSET_BITS=64" CHOST="i686-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3.4/env /usr/kde/3.4/share/config /usr/kde/3.4/shutdown /usr/kde/3/share/config /usr/lib/X11/xkb /usr/share/config /var/qmail/control" CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d" CXXFLAGS="-O2 -march=pentium4 -fomit-frame-pointer -D_FILE_OFFSET_BITS=64" DISTDIR="/usr/portage/distfiles" FEATURES="autoconfig distlocks sandbox sfperms strict" GENTOO_MIRRORS="ftp://gentoo-mirror/gentoo ftp://ftp.uni-erlangen.de/pub/mirrors/gentoo" LANG="de_DE@euro" LINGUAS="de" MAKEOPTS="-j1" PKGDIR="/usr/portage/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/local/portage" SYNC="rsync://gentoo-mirror/gentoo-portage" USE="x86 X acpi acpi4linux apache2 avi bash-completion berkdb bigger-fonts bitmap-fonts browserplugin bzip2 bzlib cli crypt ctype cups dba devfs26 dnd dri dvdread eds emboss exif expat fam fastbuild font-server foomaticdb force-cgi-redirect fortran freetype ftp gd gdbm gif gimp gimpprint gstreamer gtk gtk2 hardened hardenedphp iconv idn imagemagick imap imlib innodb ithreads java jpeg jpeg2k junit kde lcms libg++ libgd libwww mailbox maildir mbox md5sum memlimit mhash mime mmx mmx2 mng mozcalendar mozdevelop mozilla mozsvg mozxmlterm ncurses network nls nocd nptl nsplugin ogg pam pam_console pcre pdf pdflib perl png posix qt quicktime readline rtc samba scanner session silverxp simplexml soap sockets spl sse sse2 ssl svg tcpd tga tiff tokenizer truetype truetype-fonts type1 type1-fonts udev uudeview v4l v4l2 vorbis xml xml2 xprint xsl xvid zlib linguas_de userland_GNU kernel_linux elibc_glibc" Unset: ASFLAGS, CTARGET, LC_ALL, LDFLAGS
Did you compile shfs with *exactly* the same gcc version as your kernel?
(In reply to comment #1) > Did you compile shfs with *exactly* the same gcc version as your kernel? I'm not *exactly* sure so i recompiled the kernel (this time 2.6.14-hardened-r5-rz1) and shfs and _until now_ no error occured. I close this bug and will reopen it when the errors should appear again. Thanks.
Sorry, but problem still exists. gcc used for kernel and shfs module is exactly the same: Unable to handle kernel paging request at virtual address f889a000 printing eip: f8837b7c *pgd = 435001 *pmd = 54a9063 Oops: 0002 [#1] SMP Modules linked in: shfs CPU: 3 EIP: 0060:[<f8837b7c>] Not tainted VLI EFLAGS: 00010206 (2.6.14-hardened-r5-rz1) EIP is at sock_read+0x5b/0xffff94df [shfs] eax: 00001008 ebx: f889a000 ecx: 00000400 edx: da1c3400 esi: f09d0000 edi: f889a000 ebp: 00001000 esp: d50c38f0 ds: 007b es: 007b ss: 0068 Process java (pid: 21393, threadinfo=d50c3000 task=f3c78550) Stack: 00001008 d8948ba0 00000004 00000000 00000282 da1c3400 f4f90004 f883c15f d8948b80 f88381d4 f4f90004 da1c3400 ffffffff f4f9003c 00000000 f8839ff9 da1c3400 f889a000 00001000 d50c39b0 00001000 00001000 00001000 00000001 Call Trace: [<f883c15f>] __func__.5+0x401/0xffff52a2 [shfs] [<f88381d4>] reply+0x5f/0xffff8e8b [shfs] [<f8839ff9>] shell_read+0x42c/0xffff7433 [shfs] [<f88350c6>] fcache_file_read+0x1f3/0xffffc12d [shfs] [<c014986e>] __wake_up_common+0x3f/0x5e [<c01498cd>] __wake_up+0x40/0x56 [<c03623de>] unix_write_space+0x7c/0x98 [<c024c1e6>] copy_to_user+0x4c/0x66 [<c0316feb>] skb_dequeue+0x47/0x58 [<c0364b77>] unix_stream_recvmsg+0x226/0x4a6 [<c0317123>] skb_queue_tail+0x20/0x48 [<c036428a>] unix_stream_sendmsg+0x1de/0x41d [<c0310a3d>] sock_recvmsg+0x116/0x16a [<c017cd05>] page_address+0xa1/0xc2 [<c017c502>] kmap_high+0x134/0x1f1 [<c017cd05>] page_address+0xa1/0xc2 [<f8835925>] shfs_file_readpage+0x71/0xffffb74c [shfs] [<c0172e74>] __rmqueue+0xb9/0xf4 [<c0172f1b>] rmqueue_bulk+0x6c/0x76 [<c0172da3>] prep_new_page+0x47/0x5f [<c01732fe>] buffered_rmqueue+0x104/0x202 [<c018630b>] vmap_pte_range+0x19/0xd7 [<c01864c4>] map_vm_area+0xfb/0x12b [<c0186a3f>] __vmalloc_area+0xd2/0x128 [<c0186b45>] vmalloc+0x2a/0x2e [<f883482e>] alloc_fcache+0x68/0xffffc83a [shfs] [<f8834a45>] fcache_file_open+0x69/0xffffc624 [shfs] [<f8835f26>] shfs_file_open+0xac/0xffffb186 [shfs] [<c014986e>] __wake_up_common+0x3f/0x5e [<c0175eb7>] __do_page_cache_readahead+0xa9/0x164 [<c01760c3>] blockable_page_cache_readahead+0x55/0xc3 [<c024c0fc>] __copy_to_user_ll+0x6a/0x84 [<c016f2e3>] file_read_actor+0x9b/0x117 [<c016e7f3>] find_get_page+0x3d/0x4b [<c016ed93>] do_generic_mapping_read+0x1db/0x690 [<c016f51e>] __generic_file_aio_read+0x1bf/0x21a [<c016f248>] file_read_actor+0x0/0x117 [<c018f0a7>] get_unused_fd+0xdb/0x107 [<c016f6a7>] generic_file_read+0xba/0xd8 [<c0162f3e>] autoremove_wake_function+0x0/0x57 [<c019a4c0>] sys_fstat64+0x31/0x36 [<c018fd3a>] vfs_read+0x1a5/0x1aa [<c0190066>] sys_read+0x51/0x80 [<c0131deb>] sysenter_past_esp+0x54/0x79 [<c013007b>] show_regs+0xa9/0x18a Code: 9c 02 00 00 01 75 7c 8b 4c 24 40 8b 6c 24 48 8b 81 84 02 00 00 85 c0 7e 5a 39 e8 8b b1 80 02 00 00 89 df 0f 4c e8 89 e9 c1 e9 02 <f3> a5 89 e9 83 e1 03 74 02 f3 a4 8b 54 24 40 8b 82 84 02 00 00 The java process mentioned is Eclipse SDK. The Ooops causes the mounted filesystem to be inaccessible and processes trying to access the mountpoint will hang: # ps -o state,ppid,user,pid,command ax | egrep '^(Z|D)' D 1 im 419 konqueror [kdeinit] -mimetype inode/directory system:/ D 1 im 9594 ls -aF -Alh /home/im/public_html/eclipse_workspace/mountpoint No way to kill the processes cause of the state "D".
Can you test this with the vanilla kernel? Also, are you using wireless (with the connection silently dropping and picking up?)
(In reply to comment #4) > Can you test this with the vanilla kernel? Also, are you using wireless (with > the connection silently dropping and picking up?) No wireless connection. Sorry, can't test any further since i migrated shfs to ssh-fuse yesterday. I'm sick of my users complaining about not being able to use their mounts and even worse sick of lockups of the server. So far, sshfs-fuse works like a charm.
Marking as fixed as I am unable to reproduce, and reporter is no longer using shfs.