Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 127044 - net-fs/shfs-0.35-r2 causes kernel oops "Unable to handle kernel paging request"
Summary: net-fs/shfs-0.35-r2 causes kernel oops "Unable to handle kernel paging request"
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: High critical (vote)
Assignee: Saleem Abdulrasool (RETIRED)
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-03-21 00:59 UTC by Marcel Meckel
Modified: 2006-03-23 10:16 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Marcel Meckel 2006-03-21 00:59:55 UTC
Hi,

i'm running a linux terminal server where some users use shfsmount. The problem with

  2.6.14-hardened-r1
  net-fs/shfs-0.35-r2

and

  2.6.11-hardened-r15
  net-fs/shfs-0.35-r1

is that shfs causes kernel oops which after a short while results in a lockup of the machine.


2.6.14-hardened-r1 and net-fs/shfs-0.35-r2
==========================================


SHell File System, (c) 2002-2004 Miroslav Spousta
Unable to handle kernel paging request at virtual address f889f000
 printing eip:
f8835b73
*pgd =   44f001
*pmd =  54a9063
Oops: 0002 [#1]
SMP
Modules linked in: shfs
CPU:    2
EIP:    0060:[<f8835b73>]    Not tainted VLI
EFLAGS: 00010206   (2.6.14-hardened-r1-rz3)
EIP is at sock_read+0x63/0xfffe44f0 [shfs]
eax: 00001008   ebx: f889f000   ecx: 00000400   edx: d59f5800
esi: d53da000   edi: f889f000   ebp: 00001000   esp: d4eca7fc
ds: 007b   es: 007b   ss: 0068
Process java (pid: 535, threadinfo=d4eca000 task=d5063030)
Stack: 00001008 d512f1a0 d4eca000 00000000 80000004 00000000 d53dc007 d53dc004
       f883a43e d512f180 f883620e d53dc004 d53dc007 ffffffff d53dc03e d59f5800
       f8837f4b d59f5800 f889f000 00001000 d4eca8b8 00001000 00001000 00001000
Call Trace:
 [<f883a43e>] __func__.5+0x3fc/0xfffdffbe [shfs]
 [<f883620e>] reply+0x5e/0xfffe3e50 [shfs]
 [<f8837f4b>] shell_read+0x31b/0xfffe23d0 [shfs]
 [<f88330ee>] fcache_file_read+0x25e/0xfffe7170 [shfs]
 [<c014a80d>] find_busiest_group+0x10d/0x370
 [<c014ad83>] load_balance_newidle+0x43/0x110
 [<c01491dd>] activate_task+0x8d/0xa0
 [<c0149b5a>] try_to_wake_up+0x2da/0x340
 [<c014b661>] __wake_up_common+0x41/0x70
 [<c014b6ce>] __wake_up+0x3e/0x60
 [<c018069c>] set_page_address+0xac/0x190
 [<c01805d9>] page_address+0xb9/0xd0
 [<c017fd6f>] kmap_high+0x13f/0x1f0
 [<c0322626>] skb_dequeue+0x46/0x60
 [<c037143c>] unix_stream_recvmsg+0x10c/0x480
 [<c0322770>] skb_queue_tail+0x20/0x50
 [<f883393e>] shfs_file_readpage+0x9e/0xfffe6760 [shfs]
 [<c031bea3>] sock_recvmsg+0xf3/0x110
 [<c03879c6>] __reacquire_kernel_lock+0x26/0x50
 [<c0385fa0>] schedule+0x6b0/0xd70
 [<c01491dd>] activate_task+0x8d/0xa0
 [<c0149b5a>] try_to_wake_up+0x2da/0x340
 [<c014b661>] __wake_up_common+0x41/0x70
 [<c014b6ce>] __wake_up+0x3e/0x60
 [<c025171c>] __up+0x1c/0x20
 [<c03858b3>] __up_wakeup+0x7/0xc
 [<f883963e>] .text.lock.shell+0x1b/0xfffe09dd [shfs]
 [<f883a4e6>] __func__.5+0x4a4/0xfffdffbe [shfs]
 [<f8837bc6>] shell_open+0x86/0xfffe24c0 [shfs]
 [<f883a4ee>] __func__.5+0x4ac/0xfffdffbe [shfs]
 [<f883a4e6>] __func__.5+0x4a4/0xfffdffbe [shfs]
 [<c0171893>] add_to_page_cache+0xc3/0xd0
 [<c0179414>] read_pages+0xf4/0x140
 [<c0176bde>] __alloc_pages+0x30e/0x4a0
 [<c017955d>] __do_page_cache_readahead+0xfd/0x170
 [<c0179739>] blockable_page_cache_readahead+0x59/0xe0
 [<c0179996>] page_cache_readahead+0x126/0x1b0
 [<c0172568>] do_generic_mapping_read+0x648/0x660
 [<c01a3e1c>] link_path_walk+0x5c/0xe0
 [<c0172892>] __generic_file_aio_read+0x202/0x240
 [<c0172580>] file_read_actor+0x0/0x110
 [<c01a28d9>] permission+0x89/0xa0
 [<c0172a04>] generic_file_read+0xb4/0xd0
 [<c0165af0>] autoremove_wake_function+0x0/0x60
 [<c0193140>] generic_file_llseek+0x30/0xf0
 [<c0193896>] vfs_read+0xb6/0x180
 [<c0193c41>] sys_read+0x51/0x80
 [<c0133069>] syscall_call+0x7/0xb
Code: c0 0f 85 86 02 00 00 8b 4c 24 44 8b 6c 24 4c 8b 81 84 02 00 00 85 c0 7e 5b 39 c5 8b b1 80 02 00 00 89 df 0f 4f e8 89 e9 c1 e9 02 <f3> a5 89 e9 83 e1 03 74 02 f3 a4 8b 54 24 44 01 eb 8b 82 84 02


2.6.11-hardened-r15 and net-fs/shfs-0.35-r1
===========================================

Trying to vfree() nonexistent vm area (f88c8000)
Badness in __vunmap at mm/vmalloc.c:368
 [<c0181378>] vfree+0x28/0x40
 [<f88328bf>] free_fcache+0x2f/0xe0 [shfs]
 [<f8832cf7>] fcache_file_close+0x57/0xa0 [shfs]
 [<f883414e>] shfs_file_release+0xde/0x160 [shfs]
 [<c01708e1>] __do_page_cache_readahead+0xb1/0x160
 [<c036a1ee>] __wait_on_bit_lock+0x4e/0x70
 [<c0240b93>] radix_tree_gang_lookup_tag+0x63/0x80
 [<c01696c2>] find_get_pages_tag+0x72/0x80
 [<c0173ab6>] pagevec_lookup_tag+0x36/0x40
 [<c01b0f2e>] mpage_writepages+0x15e/0x3d0
 [<c0240b93>] radix_tree_gang_lookup_tag+0x63/0x80
 [<c01696c2>] find_get_pages_tag+0x72/0x80
 [<c0173ab6>] pagevec_lookup_tag+0x36/0x40
 [<c0168d36>] wait_on_page_writeback_range+0x76/0x130
 [<c01b3705>] inotify_dentry_parent_queue_event+0x35/0xc0
 [<c018bbb2>] __fput+0x162/0x180
 [<c018a039>] filp_close+0x59/0x90
 [<c018a0dd>] sys_close+0x6d/0x90
 [<c01317b7>] syscall_call+0x7/0xb


# emerge info
Portage 2.0.54 (default-linux/x86/2006.0, gcc-3.3.6, glibc-2.3.5-r2, 2.6.14-hardened-r1-rz3 i686)
=================================================================
System uname: 2.6.14-hardened-r1-rz3 i686 Intel(R) Xeon(TM) CPU 3.00GHz
Gentoo Base System version 1.6.14
ccache version 2.3 [disabled]
dev-lang/python:     2.3.5-r2, 2.4.2
sys-apps/sandbox:    1.2.12
sys-devel/autoconf:  2.13, 2.59-r7
sys-devel/automake:  1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r1
sys-devel/binutils:  2.16.1
sys-devel/libtool:   1.5.22
virtual/os-headers:  2.6.11-r2
ACCEPT_KEYWORDS="x86"
AUTOCLEAN="yes"
CBUILD="i686-pc-linux-gnu"
CFLAGS="-O2 -march=pentium4 -fomit-frame-pointer -D_FILE_OFFSET_BITS=64"
CHOST="i686-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3.4/env /usr/kde/3.4/share/config /usr/kde/3.4/shutdown /usr/kde/3/share/config /usr/lib/X11/xkb /usr/share/config /var/qmail/control"
CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d"
CXXFLAGS="-O2 -march=pentium4 -fomit-frame-pointer -D_FILE_OFFSET_BITS=64"
DISTDIR="/usr/portage/distfiles"
FEATURES="autoconfig distlocks sandbox sfperms strict"
GENTOO_MIRRORS="ftp://gentoo-mirror/gentoo ftp://ftp.uni-erlangen.de/pub/mirrors/gentoo"
LANG="de_DE@euro"
LINGUAS="de"
MAKEOPTS="-j1"
PKGDIR="/usr/portage/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage"
SYNC="rsync://gentoo-mirror/gentoo-portage"
USE="x86 X acpi acpi4linux apache2 avi bash-completion berkdb bigger-fonts bitmap-fonts browserplugin bzip2 bzlib cli crypt ctype cups dba devfs26 dnd dri dvdread eds emboss exif expat fam fastbuild font-server foomaticdb force-cgi-redirect fortran freetype ftp gd gdbm gif gimp gimpprint gstreamer gtk gtk2 hardened hardenedphp iconv idn imagemagick imap imlib innodb ithreads java jpeg jpeg2k junit kde lcms libg++ libgd libwww mailbox maildir mbox md5sum memlimit mhash mime mmx mmx2 mng mozcalendar mozdevelop mozilla mozsvg mozxmlterm ncurses network nls nocd nptl nsplugin ogg pam pam_console pcre pdf pdflib perl png posix qt quicktime readline rtc samba scanner session silverxp simplexml soap sockets spl sse sse2 ssl svg tcpd tga tiff tokenizer truetype truetype-fonts type1 type1-fonts udev uudeview v4l v4l2 vorbis xml xml2 xprint xsl xvid zlib linguas_de userland_GNU kernel_linux elibc_glibc"
Unset:  ASFLAGS, CTARGET, LC_ALL, LDFLAGS
Comment 1 Jakub Moc (RETIRED) gentoo-dev 2006-03-21 01:04:35 UTC
Did you compile shfs with *exactly* the same gcc version as your kernel?
Comment 2 Marcel Meckel 2006-03-21 04:58:35 UTC
(In reply to comment #1)
> Did you compile shfs with *exactly* the same gcc version as your kernel?

I'm not *exactly* sure so i recompiled the kernel (this time 2.6.14-hardened-r5-rz1) and shfs and _until now_ no error occured. I close this bug and will reopen it when the errors should appear again.

Thanks.
Comment 3 Marcel Meckel 2006-03-22 06:46:40 UTC
Sorry,

but problem still exists. gcc used for kernel and shfs module is exactly the same:

Unable to handle kernel paging request at virtual address f889a000
 printing eip:
f8837b7c
*pgd =   435001
*pmd =  54a9063
Oops: 0002 [#1]
SMP
Modules linked in: shfs
CPU:    3
EIP:    0060:[<f8837b7c>]    Not tainted VLI
EFLAGS: 00010206   (2.6.14-hardened-r5-rz1)
EIP is at sock_read+0x5b/0xffff94df [shfs]
eax: 00001008   ebx: f889a000   ecx: 00000400   edx: da1c3400
esi: f09d0000   edi: f889a000   ebp: 00001000   esp: d50c38f0
ds: 007b   es: 007b   ss: 0068
Process java (pid: 21393, threadinfo=d50c3000 task=f3c78550)
Stack: 00001008 d8948ba0 00000004 00000000 00000282 da1c3400 f4f90004 f883c15f
       d8948b80 f88381d4 f4f90004 da1c3400 ffffffff f4f9003c 00000000 f8839ff9
       da1c3400 f889a000 00001000 d50c39b0 00001000 00001000 00001000 00000001
Call Trace:
 [<f883c15f>] __func__.5+0x401/0xffff52a2 [shfs]
 [<f88381d4>] reply+0x5f/0xffff8e8b [shfs]
 [<f8839ff9>] shell_read+0x42c/0xffff7433 [shfs]
 [<f88350c6>] fcache_file_read+0x1f3/0xffffc12d [shfs]
 [<c014986e>] __wake_up_common+0x3f/0x5e
 [<c01498cd>] __wake_up+0x40/0x56
 [<c03623de>] unix_write_space+0x7c/0x98
 [<c024c1e6>] copy_to_user+0x4c/0x66
 [<c0316feb>] skb_dequeue+0x47/0x58
 [<c0364b77>] unix_stream_recvmsg+0x226/0x4a6
 [<c0317123>] skb_queue_tail+0x20/0x48
 [<c036428a>] unix_stream_sendmsg+0x1de/0x41d
 [<c0310a3d>] sock_recvmsg+0x116/0x16a
 [<c017cd05>] page_address+0xa1/0xc2
 [<c017c502>] kmap_high+0x134/0x1f1
 [<c017cd05>] page_address+0xa1/0xc2
 [<f8835925>] shfs_file_readpage+0x71/0xffffb74c [shfs]
 [<c0172e74>] __rmqueue+0xb9/0xf4
 [<c0172f1b>] rmqueue_bulk+0x6c/0x76
 [<c0172da3>] prep_new_page+0x47/0x5f
 [<c01732fe>] buffered_rmqueue+0x104/0x202
 [<c018630b>] vmap_pte_range+0x19/0xd7
 [<c01864c4>] map_vm_area+0xfb/0x12b
 [<c0186a3f>] __vmalloc_area+0xd2/0x128
 [<c0186b45>] vmalloc+0x2a/0x2e
 [<f883482e>] alloc_fcache+0x68/0xffffc83a [shfs]
 [<f8834a45>] fcache_file_open+0x69/0xffffc624 [shfs]
 [<f8835f26>] shfs_file_open+0xac/0xffffb186 [shfs]
 [<c014986e>] __wake_up_common+0x3f/0x5e
 [<c0175eb7>] __do_page_cache_readahead+0xa9/0x164
 [<c01760c3>] blockable_page_cache_readahead+0x55/0xc3
 [<c024c0fc>] __copy_to_user_ll+0x6a/0x84
 [<c016f2e3>] file_read_actor+0x9b/0x117
 [<c016e7f3>] find_get_page+0x3d/0x4b
 [<c016ed93>] do_generic_mapping_read+0x1db/0x690
 [<c016f51e>] __generic_file_aio_read+0x1bf/0x21a
 [<c016f248>] file_read_actor+0x0/0x117
 [<c018f0a7>] get_unused_fd+0xdb/0x107
 [<c016f6a7>] generic_file_read+0xba/0xd8
 [<c0162f3e>] autoremove_wake_function+0x0/0x57
 [<c019a4c0>] sys_fstat64+0x31/0x36
 [<c018fd3a>] vfs_read+0x1a5/0x1aa
 [<c0190066>] sys_read+0x51/0x80
 [<c0131deb>] sysenter_past_esp+0x54/0x79
 [<c013007b>] show_regs+0xa9/0x18a
Code: 9c 02 00 00 01 75 7c 8b 4c 24 40 8b 6c 24 48 8b 81 84 02 00 00 85 c0 7e 5a 39 e8 8b b1 80 02 00 00 89 df 0f 4c e8 89 e9 c1 e9 02 <f3> a5 89 e9 83 e1 03 74 02 f3 a4 8b 54 24 40 8b 82 84 02 00 00

The java process mentioned is Eclipse SDK.

The Ooops causes the mounted filesystem to be inaccessible and processes trying to access the mountpoint will hang:

# ps -o state,ppid,user,pid,command ax | egrep '^(Z|D)'
D     1 im         419 konqueror [kdeinit] -mimetype inode/directory system:/
D     1 im        9594 ls -aF -Alh /home/im/public_html/eclipse_workspace/mountpoint

No way to kill the processes cause of the state "D".
Comment 4 Saleem Abdulrasool (RETIRED) gentoo-dev 2006-03-22 13:09:02 UTC
Can you test this with the vanilla kernel?  Also, are you using wireless (with the connection silently dropping and picking up?)
Comment 5 Marcel Meckel 2006-03-22 22:46:31 UTC
(In reply to comment #4)
> Can you test this with the vanilla kernel?  Also, are you using wireless (with
> the connection silently dropping and picking up?)

No wireless connection. Sorry, can't test any further since i migrated shfs to ssh-fuse yesterday. I'm sick of my users complaining about not being able to use their mounts and even worse sick of lockups of the server.

So far, sshfs-fuse works like a charm.
Comment 6 Saleem Abdulrasool (RETIRED) gentoo-dev 2006-03-23 10:16:17 UTC
Marking as fixed as I am unable to reproduce, and reporter is no longer using shfs.