Our server completly crash because of a bug in XFS. On my 25 computers, which need this server, however already 30 minute in former times nothing more and first error occured. I have in addition also only narrations of my users. The system went down over several hours. I found this in /var/log/messages: Apr 30 14:33:43 cipinf2 xfs_iget_core: ambiguous vns: vp/0xd180a800, invp/0xf4a78980 Apr 30 14:33:43 cipinf2 ------------[ cut here ]------------ Apr 30 14:33:43 cipinf2 kernel BUG at fs/xfs/support/debug.c:106! Apr 30 14:33:43 cipinf2 invalid operand: 0000 [#1] Apr 30 14:33:43 cipinf2 SMP Apr 30 14:33:43 cipinf2 CPU: 1 Apr 30 14:33:43 cipinf2 EIP: 0060:[<c027fe44>] Not tainted Apr 30 14:33:43 cipinf2 EFLAGS: 00010246 (2.6.5-gentoo-r1) Apr 30 14:33:43 cipinf2 EIP is at cmn_err+0x9d/0xad Apr 30 14:33:43 cipinf2 eax: 00000040 ebx: 00000000 ecx: 00000097 edx: c042653c Apr 30 14:33:43 cipinf2 esi: c03e33a1 edi: c0501fbe ebp: 00000293 esp: f68c1a0c Apr 30 14:33:43 cipinf2 ds: 007b es: 007b ss: 0068 Apr 30 14:33:43 cipinf2 Process nfsd (pid: 2468, threadinfo=f68c0000 task=f7158200) Apr 30 14:33:43 cipinf2 Stack: c03f9ce9 c03e504d c0501f80 c04dda40 784e4ec7 00000000 e4fba3e0 c024e25d Apr 30 14:33:43 cipinf2 00000000 c03f2554 d180a800 f4a78980 f7e3a848 c0166a93 f73e5e00 f72f97f8 Apr 30 14:33:43 cipinf2 c04dda40 f71b4928 f68c0000 f73e5e00 f7e3a848 f72f97f4 00000000 00000000 Apr 30 14:33:43 cipinf2 Call Trace: Apr 30 14:33:43 cipinf2 [<c024e25d>] xfs_iget_core+0x49f/0x5a9 Apr 30 14:33:43 cipinf2 [<c0166a93>] get_new_inode_fast+0x4a/0xda Apr 30 14:33:43 cipinf2 [<c024e4be>] xfs_iget+0x157/0x189 Apr 30 14:33:43 cipinf2 [<c026ce98>] xfs_vget+0x68/0xdc Apr 30 14:33:43 cipinf2 [<c01508c7>] mark_buffer_dirty+0x33/0x4b Apr 30 14:33:43 cipinf2 [<c027f25c>] vfs_vget+0x34/0x38 Apr 30 14:33:43 cipinf2 [<c027ece2>] linvfs_get_dentry+0x53/0x8a Apr 30 14:33:43 cipinf2 [<c01e5a94>] find_exported_dentry+0x44/0x617 Apr 30 14:33:43 cipinf2 [<c025a66d>] xlog_write+0x10b/0x4e5 Apr 30 14:33:43 cipinf2 [<c022fd22>] xfs_bmbt_get_state+0x2f/0x3b Apr 30 14:33:43 cipinf2 [<c0226fc1>] xfs_bmap_do_search_extents+0x26b/0x3fa Apr 30 14:33:43 cipinf2 [<c02271ca>] xfs_bmap_search_extents+0x7a/0x86 Apr 30 14:33:43 cipinf2 [<c022876a>] xfs_bmapi+0x27d/0x141a Apr 30 14:33:43 cipinf2 [<c033acbd>] alloc_skb+0x47/0xe0 Apr 30 14:33:43 cipinf2 [<c033a3bc>] sock_alloc_send_pskb+0xc5/0x1e4 Apr 30 14:33:43 cipinf2 [<c033a50a>] sock_alloc_send_skb+0x2f/0x33 Apr 30 14:33:43 cipinf2 [<c0354510>] ip_append_data+0x69c/0x74f Apr 30 14:33:43 cipinf2 [<c034ebe5>] __ip_route_output_key+0x2d/0xd7 Apr 30 14:33:43 cipinf2 [<c0371dfd>] udp_sendmsg+0x34a/0x7e2 Apr 30 14:33:43 cipinf2 [<c0353db7>] ip_generic_getfrag+0x0/0xbd Apr 30 14:33:43 cipinf2 [<c01e63c5>] export_decode_fh+0x5c/0x78 Apr 30 14:33:43 cipinf2 [<c01e858c>] nfsd_acceptable+0x0/0xfc Apr 30 14:33:43 cipinf2 [<c01e8869>] fh_verify+0x1e1/0x58b Apr 30 14:33:43 cipinf2 [<c01e858c>] nfsd_acceptable+0x0/0xfc Apr 30 14:33:43 cipinf2 [<c0352ea6>] ip_finish_output+0xb4/0x1b5 Apr 30 14:33:43 cipinf2 [<c01ea046>] nfsd_open+0x38/0x17e Apr 30 14:33:43 cipinf2 [<c01ea7af>] nfsd_write+0x5b/0x345 Apr 30 14:33:43 cipinf2 [<c0372351>] udp_sendpage+0xbc/0x13a Apr 30 14:33:43 cipinf2 [<c033c398>] skb_copy_and_csum_bits+0x22d/0x2fd Apr 30 14:33:43 cipinf2 [<c033ae7e>] kfree_skbmem+0x24/0x2c Apr 30 14:33:43 cipinf2 [<c033aef6>] __kfree_skb+0x70/0xe1 Apr 30 14:33:43 cipinf2 [<c0128a9a>] groups_alloc+0x3e/0xb6 Apr 30 14:33:43 cipinf2 [<c03c61bc>] svcauth_unix_accept+0x25b/0x294 Apr 30 14:33:43 cipinf2 [<c01e78dd>] nfsd_proc_write+0xa8/0xe1 Apr 30 14:33:43 cipinf2 [<c01e6aed>] nfsd_dispatch+0xdc/0x1d9 Apr 30 14:33:43 cipinf2 [<c03c25c6>] svc_process+0x4ad/0x60e Apr 30 14:33:43 cipinf2 [<c01e686d>] nfsd+0x1ea/0x38e Apr 30 14:33:43 cipinf2 [<c01e6683>] nfsd+0x0/0x38e Apr 30 14:33:43 cipinf2 [<c0104dc1>] kernel_thread_helper+0x5/0xb Apr 30 14:33:43 cipinf2 Reproducible: Didn't try Steps to Reproduce: Actual Results: xfs_check on my two XFS filesystem led to no results. Both seems to be fine. /dev/rd/host0/target0/part5 on /home type xfs (rw,usrquota,grpquota) 286 GB /dev/rd/host0/target0/part6 on /export type xfs (rw,usrquota,grpquota) 251 GB Expected Results: Portage 2.0.50-r6 (default-x86-1.4, gcc-3.3.3, glibc-2.3.3_pre20040420-r0, 2.6.5-gentoo-r1) ================================================================= System uname: 2.6.5-gentoo-r1 i686 Intel(R) Xeon(TM) CPU 2.40GHz Gentoo Base System version 1.4.10 Autoconf: sys-devel/autoconf-2.59-r3 Automake: sys-devel/automake-1.8.3 ACCEPT_KEYWORDS="x86 ~x86" AUTOCLEAN="yes" CFLAGS="-march=pentium3 -mcpu=pentium4 -msse2 -O3 -pipe" CHOST="i686-pc-linux-gnu" COMPILER="gcc3" CONFIG_PROTECT="/etc /usr/X11R6/lib/X11/xkb /usr/kde/2/share/config /usr/kde/3/share/config /usr/share/config /var/bind /var/qmail/control" CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d" CXXFLAGS="-march=pentium3 -mcpu=pentium4 -msse2 -O3 -pipe" DISTDIR="/usr/portage/distfiles" FEATURES="autoaddcvs ccache sandbox" GENTOO_MIRRORS="ftp://ftp.uni-erlangen.de/pub/mirrors/gentoo/" MAKEOPTS="-j5" PKGDIR="/usr/portage/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="" SYNC="rsync://127.0.0.1/gentoo-portage" USE="X apache2 apm arts avi berkdb crypt cups encode foomaticdb gdbm gif gnome gpm gtk gtk2 imlib java jpeg kde ldap libg++ libwww mad mikmod motif mpeg mysql ncurses nls oggvorbis opengl oss pam pdflib perl png postgres python qt quicktime readline sdl slang spell ssl svga tcpd truetype x86 xml2 xmms xv zlib"
That's definitely XFS oopsing there. COuld you try a vanilla kernel and see if the problem still persists?
This is an upstream problem, if you still have issues with 2.6.7, please file a bug at bugzilla.kernel.org