what I did: cd ~ # my home in AFS tar -cvf tars/tar.tar .backup # .backup mountpoint of backup volume After some time the tar hangs. Therafter each process attempting to acces the directory ~/tar hangs. It is impossible to terminate the processes in any way. In the meantime the directory is normally accessible from other AFS-clients. Only a reboot resolves the problem. This makes in fact one of our webservers unusable, because the same happens with directories php-scripts access. In a short time several hundred scripts are hanging around and the whole server stops service until it is rebooted. I am running OpenAFS 1.3.85 since several months very successfully on the same hardware under Portage 2.0.53 (default-linux/amd64/2004.3, gcc-3.4.3, glibc-2.3.4.20041102-r1, 2. 6.12 x86_64) System uname: 2.6.12 x86_64 AMD Opteron(tm) Processor 246 ================================================================= Affected System Information OpenAFS Version: * net-fs/openafs Available versions: *1.2.10-r1 !1.2.13-r2 1.4.0 1.4.0-r1 1.4.0-r2 Installed: 1.4.0-r2 Homepage: http://www.openafs.org/ Description: The OpenAFS distributed file system System Information: vanilla kernel Portage 2.0.54 (default-linux/amd64/2005.0, gcc-3.4.4, glibc-2.3.5-r2, 2.6.12 x86_64) ================================================================= System uname: 2.6.12 x86_64 AMD Opteron(tm) Processor 246 Gentoo Base System version 1.6.14 ccache version 2.3 [disabled] dev-lang/python: 2.2.3, 2.3.4-r1, 2.4.2 sys-apps/sandbox: 1.2.12 sys-devel/autoconf: 2.13, 2.59-r6 sys-devel/automake: 1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r1 sys-devel/binutils: 2.16.1 sys-devel/libtool: 1.5.22 virtual/os-headers: 2.6.11-r2 ACCEPT_KEYWORDS="amd64" AUTOCLEAN="yes" CBUILD="x86_64-pc-linux-gnu" CFLAGS="-march=athlon64 -O2 -pipe" CHOST="x86_64-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3/share/config /usr/lib/X11/xkb /usr/share/config /var/qmail/control" CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d" CXXFLAGS="-march=athlon64 -O2 -pipe" DISTDIR="/usr/portage/distfiles" FEATURES="autoconfig distlocks sandbox sfperms strict" GENTOO_MIRRORS="http://pandemonium.tiscali.de/pub/gentoo/" PKGDIR="/usr/portage/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/local/portage" SYNC="rsync://rsync.gentoo.org/gentoo-portage" USE="amd64 X acl alsa avi berkdb bitmap-fonts bzip2 cdb crypt cups curl eds emboss encode expat f77 foomaticdb fortran gd gdbm gif gnome gpm gstreamer gtk gtk2 imagemagick imlib ipv6 java jpeg kde ldap lzw lzw-tiff mhash motif mp3 mpeg mysql ncurses nls opengl pam pcre pdflib perl php png python qt quicktime readline sdl slang spell ssl tcpd tiff truetype truetype-fonts type1-fonts udev usb userlocales xml2 xpm xv zlib userland_GNU kernel_linux elibc_glibc" Unset: ASFLAGS, CTARGET, LANG, LC_ALL, LDFLAGS, LINGUAS, MAKEOPTS
Is this reproducable (I assume the tar command is the main trigger)? If so, can you see if the problem occurs if you either save your tar to local disk, or get your source files form local disk?
(In reply to comment #1) > Is this reproducable (I assume the tar command is the main trigger)? > If so, can you see if the problem occurs if you either save your tar to local > disk, or get your source files form local disk? > The problem does not occur if I save the tar to local disk or if the source of the tar is the local disk. If I write the tar to my AFS home and the source is the .backup of my home the tar hangs if the size of the tar is about 1GB. Thereafter each process hangs which attempts to access the directory containing the tar. If I write the tar to AFS home and the source is the local diks the tar stops with an error message: tar: tar.tar: Wrote only 6144 of 10240 bytes tar: Error is not recoverable: exiting now /var/log/messages shows: Feb 14 11:11:23 sv4 afs: failed to store file (5) But thereafter all directories of my home remain normally accessable from the machine
Have you managed to get 1.4.1-rc6 to work on your system in the meanwhile? It would be nice to know if it solves the problem (more so as it seems very difficult to reproduce).
(In reply to comment #3) > Have you managed to get 1.4.1-rc6 to work on your system in the meanwhile? I'll try it this afternoon It > would be nice to know if it solves the problem (more so as it seems very > difficult to reproduce). I can reproduce it any time
(In reply to comment #4) > (In reply to comment #3) > > Have you managed to get 1.4.1-rc6 to work on your system in the meanwhile? > > I'll try it this afternoon I cannot compile 1.4.1-rc6 even not under Linux sv4 2.6.15-gentoo-r5 #1 SMP Wed Feb 15 16:19:18 MET 2006 x86_64 AMD Opteron(tm) Processor 246 AuthenticAMD GNU/Linux The error is still include/linux/seq_file.h:43: warning: `printk' is an unrecognized format function type /root/openafs/openafs-1.4.1-rc6/src/libafs/MODLOAD-2.6.15-gentoo-r5-MP/osi_module.c: In function `afs_ioctl': /root/openafs/openafs-1.4.1-rc6/src/libafs/MODLOAD-2.6.15-gentoo-r5-MP/osi_module.c:294 : error: `TIF_32BIT' undeclared (first use in this function)
(In reply to comment #4) > I can reproduce it any time I have made volume with 2 iso-files, together 1.4 gigabyte, and produced a backup volume of that. Then I tried doing "tar -cf tarfile OldFiles/myisos" from within that volume, and it worked without a problem :( I'm hoping you'll get 1.4.1-rc to work, or that you have some more hints to make me able to reproduce this.
(In reply to comment #6) > (In reply to comment #4) > > I can reproduce it any time > > I have made volume with 2 iso-files, together 1.4 gigabyte, and produced a > backup volume of that. Then I tried doing "tar -cf tarfile OldFiles/myisos" > from within that volume, and it worked without a problem :( > I'm hoping you'll get 1.4.1-rc to work, or that you have some more hints to > make me able to reproduce this. > Yesterday I have booted 2.6.15-gentoo-r5 and reemerged openafs-kernel-1.4.0 and openafs-1.4.0-r2. Thereafter the tar problem has disappeared. The machine has now resumed its operations as webserver and I'll wait some days if AFS now works well. The problem of an inaccessible directory and hanging php processes occured only after a varying time after boot, may be some hours or even two days. It was this problem that let me experiment with tar to find something reproducible.
(In reply to comment #7) > (In reply to comment #6) > > (In reply to comment #4) > > > I can reproduce it any time > > > > I have made volume with 2 iso-files, together 1.4 gigabyte, and produced a > > backup volume of that. Then I tried doing "tar -cf tarfile OldFiles/myisos" > > from within that volume, and it worked without a problem :( > > I'm hoping you'll get 1.4.1-rc to work, or that you have some more hints to > > make me able to reproduce this. > > > > Yesterday I have booted 2.6.15-gentoo-r5 and reemerged openafs-kernel-1.4.0 and > openafs-1.4.0-r2. Thereafter the tar problem has disappeared. > The machine has now resumed its operations as webserver and I'll wait some days > if AFS now works well. The problem of an inaccessible directory and hanging php > processes occured only after a varying time after boot, may be some hours or > even two days. It was this problem that let me experiment with tar to find > something reproducible. > openafs-1.4.0-r2 and openafs-kernel-1.4.0 are running well under the kernel 2.6.15-gentoo-r5 since several days. So if you want we can close the bug. Thanks Gunther
Gladly :) Thanks for reporting back on this! I guess this means we can start marking openafs-1.4.0-r2 stable in less than two weeks (unless you find new bugs, of course :-P )