Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 122353 - [OpenAFS] Gentoo amd64: Each process attempting to access a certain directory is blocked
Summary: [OpenAFS] Gentoo amd64: Each process attempting to access a certain directory...
Status: RESOLVED WORKSFORME
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Unspecified (show other bugs)
Hardware: AMD64 Linux
: High blocker (vote)
Assignee: Stefaan De Roeck (RETIRED)
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-02-10 04:45 UTC by Hans-Gunther Borrmann
Modified: 2006-02-22 01:55 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Hans-Gunther Borrmann 2006-02-10 04:45:34 UTC
what I did:

cd ~   # my home in AFS
tar -cvf tars/tar.tar .backup   # .backup mountpoint of backup volume

After some time the tar hangs. Therafter each process attempting to acces the 
directory ~/tar hangs. It is impossible to terminate the processes in any 
way. In the meantime the directory is normally accessible from other 
AFS-clients. Only a reboot resolves the problem. 

This makes in fact one of our webservers unusable, because the same  happens 
with directories php-scripts access. In a short time several hundred scripts 
are hanging around and the whole server stops service until it is rebooted.

I am running OpenAFS 1.3.85 since several months very successfully on the same 
hardware under 
Portage 2.0.53 (default-linux/amd64/2004.3, gcc-3.4.3, 
glibc-2.3.4.20041102-r1, 2.
6.12 x86_64)
System uname: 2.6.12 x86_64 AMD Opteron(tm) Processor 246

=================================================================
Affected System Information

OpenAFS Version:
* net-fs/openafs 
     Available versions:  *1.2.10-r1 !1.2.13-r2 1.4.0 1.4.0-r1 1.4.0-r2
     Installed:           1.4.0-r2
     Homepage:            http://www.openafs.org/
     Description:         The OpenAFS distributed file system

System Information:

vanilla kernel

Portage 2.0.54 (default-linux/amd64/2005.0, gcc-3.4.4, glibc-2.3.5-r2, 2.6.12 
x86_64)
=================================================================
System uname: 2.6.12 x86_64 AMD Opteron(tm) Processor 246
Gentoo Base System version 1.6.14
ccache version 2.3 [disabled]
dev-lang/python:     2.2.3, 2.3.4-r1, 2.4.2
sys-apps/sandbox:    1.2.12
sys-devel/autoconf:  2.13, 2.59-r6
sys-devel/automake:  1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r1
sys-devel/binutils:  2.16.1
sys-devel/libtool:   1.5.22
virtual/os-headers:  2.6.11-r2
ACCEPT_KEYWORDS="amd64"
AUTOCLEAN="yes"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=athlon64 -O2 -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3/share/config /usr/lib/X11/xkb /usr/share/config /var/qmail/control"
CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d"
CXXFLAGS="-march=athlon64 -O2 -pipe"
DISTDIR="/usr/portage/distfiles"
FEATURES="autoconfig distlocks sandbox sfperms strict"
GENTOO_MIRRORS="http://pandemonium.tiscali.de/pub/gentoo/"
PKGDIR="/usr/portage/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage"
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="amd64 X acl alsa avi berkdb bitmap-fonts bzip2 cdb crypt cups curl eds 
emboss encode expat f77 foomaticdb fortran gd gdbm gif gnome gpm gstreamer 
gtk gtk2 imagemagick imlib ipv6 java jpeg kde ldap lzw lzw-tiff mhash motif 
mp3 mpeg mysql ncurses nls opengl pam pcre pdflib perl php png python qt 
quicktime readline sdl slang spell ssl tcpd tiff truetype truetype-fonts 
type1-fonts udev usb userlocales xml2 xpm xv zlib userland_GNU kernel_linux 
elibc_glibc"
Unset:  ASFLAGS, CTARGET, LANG, LC_ALL, LDFLAGS, LINGUAS, MAKEOPTS
Comment 1 Stefaan De Roeck (RETIRED) gentoo-dev 2006-02-10 07:54:12 UTC
Is this reproducable (I assume the tar command is the main trigger)?
If so, can you see if the problem occurs if you either save your tar to local disk, or get your source files form local disk?
Comment 2 Hans-Gunther Borrmann 2006-02-14 02:47:27 UTC
(In reply to comment #1)
> Is this reproducable (I assume the tar command is the main trigger)?
> If so, can you see if the problem occurs if you either save your tar to local
> disk, or get your source files form local disk?
> 

The problem does not occur if I save the tar to local disk or if the source of the tar is the local disk.

If I write the tar to my AFS home and the source is the .backup of my home the tar hangs if the size of the tar is about 1GB. Thereafter each process hangs which attempts to access the directory containing the tar.
If I write the tar to AFS home and the source is the local diks the tar stops with an error message:
tar: tar.tar: Wrote only 6144 of 10240 bytes
tar: Error is not recoverable: exiting now
/var/log/messages shows: Feb 14 11:11:23 sv4 afs: failed to store file (5)
But thereafter all directories of my home remain normally accessable from the machine
Comment 3 Stefaan De Roeck (RETIRED) gentoo-dev 2006-02-14 02:56:03 UTC
Have you managed to get 1.4.1-rc6 to work on your system in the meanwhile?  It would be nice to know if it solves the problem (more so as it seems very difficult to reproduce).
Comment 4 Hans-Gunther Borrmann 2006-02-15 03:48:57 UTC
(In reply to comment #3)
> Have you managed to get 1.4.1-rc6 to work on your system in the meanwhile?  

I'll try it this afternoon

It
> would be nice to know if it solves the problem (more so as it seems very
> difficult to reproduce).

I can reproduce it any time 


Comment 5 Hans-Gunther Borrmann 2006-02-15 08:34:39 UTC
(In reply to comment #4)
> (In reply to comment #3)
> > Have you managed to get 1.4.1-rc6 to work on your system in the meanwhile?  
> 
> I'll try it this afternoon

I cannot compile 1.4.1-rc6 even not under 
Linux sv4 2.6.15-gentoo-r5 #1 SMP Wed Feb 15 16:19:18 MET 2006 x86_64 AMD Opteron(tm) Processor 246 AuthenticAMD GNU/Linux

The error is still

include/linux/seq_file.h:43: warning: `printk' is an unrecognized format function type
/root/openafs/openafs-1.4.1-rc6/src/libafs/MODLOAD-2.6.15-gentoo-r5-MP/osi_module.c: In
 function `afs_ioctl':
/root/openafs/openafs-1.4.1-rc6/src/libafs/MODLOAD-2.6.15-gentoo-r5-MP/osi_module.c:294
: error: `TIF_32BIT' undeclared (first use in this function)

Comment 6 Stefaan De Roeck (RETIRED) gentoo-dev 2006-02-16 03:36:01 UTC
(In reply to comment #4)
> I can reproduce it any time 

I have made volume with 2 iso-files, together 1.4 gigabyte, and produced a backup volume of that.  Then I tried doing "tar -cf tarfile OldFiles/myisos" from within that volume, and it worked without a problem :(
I'm hoping you'll get 1.4.1-rc to work, or that you have some more hints to make me able to reproduce this.  
Comment 7 Hans-Gunther Borrmann 2006-02-16 04:53:33 UTC
(In reply to comment #6)
> (In reply to comment #4)
> > I can reproduce it any time 
> 
> I have made volume with 2 iso-files, together 1.4 gigabyte, and produced a
> backup volume of that.  Then I tried doing "tar -cf tarfile OldFiles/myisos"
> from within that volume, and it worked without a problem :(
> I'm hoping you'll get 1.4.1-rc to work, or that you have some more hints to
> make me able to reproduce this.  
> 

Yesterday I have booted 2.6.15-gentoo-r5 and reemerged openafs-kernel-1.4.0 and openafs-1.4.0-r2. Thereafter the tar problem has disappeared.
The machine has now resumed its operations as webserver and I'll wait some days if AFS now works well. The problem of an inaccessible directory and hanging php processes occured only after a varying time after boot, may be some hours or even two days. It was this problem that let me experiment with tar to find something reproducible.
Comment 8 Hans-Gunther Borrmann 2006-02-22 01:32:35 UTC
(In reply to comment #7)
> (In reply to comment #6)
> > (In reply to comment #4)
> > > I can reproduce it any time 
> > 
> > I have made volume with 2 iso-files, together 1.4 gigabyte, and produced a
> > backup volume of that.  Then I tried doing "tar -cf tarfile OldFiles/myisos"
> > from within that volume, and it worked without a problem :(
> > I'm hoping you'll get 1.4.1-rc to work, or that you have some more hints to
> > make me able to reproduce this.  
> > 
> 
> Yesterday I have booted 2.6.15-gentoo-r5 and reemerged openafs-kernel-1.4.0 and
> openafs-1.4.0-r2. Thereafter the tar problem has disappeared.
> The machine has now resumed its operations as webserver and I'll wait some days
> if AFS now works well. The problem of an inaccessible directory and hanging php
> processes occured only after a varying time after boot, may be some hours or
> even two days. It was this problem that let me experiment with tar to find
> something reproducible.
> 

openafs-1.4.0-r2 and openafs-kernel-1.4.0 are running well under the kernel 2.6.15-gentoo-r5 since several days. So if you want we can close the bug.
Thanks
Gunther
Comment 9 Stefaan De Roeck (RETIRED) gentoo-dev 2006-02-22 01:55:27 UTC
Gladly :)

Thanks for reporting back on this!

I guess this means we can start marking openafs-1.4.0-r2 stable in less than two weeks (unless you find new bugs, of course :-P )