Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 282928

Summary: dev-util/catalyst doesn't work over NFS
Product: Gentoo Hosted Projects Reporter: Raúl Porcel (RETIRED) <armin76>
Component: CatalystAssignee: Gentoo Catalyst Developers <catalyst>
Status: CONFIRMED ---    
Severity: normal CC: darkside, kumba, robink
Priority: High    
Version: unspecified   
Hardware: All   
OS: Linux   
Whiteboard:
Package list:
Runtime testing required: ---

Description Raúl Porcel (RETIRED) gentoo-dev 2009-08-27 16:38:13 UTC
Just execute catalyst -f stage1.spec, and /var/tmp/catalyst/tmp being a nfs mount.

Referenced SEEDCACHE does not appear to be a directory, trying to untar...
No Valid Resume point detected, cleaning up...
Removing AutoResume Points: ...
Emptying directory /var/tmp/catalyst/tmp/default/.autoresume-stage1-armv4l-20090827/
Emptying directory /var/tmp/catalyst/tmp/default/stage1-armv4l-20090827/
Traceback (most recent call last):
File "/usr/lib/catalyst/catalyst", line 208, in build_target
    mytarget.run()
File "modules/generic_stage_target.py", line 1260, in run
    apply(getattr(self,x))
File "modules/generic_stage_target.py", line 712, in unpack
    self.clear_chroot()
File "modules/generic_stage_target.py", line 1532, in clear_chroot
    shutil.rmtree(myemp)
File "/usr/lib/python2.5/shutil.py", line 184, in rmtree
    onerror(os.rmdir, path, sys.exc_info())
File "/usr/lib/python2.5/shutil.py", line 182, in rmtree
    os.rmdir(path)
OSError: [Errno 39] Directory not empty: '/var/tmp/catalyst/tmp/default/stage1-armv4l-20090827/'
!!! catalyst: Error encountered during run of target stage1
Catalyst aborting....
lockfile does not exist '/var/tmp/catalyst/tmp/default/stage1-armv4l-20090827/.catalyst_lock'

This happens with catalyst-9999 and previous versions. Searching through the web it looks like its a shutil.rmtree issue. However using system's rm works fine.

Thanks
Comment 2 Andrew Gaffney (RETIRED) gentoo-dev 2009-08-27 20:26:37 UTC
The problem here is probably the .nfsXXXXXXXXXXXXXX files that appear "randomly" in NFS-mounted directories. There are two different solutions proposed in those 2 links, and neither of them are particularly "good".

1) Catch the OSError and ignore it. This obviously isn't a good idea. The directory is cleared for a reason, and we can't guarantee it's a .nfsXXXXXX file causing the issue without walking the directory tree

2) Try again and hope it works. For this one, we could have it try 5 times, perhaps with a 1s delay in between, which *should* give enough time for the .nfsXXXXX file to disappear. If we can't remove the dir after 5 tries, bail
Comment 3 Raúl Porcel (RETIRED) gentoo-dev 2009-09-07 13:11:53 UTC
(In reply to comment #2)
> The problem here is probably the .nfsXXXXXXXXXXXXXX files that appear
> "randomly" in NFS-mounted directories. There are two different solutions
> proposed in those 2 links, and neither of them are particularly "good".
> 
> 1) Catch the OSError and ignore it. This obviously isn't a good idea. The
> directory is cleared for a reason, and we can't guarantee it's a .nfsXXXXXX
> file causing the issue without walking the directory tree
> 
> 2) Try again and hope it works. For this one, we could have it try 5 times,
> perhaps with a 1s delay in between, which *should* give enough time for the
> .nfsXXXXX file to disappear. If we can't remove the dir after 5 tries, bail
> 

2 looks okay, at least better than 1.

Comment 4 Raúl Porcel (RETIRED) gentoo-dev 2010-05-29 15:38:10 UTC
*poke*
Comment 5 Raúl Porcel (RETIRED) gentoo-dev 2010-12-05 16:59:03 UTC
slacker!
Comment 6 Jeremy Olexa (darkside) (RETIRED) archtester gentoo-dev Security 2011-02-08 21:10:58 UTC
(In reply to comment #2)
> The problem here is probably the .nfsXXXXXXXXXXXXXX files that appear
> "randomly" in NFS-mounted directories. There are two different solutions
> proposed in those 2 links, and neither of them are particularly "good".
> 
> 1) Catch the OSError and ignore it. This obviously isn't a good idea. The
> directory is cleared for a reason, and we can't guarantee it's a .nfsXXXXXX
> file causing the issue without walking the directory tree
> 
> 2) Try again and hope it works. For this one, we could have it try 5 times,
> perhaps with a 1s delay in between, which *should* give enough time for the
> .nfsXXXXX file to disappear. If we can't remove the dir after 5 tries, bail

Neither will work because it is catalyst itself that is holding the file/dir open..

% lsof /mnt/stagebuilding/tmp/catalyst/tmp/default/stage1-armv7a-20110208/.nfs000000000006106800000017
COMMAND    PID USER   FD   TYPE DEVICE SIZE/OFF   NODE NAME
catalyst 20736 root    3uW  REG   0,17        0 397416 /mnt/stagebuilding/tmp/catalyst/tmp/default/stage1-armv7a-20110208/.nfs000000000006106800000017

% ps aux|grep 20736
root     20736  0.1  1.7  11260  7376 pts/1    S+   20:59   0:00 /usr/bin/python2.6 -OO /usr/lib/catalyst/catalyst -a -p -c /etc/catalyst/catalyst.conf -f stage1.spec

Other ideas?
Comment 7 Jeremy Olexa (darkside) (RETIRED) archtester gentoo-dev Security 2012-03-06 15:51:35 UTC
For a long time, I've made the lockdir call quite stupid in my overlay's version of catalyst.

http://git.overlays.gentoo.org/gitweb/?p=dev/darkside.git;a=blob;f=dev-util/catalyst/files/0001-generic_stage_target.py-stupify-the-LockDir-call.patch
Comment 8 SpanKY gentoo-dev 2015-10-11 17:28:47 UTC
there was some initial hardlink support in the lockdir module, but it's been removed to simplify things greatly:
http://gitweb.gentoo.org/proj/catalyst.git/commit/?id=f083637554bf5668ec856c56cfaaa76bb343d941

if we want to restore that, it should be by:
 - create a new HardlinkLock class in snakeoil.osutils
 - use same API as snakeoil.osutils.FsLock
 - change LockDir to pick FsLock or HardlinkLock based on FS type in __init__
 - nothing else needs to change :)
Comment 9 Matt Turner gentoo-dev 2020-03-28 23:25:31 UTC
*** Bug 707698 has been marked as a duplicate of this bug. ***