Hi, when using portage 2.0.51_pre20 you cannot have your distfiles on nfs because nfs doesn't support locking with flock() AFAIK at least I get an error message saying that it can't create a lock before downloading something
error message if you would, permissions, etc... Basically, info please.
ok, sorry the distfiles on the server: exportfs: /usr/portage/distfiles 192.168.0.0/24(all_squash,anonuid=250,anongid=250,rw,sync) drwxrwx--- 4 portage portage 53248 9. Sep 20:37 distfiles emerge gcc Calculating dependencies ...done! >>> emerge (1 of 1) sys-devel/gcc-3.4.1-r2 to / Traceback (most recent call last): File "/usr/bin/emerge", line 2815, in ? mydepgraph.merge(mydepgraph.altlist()) File "/usr/bin/emerge", line 1725, in merge retval=portage.doebuild(y,"merge",myroot,self.pkgsettings,edebug) File "/usr/lib/portage/pym/portage.py", line 2702, in doebuild if not fetch(fetchme, mysettings, listonly, fetchonly): File "/usr/lib/portage/pym/portage.py", line 1979, in fetch file_lock = portage_locks.lockfile(mysettings["DISTDIR"]+"/"+locks_in_subdir+"/"+myfile,wantnewlockfile=1) File "/usr/lib/portage/pym/portage_locks.py", line 74, in lockfile raise ie IOError: [Errno 37] No locks available when I umount the distfiles on the client everything works without problems and IMHO the nfs problem can be worked around by using fcntl() with F_SETLK instead of flock()
perhaps your nfs permissions are too strict to allow locking ive used nfs distfiles for a very long time now and never had a problem ... but i mount it without squash options and as root
Even with no_root_squash and 777 permissions I get this error message Can you give me your exports line that works? (BTW: with portage 2.0.50 it works this way)
Solved... 2.6.9-rc1-mm1 was broken for nfs locks... Btw, this was reported before in #37344
works for me
*** Bug 64625 has been marked as a duplicate of this bug. ***
Still trying to get my brain around this, but at least I'm seeing a consistent pattern. I'm having lock errors alright, slightly different from the one the original poster reported, and it's happening right after I have a freshly installed portage-2.0.51_rc1: daimyo ~ # emerge =sys-apps/portage-2.0.50-r11 Calculating dependencies ...done! >>> emerge (1 of 1) sys-apps/portage-2.0.50-r11 to / *** Adjusting cvs-src permissions for portage user... !!! Unable to chgrp of /usr/portage/distfiles to portage, continuing Traceback (most recent call last): File "/usr/bin/emerge", line 2826, in ? mydepgraph.merge(mydepgraph.altlist()) File "/usr/bin/emerge", line 1733, in merge retval=portage.doebuild(y,"merge",myroot,self.pkgsettings,edebug) File "/usr/lib/portage/pym/portage.py", line 2369, in doebuild if not fetch(fetchme, mysettings, listonly=listonly, fetchonly=fetchonly): File "/usr/lib/portage/pym/portage.py", line 1637, in fetch file_lock = portage_locks.lockfile(mysettings["DISTDIR"]+"/"+locks_in_subdir+"/"+myfile,wantnewlockfile=1) File "/usr/lib/portage/pym/portage_locks.py", line 48, in lockfile os.chown(lockfilename,os.getuid(),portage_data.portage_gid) OSError: [Errno 1] Operation not permitted: '/usr/portage/distfiles/.locks/portage-2.0.50-r11.tar.bz2.portage_lockfile' Then I just do the exact same thing again, and it works, compiles, installs, everything. Re-emerging current portage-2.0.51_rc1, emerge the old portage again, start all over, just like above. First time failure, second time success. Kernel config (2.6.9-rc2-mm1): CONFIG_NFS_FS=y CONFIG_NFS_V3=y # CONFIG_NFS_V4 is not set # CONFIG_NFS_DIRECTIO is not set CONFIG_NFSD=y CONFIG_NFSD_V3=y # CONFIG_NFSD_V4 is not set # CONFIG_NFS_TCP is not set The NFS share is on a corporate FreeBSD 4.6.2-release server, NFS version unknown. I don't have admin rights on that machine. The exports file looks like this: /data2 -alldirs -maproot=pkgshare -network 193.41.125.0 -mask 255.255.255.0 rpcinfo -p: program vers proto port 100000 2 tcp 111 portmapper 100000 2 udp 111 portmapper 100005 3 udp 1023 mountd 100005 3 tcp 1023 mountd 100005 1 udp 1023 mountd 100005 1 tcp 1023 mountd 100003 2 udp 2049 nfs 100003 3 udp 2049 nfs 100003 2 tcp 2049 nfs 100003 3 tcp 2049 nfs 100024 1 udp 1011 status 100024 1 tcp 1022 status nfsstat shows traffic only for a v3 client, I suppose that means the server speaks v3, too. Tell me if you need any additional informations. Thanks for looking into this again!
like nick and the reporter mention, flock() isnt NFS friendly
So I've tested this on my laptop for comparison now, and the error is reproducible on that one, too. Different kernel (2.6.9-rc1-mm5) but with almost identical config, distfiles on the same NFS server, shorter cable length (the desktop is on a CAT5 to the adjacent building that's almost 100 meters long), same effect. Except that I've now discovered that I can repeat the emerge command as much as I want, the failure --> success pattern only shows when there's a single file being taken from distfiles. gcc for example never gets there because the lockfile error just rotates through the list of files to be uncompressed. I get OSError: [Errno 1] Operation not permitted: '/usr/portage/distfiles/.locks/gcc-3.4.2-manpages.tar.bz2.portage_lockfile' first for manpages, then patches, then gcc proper, and back to manpages etc. Never the same file twice in a row, not always in the same order either, but always one file that cannot get locked. I can now only emerge things that rely on single distfiles (like groff a couple of minutes ago, that one compiled and installed cleanly on the second attempt, just like portage in comment #8 above).
*** Bug 64397 has been marked as a duplicate of this bug. ***
I played a hunch and restarted the nfs daemon on the server and the problem was gone. Portage works as expected here now.
The same error appears if the distfiles are stored on a FAT32 partition, and symlinked to. As the "chown" command is not allowed on such, portage fails. A workaround could of course be passing the portage groupid while mounting, but i have other things on that disk and don't want to change the group settings (that'd totally break my system's config). hacking the portage script in the appropriate lines (in pym/portage_locks.py) might do the trick, but i actually don't want to mess around with it if there's another way. it would be really nice if you could solve that issue, i don't have the option of moving my distfiles dir to an ext2/3 partition. my suggestion is to provide an additional flag for /etc/make.conf (i.e. FEATURES="distfilesonfat") so that the chown safetycheck is left out.
Created attachment 40245 [details, diff] Workaround for distfiles directory on a vfat partition; requires an additonal FEATURES flag Only slight modifications were made to pym/portage_locks.py it now searches for the flag "vfatdistfiles" in FEATURES (/etc/make.conf), and won't try to chown the .locks/<filename> then. if the flag is not set, the behaviour is the old. it works for me, so if you like it, use it :) (please tell me if you do so, just curious if it was helpful!) pi~
Comment on attachment 40245 [details, diff] Workaround for distfiles directory on a vfat partition; requires an additonal FEATURES flag In pym/portage.py, there's another use of chown which breaks emerge directly after downloading the ebuild's files (same error). I corrected this so that one doesn't have to restart emerge again for the install to work. See new Patch.
Created attachment 40250 [details, diff] Workaround for distfiles directory on a vfat partition, second instance Directly after downloading the source files the emerge process would break up because chown is called from pym/portage.py as well. I fixed that, also using the 'vfatdistfiles' flag. If it works for you as well or if you run into troubles, please also step by here: http://forums.gentoo.org/viewtopic.php?t=224166 and post it there as well. regards, pi~
Any reason you can't just make a ext2 loopback filesystem on your FAT disc ?
The idea behind portage_* files is to remove the portage module itself from the picture. So you're circular dep there is bad. Besides that... You've just potentially anhilated all lockfiles for non-root users, include userpriv. NFS does support locking, but NFSv2 is not very good at it. NFSv3 is fine. You have to enable that when you build your kernel. I'm working out a fix, but I need testers periodically. Test #1: Change all calls to 'flock' to 'lockf'. Search and replace. Dumb FS Option #1: Check the result from chown's exception.
I'm not completely sure which file you're talking about doing the search/replace in, but I replaced all instances of "fcntl.flock" to "fcntl.lockf" in /usr/lib/portage/pym/portage_locks.py. It didn't solve the problem, but it at least gives me a traceback now. emerge -f gnome Calculating dependencies ...done! >>> emerge (1 of 76) gnome-base/gail-1.6.6 to / Traceback (most recent call last): File "/usr/bin/emerge", line 2826, in ? mydepgraph.merge(mydepgraph.altlist()) File "/usr/bin/emerge", line 1694, in merge retval=portage.doebuild(y,"fetch",myroot,self.pkgsettings,edebug,("--pretend" in myopts),fetchonly=1) File "/usr/lib/portage/pym/portage.py", line 2369, in doebuild if not fetch(fetchme, mysettings, listonly=listonly, fetchonly=fetchonly): File "/usr/lib/portage/pym/portage.py", line 1637, in fetch file_lock = portage_locks.lockfile(mysettings["DISTDIR"]+"/"+locks_in_subdir+"/"+myfile,wantnewlockfile=1) File "/usr/lib/portage/pym/portage_locks.py", line 74, in lockfile raise ie IOError: [Errno 37] No locks available
That's a lovely result. I guess I'll have to implement the hardlink test. As for the vfat issues, I'm working that out with a "friendly_chown" function.
Yeah well it was only a quick'n'dirty hack, not a full-blown solution. But I needed something that would work _now_, and it does. Still, I'm waiting eagerly for a "real" solution from you devs. I'm not very experienced in Python (doing mostly c++/bash/php stuff), and I didn't look into portage yet before ("just used it"). But as from what I've seen, I really like Python so far, and portage is HUGE! (in terms of "being good") Anyway, waiting for the fix now. pi~
Fiddling around with this, my filesystem went limp when an 'emerge gcc' died horribly ("filesystem is read-only") upon unpacking to /var/tmp/portage. I sighed and went on a business trip, to return only last night... $deity bless ext3's recovery mechanism, I've got 2000 entries in lost+found, but at least I still have a working laptop. :) Meanwhile, the desktop is apparently more sturdy and still open for suggestions. I've checked two things: 1. Changed all occurrences of fcntl.flock to fcntl.lockf Same error as reported earlier: OSError: [Errno 1] Operation not permitted: '/usr/portage/distfiles/.locks/blahblah.tar.gz.portage_lockfile' 2. Changed the share from NFS to Samba (the server allows both) Same error, both with the old fcntl.flock and the new fcntl.lockf.
Locks are valid on samba... There is an issue with reexporting an NFS share through samba that creates locking issues. Are you doing that? OriginalServer <------------NFS---------------> You OriginalServer <--NFS--> SomeServer <--Samba--> You The lockf fix will be included in _rc2 along with the hardlink-shuffle.
I don't think it's doing that sort of ricocheted export. It's the same machine that serves the diistfiles directory (just below /data2/SHARE/PKGSHARE) as both NFS and Samba share. The smb.conf entry looks like this: [PKGSHARE] comment = common download pool browseable = no writable = yes only user = no create mask = 0666 path = /data2/SHARE/PKGSHARE guest ok = no oplocks = False hide dot files = yes valid users = +pkgshare while /etc/exports states: /data2 -alldirs -maproot=pkgshare -network 193.41.125.0 -mask 255.255.255.0 I only have read permission on those files, but I can ask the admin to help if there's anything that needs to be done server-side.
Well, that config makes it pretty obvious. You disabled oplocks. Is there a particular reason? portage-2.0.51_rc3 is out with NFSv2 fixes.
Ok. So NFSv2 fixes aren't in yet. Working on that right now.
RC3 is actually worse for me. Before the only issue was with downloading the distfiles, but now it won't let me emerge a package at all. emerge -uD world Calculating world dependencies ...done! >>> emerge (1 of 1) sys-apps/vixie-cron-4.1-r1 to / Traceback (most recent call last): File "/usr/bin/emerge", line 2844, in ? mydepgraph.merge(mydepgraph.altlist()) File "/usr/bin/emerge", line 1737, in merge retval=portage.doebuild(y,"merge",myroot,self.pkgsettings,edebug) File "/usr/lib/portage/pym/portage.py", line 2370, in doebuild if not fetch(fetchme, mysettings, listonly=listonly, fetchonly=fetchonly): File "/usr/lib/portage/pym/portage.py", line 1639, in fetch file_lock = portage_locks.lockfile(mysettings["DISTDIR"]+"/"+locks_in_subdir+"/"+myfile,wantnewlockfile=1) File "/usr/lib/portage/pym/portage_locks.py", line 80, in lockfile raise ie IOError: [Errno 37] No locks available
just upgraded to portage-2.0.51_rc4 (from rc1) and am now experiencing locking problems with my nfs mounted distdir. I get no error as such but portage seems to be waiting on it's own lock file. >>> emerge (1 of 5) x11-misc/shared-mime-info-0.15 to / Hardlink lockfile: /mnt/nfs/portage/distfiles/.locks/shared-mime-info-0.15.tar.gz.portage_lockfile.hardlock-terminus-21045 Waiting on (hardlink) lockfile: (one '.' per 3 seconds) /mnt/nfs/portage/distfiles/.locks/shared-mime-info-0.15.tar.gz.portage_lockfile ...................... It will then just sit there forever. If I start an emerge then open another term and delete the lockfile portage will then carry on as normal, however cleaning the locks before I start an emerge has no effect. I had no problems with rc1 and have no problems if DISTDIR is on a local filesystem.
Just has a possible solution proposed... emerge nfs-utils
You have to restart nfs/nfsmount after installing that package. FEATURES=-distlocks will let you merge whatever you'd like without concern for locks. Herbie: You had no problems on _rc1? You're certain that you were on _rc1? _rc1 had a much less robust/NFS-stable locking scheme. I'd be quite surprised if it was actually working. Can you post mount options?
*** Bug 65426 has been marked as a duplicate of this bug. ***
------- Additional Comment #1 From Marien Zwart 2004-09-26 10:37 PST ------- I had the same issue. 'FEATURES="-distlocks" emerge package' disables locking, allowing me to merge things. After merging nfs-utils on the client and doing /etc/init.d/netmount restart (and using /usr/lib/portage/bin/clean_locks --force /usr/portage/distfiles/.locks, but not sure that was necessary) things started working again. If this works for other people too, I suggest adding this to portage's output when waiting on locks.
RC4 looks a lot better, but still not quite right for me. emerge -uD world Calculating world dependencies ...done! >>> emerge (1 of 2) x11-misc/shared-mime-info-0.15 to / Hardlink lockfile: /usr/portage/distfiles/.locks/shared-mime-info-0.15.tar.gz.portage_lockfile.hardlock-hostname-10138 Waiting on (hardlink) lockfile: (one '.' per 3 seconds) /usr/portage/distfiles/.locks/shared-mime-info-0.15.tar.gz.portage_lockfile ............................... Portage doesn't appear to ever stop waiting (I waited probably more than 5 minutes on a different attempt). I tried deleting shared-mime-info-0.15.tar.gz.portage_lockfile during the wait and the emerge then resumed normally. Doing an emerge -f on the NFS server while the NFS client is waiting on the lockfile will also clear the lock and allow the emerge to resume. This is occuring both when the distfiles exist and when they need to be downloaded.
yep, quite sure. Just tried regressing back to 2.0.51_rc1 which solved the problem. Upon emerging _rc4 again I get the output I posted above. Emerging/reemerging nfs-utils has no effect here. server export options: rw,no_subtree_check,no_root_squash,async client mount options: rw,rsize=8192,wsize=8192,nfsvers=3,hard
Same here... rc1 works, rc4 doesn't
Everyone reporting that _rc4 is broken for them: emerge nfs-utils ; /etc/init.d/nfsmount restart If it doesn't work, then please post the following: uname -a mount | grep mountpoint_goes_here How is it mounted? Re-export of another share?
You may need to run /usr/lib/portage/bin/clean-locks --force
I re-emerged nfs-utils and started nfsmount. I also ran /usr/lib/portage/bin/clean-locks --force. Now the emerge hangs whether the distfile exists or not with the following output: emerge -f gnome Calculating dependencies ...done! >>> emerge (1 of 76) gnome-base/gail-1.6.6 to / CTRL-C now has no effect, nor does deleting the lockfile (or doing an emerge -f on the server). uname -a Linux wod28910rn 2.6.8-gentoo-r4 #1 Mon Sep 13 04:53:12 EST 2004 i686 Intel(R) Pentium(R) 4 CPU 2.00GHz GenuineIntel GNU/Linux mount | grep /usr/portage 192.168.0.3:/usr/portage on /usr/portage type nfs (rw,hard,intr,tcp,nfsvers=3,addr=192.168.0.3) (on NFS server) grep /usr/portage /etc/exports /usr/portage 192.168.0.2(rw,no_root_squash,sync) The NFS share is just the /usr/portage directory on the server's / partition. The partition is reiser4 (kernel is gentoo-dev-sources with reiser4 patched in).
uname -a: Linux terminus 2.6.8-gentoo-r3 #3 Wed Sep 22 21:06:20 BST 2004 x86_64 AMD Athlon(tm) 64 Processor 3200+ AuthenticAMD GNU/Linux client mount opts: rw,rsize=8192,wsize=8192,nfsvers=3,hard server export opts: rw,no_subtree_check,no_root_squash,async Tried reemerging nfs-utils, restarting services, changing various mount options, running clean-locks, all on both client and server. Always get the same result, portage waits indefinitly for it's own lock file to disappear.
not sure if it's relevent or weather it is intended behaviour but upon starting an emerge portage create two lock files in distfiles/.locks. For example: $ rm -f distfiles/.locks/* $ emerge -f nfs-utils Calculating dependencies ...done! >>> emerge (1 of 1) net-fs/nfs-utils-1.0.6-r4 to / Hardlink lockfile: /mnt/nfs/portage/distfiles/.locks/nfs-utils-1.0.6.tar.gz.portage_lockfile.hardlock-terminus-10187 Waiting on (hardlink) lockfile: (one '.' per 3 seconds) /mnt/nfs/portage/distfiles/.locks/nfs-utils-1.0.6.tar.gz.portage_lockfile ...... (pressed Ctrl-C to stop the emerge here) $ ls distfiles/.locks nfs-utils-1.0.6.tar.gz.portage_lockfile nfs-utils-1.0.6.tar.gz.portage_lockfile.hardlock-terminus-10208
Herbie: Two lockfiles is a locking technique. Create a system+process unique file and then using the desired lockfile name, hardlink it. If it succeeds and/or the link count on the unique file is 2, then you have the lock. What is the underlying FS for the NFS mount? Charlie: Could you try a non-reiser4 partition as a test please? I'll post a patch shortly with tons of debug.
Linux excelsior.weeve.org 2.6.9-rc2 #9 Fri Sep 24 16:37:35 MDT 2004 sparc64 sun4u TI UltraSparc IIe (Hummingbird) GNU/Linux mounted with options (rw,soft,intr,addr=192.168.0.1) Using autofs to manage the mount point rather than nfs-utils. Underlying FS is ext3
underlying filesystem is reiserfs (v3) here.
http://zarquon.twobit.net/gentoo/portage/portage_locks.py-2.0.51_rc4-debug.diff patch /usr/lib/portage/pym/portage_locks.py < portage_locks.py-2.0.51_rc4-debug.diff That will produce a tremendous amount of output on most portage operations. Just log it all, and post it for me.
Created attachment 40478 [details] portage-2.0.51_rc4 lockfile debug output
Ok. There's a typo in the original patch, I changed the patch on my server. Herbie: Look for mylsd in /usr/lib/portage/pym/portage_locks.py and change it to mylsf. Before starting that output, please run /usr/lib/portage/bin/clean_locks --force
Created attachment 40479 [details] portage-2.0.51_rc4 lockfile debug output
Created attachment 40481 [details] debug output
I get the same results when using reiserfs v3.
Created attachment 40483 [details] weeve's ncftp emerge log using second patch
http://zarquon.twobit.net/gentoo/portage/portage_locks.py-2.0.51_rc4-debug2.diff All I really need is the 'Exception' line after "Lock failed"
What I get here is; Exception: [Errno 17] File exists
with debug2: lockfile(): Calling hardlink_lockfile() lockfile(): Hardlink: Attempting link. lockfile(): Hardlink: Link failed. Exception: [Errno 17] File exists lockfile(): hardlink_is_mine() Entered
Well... I had NFSv2 and now NFSv3 up to rc1 and everything was fine... with rc3 (never used rc2). I couldn't download or emerge anything. Same problems that everyone is getting with NFS in both v2 and v3... However now with rc4... I can't even emerge sync... Here's the error I get with that... receiving file list ... 98728 files to consider delete_one: rmdir "/usr/portage/distfiles" failed: Device or resource busy Number of files: 98728 Number of files transferred: 0 Total file size: 76526147 bytes Total transferred file size: 0 bytes Literal data: 0 bytes Matched data: 0 bytes File list size: 2293018 Total bytes written: 184 Total bytes read: 2293139 wrote 184 bytes read 2293139 bytes 97588.21 bytes/sec total size is 76526147 speedup is 33.37 rsync error: some files could not be transferred (code 23) at main.c(1064)
Ok. On a clean _rc4, try this. I think it'll fix it. http://zarquon.twobit.net/gentoo/portage/portage_locks.py-2.0.51_rc4.diff
yep, that works for me :)
This patch works for me, except for when I use ctrl-c or kill the process while the lock is in place. If this happens, the lock is not removed and I see the same behavior as before. Executing "/usr/lib/portage/bin/clean_locks --force" does not remove the lock, but deleting the files in /usr/portage/distfiles/.locks does. Also, is the lock intending to prevent multiple machines from downloading the same file at once, or just multiple processes on the same machine? If it is the former, then I do see one other issue. Neither the NFS server nor client honor the lock put in place by the other machine. I haven't tested whether or not the NFS clients will honor locks by other clients.
That would be the catch with NFS locks and killing an app. Once we get portage fully handling signals, it should always clean up after itself, but you can still do things to prevent/break that. The hardlink method uses reference/link counts. So it's not a FS layer lock. It's an atomic lock due to the nature of linking. All locking is cooperative anyway, but portage, assuming you update all boxes, will pay attention to the lock. I'll add another message about the clean_locks tool.
Works for me as well.
rc4 fails here now. rc1 was broken but all I had to do was reload the nfs daemon and all was well (porage mounted nfsv3 on ext3). Now when I try to emerge it just hangs with: Waiting on (hardlink) lockfile: (one '.' per 3 seconds) /usr/portage/distfiles/.locks/shared-mime-info-0.15.tar.gz.portage_lockfile .........(infinity) Any insight guys?
Paul: 2.0.51_rc5
rc6 works partially... you now can merge with distfiles on nfs... but when you try emerge sync you get "delete_one: rmdir "/usr/portage/distfiles" failed: Device or resource busy" and emerge tries to sync until the maximum number of retries any ideas?
bug 65519
Just tried _rc6, and it's still yelling at me if I run emerge -v, but it does the trick: Calculating world dependencies ...done! >>> emerge (1 of 10) media-libs/libexif-0.6.10 to / *** Adjusting cvs-src permissions for portage user... !!! Unable to chgrp of /usr/portage/distfiles to portage, continuing Cannot chown a lockfile. This could cause inconvenience later. That last line gets repeated for each file being unpacked whenever there are multiple files called by an ebuild. But it works over NFS again, that's the important bit. :)
Rotating through the different hosts I have got running Gentoo on, there's one that doesn't behave. Identical setup with distfiles on the same NFS share as the two others that work fine now, only significant differences being a 2.4.27 kernel rather than 2.6.9.something on the others, and the underlying filesystem being Reiser3 (instead of XFS and ext3 respectively). This is the result: # emerge -uDv world Calculating world dependencies ...done! >>> emerge (1 of 17) dev-libs/expat-1.95.8 to / *** Adjusting cvs-src permissions for portage user... !!! Unable to chgrp of /usr/portage/distfiles to portage, continuing Cannot chown a lockfile. This could cause inconvenience later. Traceback (most recent call last): File "/usr/bin/emerge", line 2885, in ? mydepgraph.merge(mydepgraph.altlist()) File "/usr/bin/emerge", line 1776, in merge retval=portage.doebuild(y,"merge",myroot,self.pkgsettings,edebug) File "/usr/lib/portage/pym/portage.py", line 2380, in doebuild if not fetch(fetchme, mysettings, listonly=listonly, fetchonly=fetchonly): File "/usr/lib/portage/pym/portage.py", line 1649, in fetch file_lock = portage_locks.lockfile(mysettings["DISTDIR"]+"/"+locks_in_subdir+"/"+myfile,wantnewlockfile=1) File "/usr/lib/portage/pym/portage_locks.py", line 114, in lockfile raise e IOError: [Errno 13] Permission denied Same thing after an 'emerge metadata', portage-2.0.51_rc6. Any ideas?
If you are still on 51_rc6: FEATURES=-distlocks emerge portage If you are on other versions and this doesn't work, use a rescue portage, and then update to _rc7 or later.
rc9 breaks things for me again. emerge --oneshot binutils Calculating dependencies ...done! >>> emerge (1 of 1) sys-devel/binutils-2.15.92.0.2-r1 to Portage hangs here (the file already exists). Could this (from the ChangeLog) have anything to do with it: " 08 Oct 2004; Brian Harring <ferringb@gentoo.org> portage_locks.py: Reverted to using flock by default- if it fails (unavailable), -then- use lockf, then hardlink."
I don't have that problem with rc9. Things work as expected on all three hosts I've tested it on, provided I have FEATURES="-distlocks" in /etc/make.conf. From seeing distlocks mentioned in /etc/make.conf.example, I suppose you intend to keep this beahviour indefinitely. Any plans on silencing that "!!! unable to chgrp" warning, then? Portage just isn't its same old friendly self unless it stops yelling at me everytime I emerge something... :)
Of course it works correctly when using -distlocks, this prevents the problem code from executing. It doesn't bother me at all that the locks aren't working correctly since I can just use the -distlocks feature, but this is still a bug so I'm reporting it.
Charlie, if you could post the full output, that would be appreciated. ls -li /usr/portage/distfiles/.locks
I'm not Charlie, but here's what it looks like on my share: $ ls -li /usr/portage/distfiles/.locks/ total 0 963201 -rw-rw---- 1 600 600 0 Sep 28 21:30 expat-1.95.8.tar.gz.portage_lockfile 963206 -rw-rw-rw- 1 1012 600 0 Sep 24 11:10 frozen-bubble-client-0.0.3.tar.bz2.portage_lockfile 963205 -rw-rw---- 1 600 600 0 Sep 22 10:53 gcc-3.4.2-manpages.tar.bz2.portage_lockfile 963204 -rw-rw---- 1 600 600 0 Sep 22 10:52 gcc-3.4.2.tar.bz2.portage_lockfile 963209 -rw-rw---- 1 600 600 0 Sep 25 01:29 gpgme-0.9.0.tar.gz.portage_lockfile 963214 -rw-rw---- 1 600 600 0 Oct 14 11:34 kdebase-3.3.1.tar.bz2.portage_lockfile 963207 -rw-rw-rw- 1 1012 600 0 Sep 24 11:14 matritsa-0.1.2.tar.gz.portage_lockfile 963208 -rw-rw---- 1 600 600 0 Sep 25 01:15 modutils-2.4.26.tar.bz2.portage_lockfile 963213 -rw-rw---- 1 600 600 0 Oct 12 09:51 portage-2.0.51_rc7.tar.bz2.portage_lockfile 963212 -rw-rw---- 1 600 600 0 Sep 28 10:50 ppp-2.4.2-mppe-mppc-1.1.patch.gz.portage_lockfile 963210 -rw-rw---- 1 600 600 0 Oct 1 23:44 readline50-004.portage_lockfile 963211 -rw-rw---- 1 600 600 0 Oct 12 09:28 shadow-4.0.4.1.tar.bz2.portage_lockfile 963202 -rw-rw---- 1 600 600 0 Sep 24 10:51 winesetuptk-0.7.tar.gz.portage_lockfile 963203 -rw-rw-rw- 1 1012 600 0 Sep 24 11:04 xdelta-1.1.3.tar.gz.portage_lockfile
Ulrich: You can delete all those.
That was the entire output of the failed emerge. Here is what my .locks dir looks like after a failed emerge of sgml-common: ls -li /usr/portage/distfiles/.locks total 0 5473702 -rw-rw---- 1 root portage 0 Oct 15 12:46 sgml-common-0.6.3.tgz.portage_lockfile
I unfixed it for _rc10. Strange stuff really. It shouldn't be broken.
Can anyone verify that CURRENT (2.0.51 or 2.0.51_rc10) fix the problem once and for all?
It runs fine, except that the lock isn't removed if I cancel a download with CTRL+C
It works for me too.
No problems here, either, except for the warnings ("!!! Unable to chgrp..." and "Couldn't chown a lockfile. This could cause inconvenience...") Anyway, as you've released 2.0.51 already, I suppose you know it works... Congratulations!
A similar bug exists when /usr/portage/distfiles is mounted via cifs. The emerge process fails with the following output. >>> emerge (1 of 1) sys-kernel/gentoo-dev-sources-2.6.9-r1 to / *** Adjusting cvs-src permissions for portage user... Traceback (most recent call last): File "/usr/bin/emerge", line 2991, in ? mydepgraph.merge(mydepgraph.altlist()) File "/usr/bin/emerge", line 1839, in merge retval=portage.doebuild(y,"merge",myroot,self.pkgsettings,edebug) File "/usr/lib/portage/pym/portage.py", line 2506, in doebuild if not fetch(fetchme, mysettings, listonly=listonly, fetchonly=fetchonly): File "/usr/lib/portage/pym/portage.py", line 1849, in fetch portage_locks.unlockfile(file_lock) File "/usr/lib/portage/pym/portage_locks.py", line 162, in unlockfile raise IOError, "Failed to unlock file '%s'\n" % lockfilename IOError: Failed to unlock file '/usr/portage/distfiles/.locks/linux-2.6.9.tar.bz2.portage_lockfile' This is under portage-2.0.51-r2
For the CIFS case, the traceback implies that a standard lockf lock was obtainable but was not unlockable afterward. This seems like either a bug or possibly just a deficiency in the file system itself. You can of course get around it by using FEATURES="-distlocks".
This was fixed a while back. flock, then lockf, then hardlink is the locking approach. If hardlink isn't possible, well, you're screwed :)
I seem to get this problem on the latest ~x86 version of portage.
The -distlocks in Comment #80 helped fixed the problem on 2.2 though
speaking on the topic of locks why not put the locks in /var/lock?
(In reply to comment #84) > speaking on the topic of locks why not put the locks in /var/lock? > Locks are supposed to prevent readers from reading incomplete files and prevent multiple writers. If the locks are in /var/locks instead of NFS than if distfiles are shared over the network the entire purpose of locking is lost as /var/locks is not a shared resource; so multiple writers or uninformed readers are possible.