The problem occurred when I had mounted /var/db using aufs (making it writable) from a read-only filesystem. I guess the same error will occur with unionfs instead of aufs and perhaps also with certain nfs setups:
There was a package move of an installed package (in my case app-emacs/ebuild-mode -> app-emacs/gentoo-syntax), and due to this portage always died, trying to rename /var/db/pkg/app-emacs/ebuild-mode-1.5-r3 into /var/db/pkg/app-emacs/gentoo-syntax-1.5-r3:
Traceback (most recent call last):
File "/usr/bin/emerge", line 5575, in ?
retval = emerge_main()
File "/usr/bin/emerge", line 5288, in emerge_main
if portage._global_updates(trees, mtimedb["updates"]):
File "/usr/lib/portage/pym/portage.py", line 8249, in _global_updates
moves = vardb.move_ent(update_cmd)
File "/usr/lib/portage/pym/portage.py", line 5220, in move_ent
OSError: [Errno 18] Invalid cross-device link
The reason is clearly that actually the rename is across a filesystem boundary (see URL). So I suggest that portage should catch the above error and use a copy/remove operation for the corresponding part of the tree in such a case.
I will attach a function which does this (i.e. which I suggest to call instead of os.rename from within portage). (For some reason, shutil.move() throwed strange errors in my tests.)
Created attachment 122877 [details]
substitute for os.rename
The attached function might be used instead of os.rename from within portage:
It first attempts os.rename; if this fails because of filesystem boundaries, it copies the corresponding tree and removes the old one.
There's an shutil.move(src, dest) function that seem to be made for this type of thing, so hopefully we can just use that.
Created attachment 122886 [details, diff]
portage-2.1.3_rc5 patch to replace os.rename() with shutil.move()
This is fixed in svn r6968.
(In reply to comment #2)
> There's an shutil.move(src, dest) function that seem to be made
> for this type of thing, so hopefully we can just use that.
But as I said, shutil.move throws strange errors in my tests (python-2.4.4-r4), so I doubt that this solves the problem:
With /var/db mounted by aufs+squashfs, a simple attempt to call shutil.move to rename a directory throws the error:
File "/usr/lib/python2.4/shutil.py", line 189, in move
raise Error, "Cannot move a directory '%s' into itself '%s'." % (src, dst)
shutil.Error: Cannot move a directory '/var/db/pkg/app-emacs' into itself '/var/db/pkg/app-emacs-bak'.
(In reply to comment #4)
> shutil.move("/var/db/pkg/app-emacs", "/var/db/pkg/app-emacs-bak")
> File "/usr/lib/python2.4/shutil.py", line 189, in move
> raise Error, "Cannot move a directory '%s' into itself '%s'." % (src, dst)
> shutil.Error: Cannot move a directory '/var/db/pkg/app-emacs' into itself
That error is triggered by the flawed logic in this function:
def destinsrc(src, dst):
(In reply to comment #5)
> return abspath(dst).startswith(abspath(src))
Oh, I see... so it is a bug, but hopefully not relevant for portage.
Created attachment 122922 [details, diff]
portage-2.1.3_rc5 patch to replace os.rename() with portage.movefile() in cases where the parent directory might change
That shutil.move() can cause problems so I've converted relevant cases to use portage.movefile() instead.
Thanks for the patch.
(In reply to comment #7)
> in cases where the parent directory might change
The problem with unionfs/aufs is that there is usually a filesystem boundary within the *same* parent directory unless the file was freshly created/modified (which includes a "move" of the parent directory). One might consider the failure of rename() in such a case as a bug of unionfs/aufs, but it is really not possible to do an atomic rename() in such a case which is a POSIX requirement...
If we're going (In reply to comment #8)
> One might consider the
> failure of rename() in such a case as a bug of unionfs/aufs, but it is really
> not possible to do an atomic rename() in such a case which is a POSIX
I'd prefer to stick with rename(2) operations wherever possible. If the unionfs/aufs documentation explicitly states that a specific case will result in EXDEV, then we have an argument against using rename(2). The docs that I've seen seem to indicate that rename operations on files should succeed.
When you issue rename(2) to the file on aufs, aufs may copyup it to the higher writable branch. If this behaviour is not what you want, then you should rename(2) it on the lower branch directly.
E: error (either unionfs or vfs)
none = file does not exist
file = file is a file
dir = file is a empty directory
child= file is a non-empty directory
wh = file is a directory containing only whiteouts; this makes it logically
none file dir child wh
file o o E E E
dir o E o E o
child X E X E X
wh o E o E o
Since I can use only aufs in the moment, I didn't check the unionfs documentation and was therefore not aware that files might be treated differently than directories here. I tested now with aufs, and it is here also as the unionfs documentation describes:
Renaming files or empty directories succeeds without any problems in all reasonable situations, only renaming a *nonempty* directory fails (if the directory was not in the writable branch before), i.e. the bug was originally triggered only because it was in the "child-child" situation (unionfs terminology).
So it seems that your patch is the perfect solution...
This has been released in 2.1.3_rc6.