My use case might be a little bit exotic but I'm using git-annex (http://git-annex.branchable.com/) to sync my packages across a bad internet connection through multiple hosts. For this reason every file is a symlink: ls -l /usr/portage/packages/sys-process/htop-1.0.2.tbz2 lrwxrwxrwx 1 portage portage 203 Apr 5 19:53 /usr/portage/packages/sys-process/htop-1.0.2.tbz2 -> ../.git/annex/objects/zq/XK/SHA256E-s90880--fff5625837a11539d684641c5e7690bdf6776e6838a17ef47bdba145ef48a822.2.tbz2/SHA256E-s90880--fff5625837a11539d684641c5e7690bdf6776e6838a17ef47bdba145ef48a822.2.tbz2 I expect this not as a problem as long as the content is the same. But in the portage python library very often 'lstat' is used instead of 'stat' which returns the content of the actual symlink and doesn't resolve it. Through this the package size is completely wired. This has various consequences e.g. "emaint binhost --fix" clears Packages file completely. Is that intended? Why is it designed that way? Is there a chance to change that? Reproducible: Always Steps to Reproduce: $ cd /usr/portage/packages/ $ ls -l x11-wm/i3-4.5.1.tbz2 -rw-r--r-- 1 root root 742462 Apr 5 22:35 x11-wm/i3-4.5.1.tbz2 $ python -c ' > import portage; > bintree = bintree = portage.db[portage.settings["EROOT"]]["bintree"]; > print([i for i in bintree.dbapi.cpv_all() if str(i) == "x11-wm/i3-4.5.1"])' ['x11-wm/i3-4.5.1'] $ mv x11-wm/i3-4.5.1.tbz2 . $ ln -s /usr/portage/packages/i3-4.5.1.tbz2 x11-wm/i3-4.5.1.tbz2 $ ll x11-wm/i3-4.5.1.tbz2 lrwxrwxrwx 1 root root 35 Sep 10 22:48 x11-wm/i3-4.5.1.tbz2 -> /usr/portage/packages/i3-4.5.1.tbz2 $ python -c ' > import portage; > bintree = bintree = portage.db[portage.settings["EROOT"]]["bintree"]; > print([i for i in bintree.dbapi.cpv_all() if str(i) == "x11-wm/i3-4.5.1"])' [] Actual Results: symlinks are not resolved in portages bintree or emaint binhost Expected Results: symlinks are resolved and the actual file is used for comparison of e.g. size or mtime
The lstat usage is for handling the older $PKGDIR layout which had all the tbz2 files in a $PKGDIR/All subdirectory, and had symlinks in the same place as the tbz2 files in the current layout. Supporting symlinks like yours raises the question of what should happen when binary packages need to be updated for package moves. Would you want portage to replace your symlink with a modified copy of the original tbz2 file, or would this interfere with your setup? In case you aren't familiar with package moves, see here: http://devmanual.gentoo.org/ebuild-writing/ebuild-maintenance/index.html#moving-ebuilds
Ok, I see your point. But in my opinion moving the symlink is the only thing which makes sense. A symlink stays a symlink. Portage doesn't touches the target of the symlink. But the symlink itselfs is moved to a new location but still points to the right location. Until now found some problems in the following files: * portage/emaint/modules/binhost/binhost.py * portage/dbapi/bintree.py * gentoolkit/eclean/search.py
(In reply to Florian Eitel from comment #2) > Ok, I see your point. But in my opinion moving the symlink is the only thing > which makes sense. A symlink stays a symlink. Portage doesn't touches the > target of the symlink. But the symlink itselfs is moved to a new location > but still points to the right location. Current versions of portage expect the internal CATEGORY, PF, and ebuild name to be consistent with the package move. Also, the internal metadata has to be modified when the dependencies of the package need to be updated for package moves.
Note that the existing code should work fine if you replace your symlinks with hardlinks.
Ok, this is a bigger problem than expected. 1.) The current state is: Treat all symlinks as invalid data. 2.) My expectation is, if portage accesses/modifies the file *content* then the target of the symlink gets changed. But all operations (delete, move) on the file itself should only modify the symlink. I think this is a intuitive approach used in most programs. 3.) Another way (which fits my requirement even better): If a file content changes than the symlink is replaced with a new file. The target is not touched. For all operations on the file itselfs the target stays untouched. But of course it is only my view. So, what is the decision? For me the third way is the safest way. Because all the access code can stay as is and the target gets not modified at all. So only the lstat statements gets replaced with stat - the rest stays as is. (bintree.py handles moves in a way that the content is copied and deleted - not moved) But I see a change can lead to trouble because it is more complicated than expected. If you decide to change it, of course I'm willing to help & test.