Currently, if the rootfs and $PORTAGE_TMPDIR are on the same mountpoint, the merge process will attempt to make a hardlink and then overwrite it by move with the on-rootfs files. Due to a internals of kernel, the kernel gets confused and if such file from rootfs is opened, the kernel shows the path like it was the portage tmp directory. One can look it up for example with lsof thunderbi 5729 21485 piotr DEL REG 254,2 96515 /var/portage/tmp/portage/media-libs/mesa-17.1.5/image/usr/lib64/libglapi.so.0.0.0 thunderbi 5729 21485 piotr DEL REG 254,2 1241934 /var/portage/tmp/portage/x11-libs/libdrm-2.4.82/image/usr/lib64/libdrm.so.2.4.0 thunderbi 5729 21485 piotr DEL REG 254,2 96535 /var/portage/tmp/portage/media-libs/mesa-17.1.5/image/usr/lib64/libgbm.so.1.0.0 thunderbi 5729 21485 piotr DEL REG 254,2 96531 /var/portage/tmp/portage/media-libs/mesa-17.1.5/image/usr/lib64/libGL.so.1.2.0 thunderbi 5729 21485 piotr DEL REG 254,2 96538 /var/portage/tmp/portage/media-libs/mesa-17.1.5/image/usr/lib64/libEGL.so.1.0.0 thunderbi 5729 21485 piotr DEL REG 254,2 20960 / It would be nice if portage could have optional feature to force the fallback mechanism, that is used anyway, if the portage tmp directory is outside of rootfs, or, crosses mount points.
Hello Zac, Any luck in getting it in? I just went with `lsof | grep deleted` on my fresh system with single filesystem and saw that all of the deleted/replaced inodes display `/var/portage/tmp/portage/`. Or, would it be possible for you to suggest where more or less is the code that control it so I could contribute a pull request with this feature? -- Piotr.
Replacement of the executable via rename directly from $PORTAGE_TMPDIR is what triggers this, correct? If we add a workaround, I would like to choose the most efficient one available, avoiding a copy if possible. Here are a couple of cases we can test: Double rename approach: 1) Rename file to a temporary name in the same directory as the target location. 2) Rename again, this time replacing the target location. Hardlink approach: 1) Create temporary named hardlink to source location in same directory as target location. 2) Remove source location. 3) Rename temporary file, replacing the target location. Will either of these approaches work? If not, is there any alternative to copying the file?
I prefer to be able to test things myself when possible. If my suggested workarounds are not effective, then please show me how I cast test it myself using shell commands.
I have hard times to reproduce it now with pure shell, but here's example using busybox and portage, when the portage's tmp directory is on the same mountpoint as the rootfs: arifal ~ # /bin/busybox sleep 3600 & [1] 28758 arifal ~ # pid=$! arifal ~ # ls -l /proc/$pid/exe lrwxrwxrwx 1 root root 0 2018-02-04 10:19 /proc/28758/exe -> /bin/busybox* arifal ~ # emerge -1 busybox >/dev/null 2>&1 arifal ~ # ls -l /proc/$pid/exe lrwxrwxrwx 1 root root 0 2018-02-04 10:19 /proc/28758/exe -> '/var/portage/tmp/portage/sys-apps/busybox-1.28.0/image/bin/busybox (deleted)' arifal ~ # kill $pid [1] + terminated /bin/busybox sleep 3600 Perhaps that will let you reproduce it locally.
Oh wow, I forgot that this kernel bug changed the way it works since 3.15. I even reporetd it to lkml back then https://lkml.org/lkml/2014/9/6/120 so If the rename comes from the same mount point but for another directory, then it will trigger. The in-shell steps to reproduce: mkdir -p /root/tmp/test2 /bin/busybox sleep 3600 & pid=$! ls -l /proc/$pid/exe cp /bin/busybox /root/tmp/test2/busybox mv /root/tmp/test2/busybox /bin/busybox ls -l /proc/$pid/exe kill $pid it does not however trigger if the rename have src and dest in the same directory. I looked around movefile.py and I am not sure from where the `hardlink_cnadidates` comes from but it seems that if the files were indeed in the hardlink_candidates routine, it would work. As creating a hardlink from portage tmp directory to for example '/bin/.busybox.portage.XXX' and ten using os.rename from '.busybox.portage.XXX' into busybox does not trigger the in-kernel bug. So the best solution for kernels >=3.15, that would in fact not copy things, would be to use the hardlink + rename (via hardlink_candidates). As it's two calls instead of just single os.rename, then maybe a check if the source file have exec bit, and then force it into hardlink_candidates? The name of this bug is now misleading, sorry about that. It seems that the hardlink is a soluton here. :)
Originally I saw the issue with the files that had >1 hardlink in portage tmp dir, like git or perl (not anymore). so even if source have >1 link, doing a os.link into the dest's parent dir and os.rename still seems like a valid solution
Hi Zac, Any update on that? I really could use a switch for that. It's a kernel's issue, but without a workround in portage I have hard times tracking /proc/*/exe paths.
Created attachment 532650 [details, diff] movefile: hardlink before rename Hopefully this patch does the trick. Please test.
If this is still an issue, please test out the patch.
It is still an issue because of the decades old bug in kernel, the patch changes the logic but it still have the same result, the in-/proc filesystem exe symlink of pid will have invalid data, currently it will have the $PORTAGE_TMPDIR there and after the patch is applied, it will have *_portage_merge_* name. The problem is that you have N>1 links to a single inode, and kernel updates exe symlink based on the first it will lookup. When the merge from $D to rootfs is done across mount point boundaries, it will fail to hardlink and fallback to copy, which is what would be a solution here, to just have it not even try the hardlinking.
Feel free to close it as WONTFIX if you so desire, there are workarounds that can be used and after all it is kernel bug, not portage, though I am not entirely sure what benefit this hardlinking with the rename then, since the rename is guaranteed to be atomic in VFS, and there's no benefit to keep for brief period the file in new location and $D before it is unlinked. os.link+os.rename+os.unlink vs just os.rename while in boundaries of the same mount point yield exactly the same results and there's no race condition possible there.