Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 687236 - app-misc/ca-certificates: use 'doins' instead of 'cp -pPR' to workaround musl bug
Summary: app-misc/ca-certificates: use 'doins' instead of 'cp -pPR' to workaround musl...
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Gentoo's Team for Core System packages
Keywords: NeedPatch
Depends on:
Blocks: musl-porting
  Show dependency tree
Reported: 2019-06-03 00:45 UTC by Joshua Kinard
Modified: 2020-03-07 17:03 UTC (History)
3 users (show)

See Also:
Package list:
Runtime testing required: ---

Replace 'cp -pPR' with 'doins -r" in src_install (ca-certificates-20190110.3.43_ebuild-musl-workaround.diff,386 bytes, patch)
2019-06-03 00:45 UTC, Joshua Kinard
Details | Diff
strace before recompilation (strace.before_recompilation,4.48 KB, text/plain)
2020-01-21 22:30 UTC, Andrew Aladjev
strace after recompilation (strace.after_recompilation,3.34 KB, text/plain)
2020-01-21 22:30 UTC, Andrew Aladjev
cross vs native build diff (cross_vs_native.diff,172.11 KB, patch)
2020-01-22 08:09 UTC, Andrew Aladjev
Details | Diff
coreutils 8.30 musl guessing (musl_guessing.patch.xz,136.29 KB, patch)
2020-02-13 18:08 UTC, Andrew Aladjev
Details | Diff
musl_guessing.8.31.patch.xz (musl_guessing.8.31.patch.xz,92.25 KB, application/x-xz)
2020-03-07 17:03 UTC, Andrew Aladjev

Note You need to log in before you can comment on or make changes to this bug.
Description Joshua Kinard gentoo-dev 2019-06-03 00:45:10 UTC
Created attachment 578488 [details, diff]
Replace 'cp -pPR' with 'doins -r" in src_install

There appears to be a rather weird bug in the 'cp' command when compiled against sys-libs/musl, such that 'cp -pPR' against a symlink will fail with the error "cp: failed to preserve ownership for <FILE>: Not supported".


Running these commands in my MIPS musl chroot reproduces the error:

root@octane ~ # mkdir d1
root@octane ~ # cd d1
root@octane ~/d1 # touch p1
root@octane ~/d1 # ln -s p1 lp1
root@octane ~/d1 # cd ..
root@octane ~ # cp -a d1 d2
cp: failed to preserve ownership for d2/lp1: Not supported

This is a problem with app-misc/ca-certificates because in the ebuild's 'src_install', the command 'cp -pPR image/* "${D}"/ || die' is executed, which will trigger this musl bug because the files in ${D}/work/image/etc/ssl/certs/* are symlinks to the actual certificates in ${D}/work/image/usr/share/ca-certificates/mozilla/*.

I tested replacing the 'cp' command with 'doins -r' and this resolves the issue in my scenario.  Manually checking file permissions and ownership of the modified approach versus a text-merge on my x86_64 box shows no discernible difference in the ebuild's operation.
Comment 1 Felix Janda 2019-06-03 05:42:37 UTC
Could you report that this still happens to upstream?

I cannot reproduce it on musl arm. Maybe it is mips specific.
Comment 2 Mike Gilbert gentoo-dev 2019-06-03 18:03:19 UTC
Calling doins -r would introduce a duplicate ${EPREFIX} into the resulting installation image.

src_unpack, src_prepare, and src_compile would require adjustment to remove ${EPREFIX} from the "image" directory.
Comment 3 Joshua Kinard gentoo-dev 2019-06-04 05:47:32 UTC
(In reply to Mike Gilbert from comment #2)
> Calling doins -r would introduce a duplicate ${EPREFIX} into the resulting
> installation image.
> src_unpack, src_prepare, and src_compile would require adjustment to remove
> ${EPREFIX} from the "image" directory.

It does not appear to have done that in the chroot I have.  Both /etc/ssl/certs and /usr/share/ca-certificates/mozilla are valid directories from the root of the chroot.  And the symlinks in /etc/ssl/certs are properly pointed at the correct certs in /usr/share/ca-certificates/mozilla.

Perhaps the doins logic has gotten smarter in recent portage releases?
Comment 4 Joshua Kinard gentoo-dev 2019-06-04 05:54:00 UTC
Okay, in my case, EPREFIX isn't set to anything, since I'm basically doing a normal install.  I don't recall ever having used EPREFIX before, hence not recognizing it.  I can see that if that's defined, than can introduce problems, but at least for the case of a normal install, doins does work.
Comment 5 Mike Gilbert gentoo-dev 2019-06-04 14:25:04 UTC
(In reply to Joshua Kinard from comment #4)

Right, most people would not have EPREFIX defined; it's a rather niche feature. I just don't want to break things for the people who do. :)
Comment 6 Joshua Kinard gentoo-dev 2019-06-12 19:50:20 UTC
(In reply to Felix Janda from comment #1)
> Could you report that this still happens to upstream?
> I cannot reproduce it on musl arm. Maybe it is mips specific.

Okay, it looks like I had two copies of 'cp' in my PATH.  /bin/cp installed by coreutils does not exhibit the problem, while /usr/bin/cp, which must be from the OpenADK base that I started from, did.  Possible that it's a side-effect of replacing the libraries the OpenADK binaries were linked against w/ the Gentoo ones that introduced subtle breakage.  Getting rid of /usr/bin/cp appears to have fixed the problem.

Going to close as INVALID.
Comment 7 Andrew Aladjev 2020-01-21 21:13:08 UTC
Hello. I think you have closed this issue too early. This is really musl bug, I am reproducing it with "emerge -v1 portage" inside musl x86_64 system without qemu. I have only one "bin/cp" and it is from coreutils.

I see that people reported this bug I am going to find how to reproduce and debug it and provide more info for upstream.
Comment 8 Andrew Aladjev 2020-01-21 21:25:27 UTC
I've received:

cp: failed to preserve ownership for /var/tmp/portage/sys-apps/portage-2.3.84-r1/image/./usr/bin/quickpkg: Not supported                                                                                          
cp: failed to preserve ownership for /var/tmp/portage/sys-apps/portage-2.3.84-r1/image/./usr/bin/egencache: Not supported

ls -la /var/tmp/portage/sys-apps/portage-2.3.84-r1/image/./usr/bin/:

quickpkg -> ../lib/python-exec/python-exec2
egencache -> ../lib/python-exec/python-exec2

ls -la /var/tmp/portage/sys-apps/portage-2.3.84-r1/image/./usr/lib/python-exec/python-exec2:

No such file or directory

So issue is about broken symlinks.
Comment 9 Andrew Aladjev 2020-01-21 21:36:20 UTC
buildah run 82abc5b40984 -- sh -c 'cd /tmp && ln -sf fit musl; cp -a musl musl2; ls -la'

cp: failed to preserve ownership for musl2: Not supported

musl -> fit
musl2 -> fit

Issue is very easy to reproduce. I will try to find a solution soon.
Comment 10 Andrew Aladjev 2020-01-21 22:27:05 UTC
This issue can not be reproduced after rebuilding of sys-apps/coreutils. For example:

PYTHON_TARGETS='python3_6' emerge -v1 sys-apps/portage # failed, checked twice!

USE='-nls -acl' emerge -v1 sys-apps/coreutils # ok
PYTHON_TARGETS='python3_6' emerge -v1 sys-apps/portage # work fine, checked twice!

This is super strange bug. Maybe something is wrong with coreutils cross compilation.

I will attach 2 different strace logs. First one is about cp before recompilation and second one is after recompilation.
Comment 11 Andrew Aladjev 2020-01-21 22:30:20 UTC
Created attachment 603924 [details]
strace before recompilation
Comment 12 Andrew Aladjev 2020-01-21 22:30:37 UTC
Created attachment 603926 [details]
strace after recompilation
Comment 13 Andrew Aladjev 2020-01-21 22:39:23 UTC
This is the main diff:

-lstat("musl2", {}) = 0
-newfstatat(AT_FDCWD, "musl2", {}, AT_SYMLINK_NOFOLLOW) = 0
-fcntl(1, F_GETFL)                       = 0x1 (flags O_WRONLY)
-lseek(0, 0, SEEK_CUR)                   = -1 ESPIPE (Invalid seek)
+utimensat(AT_FDCWD, "musl2", [{}, {}], AT_SYMLINK_NOFOLLOW) = 0
+llistxattr("musl", NULL, 0)             = 0
+llistxattr("musl", 0x7fffab8f3320, 0)   = 0

This issue is very unusual, I will continue it tomorrow =).
Comment 14 Andrew Aladjev 2020-01-22 08:09:53 UTC
Created attachment 603936 [details, diff]
cross vs native build diff
Comment 15 Andrew Aladjev 2020-01-22 08:22:18 UTC
We can see that problem is inside autoconf. It has some wrong crosscompiling "guessing" for musl host, which should be found and fixed.

If you are reading this comment - please do not ever use autotools in your new projects. Use cmake or something other. Autotools is the biggest pain for everyone who tried to fight with it.
Comment 16 Andrew Aladjev 2020-01-22 09:54:45 UTC
There are bunch of guessing to fix. People already fixed it here

So we shouldn't worry. It looks like this patch will be available in next coreutils release. For now we can just recompile coreutils before using it inside new musl container.
Comment 17 Rich Felker 2020-01-22 13:56:35 UTC
The gnulib guessing might bury the problem so it doesn't happen again, but it's not the right fix. The problem is in cp.c, where it's treating EOPNOTSUPP from lchmod as an error rather than silently ignoring it. While lchmod is not a standard function anyway, fchmodat is and specifies EOPNOTSUPP for systems where changing mode of a symbolic link is not supported, so it makes sense for lchmod to do the same (we simply define it in terms of fchmodat with AT_SYMLINK_NOFOLLOW).

gnulib has a replacement for lchmod that just does chmod, which is dangerously wrong (caller is supposed to avoidcalling it on links, but that has race conditions where it follows the link and modifies the link target's permissions). The function in musl safely avoids this, but coreutils' cp is then treating the reported avoidance as an error.
Comment 18 Andrew Aladjev 2020-02-13 18:08:06 UTC
Created attachment 613624 [details, diff]
coreutils 8.30 musl guessing
Comment 19 Andrew Aladjev 2020-02-13 18:09:22 UTC
Thank you, we will wait for proper solution. For now I've backported patch for the most popular coreutils 8.30 that adds musl gnulib guessing.
Comment 20 Andrew Aladjev 2020-03-07 17:01:22 UTC
Yesterday I've noticed that new coreutils release appeared - 8.32. It already has musl guessing inside.

I want to upload musl guessing patch related to 8.31.
Comment 21 Andrew Aladjev 2020-03-07 17:03:39 UTC
Created attachment 617424 [details]