When an ebuild tries to directly write to /dev/stderr, while inside a chroot, sandbox errors out with the error: * ACCESS DENIED: open_wr: /dev/stderr This happens on the portage stable release in the latest stage3, sys-apps/portage-3.0.61-r1::gentoo, and causes dev-util/radare2 to not be able to compile while inside a chroot. The reason for this, is that /proc/self/fd/2 is pointing to a *full path* to the chroot, instead of containing a path relative from the chroot's root directory. I have no idea what is causing this incorrect symlink. A minimal ebuild to reproduce this would be: EAPI=8 inherit edo S=$WORKDIR SLOT=0 src_configure() { edo /bin/bash -c 'ls -l /dev/stderr' edo /bin/bash -c 'ls -l /proc/self/fd' edo /bin/bash -c 'ls -l /dev/pts' edo /bin/bash -c 'echo a >&2' edo /bin/bash -c 'echo a >/dev/stderr' }
I feel like I remember another bug for this somewhere (a way more recent one).
Created attachment 883166 [details] repository containing the failing ebuild I've attached a test.tar.xz containing the ebuild above and the necessary repository metadata to be able to reproduce the issue following the steps below: mkdir /tmp/stage3 && cd /tmp/stage3 wget https://distfiles.gentoo.org/releases/amd64/autobuilds/20240121T170320Z/stage3-amd64-openrc-20240121T170320Z.tar.xz tar xpvf stage3-*.tar.xz --xattrs-include='*.*' --numeric-owner mount --types proc /proc proc mount --rbind /sys sys mount --make-rslave sys mount --rbind /dev dev mount --make-rslave dev mount --bind /run run mount --make-slave run chroot . /bin/bash source /etc/profile export PS1="(chroot) ${PS1}" emerge-webrsync wget <url/to/test.tar.xz> tar xf test.tar.xz -C /tmp ebuild /tmp/test/test/test/test-0.ebuild configure Output: Appending /tmp/test to PORTDIR_OVERLAY... bash: línia 1: /var/cache/distfiles/.__portage_test_write__: S’ha denegat el permís Adjusting permissions recursively: '/var/cache/distfiles' >>> Unpacking source... >>> Source unpacked in /var/tmp/portage/test/test-0/work >>> Preparing source in /var/tmp/portage/test/test-0/work ... >>> Source prepared. >>> Configuring source in /var/tmp/portage/test/test-0/work ... * /bin/bash -c ls -l /dev/stderr lrwxrwxrwx 1 root root 15 26 Jan 11:56 /dev/stderr -> /proc/self/fd/2 * /bin/bash -c ls -l /proc/self/fd total 0 lr-x------ 1 portage portage 64 26 Jan 12:05 0 -> /var/tmp/stage3/dev/null lrwx------ 1 portage portage 64 26 Jan 12:05 1 -> /var/tmp/stage3/dev/pts/5 lrwx------ 1 portage portage 64 26 Jan 12:05 2 -> /var/tmp/stage3/dev/pts/5 lr-x------ 1 portage portage 64 26 Jan 12:05 3 -> /proc/34/fd * /bin/bash -c ls -l /dev/pts total 0 crw--w---- 1 1000 tty 136, 0 26 Jan 12:05 0 crw--w---- 1 1000 tty 136, 1 26 Jan 11:57 1 crw--w---- 1 1000 tty 136, 2 26 Jan 11:56 2 crw--w---- 1 root tty 136, 3 26 Jan 12:04 3 crw--w---- 1 1000 tty 136, 4 26 Jan 12:00 4 crw--w---- 1 portage portage 136, 5 26 Jan 12:05 5 c--------- 1 root root 5, 2 26 Jan 11:56 ptmx * /bin/bash -c echo a >&2 a * /bin/bash -c echo a >/dev/stderr * ACCESS DENIED: open_wr: /dev/stderr /bin/bash: línia 1: /dev/stderr: S’ha denegat el permís * ERROR: test/test-0::build failed (configure phase): * Failed to run command: /bin/bash -c echo a >/dev/stderr * * Call stack: * ebuild.sh, line 136: Called src_configure * environment, line 514: Called edo '/bin/bash' '-c' 'echo a >/dev/stderr' * environment, line 453: Called die * The specific snippet of code: * "$@" || die -n "Failed to run command: $@" * * If you need support, post the output of `emerge --info '=test/test-0::build'`, * the complete build log and the output of `emerge -pqv '=test/test-0::build'`. * The complete build log is located at '/var/tmp/portage/test/test-0/temp/build.log'. * The ebuild environment file is located at '/var/tmp/portage/test/test-0/temp/environment'. * Working directory: '/var/tmp/portage/test/test-0/work' * S: '/var/tmp/portage/test/test-0/work' * ----------------------- SANDBOX ACCESS VIOLATION SUMMARY ----------------------- * LOG FILE: "/var/tmp/portage/test/test-0/temp/sandbox.log" * VERSION 1.0 FORMAT: F - Function called FORMAT: S - Access Status FORMAT: P - Path as passed to function FORMAT: A - Absolute Path (not canonical) FORMAT: R - Canonical Path FORMAT: C - Command Line F: open_wr S: deny P: /dev/stderr A: /dev/stderr R: /dev/stderr C: /bin/bash -c echo a >/dev/stderr * --------------------------------------------------------------------------------
(In reply to Sam James from comment #1) > I feel like I remember another bug for this somewhere (a way more recent > one). I don't think that bug is very related. Even if you were to whitelist the files properly, this bug is caused by /proc/self/fd/2 pointing to a file that doesn't exist.
(In reply to Esteve Varela Colominas from comment #3) > (In reply to Sam James from comment #1) > > I feel like I remember another bug for this somewhere (a way more recent > > one). > > I don't think that bug is very related. Even if you were to whitelist the > files properly, this bug is caused by /proc/self/fd/2 pointing to a file > that doesn't exist. The first bug I added to See Also is mostly "general /dev/std*". The one I have in mind is a more specific one, like bug 540828, but newer.
This appears to be caused by FEATURES="pid-sandbox".
I can reproduce the issue using the provided instructions, but I'm not much closer to resolving it. I think we need to better understand exactly what sequence of forks/unshares/mounts causes /tmp/stage3 to appear in the /proc/self/fd symlink targets. I haven't quite figured out how to debug that.
There's definitely something that breaks when we unshare(CLONE_NEWNS|CLONE_NEWPID). Any files that were opened between the chroot() and the unshare() call have their paths update in /proc to include the full path as it would exist outside the chroot. I'm unable to reproduce the behavior with Portage inside a systemd-nspawn container, so I suspect it does something to make the kernel behave better in this scenario.
(In reply to Mike Gilbert from comment #7) > I'm unable to reproduce the behavior with Portage inside a systemd-nspawn > container, so I suspect it does something to make the kernel behave better > in this scenario. I poked around in systemd-nspawn, and found that instead of chroot it called a mount_switch_root function that calls this interesting function that combines pivot_root and umount2: https://github.com/systemd/systemd/blob/main/src/shared/mount-util.c#L456C1-L470C2 > static int mount_switch_root_pivot(int fd_newroot, const char *path) { > assert(fd_newroot >= 0); > assert(path); > > /* Let the kernel tuck the new root under the old one. */ > if (pivot_root(".", ".") < 0) > return log_debug_errno(errno, "Failed to pivot root to new rootfs '%s': %m", path); > > /* Get rid of the old root and reveal our brand new root. (This will always operate on the top-most > * mount on our cwd, regardless what our current directory actually points to.) */ > if (umount2(".", MNT_DETACH) < 0) > return log_debug_errno(errno, "Failed to unmount old rootfs: %m"); > > return 0; > }
With the information above, I've been able to combine unshare+pivot_root to achieve the same effect as systemd-nspawn does. In the reproduction steps above, replace everything between unpacking the stage3 and entering the chroot with the following: unshare -m mkdir old-root new-root mount --bind . new-root cd new-root mount --types proc /proc proc mount --rbind /sys sys mount --make-rslave sys mount --rbind /dev dev mount --make-rslave dev mount --bind /run run mount --make-slave run pivot_root . old-root exec chroot . /bin/bash "fixed" output: # ebuild /tmp/test/test/test/test-0.ebuild clean configure Appending /tmp/test to PORTDIR_OVERLAY... >>> Unpacking source... >>> Source unpacked in /var/tmp/portage/test/test-0/work >>> Preparing source in /var/tmp/portage/test/test-0/work ... >>> Source prepared. >>> Configuring source in /var/tmp/portage/test/test-0/work ... * /bin/bash -c ls -l /dev/stderr lrwxrwxrwx 1 root root 15 Jan 27 12:54 /dev/stderr -> /proc/self/fd/2 * /bin/bash -c ls -l /proc/self/fd total 0 lr-x------ 1 portage portage 64 Jan 27 13:06 0 -> /dev/null lrwx------ 1 portage portage 64 Jan 27 13:06 1 -> /dev/pts/4 lrwx------ 1 portage portage 64 Jan 27 13:06 2 -> /dev/pts/4 lr-x------ 1 portage portage 64 Jan 27 13:06 3 -> /proc/34/fd * /bin/bash -c ls -l /dev/pts total 0 crw--w---- 1 1000 tty 136, 0 Jan 27 13:05 0 crw--w---- 1 1000 tty 136, 1 Jan 27 12:54 1 crw--w---- 1 root tty 136, 2 Jan 27 13:06 2 crw--w---- 1 1000 tty 136, 3 Jan 27 13:01 3 crw--w---- 1 portage portage 136, 4 Jan 27 13:06 4 c--------- 1 root root 5, 2 Jan 27 12:54 ptmx * /bin/bash -c echo a >&2 a * /bin/bash -c echo a >/dev/stderr a >>> Source configured.
Ok, so I think there is probably no way to "fix" pid-sandbox when it is used after chroot() without some kernel changes. On the sandbox side, one solution would be to just trust any operation on /proc/self/fd/*. That makes us less "secure", but sandbox is not really a security tool anyway.