This happens with systemd-231 and sandbox-2.11-r2. It also happens with the current git master versions of each package. floppym@naomi ~/src/systemd $ .libs/test-fs-util * /tmp/portage/sys-apps/sandbox-9999/work/sandbox-9999/libsandbox/libsandbox.c:check_syscall():990: failure (No such file or directory): * ISE: opendir(.) abs_path: (null) res_path: (null) /usr/lib64/libsandbox.so(+0xbda2)[0x7f1b4bc0cda2] /usr/lib64/libsandbox.so(+0xbeb5)[0x7f1b4bc0ceb5] /usr/lib64/libsandbox.so(+0x5489)[0x7f1b4bc06489] /usr/lib64/libsandbox.so(opendir+0x29)[0x7f1b4bc090c9] /lib/systemd/libsystemd-shared-231.so(get_files_in_directory+0x43)[0x7f1b4b83f5a3] .libs/test-fs-util(+0x16c0)[0x55715efcd6c0] .libs/test-fs-util(+0x1a6b)[0x55715efcda6b] /lib64/libc.so.6(__libc_start_main+0xf0)[0x7f1b4b2567f0] .libs/test-fs-util(+0xe89)[0x55715efcce89] /proc/17909/cmdline: .libs/test-fs-util Aborted (core dumped)
Same issue when I set the proper LD_LIBRARY_PATH; just including this so nobody asks about that /lib/systemd/ in the first paste. floppym@naomi ~/src/systemd $ LD_LIBRARY_PATH=.libs .libs/test-fs-util * /tmp/portage/sys-apps/sandbox-9999/work/sandbox-9999/libsandbox/libsandbox.c:check_syscall():990: failure (No such file or directory): * ISE: opendir(.) abs_path: (null) res_path: (null) /usr/lib64/libsandbox.so(+0xbda2)[0x7fa873532da2] /usr/lib64/libsandbox.so(+0xbeb5)[0x7fa873532eb5] /usr/lib64/libsandbox.so(+0x5489)[0x7fa87352c489] /usr/lib64/libsandbox.so(opendir+0x29)[0x7fa87352f0c9] .libs/libsystemd-shared-231.so(get_files_in_directory+0x7b)[0x7fa873303b89] .libs/test-fs-util(+0x16c0)[0x55f4a0faf6c0] .libs/test-fs-util(+0x1a6b)[0x55f4a0fafa6b] /lib64/libc.so.6(__libc_start_main+0xf0)[0x7fa872d147f0] .libs/test-fs-util(+0xe89)[0x55f4a0faee89] /proc/17926/cmdline: .libs/test-fs-util Aborted (core dumped)
Links to relevant code: https://github.com/systemd/systemd/blob/master/src/test/test-fs-util.c#L77 https://github.com/systemd/systemd/blob/master/src/basic/fs-util.c#L447
test-fs-util.c calls these tests in this order: - ... - test_readlink_and_make_absolute - test_get_files_in_directory - ... test_readlink_and_make_absolute() does: - create a dir - chdir into it - rmdir it while still staying in it and this is what happens in test_get_files_in_directory(): ``` #0 sb_unwrapped_getcwd_DEFAULT (buf=buf@entry=0x7ffff7f86010 "", size=size@entry=8192) at sandbox-2.10/libsandbox/wrapper-funcs/__wrapper_simple.c:42 #1 egetcwd (buf=buf@entry=0x7ffff7f86010 "", size=size@entry=8192) at sandbox-2.10/libsandbox/libsandbox.c:352 #2 erealpath (name=name@entry=0x555555555b4f ".", resolved=resolved@entry=0x7ffff7f86010 "") at sandbox-2.10/libsandbox/canonicalize.c:90 #3 canonicalize (path=path@entry=0x555555555b4f ".", resolved_path=resolved_path@entry=0x7ffff7f86010 "") at sandbox-2.10/libsandbox/libsandbox.c:175 #4 resolve_path (path=path@entry=0x555555555b4f ".", follow_link=follow_link@entry=0) at sandbox-2.10/libsandbox/libsandbox.c:228 #5 check_syscall (flags=0, file=0x555555555b4f ".", func=0x7ffff7bcdc66 "opendir", sb_nr=16, sbcontext=0x7ffff7dd82f0 <sbcontext>) at sandbox-2.10/libsandbox/libsandbox.c:911 #6 before_syscall (dirfd=dirfd@entry=-100, sb_nr=sb_nr@entry=16, func=func@entry=0x7ffff7bcdc66 "opendir", file=file@entry=0x555555555b4f ".", flags=flags@entry=0) at sandbox-2.10/libsandbox/libsandbox.c:1068 #7 opendir_DEFAULT (name=0x555555555b4f ".") at sandbox-2.10/libsandbox/wrapper-funcs/__wrapper_simple.c:52 #8 get_files_in_directory () at systemd-231/src/basic/fs-util.c:458 #9 test_get_files_in_directory () at src/test/test-fs-util.c:81 ``` Simply put, sandbox's wrapper of opendir() tries to call getcwd() in a non-existant dir and handles the error fatally.
*** Bug 598810 has been marked as a duplicate of this bug. ***
I get this (with my opendir precheck patch included, but this shouldn't be relevant here as the path isn't longer than PATH_MAX): $ ../src/sandbox.sh ============================= Gentoo path sandbox ============================== Detection of the support files. Verification of the required files. Setting up the required environment variables. The protected environment has been started. -------------------------------------------------------------------------------- Process being started in forked instance. [s] leio@prometheus ~/gentoo/sandbox/leio $ mkdir t [s] leio@prometheus ~/gentoo/sandbox/leio $ cd t [s] leio@prometheus ~/gentoo/sandbox/leio/t $ rmdir ../t [s] leio@prometheus ~/gentoo/sandbox/leio/t $ ls * libsandbox.c:check_syscall():990: failure (No such file or directory): * ISE: opendir(.) abs_path: (null) res_path: (null) /home/leio/gentoo/sandbox/libsandbox/.libs/libsandbox.so(+0xbbb2)[0x7f5442cedbb2] /home/leio/gentoo/sandbox/libsandbox/.libs/libsandbox.so(+0xbcc5)[0x7f5442cedcc5] /home/leio/gentoo/sandbox/libsandbox/.libs/libsandbox.so(+0x5408)[0x7f5442ce7408] /home/leio/gentoo/sandbox/libsandbox/.libs/libsandbox.so(opendir+0x4c)[0x7f5442cea0bc] ls[0x403621] /lib64/libc.so.6(__libc_start_main+0xf0)[0x7f5442969720] ls[0x4047b9] /proc/15602/cmdline: ls --color=auto Aborted (core dumped) So with that separate testcase it aborts "ls" the same way, but I don't observe any getcwd stuff in the coredump as it aborts out cleanly (as far as not segfaulting is concerned, not saying an abort here is appropriate in any way): #0 0x00007f03e0c231b8 in raise () from /lib64/libc.so.6 #1 0x00007f03e0c2460a in abort () from /lib64/libc.so.6 #2 0x00007f03e0f94ccf in __sb_ebort (file=file@entry=0x7f03e0f975cb "libsandbox.c", func=func@entry=0x7f03e0f97820 <__func__.8243> "check_syscall", line_num=line_num@entry=990, format=format@entry=0x7f03e0f97588 "ISE: %s(%s)\n\tabs_path: %s\n\tres_path: %s\n") at sb_efuncs.c:138 #3 0x00007f03e0f8e408 in check_syscall (sbcontext=0x7f03e11a3300 <sbcontext>, flags=<optimized out>, file=0x2348830 ".", func=0x7f03e0f97f46 "opendir", sb_nr=<optimized out>) at libsandbox.c:989 #4 before_syscall (dirfd=<optimized out>, sb_nr=<optimized out>, func=<optimized out>, file=0x2348830 ".", flags=<optimized out>) at libsandbox.c:1069 #5 0x00007f03e0f8e476 in before_syscall (dirfd=dirfd@entry=-100, sb_nr=sb_nr@entry=16, func=func@entry=0x7f03e0f97f46 "opendir", file=file@entry=0x2348830 ".", flags=flags@entry=0) at libsandbox.c:1082 #6 0x00007f03e0f910bc in opendir_DEFAULT (name=0x2348830 ".") at wrapper-funcs/__wrapper_simple.c:52 #7 0x0000000000403621 in ?? () #8 0x00007f03e0c10720 in __libc_start_main () from /lib64/libc.so.6 #9 0x00000000004047b9 in ?? () Looking into it, it's not including any getcwd, but might be because "ls" is doing open + getdents, instead of opendir (which is actually NOT a syscall). If I modify my opendir test program to opendir(".") and execute it without using absolute paths (with relative paths it would also abort inside bash, in a third way in execve) it still aborts, and no getcwd. Looking at Mike's output, it looks like a clean abort as well, so not sure where the segfault in getcwd stuff comes from for you, Jan. But maybe I'm misreading something and I don't see it because systemd test does some other sequence of things or something. Nevertheless, still an abort in many cases (opendir, execve, ...) with a gone CWD, which doesn't happen outside sandbox. Seems like something sandbox would struggle with because of trying to do various stat and other calls and probably losing the fd handle keeping this thing around or something of that sort. Maybe at least giving a ENOENT errno here would get this sane enough? Though outside sandbox this isn't what happens: ls case: open(".", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0 getdents(3, 0xa78cd0, 32768) = -1 ENOENT (No such file or directory) (open succeeds, fstat too, fails with ENOENT once trying to read directory entries) my opendir testcase: open(".", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0 brk(NULL) = 0xbaf000 there all seems good outside sandbox without any errors whatsoever, so ENOENT wouldn't seem to be strictly correct either. So I guess my point is that systemd order of calls isn't the only case that aborts, but many other cases as well. Hopefully a fix would fix them all, but not sure how to go about this. Not sure why we go through all the check_syscall stuff for something that isn't a syscall, but I see the same kind of aborts with execve too when I try to execute something in bash via a relative path too, so...
(In reply to Mart Raudsepp from comment #5) The code that's failing looks cca like this: ``` if(egetcwd() == FAIL) { abort(); } ``` So that's why you don't see egetcwd() in the stacktrace. I caught it by stepping in gdb. Also the proper bug tracking the core issue behind this class of errors is https://bugs.gentoo.org/show_bug.cgi?id=598810 (which was closed as a non-issue). This bug is only about workarounds for the systemd test suite.
*** Bug 623846 has been marked as a duplicate of this bug. ***
Do we have a modern reproducer of this bug? Today's systemd-9999 tests work just fine: 336/512 test-fs-util OK 0.13 s Similar success if I run 'pwd' in a deleted directory from under sandbox. I think it's because opendir() is backed by kernel's openat(). Please also post you 'emerge --info' if it happens to be a non amd64-glibc system.
I hit it with dev-python/pyproject2setuppy-1 (just copy the ebuild from -2), with any python3* target: tests/test_flit.py::FlitBasicTest::test_real_build_system <- tests/base.py * /tmp/portage/sys-apps/sandbox-2.18/work/sandbox-2.18/libsandbox/libs$ ndbox.c:check_syscall():974: failure (No such file or directory): * ISE: opendir(.) abs_path: (null) res_path: (null) /tmp/portage/dev-python/pyproject2setuppy-1/temp/environment: line 2834: 65 Aborted (core dumped) pytest -vv * ERROR: dev-python/pyproject2setuppy-1::gentoo failed (test phase): * Tests fail with python3.8
Thanks, that helps. The smaller reproducer is: $ cat test_bug.py import unittest import importlib import os import sys from tempfile import TemporaryDirectory def make_package(): d = TemporaryDirectory() os.chdir(d.name) with open('foo', 'w') as f: pass return d class FlitBasicTest(unittest.TestCase): def test_1(self): with TemporaryDirectory(): with make_package(): sys.path.insert(0, '.') def test_2(self): backend = importlib.import_module('foo') $ sandbox pytest =================================================================== test session starts ==================================================================== platform linux -- Python 3.6.9, pytest-4.5.0, py-1.5.4, pluggy-0.11.0 rootdir: /home/slyfox/dev/bugs/sandbox/opendir-dot collected 2 items test_bug.py . * ../../sandbox-2.18/libsandbox/libsandbox.c:check_syscall():974: failure (No such file or directory): * ISE: opendir(.) abs_path: (null) res_path: (null) Sandboxed process killed by signal: Aborted Aborted (core dumped) It's not very deterministic and sometimes does not cause sandbox to crash. Probably due to GC order evicting temp directories.
'perf trace' shows shorter reproducer: $ LANG=C sandbox 'mkdir /tmp/zzz; cd /tmp/zzz; rmdir /tmp/zzz; ls' * ../../sandbox-2.18/libsandbox/libsandbox.c:check_syscall():974: failure (No such file or directory): * ISE: opendir(.) abs_path: (null) res_path: (null) /usr/lib64/libsandbox.so(+0xd1b5)[0x7fc1e4cad1b5] /usr/lib64/libsandbox.so(+0xd2ce)[0x7fc1e4cad2ce] /usr/lib64/libsandbox.so(+0x5fa2)[0x7fc1e4ca5fa2] /usr/lib64/libsandbox.so(opendir+0x4c)[0x7fc1e4ca952c] ls(+0xb3e8)[0x56104ddbe3e8] ls(+0x548d)[0x56104ddb848d] /lib64/libc.so.6(__libc_start_main+0xeb)[0x7fc1e4aeaf1b] ls(+0x612a)[0x56104ddb912a] /proc/1827363/cmdline: ls /bin/bash: line 1: 1827363 Aborted ls
Even fancier case of an escape from deleted directory that sandbox fails on: $ LANG=C sandbox 'mkdir /tmp/zzz; cd /tmp/zzz; rmdir /tmp/zzz; touch ../foo' * ../../sandbox-2.18/libsandbox/libsandbox.c:check_syscall():974: failure (No such file or directory): * ISE: open_wr(../foo) abs_path: (null) res_path: (null) /usr/lib64/libsandbox.so(+0xd1b5)[0x7fceea4b61b5] /usr/lib64/libsandbox.so(+0xd2ce)[0x7fceea4b62ce] /usr/lib64/libsandbox.so(+0x5fa2)[0x7fceea4aefa2] /usr/lib64/libsandbox.so(open+0x8c)[0x7fceea4b4bcc] touch(+0x47aa)[0x5586d17837aa] touch(+0x39c4)[0x5586d17829c4] /lib64/libc.so.6(__libc_start_main+0xeb)[0x7fceea2fbf1b] touch(+0x3e0a)[0x5586d1782e0a] /proc/3649526/cmdline: touch ../foo Here linux kernel (AFAIU) does not provides us any facility to resolve '../foo' down to '/tmp/foo' for sandbox to perform accessibility checks. We can either 'always-allow' or 'always-block' operations against deleted directory. 'always-allow' will be an easy way to escape sandbox. My only concern is that escape can be done by accident (equivalent of 'rm -rf $WORKDIR'). Given that one: - can't delete non-empty directory - can't create files in deleted directory - can't getcwd() within a non-empty directory I suggest just making the sandbox error nicer and suggest users fix software to avoid manipulating filesystem from within deleted directory.
Created attachment 600598 [details, diff] 0001-libsandbox-libsandbox.c-add-errno-output-for-interna.patch
Created attachment 600600 [details, diff] 0002-check_syscall-turn-internal-sandbox-violation-into-d.patch
Will this eventually trigger sandbox failure or just an error message?
It will deny write as in example from commit message: """ Report after the change looks like: $ ./sandbox.sh 'mkdir /tmp/zzz; cd /tmp/zzz; rmdir /tmp/zzz; touch ../foo' * ACCESS DENIED: open_wr: '../foo' (from deleted directory, see https://bugs.gentoo.org/590084) * ACCESS DENIED: utimensat: '../foo' (from deleted directory, see https://bugs.gentoo.org/590084) touch: cannot touch '../foo': Permission denied """ Fr comparison 'mkdir /tmp/zzz; cd /tmp/zzz; rmdir /tmp/zzz; touch ../foo' just works when ran outside sandbox.
In https://bugs.gentoo.org/534172#c8 makes in into a non-fatal error.
I've merged your changes and released 2.18 but it turns out the new tests XPASS-es when run via ebuild. Could you investigate?
I'll have a look.
(In reply to Sergei Trofimovich from comment #19) > I'll have a look. Ah, it's a bad test. It fails only on 32-bit ABI because test runs system's 'touch' (64-bit ABI). Test should use locally built binary for 32-bit ABI to match tested libsandbox. I'll try to write a patch.
Created attachment 642870 [details, diff] 0001-tests-script-16.sh-mark-as-passing-only-for-native-A.patch 0001-tests-script-16.sh-mark-as-passing-only-for-native-A.patch should fix test (at least it does for me on live ebuild for x86 and amd64 ABIs).
The bug has been closed via the following commit(s): https://gitweb.gentoo.org/proj/sandbox.git/commit/?id=818d14f59a5bcf3cc9e8e88a993abc5605ed0b26 commit 818d14f59a5bcf3cc9e8e88a993abc5605ed0b26 Author: Sergei Trofimovich <slyfox@gentoo.org> AuthorDate: 2020-05-31 09:55:41 +0000 Commit: Michał Górny <mgorny@gentoo.org> CommitDate: 2020-05-31 10:31:10 +0000 tests/script-16.sh: mark as passing only for native ABI All scripts assume that ran tools matck tested sandbox's ABI. Most scripts have a guard against ABI check, but script-16 was missing it. It's afollow-up commit to 24fd102c9976 ("check_syscall(): turn internal sandbox violation into denywrite") Reported-by: Michał Górny Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org> Closes: https://bugs.gentoo.org/590084 Signed-off-by: Michał Górny <mgorny@gentoo.org> tests/script-16.sh | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)