Currently, PMS only scarcely describes commands for interacting with the sandbox. Besides not even telling what the sandbox is, even given the extra knowledge it's not enough to portably use the commands, not to mention implementing a sandbox. I think we should describe, at the very least: 1. What are the default path lists? We need to know that to know when to use add*. 2. How do different path lists interact? 2a. Does write imply read? Out are we expected to pass the same path to both? 2b. How do the paths stack? Given x=/foo y=/foo/bar, which one takes precedence for /foo/bar? The more specific (x) or just stronger of the two? 2c. How does addwrite/addpredict interact? And adddeny? 3. How far does write prevention work? Do we disallow opening file with write mode? Is access() supposed to report it as non-non-writable?
*** Bug 483238 has been marked as a duplicate of this bug. ***
Let's note what sandbox does right now. > 1. What are the default path lists? We need to know that to know when to use > add*. (I will use r=READ, w=WRITE, p=PREDICT, d=DENY) /etc/sandbox.conf *always adds* the following paths to sandbox by default: w: /dev/fd /proc/self/fd /dev/zero /dev/null /dev/full /dev/console /dev/tty /dev/vc/ /dev/pty /dev/tts /dev/ptmx /dev/pts/ /dev/shm /tmp/ /var/tmp/ ${HOME}/.bash_history Now, if Portage didn't set any, then /etc/sandbox.d would default to: r: / # all but the first supposedly needed for configure tests w: ${PWD} /usr/tmp/conftest /usr/lib*/conftest /usr/lib*/cf Additionally, some packages (icedtea, openssl, fontconfig) install some entries on my system: p: /dev/crypto /proc/self/coredump_filter /var/cache/fontconfig What Portage sets are (some of those conditional to features and limited in scope): r: / ${PORTAGE_TMPDIR} ${CCACHE_DIR} w: ${PORTAGE_TMPDIR} ${DISTCC_LOG%/*} ${CCACHE_DIR} ${DISTCC_DIR} ${PORTAGE_CONFIGROOT}etc/portage/suidctl.conf /selinux/context /sys/fs/selinux/context ${RPMDIR} p: # see below So basically Portage sets r/w unconditionally, overriding /etc/sandbox.d (but including /etc/sandbox.conf). 'p' is more complex because: a. Portage sets it to ${PORTAGE_GPG_DIR} if that var is set. If it's not set, then it's unset and /etc/sandbox.d applies instead... b. Portage adds / for src_test() (wtf?!).
Hmm, on the big question list I forgot one important bit: how to deal with symlinks? In the following algo, I will use the following path terms: - 'apparent path' is the apparent accessed path that might be a symlink, - 'real path' is the path with all symlinks resolved (if possible), - 'path' is like 'apparent path' if syscall operates directly on symlinks rather than their target, 'real path' otherwise. Action terms: - accept = allow the syscall and break the algo, - reject = deny the syscall, report violation, trigger failure and break the algo, - silently reject = deny the syscall without failure and break the algo. It seems that access control algo in sandbox works in first-to-match mode, with some symlink resolution magic, in the following order: 1. reject if apparent path is on deny list, 2. if syscall would operate on symlink target (rather than symlink itself): 2a. reject if path is on deny list, 3. if in read-ish syscall: 2a. accept if path is on read list, 2b. silently reject if access() syscall, 3. if in write-ish syscall: 3a. reject if path is on write deny list (that doesn't seem to be settable...), 3b. accept if path is on write list, 3c. [there seems to be an obsolete hack for changing symlinks to protected targets here but it should no longer do anything given that 3b uses apparent path if the function operates on symlinks] 3d. allow if path matches the real path of /proc/self/fd, 3e. allow (pass through to syscall for error) if parent directory does not exist, 3f. silently reject if path is on predict list, 3g. silently reject if path ends with .pyc/.pyo (seriously?), 3h. silently reject if access() syscall, 4. reject.
Few more fun facts about paths in SANDBOX*: 1. Sandbox is always adding specified path and its realpath, i.e. addfoo /symlink also lists whatever path the symlink points to at the time sb detects env foobar. 2. Paths are matched using dumb prefix matching, so addfoo /foo also matches /foobar. 3. Trailing slash can be used to restrict match to a single directory tree, and it matches the directory as well.
For completeness, few notes on control variables: - processes in sandbox can freely manipulate path lists via altering sandbox vars, - path sets are passed down the process tree but not upwards, i.e. add* in subshell should not affect the parent shell.
Now a few notes on sydbox as a alternative: 1. Sydbox supports separate read, write and exec sandboxing. I suppose we'll map pms read -> read+exec. 2. Each of those can work either in blacklist or whitelist mode (but not both). The whitelist mode should match sandbox's read/write lists. 3. There are also filters that silence access violations. Write filter should be equivalent to predict list. 4. I don't see an obvious way to implement deny. However, given that it's not used at all in gentoo and that people can't rely on sandbox being present, I guess we can live with it being stubbed. 5. Sydbox can be configured using magic commands. They allow adding and removing paths from the lists but not querying them. 6. Sydbox uses one status for the whole process tree. Changes done in subprocesses affect the parent process as well. 7. The lists are implemented using rsync patterns which provide great flexibility. If we really wanted to, we can copy the current sandbox behavior. 8. Symlink behavior TBD but it probably is more correct than sandbox.
Now, for a few more fun facts: 1. PMS specifies that add* take directories only but it's very common for ebuilds to pass file paths there. 2. Sandbox does not follow normal ACL rules when applying restrictions, e.g. does not complain if you create a file in directory to which writing is forbidden (as long as the file path is whitelisted). 3. Combined with stupid prefix matching, ebuilds can freely create /dev/nullfoobar and so on.
I don't think this bug is actionable as-is. A patch with improved wording for the spec would be a better basis for further discussion.