So I investigated bug #342681 and the test 20 and realized that sandbox currently does not handle mkdirat very well. The problem shows itself if you have a directory, open a file director to that directory and try with mkdirat create a directory in the old directory with the same name as its parent. For example if you have code that mkdir("~/one"); fd=open("~/one"); mkdirat(fd,"one"); this works fine outside of sandbox, but will fail inside sandbox because erealpath/canonicalize checks ~/one, and sends back to the program calling mkdirat that EEXISTS. however the following works fine inside of sandbox as well: mkdir("~/one"); fd=open("~/one"); mkdirat(fd,"two");
Created attachment 252287 [details] sample code to show the behaviour I know my comment may sound vague, but here is code that shows the behaviour. This is what happends n my system if I run it: xake@lillen ~/devel $ cc -o poc sandbox-poc.c xake@lillen ~/devel $ rm -rf one && ./poc Have created and open dir "one" About to create dir "one/one"... Have created "one/one" xake@lillen ~/devel $ sandbox [s] xake@lillen ~/devel $ rm -rf one && ./poc Have created and open dir "one" About to create dir "one/one"... Failed create "one/one": -1, File exists [s] xake@lillen ~/devel $ As you may see it runs fine outside of sandbox, but inside of sandbox it does not, and with SANDBOX_DEBUG you will get: ... About to create dir "one/one"... EARLY FAIL mkdirat(one[/home/xake/devel/one]) @ lstat: Success Failed create "one/one": -1, File exists ... So it checks one level too low...
So mkdirat_pre_check canonicalize the pathname, and then it EEXISTS if the canonical filepath already exists. This is done before before_syscall() is run. The problem is that it seems like if you give canonicalize a relative path it always return CWD/RELPATH. So if you for example try to with mkdirat(A,A) this will fail since CWD/A does already exists. Likewise does sandbox return EEXISTS for mkdirat(A,B) if CWD/B exists (this is what happends in tars testsuite). Why I mentioned before_syscall() is that sandbox seems to create the file on the right place becouse if mkdirat_pre_check pass then before_syscall() will run and resolvs the dirfd and creates the file in the right place. So the way to fix this seems to be to have mkdirat_pre_check if (dirfd == AT_FDCWD) { readlink(/proc/self/fd/dirfd) (like before_syscall do) to resolv the path to dirfd/RELPATH } else { do the canonicalize to CWD/RELPATH (since AT_FDCWD == -100, and does not resolv to anything inside of /proc/self/fd/)}. Then look if the result exists. Does this sound correct?
Created attachment 252345 [details, diff] Make mkdirat_pre_check resolv the dirfd if dirfd != AT_FDCWD quick, dirty patch. Need some cleanup since there are some things I am uncertain about when it comes to ENAMETOOLONG and so. However this patch does not break anything in the sandbox testsuite, and it makes the tar testsuite (at least tar.git) work with sandbox.
(In reply to comment #3) > Created an attachment (id=252345) [details] > Make mkdirat_pre_check resolv the dirfd if dirfd != AT_FDCWD > > quick, dirty patch. Need some cleanup since there are some things I am > uncertain about when it comes to ENAMETOOLONG and so. However this patch does > not break anything in the sandbox testsuite, and it makes the tar testsuite (at > least tar.git) work with sandbox. tar-1.24-r2 still breaks emerges here
(In reply to comment #4) > tar-1.24-r2 still breaks emerges here With, or without the patch from this bug applied (to sandbox)? The patch is not in portage yet. Try to make sense...
He's saying tar-1.24-r2 should be masked.
(In reply to comment #6) > He's saying tar-1.24-r2 should be masked. Ryan has it exactly right. This bug (sandbox patch not applied) causes emerge to silently corrupt your system. Items that trigger it, like tar 1.24, need to be masked other wise your system gets corrupted and emerge/portage thinks all is ok. IMHO this is really bad. I understand that emerge/portage depends on some tools working correctly and hense does not check some for some errors. In this case, since tar keeps getting rereleased, it _might_ be a good idea to add a check on the tar rc to emerge/portage. From a user's POV I would _much_ rather emerge tell me some ebuilds have failed and the install is corrupt rather than have to parse logs and manually reemerge 10-20 ebuilds each time this happens (third time here). Thanks
(In reply to comment #7) > (In reply to comment #6) > > He's saying tar-1.24-r2 should be masked. > > Ryan has it exactly right. This bug (sandbox patch not applied) causes emerge > to silently corrupt your system. Items that trigger it, like tar 1.24, need to > be masked other wise your system gets corrupted and emerge/portage thinks all > is ok. IMHO this is really bad. > > I understand that emerge/portage depends on some tools working correctly and > hense does not check some for some errors. In this case, since tar keeps > getting rereleased, it _might_ be a good idea to add a check on the tar rc to > emerge/portage. From a user's POV I would _much_ rather emerge tell me some > ebuilds have failed and the install is corrupt rather than have to parse logs > and manually reemerge 10-20 ebuilds each time this happens (third time here). > > Thanks > BTW this time xorg-server-1.9.2 was the one that hit the bug - a missing /usr/bin/X is not fun...
tar-1.25 also has this problem and causes silent corruption. Here qt-core-4.7.1 did not setup files needed by qt-sql-4.71 causing that emerge to fail. Works with tar-1.23-r3 installed. Please mask tar-1.25 Here all the 1.24 and 1.25 releases of tar silently corrupt my install. This is NOT good.
Works fine here. Open a new bug that has some details (files missing, error message, emerge --info, etc.).
Nevermind, confirmed on IRC it's the same issue.
So where should we go from this? Should we do this in another way or is the method in the patch acceptable? If so should we fraction out this method and make it into a function before_syscall() and sb_mkdirat_pre_check() can share? Comment, comment, so we can fix this bug?
i'm guessing comments #4 - comments #11 are dupes of completely different bugs and have no business being here. mkdirat() isnt the only pre-check func that mishandles dirfd. simply grep the source to see more: mkdirat_pre_check.c: /* XXX: need to check pathname with dirfd */ openat_pre_check.c: if (dirfd == AT_FDCWD || pathname[0] == '/') { unlinkat_pre_check.c: /* XXX: need to check pathname with dirfd */ so rather than duplicate code, ive unified things: http://git.overlays.gentoo.org/gitweb/?p=proj/sandbox.git;a=commitdiff;h=c473c10a447a285f8c7b762f34c0650f587e1ff4 and added test cases: http://git.overlays.gentoo.org/gitweb/?p=proj/sandbox.git;a=commitdiff;h=46b108abbbc306357d949fd4cfd91ddab9749b00
Xake: can you verify the change i committed ?
(In reply to comment #14) > Xake: can you verify the change i committed ? > My preliminary testing suggests that your fix works for me.:-) Planning on a release?
that commit had a slight bug. this fixes it. http://git.overlays.gentoo.org/gitweb/?p=proj/sandbox.git;a=commitdiff;h=4a383c33005f7ffad6edeed01f78d8e2cca5203d