Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 98419

Summary: [sandbox] built-in `getcwd()' implementation broken
Product: Portage Development Reporter: bartron <bartron>
Component: SandboxAssignee: Sandbox Maintainers <sandbox>
Status: RESOLVED REMIND    
Severity: normal CC: ccb921215, sam
Priority: High    
Version: unspecified   
Hardware: All   
OS: Linux   
See Also: https://bugs.gentoo.org/show_bug.cgi?id=149982
https://bugs.gentoo.org/show_bug.cgi?id=261936
https://bugs.gentoo.org/show_bug.cgi?id=447970
Whiteboard:
Package list:
Runtime testing required: ---
Attachments: getcwd_glibc.diff
getcwd-test.c

Description bartron 2005-07-08 18:26:42 UTC
(1) in `__syscall_egetcwd()', line #151:

        `st' is used uninitialized... should be initialized with "/".

(2) in `recurser()', line #131

        `path_buf' is overwritten with "/.." ... but should have "/.." appended.
Comment 1 Martin Schlemmer (RETIRED) gentoo-dev 2005-07-09 02:36:48 UTC
Something like this:

-----
Index: getcwd.c
===================================================================
--- getcwd.c    (revision 125)
+++ getcwd.c    (working copy)
@@ -128,7 +128,7 @@
        if (strlen(path_buf) + 4 > path_size) {
            goto oops;
        }
-       snprintf(path_buf, 4, "/..");
+       snprintf(path_buf + strlen(path_buf), 4, "/..");
        if (recurser(path_buf, path_size, root_dev, root_ino) == 0)
                return 0;

@@ -147,6 +147,8 @@
     size_t olderrno;

     olderrno = errno;
+    if (lstat("/", &st) < 0)
+           return -1;
     len = -1;
     cwd = recurser(buf, size, st.st_dev, st.st_ino);
     if (cwd) {
@@ -175,6 +177,7 @@
        if (path == NULL)
            return NULL;
     }
+    snprintf(buf, 2, ".");
     ret = __syscall_egetcwd(path, alloc_size);
     if (ret >= 0)
     {
@@ -182,6 +185,7 @@
            buf = realloc(path, ret);
        if (buf == NULL)
            buf = path;
+       printf("getcwd (%d) = '%s'\n", strlen(buf), buf);
        return buf;
     }
     if (buf == NULL)
@@ -197,11 +201,12 @@
        __set_errno(0);
        tmpbuf = getcwd(buf, size);

-       if (tmpbuf) {
+       if (tmpbuf)
                lstat(buf, &st);
-       } else {
+       else if (ENAMETOOLONG == errno)
+               return __egetcwd(buf, SB_PATH_MAX);
+       else
                return tmpbuf;
-       }

        if (errno) {
                /* If lstat() failed with eerror = ENOENT, then its
-----

This btw still only have the 4096 max path it can return ...
Comment 2 Martin Schlemmer (RETIRED) gentoo-dev 2005-07-10 06:33:08 UTC
Anyhow, besides the path size limit, does the first 3 hunks look like the fixes
you suggested?
Comment 3 bartron 2005-07-10 18:03:17 UTC

    
Comment 4 bartron 2005-07-10 18:03:17 UTC
  
  For standard usage, yes.  Problem is still PATH_MAX... at the point 
the test in bug #21766 fails, `cwd_len' is somewhere between (PATH_MAX + 1)
and (PATH_MAX + 9)...

  The problem with `egetcwd()' in sandbox 1.2.10 is it tries to stat() 
every directory component relative to "/" on its way back... in the 
outermost recursion it's slightly above PATH_MAX... and fails...

  The generic one from glibc, on the other hand, uses paths relative 
to ".", basically a gigantic chain of "../" + currently searched directory 
name.  That means, unless the average component length is less than two, 
it can handle more than PATH_MAX... in the scenario with `confdir3' 
(8 bytes + '/'), limit is 12267 (give or take a few, depending on how 
long the parts before `/confdir3/...' are, but definetely plenty enough).


  (unfortunately, there's a small error in getcwd.c that looks like is
sitting there forever... when it runs out of "../.."s, assuming `dots[]' 
is 

  "../../../../..", 

after extending it becomes 

  "../../../../..../../../../.."

(note missing slash)... limiting depth to 76.  Looking at older sandbox 
versions, `dots[]' is a bit longer there, but is it enough?)
Comment 5 bartron 2005-07-10 18:04:48 UTC
Created attachment 63103 [details, diff]
getcwd_glibc.diff
Comment 6 bartron 2005-07-10 18:04:48 UTC
Created attachment 63103 [details, diff]
getcwd_glibc.diff

  
  Here's a patch for glibc getcwd.c that removes said limit.
Comment 7 Martin Schlemmer (RETIRED) gentoo-dev 2005-07-11 00:10:42 UTC
Ok, here is the thing ... I have tried with longer dot list ... and it always
fails like you said around 12 thousand.  The problem however is that at least on
amd64 (can't remember with my p4) the limit is not long enough (need about 13
thousand or 14 ...).

Anyhow, somewhere getting too engrossed in finding a generic version, I forgot
what the whole issue was about.  The issue really was that on some (still an
issue?) 2.4 kernels, the getcwd syscall fails before it even gets to PATH_MAX,
but do not return an error.  The returned cwd is however not valid .. thus the
lstat I do.

So basically that test is just to get the getcwd() issue on 2.4 resolved, and
not to get cwd to handle that long paths.  If we anyhow do it differently than
normal, coreutils/whatever will any expect something not valid.  It would help
if we could handle it, but currently any implementation is just _too_ slow.
Comment 8 Chris C. Blockschmidt 2005-07-12 16:08:39 UTC
Sorry, I'm still not convinced. Isn't PATH_MAX 4096 on linux, always?

I checked on my ia64 box (64K page size) and amd64 (4K page size, sorry
only available option) but PATH_MAX is always 4096. I even compiled old 2.4
kernels on both - same result - coreutils test stops at 4100. I still think
the problem is in sandbox. 
Comment 9 bartron 2005-07-12 16:45:53 UTC
[Chris... I thought you said you studied bug #21766...]

  Even though coreutils is only marginally related, here's a short
summary, just this once...


  Once upon a time there used to be kernels whose `getcwd()' syscall 
would return a truncated path (elements missing at the beginning) if 
invoked from a path longer than PATH_MAX (or rather, PAGE_SIZE) without 
setting `errno'.  All kernels from 2.4.21 on set errno=ENAMETOOLONG in 
this case (note that we're talking about the buffer passed to `__d_path()' 
here, which is always PAGE_SIZE big, and how paths deeper than that are 
handled.  The `getcwd()' syscall itself fails, and always did, with 
errno=ERANGE in case it is called with an insufficiently sized buffer, 
/if/ pathlen itself is still within limits).


  Now, here's what coreutils does during `configure' (comments added by me)

    /* [snip] */
    fail=0; cwd_len=strlen(cwd);
    while (1)
    {
      /* [snip] */

      /* cwd_len internally keeps track how long cwd is... */
      cwd_len += 1 + strlen (DIR_NAME);

      /* create and change to DIR_NAME ("confdir3") */
      if (mkdir (DIR_NAME, 0700) < 0 || chdir (DIR_NAME) < 0)
      {
        /* failing to do so means the whole test failed... bad.  (*1) */
        fail = 1;
        break;
      }

    if ((c = getcwd (buf, PATH_MAX)) == NULL)
    {
      /* getcwd eventually returns NULL if path gets too long... good. */
      /* (although it really should check errno, like the original 
       * comment suggests...) */
      break;
    }

    if ((len = strlen (c)) != cwd_len)
    {
      /* getcwd returns non-NULL, but result is too short... means we're 
       * running on linux 2.4.20 or older... bad.
       */
      fail = 1;
      break;
    }
    ++n_chdirs;
    if (PATH_MAX < len)
      /*
       * non-NULL path and len > PATH_MAX means getcwd can actually 
       * return longer paths... good.
       *
       * (actually that can never happen because getcwd() is called with
       *  a buffer that is only PATH_MAX big)
       */
      break;
    }

    /*
     * The real problem in #21766 comes next, and is triggered regardless 
     * whether `getcwd()' is good or bad... we're still sitting in a 
     * path that is (PATH_MAX+n) long (n=1..9), but in case the mkdir()
     * succeeded and chdir() failed above (*1), we're trying to unlink()
     * DIR_NAME, just to make sure...
     */
     unlink (DIR_NAME);

    /* [snip] */

  Since sandbox hooks `unlink()' and its argument is relative, it tries 
to calculate 

    rp = `getcwd` + "/" + "confdir3".

  First case is obvious, in the olden days when `erealpath()' was using 
the kernel `getcwd', and `getcwd' returns a truncated path rp is not a 
place where sandbox allows writing, hence the access violation.  On 
arches where PAGE_SIZE is bigger than 4K, errno is ERANGE and rp became 
"<empty string>" + "/" + "confdir3", with good and bad kernels (actually 
I've seen that happen on x86 and good kernels too, but never got around 
to investigate).


  The easiest hack (what sandbox currently does) is provide a `getcwd()' 
that can return paths slightly longer than PATH_MAX (+9 should always 
be sufficient, see above).  I suggested the one from glibc a while back, 
now it's uclibc's (differenly limited and a bit slower, but since there's 
no need to do anything with the current directory, only its parent, it 
would still work as long as the parent fits into PATH_MAX).

  However... I absolutely agree with Martin that in the long run the 
coreutils test should be modified (may not be accepted upstream because 
it's sandbox specific, and thus means more work for Gentoo maintainers, 
but personally I like clean solutions...)


  My suggestion: 

split

    if (mkdir (DIR_NAME, 0700) < 0 || chdir (DIR_NAME) < 0)

into two parts and `rmdir(DIRNAME)' if `chdir()' fails, then it's 
safe to remove the `unlink()' call that is causing all the trouble.
Something like

    if (mkdir (DIR_NAME, 0700) < 0)
    {
        /* fail */
    }
    if( chdir (DIR_NAME) < 0)
    {
        rmdir (DIR_NAME);
        /* fail */
    }

and get rid of the unlink call.
Comment 10 Martin Schlemmer (RETIRED) gentoo-dev 2005-07-13 01:01:24 UTC
Ok, I wanted to take a shot at explaining, but seems like Barton did nicely. 
Anyhow, just for the record, the getcwd test goes over 12000:

 $ ./getcwd-test
path size = 12312
Comment 11 Martin Schlemmer (RETIRED) gentoo-dev 2005-07-13 01:02:54 UTC
Created attachment 63296 [details]
getcwd-test.c

Source for above test (basically the getcwd test for coreutils).
Comment 12 Chris C. Blockschmidt 2005-07-16 17:25:25 UTC
That is all nice and dandy, but coreutils doesn't get that deep, ever. Bartron,
could you please settle this for us one last time?
Comment 13 bartron 2005-07-16 18:10:26 UTC

    
Comment 14 bartron 2005-07-16 18:10:26 UTC
  
  That depends a bit what problem you're trying to argue... If it's 
`getcwd()' can't deal with such huge paths, yes, you have a point.  If 
it's whether a configure test can end up being stuck in there... well, 
comment #9 has an example where exactly that happens...

  Unlike the test in coreutils 5.2.1, this test just goes on making 
directories like mad, with 3 checkpoints and only 2 calls to getcwd() 
along the way...

(1) first iteration when `cwd_len' becomes >= PATH_MAX... 
- fails if c==NULL and ENOENT
- fails if c!=NULL or (c==NULL and errno is neither ERANGE nor 
  ENAMETOOLONG)

(2) first iteration when `cwd_len' becomes >= dotdot_max + initial_cwd_len
- fails if c==NULL, but not if ENAMETOOLONG
- fails if c!=NULL, but unexpected length of return value.

(3) next iteration from (2), quits loop with fail=0.


  At (1) I'm getting c=NULL, errno=ENAMETOOLONG, cwd_len=4098... passed.

  At (2), c=NULL, errno=ENAMETOOLONG, cwd_len=12333... passed.

  It breaks out at (3), with cwd_len=12342... test successful, getcwd() 
called only two times (both returned NULL), and here the test is hanging 
in a working directory that is 12342 bytes long...


  So I'd guess you're both right... (a) `getcwd()' usually can't deal 
with a path that long (and if it did, comment #9 would fail at point(1) 
above), and (b) the attachment in comment #9 is an example of a 
configure test that is not really sandbox friendly.
Comment 15 Martin Schlemmer (RETIRED) gentoo-dev 2005-07-17 06:28:25 UTC
(In reply to comment #10)
> That is all nice and dandy, but coreutils doesn't get that deep, ever. Bartron,
> could you please settle this for us one last time?
> 

Yes, you are right (besides the configure test) ... but what exactly do you
expect by from argueing about this?  I am probably missing something, but I do
not really see what you think should happen?
Comment 16 Chris C. Blockschmidt 2005-07-18 16:54:06 UTC
My problem is, we have an ebuild doing some acrobatics with huge paths in its
configure stage (not easily rewritten) just like coreutils. Until sandbox
1.2.9, it just worked. Heavy access violations in 1.2.9 and 1.2.10 (sandbox
translates /very/huge/path/to/filename to /filename). With sandbox 1.2.11
(unstable) it seems to work again but it prints "LONG PATH" warnings (in yellow
on white) and proceeds with noncanonical arguments (doesn't that mean no
sandbox protection at all?)  The old variant did not have these problems and
our tests have shown it is also five times faster (we measured how many calls
to getcwd in 5 secs with old version, then did the same number of calls with 
new version - 25.3 secs). Why use inferior code?
Comment 17 Martin Schlemmer (RETIRED) gentoo-dev 2005-07-20 00:52:52 UTC
Private ebuild ?  I guess I do not mind changing it back besides the fact that
its ugly code and more difficult to port :-)

Bartron .. you dont have something laying around that is a bit faster ? ;)  Or
if not, what was the suggested fix for the old code again ?
Comment 18 Martin Schlemmer (RETIRED) gentoo-dev 2005-07-20 06:58:49 UTC
PS: are you guys using an older kernel?  As the generic version should only be
called if the getcwd() call do not fail, but return an invalid path (as Barton
explained in comment #7 ) ....

Also, sure, we do not 'protect' in that case, but if you actually use the
generic code for the test example above, it takes _minutes_ and not just a few
seconds longer.

Then for the warning .. it should only be one per version .. how do you create
the dirs, and why?
Comment 19 bartron 2005-07-23 20:00:35 UTC

    
Comment 20 bartron 2005-07-23 20:00:35 UTC
  A patch that allows a maximum of 1365 (= (PATH_MAX-1) / 3) directories 
is here

    http://bugs.gentoo.org/attachment.cgi?id=63296

(or at least, close to that, or assuming the root entry is no longer 
than 2, next one is no longer than 5, and so forth...)


  The problem I see is, this is not the only restriction.

1. The test in comment #9 goes even further than that (by at least 
two)... I'd guess it's not impossible to come up with an algorithm 
that can handle that too (once a limit is reached, chdir to farthest 
dir within reach and continue from there, with lots of precautions to 
restore current working dir under all sorts of error conditions)... but 
it really does get slow if it's called too often.
(tar-1.15 test would take almost 2 minutes even on moderately fast 
hardware)

2. Sandbox internally uses fixed size buffers of `SB_PATH_MAX' 
(=PATH_MAX*2) in size at times... what should the maximum be here?
In the example above, max depth 1365, avg. dir name length 8, the 
theoretical maximum path length is already PATH_MAX*3.  With an 
avg. dir name length of 17, it would be PATH_MAX*6, and so on.
Comment 21 bartron 2005-07-23 20:11:10 UTC
Err... wrong URL, should be 

    http://bugs.gentoo.org/attachment.cgi?id=63103

(sorry about that)
Comment 22 Martin Schlemmer (RETIRED) gentoo-dev 2005-07-24 06:12:47 UTC
Thanks.  I still would have liked to know what exactly they are doing, as the
getcwd() arobatics we do in sandbox have nothing really to do with the getcwd()
call the program does (or any other call for that matter), except if we restrict
the app when we should not have.
Comment 23 Martin Schlemmer (RETIRED) gentoo-dev 2005-08-31 08:44:55 UTC
Marking as reminder, as already pointed out, the one bug was fixed, the code is
only used on some older 2.4 kernels, and except if we can really get something
speedy, its not worth using a generic implementation if the regular getcwd()
fails with ENAMETOOLONG.