Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 264007 - sys-apps/coreutils-7.1: /usr/bin/id -G <user> hangs
Summary: sys-apps/coreutils-7.1: /usr/bin/id -G <user> hangs
Status: RESOLVED FIXED
Alias: None
Product: Gentoo/Alt
Classification: Unclassified
Component: Prefix Support (show other bugs)
Hardware: x86 OS X
: High major (vote)
Assignee: Gentoo Prefix
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-03-27 21:48 UTC by Ian Rae
Modified: 2009-04-20 19:30 UTC (History)
3 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Patch posted to bug-coreutils ( (patch-against-7.2.txt,998 bytes, patch)
2009-04-09 01:08 UTC, Steven Parkes
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Ian Rae 2009-03-27 21:48:41 UTC
I'm using prefixed Gentoo on a Mac OS X 10.5 system with LDAP enabled for authentication and user search.  After upgrading coreutils to 7.1, I get really strange results with ${EPREFIX}/usr/bin/id:

ian@cumulonimbus ~ $ id -G
1463 8056 12869 5105 81 102 5124 79 13684 80 1440 1091
ian@cumulonimbus ~ $ id -G ian
(fails to exit)
ian@cumulonimbus ~ $ id -G root
(fails to exit)
ian@cumulonimbus ~ $ id -G rene
1453

The "rene" account has never logged into this machine (it's just in LDAP), and it does belong to more than one group.

coreutils-6.12-r2 works fine:

ian@cumulonimbus ~ $ id -G root
0 1 2 8 29 1148 3 9 5055 4
ian@cumulonimbus ~ $ id -G ian
1463 8056 12869 5105 81 102 5124 79 13684 80
ian@cumulonimbus ~ $ id -G rene
1453 1148 5055 1298 6061 32659 1091

This also has the unfortunate side effect of breaking portage, since it uses id -G to enumerate the portage user's groups.

Reproducible: Always

Steps to Reproduce:
1. Enable LDAP for user authentication
2. Emerge sys-apps/coreutils-7.1
3. id -G root

Actual Results:  
id hangs.

Expected Results:  
id should output a list of group IDs.

Mac OS X 10.5, LDAP enabled for user authentication and search.
Comment 1 Fabian Groffen gentoo-dev 2009-03-28 07:59:12 UTC
Hmmm, I recall someone else having a problem with this on some other platform.  Apparently the coreutils folks have changed something which is a lot more expensive (if it will ever finish at all!).
Comment 2 Markus Duft (RETIRED) gentoo-dev 2009-03-30 08:00:21 UTC
interix has a similar problem regarding windows domains. there are replacements for some functions in the system, that don't query all members of all groups on a domain (which takes ages). i use those function (by redefining them in the coreutils ebuild, FYI), but i guess they're not available anywhere else, and these two problems are unrelated.

however - darkside had a similar problem on interix lately which seemed to be a little different from what i fixed by using the replacement functions - maybe he's on LDAP too, instead of windows domains?
Comment 3 Fabian Groffen gentoo-dev 2009-04-03 17:24:08 UTC
just a wishful check, does coreutils-7.2 exhibit the same problem?
Comment 4 Steven Parkes 2009-04-06 22:03:41 UTC
(In reply to comment #3)
> just a wishful check, does coreutils-7.2 exhibit the same problem?
> 

Yes, it does.

Trying to dig a little deeper ...
Comment 5 Steven Parkes 2009-04-06 23:18:45 UTC
It looks like a Darwin bug. getgrouplist (called from mgetgroups.c:mgetgroups) is not changing the value of max_n_groups, which all the docs seem to say it should do. The result is an infinite loop.

It worked in 6.12 because 6.12 never actually looked at the result from the getgrouplist call if the number of groups looked okay. So it would return bad results (it didn't include the whole list of groups), but it did return.

Darwin kinda gets around this by always making a buffer the size of NGROUPS+1. But they also now support more than NGROUPS (from what I understand).

One could patch this for Darwin by just increasing the buffer size (2X?) until getgrouplist didn't return an error.

Happy to help ...
Comment 6 Steven Parkes 2009-04-07 17:14:13 UTC
Follow up from gentoo-alt@lists.gentoo.org:

> I don't think it's a bug.

Well, there is some room for debate; I probably stated that too strongly. There is a question as to when the OS should update ngroups in getgrouplist. The 4.4BSD/Darwin docs are a little more ambiguous then the NetBSD 5.0/Linux docs. The code is in Darwin is v1.2 of getgrouplist.c which (apparently) comes from NetBSD. Darwin is using v1.2 and it's moved on quite a bit since since then, so I'd say it's Darwin's problem. 

> I think coreutils recently started to try
> harder to obtain the information it needs, thus for instance going into
> an ldap or NT domain lookup, which just takes an awful amount of time.

This isn't about taking a long time, it's about an infinite loop. I guarantee it.

> we should try to find
> the configure flag that reverts this behaviour and add it so

There's no config flag that will revert the behavior. The change between 6 and 7 in this case is in the same code path, it's just looking at the results slightly differently between the two version. With a "working" version of getgrouplist, there's no change in behavior between 6 and 7, but given the behavior on Darwin, there is.

There is a flag for completely avoiding getgrouplist and I guess we could set that, but that wasn't used before and I, personally, don't like avoiding the OS call. Plus it doesn't give the same results as Darwin's /usr/bin/id.

It's easy enough to patch to either "do the right thing" or to just patch it to do what Darwin's native id does.
Comment 7 Fabian Groffen gentoo-dev 2009-04-07 17:49:55 UTC
ok, then we should bring this to upstream's attention
Comment 8 Steven Parkes 2009-04-07 17:55:40 UTC
Cool with me.

Is the idea that the coreutils people care about working around broken getgroupslist?
Comment 9 Fabian Groffen gentoo-dev 2009-04-07 17:56:41 UTC
I think they do care about their software going into an endless loop.
Comment 10 Steven Parkes 2009-04-07 18:00:06 UTC
Good point.

This something I can help with/do? Haven't exactly done it before, but I should be able to find the right people to submit to ...
Comment 11 Fabian Groffen gentoo-dev 2009-04-07 18:03:54 UTC
I think gnu coreutils people primarily work on the gnu coreutils mailing list.

First search if this issue hasn't been reported already, then submit the problem to them, including your in-depth analysis.

In the meanwhile I'm willing to:
a) mask
b) patch, provided the patch is acceptable ;)
Comment 12 Steven Parkes 2009-04-07 19:26:48 UTC
Submitted to <bug-coreutils@gnu.org>.

I don't have a strong feeling on masking vs. patching, though I guess I do figure it makes sense to do something fairly quickly since emerge calls id so this can really stop things dead in the water.

Reverting is easy enough. Or a one line patch (at the end) will revert just this case to 6.12. Note that in either case, the results are wrong: it's hard-limiting to 10 groups, regardless of how many the user is, so prefix id and /usr/bin/id are going to return different results.

A fix that would return correct results would be to look at the return value from getgrouplist and manage changing the buffer size manually and speculatively, that is, perhaps doubling it a few times in hope that you'll generate as much space as is required. I asked on bug-coreutil whether they would want to do this. Would you? A little more freestore load, but really not a big deal, I would think (just kinda annoying, since you just wish Darwin got rid of the bug).
Comment 13 Steven Parkes 2009-04-07 19:55:49 UTC
Here's the pseudo-revert, though I'd still rather do it "right":

diff --git a/lib/mgetgroups.c b/lib/mgetgroups.c
index e697013..77359f0 100644
--- a/lib/mgetgroups.c
+++ b/lib/mgetgroups.c
@@ -1,3 +1,4 @@
+#include <stdio.h>
 /* mgetgroups.c -- return a list of the groups a user is in
 
    Copyright (C) 2007-2009 Free Software Foundation, Inc.
@@ -94,6 +95,11 @@ mgetgroups (char const *username, gid_t gid, GETGROUPS_T **groups)
            }
          g = h;
 
+          if( ng < 0 && max_n_groups <= N_GROUPS_INIT)
+            {
+                ng = max_n_groups;
+            }
+
          if (0 <= ng)
            {
              *groups = g;
Comment 14 Steven Parkes 2009-04-07 20:19:32 UTC
Whoops. Sorry about the debugging printf include.

Anyway, here is a "correct" patch. It relies on the OS not forever returning -1.

It also includes something I didn't realize, which is that when Darwin getgrouplist does return a full list, it just sets the return value to 0, whereas it looks like the newer libs have it return the number of groups.

diff --git a/lib/mgetgroups.c b/lib/mgetgroups.c
index e697013..7282197 100644
--- a/lib/mgetgroups.c
+++ b/lib/mgetgroups.c
@@ -83,8 +83,18 @@ mgetgroups (char const *username, gid_t gid, GETGROUPS_T **groups)
          GETGROUPS_T *h;
 
          /* getgrouplist updates max_n_groups to num required.  */
+         /* Some old getgrouplist return -1 but don't update max_n_groups */
+         int previous_max_n_groups = max_n_groups;
          ng = getgrouplist (username, gid, g, &max_n_groups);
 
+         if (ng < 0 && max_n_groups == previous_max_n_groups)
+           {
+             max_n_groups <<= 2;
+           } else {
+             /* some old getgrouplist don't return max_n_groups on success */
+             ng = max_n_groups;
+           }
+
          if ((h = realloc_groupbuf (g, max_n_groups)) == NULL)
            {
              int saved_errno = errno;
Comment 15 Steven Parkes 2009-04-07 20:40:30 UTC
'course, I meant *= 2 or <<= 1; long day ...
Comment 16 Steven Parkes 2009-04-09 01:08:39 UTC
Created attachment 187745 [details, diff]
Patch posted to bug-coreutils (

This is the patch that's been posted to bug-coreutils. Pretty much what I posted before. Tested against 10.5.6 and returns the right result, even when a user has many groups.

This is just the code change, and against 7.2; not the supporting stuff.

cf: http://lists.gnu.org/archive/html/bug-coreutils/2009-04/msg00089.html
Comment 17 Steven Parkes 2009-04-09 01:10:42 UTC
Hmmmm ... bugzilla ate my comments on the patch. This is what was posted to bug-coreutils, applied to 7.2. Just the code, not the supporting materials. 

The code is essentially what I posted before and returns the correct results for a user with dozens of groups. 
Comment 18 Fabian Groffen gentoo-dev 2009-04-20 19:30:12 UTC
I added the following patch:

http://article.gmane.org/gmane.comp.gnu.core-utils.bugs/16578
http://repo.or.cz/w/coreutils.git?a=commitdiff;h=bf87a2c8ea4487ca4448c9fe42a9c9858400acbd

with modifications to apply on the 7.2 branch.

Thanks all, sorry for the wait!