Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 222693 - net-nds/openldap-2.3.41: slapd takes a long time to start when "group:" line in nsswitch.conf file includes "ldap"
Summary: net-nds/openldap-2.3.41: slapd takes a long time to start when "group:" line ...
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Server (show other bugs)
Hardware: x86 Linux
: High normal (vote)
Assignee: Gentoo LDAP project
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-05-18 17:34 UTC by Andrei Iordache
Modified: 2012-02-12 21:51 UTC (History)
4 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Test case for bug 222693 (222693-1.txt.txt,16.03 KB, text/plain)
2008-10-14 19:39 UTC, Andrei Iordache
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Andrei Iordache 2008-05-18 17:34:32 UTC
I use LDAP for user authentication with OpenLDAP, pam_ldap and nss_ldap, all on the same machine. All is working well and after slapd starts I can use the commands getent passwd, getent group, getent shadow. However slapd takes a long time to start. Actually it seems to take exactly 30 seconds and I think this are the default values for both the search timelimit (timelimit) and bind/connect timelimit (bind_timelimit) in /etc/ldap.conf.

It seems that it takes so long to start ONLY if I include ‘ldap’ in /etc/nsswitch.conf on the ‘group:’ line. If the lines ‘passwd:’ and ‘shadow:’ contain ‘ldap’ does not seem to slow down the start-up of the slapd service. 

In /etc/nsswitch.conf I have:
passwd:     files ldap
shadow      files ldap
group:      files ldap

/etc/ldap.conf:
base <my.base>
uri ldap://127.0.0.1/
ldap_version 3
rootbinddn <my.root.bind.dn>
... (default values)
logdir /var/log/nss_ldap
debug 9

In order to see what’s going on, I start slapd from the command line:

# /usr/lib/openldap/slapd -d 256 -u ldap -g ldap -l daemon
@(#) $OpenLDAP: slapd 2.3.41 (May 18 2008 16:20:55) $
        root@gentoo.home:/var/tmp/portage/net-nds/openldap-2.3.41/work/openldap-2.3.41/servers/slapd
ldap_create
ldap_url_parse_ext(ldap://127.0.0.1/)
ldap_create
ldap_url_parse_ext(ldap://127.0.0.1/)
ldap_create
ldap_url_parse_ext(ldap://127.0.0.1/)
ldap_simple_bind
ldap_sasl_bind
ldap_send_initial_request
ldap_new_connection 1 1 0
ldap_int_open_connection
ldap_connect_to_host: TCP 127.0.0.1:389
ldap_new_socket: 9
ldap_prepare_socket: 9
ldap_connect_to_host: Trying 127.0.0.1:389
ldap_connect_timeout: fd: 9 tm: 30 async: 0
ldap_ndelay_on: 9
ldap_is_sock_ready: 9
ldap_is_socket_ready: error on socket 9: errno: 111 (Connection refused)
ldap_close_socket: 9
ldap_err2string
ldap_unbind
ldap_create
ldap_url_parse_ext(ldap://127.0.0.1/)
ldap_simple_bind
ldap_sasl_bind
ldap_send_initial_request
ldap_new_connection 1 1 0
ldap_int_open_connection
ldap_connect_to_host: TCP 127.0.0.1:389
ldap_new_socket: 9
ldap_prepare_socket: 9
ldap_connect_to_host: Trying 127.0.0.1:389
ldap_connect_timeout: fd: 9 tm: 30 async: 0
ldap_ndelay_on: 9
ldap_is_sock_ready: 9
ldap_is_socket_ready: error on socket 9: errno: 111 (Connection refused)
ldap_close_socket: 9
ldap_err2string
ldap_unbind
ldap_create
ldap_url_parse_ext(ldap://127.0.0.1/)
ldap_simple_bind
ldap_sasl_bind
ldap_send_initial_request
ldap_new_connection 1 1 0
ldap_int_open_connection
ldap_connect_to_host: TCP 127.0.0.1:389
ldap_new_socket: 9
ldap_prepare_socket: 9
ldap_connect_to_host: Trying 127.0.0.1:389
ldap_connect_timeout: fd: 9 tm: 30 async: 0
ldap_ndelay_on: 9
ldap_is_sock_ready: 9
ldap_is_socket_ready: error on socket 9: errno: 111 (Connection refused)
ldap_close_socket: 9
ldap_err2string
ldap_unbind
ldap_create
ldap_url_parse_ext(ldap://127.0.0.1/)
ldap_simple_bind
ldap_sasl_bind
ldap_send_initial_request
ldap_new_connection 1 1 0
ldap_int_open_connection
ldap_connect_to_host: TCP 127.0.0.1:389
ldap_new_socket: 9
ldap_prepare_socket: 9
ldap_connect_to_host: Trying 127.0.0.1:389
ldap_connect_timeout: fd: 9 tm: 30 async: 0
ldap_ndelay_on: 9
ldap_is_sock_ready: 9
ldap_is_socket_ready: error on socket 9: errno: 111 (Connection refused)
ldap_close_socket: 9
ldap_err2string
ldap_unbind
ldap_create
ldap_url_parse_ext(ldap://127.0.0.1/)
ldap_simple_bind
ldap_sasl_bind
ldap_send_initial_request
ldap_new_connection 1 1 0
ldap_int_open_connection
ldap_connect_to_host: TCP 127.0.0.1:389
ldap_new_socket: 9
ldap_prepare_socket: 9
ldap_connect_to_host: Trying 127.0.0.1:389
ldap_connect_timeout: fd: 9 tm: 30 async: 0
ldap_ndelay_on: 9
ldap_is_sock_ready: 9
ldap_is_socket_ready: error on socket 9: errno: 111 (Connection refused)
ldap_close_socket: 9
ldap_err2string
ldap_unbind
ldap_create
ldap_url_parse_ext(ldap://127.0.0.1/)
ldap_simple_bind
ldap_sasl_bind
ldap_send_initial_request
ldap_new_connection 1 1 0
ldap_int_open_connection
ldap_connect_to_host: TCP 127.0.0.1:389
ldap_new_socket: 9
ldap_prepare_socket: 9
ldap_connect_to_host: Trying 127.0.0.1:389
ldap_connect_timeout: fd: 9 tm: 30 async: 0
ldap_ndelay_on: 9
ldap_is_sock_ready: 9
ldap_is_socket_ready: error on socket 9: errno: 111 (Connection refused)
ldap_close_socket: 9
ldap_err2string
ldap_unbind
ldap_err2string
ldap_create
ldap_url_parse_ext(ldap://127.0.0.1/)
ldap_create
ldap_url_parse_ext(ldap://127.0.0.1/)
ldap_simple_bind
ldap_sasl_bind
ldap_send_initial_request
ldap_new_connection 1 1 0
ldap_int_open_connection
ldap_connect_to_host: TCP 127.0.0.1:389
ldap_new_socket: 9
ldap_prepare_socket: 9
ldap_connect_to_host: Trying 127.0.0.1:389
ldap_connect_timeout: fd: 9 tm: 30 async: 0
ldap_ndelay_on: 9
ldap_is_sock_ready: 9
ldap_is_socket_ready: error on socket 9: errno: 111 (Connection refused)
ldap_close_socket: 9
ldap_err2string
ldap_unbind
ldap_create
ldap_url_parse_ext(ldap://127.0.0.1/)
ldap_simple_bind
ldap_sasl_bind
ldap_send_initial_request
ldap_new_connection 1 1 0
ldap_int_open_connection
ldap_connect_to_host: TCP 127.0.0.1:389
ldap_new_socket: 9
ldap_prepare_socket: 9
ldap_connect_to_host: Trying 127.0.0.1:389
ldap_connect_timeout: fd: 9 tm: 30 async: 0
ldap_ndelay_on: 9
ldap_is_sock_ready: 9
ldap_is_socket_ready: error on socket 9: errno: 111 (Connection refused)
ldap_close_socket: 9
ldap_err2string
ldap_unbind
ldap_create
ldap_url_parse_ext(ldap://127.0.0.1/)
ldap_simple_bind
ldap_sasl_bind
ldap_send_initial_request
ldap_new_connection 1 1 0
ldap_int_open_connection
ldap_connect_to_host: TCP 127.0.0.1:389
ldap_new_socket: 9
ldap_prepare_socket: 9
ldap_connect_to_host: Trying 127.0.0.1:389
ldap_connect_timeout: fd: 9 tm: 30 async: 0
ldap_ndelay_on: 9
ldap_is_sock_ready: 9
ldap_is_socket_ready: error on socket 9: errno: 111 (Connection refused)
ldap_close_socket: 9
ldap_err2string
ldap_unbind
ldap_create
ldap_url_parse_ext(ldap://127.0.0.1/)
ldap_simple_bind
ldap_sasl_bind
ldap_send_initial_request
ldap_new_connection 1 1 0
ldap_int_open_connection
ldap_connect_to_host: TCP 127.0.0.1:389
ldap_new_socket: 9
ldap_prepare_socket: 9
ldap_connect_to_host: Trying 127.0.0.1:389
ldap_connect_timeout: fd: 9 tm: 30 async: 0
ldap_ndelay_on: 9
ldap_is_sock_ready: 9
ldap_is_socket_ready: error on socket 9: errno: 111 (Connection refused)
ldap_close_socket: 9
ldap_err2string
ldap_unbind
ldap_create
ldap_url_parse_ext(ldap://127.0.0.1/)
ldap_simple_bind
ldap_sasl_bind
ldap_send_initial_request
ldap_new_connection 1 1 0
ldap_int_open_connection
ldap_connect_to_host: TCP 127.0.0.1:389
ldap_new_socket: 9
ldap_prepare_socket: 9
ldap_connect_to_host: Trying 127.0.0.1:389
ldap_connect_timeout: fd: 9 tm: 30 async: 0
ldap_ndelay_on: 9
ldap_is_sock_ready: 9
ldap_is_socket_ready: error on socket 9: errno: 111 (Connection refused)
ldap_close_socket: 9
ldap_err2string
ldap_unbind
ldap_create
ldap_url_parse_ext(ldap://127.0.0.1/)
ldap_simple_bind
ldap_sasl_bind
ldap_send_initial_request
ldap_new_connection 1 1 0
ldap_int_open_connection
ldap_connect_to_host: TCP 127.0.0.1:389
ldap_new_socket: 9
ldap_prepare_socket: 9
ldap_connect_to_host: Trying 127.0.0.1:389
ldap_connect_timeout: fd: 9 tm: 30 async: 0
ldap_ndelay_on: 9
ldap_is_sock_ready: 9
ldap_is_socket_ready: error on socket 9: errno: 111 (Connection refused)
ldap_close_socket: 9
ldap_err2string
ldap_unbind
ldap_err2string
slapd starting <= At this point 30 seconds will have passed.

If I change in nsswitch.conf as follows:
passwd:     files ldap
shadow      files ldap
group:      files

/usr/lib/openldap/slapd -d 256 -u ldap -g ldap -l daemon
@(#) $OpenLDAP: slapd 2.3.41 (May 18 2008 16:20:55) $
        root@gentoo.home:/var/tmp/portage/net-nds/openldap-2.3.41/work/openldap-2.3.41/servers/slapd
slapd starting <= it takes 1 second or less

If I change in nsswitch.conf as follows:
passwd:     files ldap
shadow      files ldap
group:      ldap

# /usr/lib/openldap/slapd -d 256 -u ldap -g ldap -l daemon
@(#) $OpenLDAP: slapd 2.3.41 (May 18 2008 16:20:55) $
        root@gentoo.home:/var/tmp/portage/net-nds/openldap-2.3.41/work/openldap-2.3.41/servers/slapd
ldap_create
ldap_url_parse_ext(ldap://127.0.0.1/)
ldap_create
ldap_url_parse_ext(ldap://127.0.0.1/)
ldap_simple_bind
ldap_sasl_bind
ldap_send_initial_request
ldap_new_connection 1 1 0
ldap_int_open_connection
ldap_connect_to_host: TCP 127.0.0.1:389
ldap_new_socket: 9
ldap_prepare_socket: 9
ldap_connect_to_host: Trying 127.0.0.1:389
ldap_connect_timeout: fd: 9 tm: 30 async: 0
ldap_ndelay_on: 9
ldap_is_sock_ready: 9
ldap_is_socket_ready: error on socket 9: errno: 111 (Connection refused)
ldap_close_socket: 9
ldap_err2string
ldap_unbind
ldap_create
ldap_url_parse_ext(ldap://127.0.0.1/)
ldap_simple_bind
ldap_sasl_bind
ldap_send_initial_request
ldap_new_connection 1 1 0
ldap_int_open_connection
ldap_connect_to_host: TCP 127.0.0.1:389
ldap_new_socket: 9
ldap_prepare_socket: 9
ldap_connect_to_host: Trying 127.0.0.1:389
ldap_connect_timeout: fd: 9 tm: 30 async: 0
ldap_ndelay_on: 9
ldap_is_sock_ready: 9
ldap_is_socket_ready: error on socket 9: errno: 111 (Connection refused)
ldap_close_socket: 9
ldap_err2string
ldap_unbind
ldap_create
ldap_url_parse_ext(ldap://127.0.0.1/)
ldap_simple_bind
ldap_sasl_bind
ldap_send_initial_request
ldap_new_connection 1 1 0
ldap_int_open_connection
ldap_connect_to_host: TCP 127.0.0.1:389
ldap_new_socket: 9
ldap_prepare_socket: 9
ldap_connect_to_host: Trying 127.0.0.1:389
ldap_connect_timeout: fd: 9 tm: 30 async: 0
ldap_ndelay_on: 9
ldap_is_sock_ready: 9
ldap_is_socket_ready: error on socket 9: errno: 111 (Connection refused)
ldap_close_socket: 9
ldap_err2string
ldap_unbind
ldap_create
ldap_url_parse_ext(ldap://127.0.0.1/)
ldap_simple_bind
ldap_sasl_bind
ldap_send_initial_request
ldap_new_connection 1 1 0
ldap_int_open_connection
ldap_connect_to_host: TCP 127.0.0.1:389
ldap_new_socket: 9
ldap_prepare_socket: 9
ldap_connect_to_host: Trying 127.0.0.1:389
ldap_connect_timeout: fd: 9 tm: 30 async: 0
ldap_ndelay_on: 9
ldap_is_sock_ready: 9
ldap_is_socket_ready: error on socket 9: errno: 111 (Connection refused)
ldap_close_socket: 9
ldap_err2string
ldap_unbind
ldap_create
ldap_url_parse_ext(ldap://127.0.0.1/)
ldap_simple_bind
ldap_sasl_bind
ldap_send_initial_request
ldap_new_connection 1 1 0
ldap_int_open_connection
ldap_connect_to_host: TCP 127.0.0.1:389
ldap_new_socket: 9
ldap_prepare_socket: 9
ldap_connect_to_host: Trying 127.0.0.1:389
ldap_connect_timeout: fd: 9 tm: 30 async: 0
ldap_ndelay_on: 9
ldap_is_sock_ready: 9
ldap_is_socket_ready: error on socket 9: errno: 111 (Connection refused)
ldap_close_socket: 9
ldap_err2string
ldap_unbind
ldap_create
ldap_url_parse_ext(ldap://127.0.0.1/)
ldap_simple_bind
ldap_sasl_bind
ldap_send_initial_request
ldap_new_connection 1 1 0
ldap_int_open_connection
ldap_connect_to_host: TCP 127.0.0.1:389
ldap_new_socket: 9
ldap_prepare_socket: 9
ldap_connect_to_host: Trying 127.0.0.1:389
ldap_connect_timeout: fd: 9 tm: 30 async: 0
ldap_ndelay_on: 9
ldap_is_sock_ready: 9
ldap_is_socket_ready: error on socket 9: errno: 111 (Connection refused)
ldap_close_socket: 9
ldap_err2string
ldap_unbind
ldap_err2string
No group entry for group ldap <= This time it takes 15 seconds

At this point I checked if I have the ldap group in:
/etc/group:
ldap:x:439:

/etc/passwd:
ldap:x:439:439:added by portage for openldap:/usr/lib/openldap:/usr/sbin/nologin

/etc/shadow:
ldap:!:13363:0:99999:7:::

So it seems that slapd itself is trying to find a group ‘ldap’ in LDAP if the line ‘group:’ in /etc/nsswitch.conf contains the entry ‘ldap’. But it tries to find that user before the server is even started. Then after it times-out the server continues to start. And then everything works fine.  If the line contains only ‘files’ then the group ldap is found immediately in /etc/group and therefore slapd starts right away.



Reproducible: Always

Steps to Reproduce:
1. Install OpenLDAP, nss_ldap and pam_ldap
2. Populate the LDAP database with some users and groups
3. Configure the system to use LDAP for user authentication, in particular include ‘ldap’ on the ‘group:’ line in /etc/nsswitch.conf.
4. Start the slapd server. It takes 30 seconds or so to start.
5. At this point ‘getent passwd’, ‘getent group’ and ‘getent shadow’ should return users and groups in LDAP.


Actual Results:  
Slapd starts in 30 seconds or so.

Expected Results:  
Slapd should start much faster, within seconds or so.

It makes no difference if I put in /etc/nsswitch.conf
passwd:     files ldap
shadow      files ldap
group:      files [SUCCESS=return] ldap

or:
passwd:     files ldap
shadow      files ldap
group:      files [SUCCESS=return] ldap [UNAVAIL=return]

It will behave exactly as in the first case (when I had group: files ldap)

# equery uses openldap
[ Searching for packages matching openldap... ]
[ Colour Code : set unset ]
[ Legend : Left column  (U) - USE flags from make.conf              ]
[        : Right column (I) - USE flags packages was installed with ]
[ Found these USE variables for net-nds/openldap-2.3.41 ]
 U I
 + + berkdb        : Adds support for sys-libs/db (Berkeley DB for MySQL)
 + + crypt         : Add support for encryption -- using mcrypt or gpg where applicable
 - - debug         : Enable extra debug codepaths, like asserts and extra output. If you want to get meaningful backtraces see http://www.gentoo.org/proj/en/qa/backtraces.xml
 - - gdbm          : Adds support for sys-libs/gdbm (GNU database libraries)
 - - ipv6          : Adds support for IP version 6
 - - kerberos      : Adds kerberos support
 - - minimal       : Install a very minimal build (disables, for example, plugins, fonts, most drivers, non-critical features)
 - - odbc          : Adds ODBC Support (Open DataBase Connectivity)
 - - overlays      : Enable contributed OpenLDAP overlays
 - - perl          : Adds support/bindings for the Perl language
 - - samba         : Adds support for SAMBA (Windows File and Printer sharing)
 - - sasl          : Adds support for the Simple Authentication and Security Layer
 - - selinux       : !!internal use only!! Security Enhanced Linux support, this must be set by the selinux profile or breakage will occur
 - - slp           : Adds Service Locator Protocol support
 - - smbkrb5passwd : Enable overlay for syncing ldap, unix and lanman passwords
 + + ssl           : Adds support for Secure Socket Layer connections
 + + tcpd          : Adds support for TCP wrappers

# equery uses nss_ldap
[ Searching for packages matching nss_ldap... ]
[ Colour Code : set unset ]
[ Legend : Left column  (U) - USE flags from make.conf              ]
[        : Right column (I) - USE flags packages was installed with ]
[ Found these USE variables for sys-auth/nss_ldap-258 ]
 U I
 - - debug    : Enable extra debug codepaths, like asserts and extra output. If you want to get meaningful backtraces see http://www.gentoo.org/proj/en/qa/backtraces.xml
 - - kerberos : Adds kerberos support
 - - sasl     : Adds support for the Simple Authentication and Security Layer
Comment 1 Jan Kundrát (RETIRED) gentoo-dev 2008-05-18 18:04:27 UTC
How does your /etc/ldap.conf file look like? Do you have any line that starts with "nss_initgroups_ignoreusers" in there? If you do, does it include the user that your slapd is supposed to run as?
Comment 2 Andrei Iordache 2008-05-18 21:22:38 UTC
(In reply to comment #1)
> How does your /etc/ldap.conf file look like? Do you have any line that starts
> with "nss_initgroups_ignoreusers" in there? If you do, does it include the user
> that your slapd is supposed to run as?
> 

Thank you for looking into this.

My /etc/ldap.conf file is pretty much the one that nss_ldap installs by default, with only a few options modified like base, uri, ldap_version 3, rootbinddn, pam_password exop.

There is a line "nss_initgroups_ignoreusers" (line 168) but it's commented. I also should mention that I don't know what that option is for and I have never used it.

# Use backlinks for answering initgroups()
#nss_initgroups backlink

In the light of these I'm starting to think that maybe there's a problem with nss_ldap or pam_ldap because even if I put 

group:      files [SUCCESS=return] ldap [UNAVAIL=return]

in /etc/nsswitch.conf, it's like anything but 'files' and 'ldap' is ignored. Because the above line is supposed to mean that -- if what it was looked for is found in the files -- then the search should stop. But it doesn't and because slapd is not started then the slapd or slapd init script takes a long time to start looking-up data that itself is supposed to serve. On the other hand if I change in /etc/nsswitch.conf as following:

passwd:         files ldap
shadow          files ldap
group:          files

then slapd starts instantly without problems.

Comment 3 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2008-10-14 10:16:44 UTC
Is nscd running? It's an absolute requirement to do the fast startup.
Somewhere there is an NSS lookup that is going to LDAP because it's not found in the base passwd layer.

If your timeout is exactly 30 seconds, and you are using the EXACT timeout settings that the gentoo /etc/ldap.conf ships with, that means there are two lookups failing.

Here's the block to look for:
# For Gentoo's distribution of nss_ldap, as of 250-r1, we use these values
nss_reconnect_tries 4			# number of times to double the sleep time
nss_reconnect_sleeptime 1		# initial sleep value
nss_reconnect_maxsleeptime 16	# max sleep value to cap at
nss_reconnect_maxconntries 2	# how many tries before sleeping
# This leads to a delay of 15 seconds (1+2+4+8=15)

This specifically handles the case where the LDAP server is not reachable. timelimit and sizelimit only apply once the server is fully started.

I suggest runninng the debugging of nscd to find what lookup isn't in /etc/passwd, and posting those details here.
Comment 4 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2008-10-14 10:26:42 UTC
Sorry, I missed this in your second comment:
> On the other hand if I change in /etc/nsswitch.conf as following:
> passwd:         files ldap
> shadow          files ldap
> group:          files
> then slapd starts instantly without problems.

This means that there are two groups that your system is trying to do a lookup of, and /etc/groups doesn't contain them, so it goes to LDAP.

I have had reports of this problem before, but absolutely nobody reported back on the full tracing, they all just did various dumb hacks around it.

I'd really like to know what two groups are being looked up. Enable 'logfile' and 'debug-level' in /etc/nscd.conf and then sort through the chaff to find what the two lookups that are going to nss_ldap are. Alternatively, emerge nss_ldap with USE=debug, and maybe add more of your own debugging inside the C source. I strongly suggest using -DDEBUG_SYSLOG during the compile as well, and having some syslog rules to catch the debug entries - they can contain confidential data, so do take care.

Also, the 'debug' option in ldap.conf doesn't help at all. It's debugging for the LDAP libraries, not nss_ldap/pam_ldap. This is documented in the nss_ldap manpage.
Comment 5 Andrei Iordache 2008-10-14 19:39:48 UTC
Created attachment 168474 [details]
Test case for bug 222693
Comment 6 Andrei Iordache 2008-10-14 19:40:32 UTC
Comment on attachment 168474 [details]
Test case for bug 222693

Thanks for replying, Robin

I should start by saying that in the meanwhile I found an acceptable (for me) solution to this problem:

# Reconnect policy:
#  hard_open: reconnect to DSA with exponential backoff if
#             opening connection failed
#  hard_init: reconnect to DSA with exponential backoff if
#             initializing connection failed
#  hard:      alias for hard_open
#  soft:      return immediately on server failure
bind_policy soft

In /etc/ldap.conf. If I set the bind policy parameter to soft, then the problem that I filled the bug for disappears. As far as I understand the parameter, it doesn't actually solve the problem but it hides it in the sense that if the first connection fails then subsequent connections are not performed. I could be wrong though.

Now to come back to your suggestions.

First of all, I absolutely do not use nscd, because it caches the information and I often need it to propagate as soon as possible and nscd delays the changes I make to LDAP. I don't have it started so I cannot debug it as you suggested. Is there really no way to debug this problem without it? 

You say you need to know which groups are being looked up in LDAP. When I start slapd, the group being looked-up --  in LDAP --  is 'ldap'. Also the same user is looked up. Please look at the testcase in the attachment so you see how I know.
Comment 7 Andrei Iordache 2008-10-14 19:50:14 UTC
(In reply to comment #4)

Thanks for replying, Robin

I should start by saying that in the meanwhile I found an acceptable (for me) solution to this problem:

# Reconnect policy:
#  hard_open: reconnect to DSA with exponential backoff if
#             opening connection failed
#  hard_init: reconnect to DSA with exponential backoff if
#             initializing connection failed
#  hard:      alias for hard_open
#  soft:      return immediately on server failure
bind_policy soft

In /etc/ldap.conf. If I set the bind policy parameter to soft, then the problem that I filled the bug for disappears. As far as I understand the parameter, it doesn't actually solve the problem but it hides it in the sense that if the first connection fails then subsequent connections are not performed. I could be wrong though.

Now to come back to your suggestions.

First of all, I absolutely do not use nscd, because it caches the information and I often need it to propagate as soon as possible and nscd delays the changes I make to LDAP. I don't have it started so I cannot debug it as you suggested. Is there really no way to debug this problem without it? 

> Sorry, I missed this in your second comment:
> > On the other hand if I change in /etc/nsswitch.conf as following:
> > passwd:         files ldap
> > shadow          files ldap
> > group:          files
> > then slapd starts instantly without problems.
> 
> This means that there are two groups that your system is trying to do a lookup
> of, and /etc/groups doesn't contain them, so it goes to LDAP.

How can it look for the group information in LDAP if the group line in nsswitch.conf contains only 'files'? Shouldn't it ONLY look in the passwd file in that case?

> I have had reports of this problem before, but absolutely nobody reported back
> on the full tracing, they all just did various dumb hacks around it.
> 
> I'd really like to know what two groups are being looked up. 

You say you need to know which groups are being looked up in LDAP. When I start slapd, the group being looked-up --  in LDAP --  is 'ldap'. Also the same user is looked up. Please look at the testcase in the attachment so you see how I know. 

> Enable 'logfile'
> and 'debug-level' in /etc/nscd.conf and then sort through the chaff to find
> what the two lookups that are going to nss_ldap are. Alternatively, emerge
> nss_ldap with USE=debug, and maybe add more of your own debugging inside the C
> source. I strongly suggest using -DDEBUG_SYSLOG during the compile as well, and
> having some syslog rules to catch the debug entries - they can contain
> confidential data, so do take care.
> 
> Also, the 'debug' option in ldap.conf doesn't help at all. It's debugging for
> the LDAP libraries, not nss_ldap/pam_ldap. This is documented in the nss_ldap
> manpage.

Please let me know if this proof is not sufficient and I'll try to recompile nss_ldap with USE=debug to see what I can figure that way, but I'm not that good with programming and debugging. I'll try my best though.

Comment 8 Till Korten 2009-06-07 00:41:59 UTC
I think I found a nice fix for this:
I added the following (very long) line to /etc/ldap.conf:

nss_initgroups_ignoreusers avahi,avahi-autoipd,backup,bin,daemon,dhcp,games,gdm,gnats,haldaemon,hplip,irc,klog,landscape,libuuid,list,lp,mail,man,messagebus,news,openldap,polkituser,proxy,pulse,root,sync,sys,syslog,uucp,www-data,mysql,ldap

I suggest to recheck this for sanity (I just copied it over from an ubuntu ldap.conf and added mysql,ldap). And add a sane line like this to the default ldap.conf in gentoo.
Comment 9 Dennis Schridde 2010-04-03 20:56:30 UTC
I confirm that adding "nss_initgroups_ignoreusers ldap" to ldap.conf fixes the issue.
Comment 10 kikeitor 2011-06-08 17:58:25 UTC
I was searching the problem and finally i found it in my case. I am going to describe the procedure that I followed.

1. Turn off nscd service and execute the next command
 
 nscd -d

2. In other console try to run next:
 
bash -x /etc/init.d/slapd start

3. Script stopped when try to run "runuser", look the first console and you can see that some group could not be found.

4. Go to the /etc/security/limits.d and search that group and comment that line. In my case was pulse-rt.

5. try again start slapd service.

I hope this information can be usefull. good luck.
Comment 11 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2012-02-12 21:51:26 UTC
InCVS.