Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 83879 - nss_ldap-234 doesn't query the ldap server
Summary: nss_ldap-234 doesn't query the ldap server
Status: RESOLVED TEST-REQUEST
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: High major (vote)
Assignee: Robin Johnson
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-03-02 13:40 UTC by Michael Hanselmann (hansmi) (RETIRED)
Modified: 2005-08-27 07:38 UTC (History)
3 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
emerge-info (emerge-info,16.10 KB, text/plain)
2005-03-28 20:39 UTC, Matt Taylor
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Hanselmann (hansmi) (RETIRED) gentoo-dev 2005-03-02 13:40:26 UTC
I've updated nss_ldap from 226 to 234 today and I wasn't able to log in anymore. Remerging nss_ldap-234 fixed it. Using tcpdump showed, that the client does not contact the server anymore. I propose to package.mask this version until we are able to fix it.

Reproducible: Always
Steps to Reproduce:
1.
2.
3.
Comment 1 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2005-03-02 17:48:30 UTC
Huh?
"I've updated nss_ldap from 226 to 234 today and I wasn't able to log in anymore. Remerging nss_ldap-234 fixed it."

Did you mean 226 in the second case?

nss_ldap-234 does work fine for me.
Comment 2 Michael Hanselmann (hansmi) (RETIRED) gentoo-dev 2005-03-03 00:53:06 UTC
Uh, sorry, yes. nss_ldap-226 works great, while 234 doesn't.
Comment 3 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2005-03-03 01:41:46 UTC
I'm using 234 on production boxes, with no problems at all.
I did see the behavior you are noting on 233.
Just on a hunch, could you reboot your 234 machine, to make sure any copy of 226 in the cache gets flushed?
Comment 4 Michael Hanselmann (hansmi) (RETIRED) gentoo-dev 2005-03-03 14:47:21 UTC
I've rebooted after remerging nss_ldap-234 and it still gives me "illegal user" in the log. But I've noticed another problem with the ldap credentials, so I'll have to investigate more. Maybe the configuration has an error. Don't do anything on this bug until further notice. :-)
Comment 5 Matt Taylor 2005-03-28 19:03:00 UTC
I just hit this bug.  I was setting up a new box using 2005.0 and it wouldn't login.

htpc ~ # emerge -p nss_ldap pam_ldap

These are the packages that I would merge, in order:

Calculating dependencies ...done!
[ebuild   R   ] net-libs/nss_ldap-234  
[ebuild   R   ] net-libs/pam_ldap-176  
htpc ~ # su htpc
Unknown id: htpc

I went back and checked my other boxes and this is what I get:

poweredge ~ # emerge -p nss_ldap pam_ldap

These are the packages that I would merge, in order:

Calculating dependencies ...done!
[ebuild   R   ] net-libs/nss_ldap-234  
[ebuild   R   ] net-libs/pam_ldap-176  
poweredge ~ # su htpc
Unknown id: htpc

delltop ~ # emerge -p nss_ldap pam_ldap

These are the packages that I would merge, in order:

Calculating dependencies ...done!
[ebuild     U ] net-libs/nss_ldap-234 [226] 
[ebuild   R   ] net-libs/pam_ldap-176  
delltop ~ # su htpc
Creating directory '/home/htpc'.
htpc@delltop root $ 

The configs on all the boxes are identical.
Comment 6 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2005-03-28 19:50:49 UTC
liverbugg: first of all on the affected machine, diagnose it down to nss or pam.
and make sure nscd is running.

on all of your boxes, I want your 'emerge -v info' output, as well as your version of linux-headers.
Comment 7 Matt Taylor 2005-03-28 20:32:33 UTC
getent passwd also works/doesn't work the same as su on all the boxes.  As that's provided by glibc it should have nothing to do with PAM, so it's definatly nss_ldap that's not working.

nscd isn't running on any boxes and never was running on any boxes.  But starting it changed nothing.

On the ldap server, nss_ldap-234 works, using the same ldap.conf as the other boxes.

All clents have linux-headers-2.6.8.1-r4, the server has linux-headers-2.4.22-r1.

I'll attach the emerge -v info output from all the boxes in one file.
Comment 8 Matt Taylor 2005-03-28 20:39:32 UTC
Created attachment 54737 [details]
emerge-info

emerge -v info from 4 boxes
Comment 9 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2005-03-28 23:06:25 UTC
do either of you use /etc/ldap.secret?
I see that upstream has been busy with versions, and is up to 238 now, that fixes a glitch in the handling of /etc/ldap.secret (the last character was getting removed sometimes). It also makes some other undocuments changes, so I'll put it into the tree tommorrow, in case it fixes things for you.

I agree that there is something weird.
I shelled into my work box (that was using nss_ldap-234), and it worked fine.
I then assembled the ebuild for 238 and tried it, and found that the machine did NOT connect the ldap server at all (checking via tcpdump). I downgraded to 234 again, and found it also now didn't work.
Then I told the box to reboot to check if that helped, but my machine didn't come back on it's own. I probably left a CD in my drive, so it'll have to wait for tommorow to get checked further.
Comment 10 Matt Taylor 2005-03-29 00:00:28 UTC
I dont use /etc/ldap.secret.  I added nss_ldap-238 into my overlay and it behaves exactly as 234 does for me.  
Comment 11 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2005-03-29 00:37:59 UTC
liverbugg: if you have some time on your hands, could you please give the following nss_ldap versions a quick try, via copying the other ebuild into your overlay and just seeing they compile, and if 'genent -s ldap passwd (some-ldap-only-user)' works?
227
228
229
230
232

I'd suggest rebooting between each version, just to be 100% certain.
this should help us narrow it down to a specific change in nss_ldap (or at least cut down the search field significently).
Comment 12 Matt Taylor 2005-03-29 13:40:29 UTC
Here's my test process:

rm /etc/ldap.conf
emerge "=nss_ldap-2xx"
reboot
cp /etc/ldap.conf2 /etc/ldap.conf
getent -s ldap passwd htpc

The reason for rming the conf is many of the broken versions hang during emerge waiting for the timeout, and emerge causes lots of queries so it takes forever for them all to timeout.

226-227 - works fine
228-233 - hangs with "nss_ldap: reconnecting to LDAP server (sleeping xx seconds)..." in syslog, then times out.
234+ - doesn't hang but doesn't work.  no messages in syslog

Rebooting seemed to be unneccesary, although I still did it just to be sure.  If I didn't rm ldap.conf, emerging a broken version would immediately hang during the unmergeing of the old version.  If I had nscd running and I upgrade to a broken version it still works, I assume because of the cacheing.  If I then reboot it's broken.  
Comment 13 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2005-03-30 01:38:56 UTC
toolchain/glibc folk: did something in glibc's nss code change, read this comment please.

after compiling with debugging, the problem gets even more confusing...
It works under SOME combinations of getent/nsswitch, but not others!. I think something in glibc may have changed, and be responsible for some of this...

If I have 'ldap' on the nsswitch line for passwd, then this works 100% under nss_ldap-234 and nss_ldap-238:
'getent passwd $LDAPUSER'
it returns the correct output.

I haven't tested with SSL at all, as my LDAP setup is non-SSL.

but, run 'getent -s ldap passwd $LDAPUSER' and it hangs when NSCD is not running. (227 is the last version that I actively tested this command against, and it work there, it definetly doesn't work for me in 234/238). if nscd is running, it just doesn't give any results.

From the point of view of the nss_ldap code, it should make NO difference that I run 'getent -s ldap ...' or had the ldap entry in nsswitch instead, UNLESS there was a change in glibc. 

Output as follows:
x29 tests # getent -s ldap passwd pat
nss_ldap: ==> _nss_ldap_enter
nss_ldap: <== _nss_ldap_enter
nss_ldap: ==> _nss_ldap_getbyname
nss_ldap: ==> _nss_ldap_search_s
nss_ldap: ==> do_init
nss_ldap: ==> do_close_no_unbind
nss_ldap: <== do_close_no_unbind (connection was not open)
nss_ldap: ==> ldap_init
nss_ldap: ==> _nss_ldap_enter
nss_ldap: <== _nss_ldap_enter
nss_ldap: ==> _nss_ldap_getbyname
nss_ldap: ==> _nss_ldap_search_s
nss_ldap: ==> do_init
nss_ldap: ==> ldap_init
(hang here)

My emerge info output from my testing box:
Portage 2.0.51.19 (default-linux/x86/2005.0, gcc-3.4.3, glibc-2.3.4.20050125-r1, 2.6.10-gentoo-r4 i686)
=================================================================
System uname: 2.6.10-gentoo-r4 i686 AMD Athlon(tm) XP 3000+
Gentoo Base System version 1.6.10
Python:              dev-lang/python-2.3.5 [2.3.5 (#1, Feb 20 2005, 02:21:17)]
distcc 2.18.3 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [disabled]
ccache version 2.3 [enabled]
dev-lang/python:     2.3.5
sys-devel/autoconf:  2.59-r6, 2.13
sys-devel/automake:  1.7.9-r1, 1.8.5-r3, 1.5, 1.4_p6, 1.6.3, 1.9.5
sys-devel/binutils:  2.15.92.0.2-r7
sys-devel/libtool:   1.5.14
virtual/os-headers:  2.6.8.1-r4
ACCEPT_KEYWORDS="x86 ~x86"
AUTOCLEAN="yes"
CFLAGS="-O3 -march=athlon-xp -ggdb3 -pipe"
CHOST="i686-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3.3/env /usr/kde/3.3/share/config /usr/kde/3.3/shutdown /usr/kde/3/share/config /usr/lib/X11/xkb /usr/lib/mozilla/defaults/pref /usr/share/config /usr/share/texmf/dvipdfm/config/ /usr/share/texmf/dvips/config/ /usr/share/texmf/tex/generic/config/ /usr/share/texmf/tex/platex/config/ /usr/share/texmf/xdvi/ /var/qmail/control"
CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d"
CXXFLAGS="-O3 -march=athlon-xp -ggdb3 -pipe"
DISTDIR="/usr/portage-distfiles"
FEATURES="autoaddcvs autoconfig buildpkg ccache collision-protect confcache cvs digest distlocks sandbox sfperms userpriv nostrip"
GENTOO_MIRRORS="http://distfiles.gentoo.org http://distro.ibiblio.org/pub/Linux/distributions/gentoo"
MAKEOPTS="-j16"
PKGDIR="/usr/portage-packages"
PORTAGE_TMPDIR="/dev/shm"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage"
SYNC="rsync://yamato/gentoo-portage"
USE="x86 3dnow X Xaw3d aalib acl acpi alsa amd apache2 apm arts avi berkdb bitmap-fonts caps cdr cgi clearpasswd crypt cscope cups curl divx4linux dri dts dvd dvdr emboss encode erandom escreen esd ethereal expat f77 faac faad fam flac flash foomaticdb fortran gcj gd gdbm gif glx gnome gpm gstreamer ieee1394 imagemagick imap imlib innodb ipalias ipv6 jabber jack java javascript jikes jpeg junit kde ldap libwww lm_sensors mad maildir mcal md5sum mikmod mmx motif mozcalendar mozdevelop mozsvg mozxmlterm mp3 mpeg multitarget nas ncurses nls nptl oav objc offensive oggvorbis opengl pam pcap pda pdflib perl pic plotutils png pnp ppds python quicktime rdesktop readline rpc samba scanner sdl slang slp snmp socks5 speex spell sqlite sse ssl tcltk tcpd tetex theora tidy tiff truetype truetype-fonts type1 type1-fonts ungif usb userlocales v4l v4l2 wifi wmf wxwindows xinerama xml xml2 xmms xosd xrandr xscreensaver xv xvid zlib linguas_en"
Unset:  ASFLAGS, CBUILD, CTARGET, LANG, LC_ALL, LDFLAGS
Comment 14 SpanKY gentoo-dev 2005-03-30 14:57:23 UTC
i dont think any of us toolchain peeps pay attention to the nss code ... i'm pretty sure we've never patched it
Comment 15 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2005-04-29 13:24:16 UTC
I've put nss_ldap-238 into the tree now, please try it.

toolchain:
there is definitely a change in the getent/nss stuff.
previously 'getent -s SOURCE TYPE [FOO]' would use SOURCE only for TYPE data, and use the other sources in nsswitch.conf for other data during the same call (eg nss_ldap might need to do a host lookup to find the server).
The newer versions of glibc apply SOURCE for ALL types of data during the getent call. so 'getent -s ldap passwd' fails when nss_ldap needs to use the file-based or dns-based data to find the LDAP server.
Comment 16 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2005-04-29 13:26:27 UTC
This brokenness is definetly a result of glibc changes, not nss_ldap changes.
I've tried a few glibc versions now, and it seems to work in some of them, but not  others - and sometimes it works, sometimes it doesn't.
Comment 17 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2005-07-06 16:41:37 UTC
please test as requested 2 months ago.
Comment 18 Niels Laukens 2005-08-27 07:24:29 UTC
I just installed a nss_ldap-239 machine and came across this problem. Downgrade
to 226 fixed the problem.

Does this help? or is your fix not in the 239 but only in 238?
Comment 19 Michael Hanselmann (hansmi) (RETIRED) gentoo-dev 2005-08-27 07:38:20 UTC
nss_ldap-239-r1 works for me on ~ppc.