Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 146672 - nss_ldap compiled with GCC 4.1.1 trigger segfaults
Summary: nss_ldap compiled with GCC 4.1.1 trigger segfaults
Status: RESOLVED INVALID
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Library (show other bugs)
Hardware: x86 Linux
: High blocker (vote)
Assignee: Robin Johnson
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-09-07 04:19 UTC by Lionel Bouton
Modified: 2006-09-29 13:32 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Lionel Bouton 2006-09-07 04:19:41 UTC
I upgraded my box with the shining new 2006.1 profile. It seems that the new Glibc 2.4 / GCC 4.1.1 combination breaks nss_ldap.
Before the upgrade (done by following the GCC Upgrading Guide to the letter with a switch of profile beforehand) my LDAP authentification worked like a charm. Now, as soon as the box tries to look up something in the LDAP server the process involved segfaults.

nsswitch.conf:
-- BEGIN --
passwd:    files ldap
shadow:    files ldap
group:     files ldap

hosts:       files dns
networks:    files dns

services:    db files
protocols:   db files
rpc:         db files
ethers:      db files
netmasks:    files
netgroup:    files
bootparams:  files

automount:   files
aliases:     files
-- END --

On the server side, I see this (for an outside connection to the ssh server which is immediately closed : the forked sshd process is dead before the password prompt):

-- BEGIN --
Sep  7 12:23:39 quiet slapd[15484]: conn=3325 fd=12 ACCEPT from IP=127.0.0.1:39203 (IP=0.0.0.0:389)
Sep  7 12:23:39 quiet slapd[15484]: conn=3325 op=0 BIND dn="" method=128
Sep  7 12:23:39 quiet slapd[15484]: conn=3325 op=0 RESULT tag=97 err=0 text=
Sep  7 12:23:39 quiet slapd[15484]: conn=3325 op=1 SRCH base="dc=home,dc=bouton,dc=name" scope=2 deref=0 filter="(uid=root)"
Sep  7 12:23:39 quiet slapd[15484]: conn=3325 op=1 SEARCH RESULT tag=101 err=0 nentries=1 text=
Sep  7 12:23:39 quiet slapd[15484]: conn=3325 op=2 BIND dn="uid=root,ou=Users,dc=home,dc=bouton,dc=name" method=128
Sep  7 12:23:39 quiet slapd[15484]: slap_global_control: unrecognized control: 1.3.6.1.4.1.42.2.27.8.5.1
Sep  7 12:23:39 quiet slapd[15484]: conn=3325 op=2 RESULT tag=97 err=49 text=
Sep  7 12:23:39 quiet slapd[15484]: conn=3325 op=3 BIND dn="" method=128
Sep  7 12:23:39 quiet slapd[15484]: conn=3325 op=3 RESULT tag=97 err=0 text=
Sep  7 12:23:42 quiet slapd[15484]: conn=3325 fd=12 closed (connection lost)
Sep  7 12:23:44 quiet slapd[15484]: conn=3326 fd=12 ACCEPT from IP=127.0.0.1:39226 (IP=0.0.0.0:389)
Sep  7 12:23:44 quiet slapd[15484]: conn=3326 op=0 BIND dn="" method=128
Sep  7 12:23:44 quiet slapd[15484]: conn=3326 op=0 RESULT tag=97 err=0 text=
Sep  7 12:23:44 quiet slapd[15484]: conn=3326 op=1 SRCH base="dc=home,dc=bouton,dc=name" scope=2 deref=0 filter="(uid=root)"
Sep  7 12:23:44 quiet slapd[15484]: conn=3326 op=1 SEARCH RESULT tag=101 err=0 nentries=1 text=
Sep  7 12:23:44 quiet slapd[15484]: conn=3326 fd=12 closed (connection lost)
-- END --

I've another box with LDAP authentication on the same server working correctly (no GCC/Glibc upgrade for it) so the problem is clearly on the client side.

As soon as I remove any ldap reference from the nsswitch.conf file, everything becomes stable.

I've tried without and with nscd running. When nscd is running there's a twist:
with ldap in nsswitch.conf, processes still segfault but the nscd process seems to be able to do lookups because when I remove the ldap references from the nsswitch.conf file, getent calls return users and groups that are only defined in the LDAP server.

I tried both the x86 nss_ldap-249 and ~x86 nss_ldap-252. Even a fresh reboot with 252 doesn't solve the problem (assuming an old library might have been loaded in memory).

emerge --info:

Portage 2.1-r2 (default-linux/x86/2006.1, gcc-4.1.1, glibc-2.4-r3, 2.6.17-gentoo-r7 i686)
=================================================================
System uname: 2.6.17-gentoo-r7 i686 Unknown CPU Typ
Gentoo Base System version 1.12.4
distcc 2.18.3 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [disabled]
ccache version 2.3 [enabled]
app-admin/eselect-compiler: [Not Present]
dev-lang/python:     2.4.3-r1
dev-python/pycrypto: 2.0.1-r5
dev-util/ccache:     2.3
dev-util/confcache:  [Not Present]
sys-apps/sandbox:    1.2.17
sys-devel/autoconf:  2.13, 2.59-r7
sys-devel/automake:  1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2
sys-devel/binutils:  2.16.1-r3
sys-devel/gcc-config: 1.3.13-r3
sys-devel/libtool:   1.5.22
virtual/os-headers:  2.6.11-r5
ACCEPT_KEYWORDS="x86"
AUTOCLEAN="yes"
CBUILD="i686-pc-linux-gnu"
CFLAGS="-O3 -march=athlon-xp -fomit-frame-pointer -ffast-math -pipe"
CHOST="i686-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/NX/etc /usr/NX/home /usr/share/X11/xkb /var/bind"
CONFIG_PROTECT_MASK="/etc/env.d /etc/env.d/java/ /etc/gconf /etc/java-config/vms/ /etc/revdep-rebuild /etc/terminfo"
CXXFLAGS="-O3 -march=athlon-xp -fomit-frame-pointer -ffast-math -pipe"
DISTDIR="/usr/portage/distfiles"
FEATURES="autoconfig ccache distlocks metadata-transfer parallel-fetch sandbox sfperms strict userpriv usersandbox"
GENTOO_MIRRORS="http://ftp-stud.fht-esslingen.de/pub/Mirrors/gentoo/ http://mirrors.sec.informatik.tu-darmstadt.de/gentoo/ http://ftp.roedu.net/pub/mirrors/gentoo.org/"
LINGUAS="en fr"
MAKEOPTS="-j2"
PKGDIR="/usr/portage/packages"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --delete-after --stats --timeout=180 --exclude='/distfiles' --exclude='/local' --exclude='/packages'"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage"
SYNC="rsync://rsync.europe.gentoo.org/gentoo-portage"
USE="x86 3dnow 3dnowext X aac aalib aim alsa bash-completion bitmap-fonts bzip2 cairo cli crypt cups djbfft dlloader dri dvd dvdread emacs fam fastcgi firefox flac glut gtkhtml icq imagemagick iproute2 ipv6 isdnlog jabber java jikes jpeg lcms lzo makecheck matroska memcache mmx mmx2 mmxext mozilla mp4 msn ncurses nls nptl nptlonly nsplugin pam pcre png ppds pppd rdesktop readline real reflection rrdtool ruby scanner session speex spl sse ssl theora threads tiff truetype truetype-fonts type1-fonts udev unicode usb vnc xorg xvid yahoo zlib elibc_glibc input_devices_keyboard input_devices_mouse kernel_linux linguas_en linguas_fr userland_GNU video_cards_mga"
Unset:  CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LANG, LC_ALL, LDFLAGS, PORTAGE_RSYNC_EXTRA_OPTS
Comment 1 Lionel Bouton 2006-09-07 06:55:40 UTC
I just did the following:
- emerge gcc-3.4.6
- gcc-config i686-pc-linux-gnu-3.4.6
- source /etc/profile
- emerge nss_ldap (x86 249)
- activate ldap in nsswitch.conf
- opened a new ssh session successfully

It rules out a glibc-2.4 problem. This is definitely a nss_ldap/gcc-4.1.1 problem.
Comment 2 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2006-09-07 18:12:48 UTC
Build BOTH openldap and nss_ldap with the same version of GCC4, and test again.
Comment 3 Lionel Bouton 2006-09-08 02:04:27 UTC
Both were originally compiled with gcc-4.1.1 as I followed the GCC Upgrade Guide.

I believe that after:
- gcc-config i686-pc-linux-gnu-4.1.1
- source /etc/profile
- emerge -e system && emerge -e world
- a reboot.

there's no way to have openldap and nss_ldap compiled with anything other than gcc-4.1.1 both on disk or in memory, is there?

Anyway, the machine as time to spare. I'll make a tbz2 of the currently working, 3.4.6-compiled nss_ldap and do another

emerge openldap nss_ldap

with gcc-4.1.1 just to be sure. I'll post the result shortly (if I don't lose my ssh access in the process...).
Comment 4 Lionel Bouton 2006-09-08 02:29:10 UTC
After rebuilding both openldap and nss_ldap with 4.1.1 the segfaults are back as soon as ldap is in nsswitch.conf.

Reverting to a nss_ldap compiled with 3.4.6 solves the problem (again).
Comment 5 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2006-09-08 10:44:06 UTC
Hmm, I'll dig more.
The reason I asked for rebuilding, is that I saw some weirdness if during the GCC4 upgrade, nss_ldap was rebuilt before openldap.
1. Please attach your /etc/ldap.conf file.
2. What version of openldap were you using?
Comment 6 Lionel Bouton 2006-09-08 15:00:04 UTC
Ok, I understand the concern for a mismatch between the openldap used for build and run.

I use openldap-2.3.24-r1 (and nss_ldap-249).

Here's the ldap.conf content (grep -v '^ *\(#\|$\)' /etc/ldap.conf) to save you the comments actually) :

host 127.0.0.1
base dc=home,dc=bouton,dc=name
nss_reconnect_tries 1
nss_reconnect_sleeptime 1
nss_reconnect_maxsleeptime 1
nss_reconnect_maxconntries 3

Comment 7 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2006-09-27 18:02:31 UTC
could you please provide the output of:
ldd /lib/libnss_ldap-2.4.so

also, try the latest ~arch nss_ldap while you are at it.


I still haven't reproduced this problem at all, and my entire machine is built with GCC4.1.1.
Comment 8 Lionel Bouton 2006-09-27 18:42:25 UTC
x86 nss_ldap (249) compiled with gcc-3.4.6 (ok):
ldd /lib/libnss_ldap-2.4.so
        linux-gate.so.1 =>  (0xffffe000)
        libldap-2.3.so.0 => /usr/lib/libldap-2.3.so.0 (0xb7f9c000)
        liblber-2.3.so.0 => /usr/lib/liblber-2.3.so.0 (0xb7f90000)
        libdl.so.2 => /lib/libdl.so.2 (0xb7f8c000)
        libnsl.so.1 => /lib/libnsl.so.1 (0xb7f77000)
        libresolv.so.2 => /lib/libresolv.so.2 (0xb7f65000)
        libc.so.6 => /lib/libc.so.6 (0xb7e49000)
        libssl.so.0.9.7 => /usr/lib/libssl.so.0.9.7 (0xb7e16000)
        libcrypto.so.0.9.7 => /usr/lib/libcrypto.so.0.9.7 (0xb7d07000)
        /lib/ld-linux.so.2 (0x80000000)

with 4.1.1 (segfaults):
        linux-gate.so.1 =>  (0xffffe000)
        libldap-2.3.so.0 => /usr/lib/libldap-2.3.so.0 (0xb7ef1000)
        liblber-2.3.so.0 => /usr/lib/liblber-2.3.so.0 (0xb7ee5000)
        libdl.so.2 => /lib/libdl.so.2 (0xb7ee1000)
        libnsl.so.1 => /lib/libnsl.so.1 (0xb7ecc000)
        libresolv.so.2 => /lib/libresolv.so.2 (0xb7eba000)
        libc.so.6 => /lib/libc.so.6 (0xb7d9e000)
        libssl.so.0.9.7 => /usr/lib/libssl.so.0.9.7 (0xb7d6b000)
        libcrypto.so.0.9.7 => /usr/lib/libcrypto.so.0.9.7 (0xb7c5c000)
        /lib/ld-linux.so.2 (0x80000000)

~x86 nss_ldap (253) with 4.1.1 (segfaults):
        linux-gate.so.1 =>  (0xffffe000)
        libldap-2.3.so.0 => /usr/lib/libldap-2.3.so.0 (0xb7ea2000)
        liblber-2.3.so.0 => /usr/lib/liblber-2.3.so.0 (0xb7e96000)
        libdl.so.2 => /lib/libdl.so.2 (0xb7e92000)
        libnsl.so.1 => /lib/libnsl.so.1 (0xb7e7d000)
        libresolv.so.2 => /lib/libresolv.so.2 (0xb7e6b000)
        libc.so.6 => /lib/libc.so.6 (0xb7d4f000)
        libssl.so.0.9.7 => /usr/lib/libssl.so.0.9.7 (0xb7d1c000)
        libcrypto.so.0.9.7 => /usr/lib/libcrypto.so.0.9.7 (0xb7c0d000)
        /lib/ld-linux.so.2 (0x80000000)

I've got another box (another Athlon-XP) broken since my last report. But for this one, I've no solution: although the first two were fixed by emerging nss_ldap with gcc 3.4.6, on this third box even this doesn't fix the problem.
All boxes have got the emerge -e system && emerge -e world treatment, even twice for some of them. Seems like a corner case linked to CFLAGS or USE-flags.

Here's the emerge --info of the third box:
Portage 2.1.1 (default-linux/x86/2006.1/desktop, gcc-4.1.1, glibc-2.4-r3, 2.6.16-gentoo-r6 i686)
=================================================================
System uname: 2.6.16-gentoo-r6 i686 AMD Athlon(tm) XP 1600+
Gentoo Base System version 1.12.5
Last Sync: Thu, 21 Sep 2006 17:50:01 +0000
distcc 2.18.3 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [enabled]
ccache version 2.3 [enabled]
app-admin/eselect-compiler: [Not Present]
dev-java/java-config: 1.3.6-r1, 2.0.28-r1
dev-lang/python:     2.3.5-r2, 2.4.3-r1
dev-python/pycrypto: 2.0.1-r5
dev-util/ccache:     2.3
dev-util/confcache:  [Not Present]
sys-apps/sandbox:    1.2.17
sys-devel/autoconf:  2.13, 2.59-r7
sys-devel/automake:  1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2
sys-devel/binutils:  2.16.1-r3
sys-devel/gcc-config: 1.3.13-r3
sys-devel/libtool:   1.5.22
virtual/os-headers:  2.6.17-r1
ACCEPT_KEYWORDS="x86"
AUTOCLEAN="yes"
CBUILD="i686-pc-linux-gnu"
CFLAGS="-O2 -march=athlon-xp -fomit-frame-pointer -ffast-math -pipe"
CHOST="i686-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/NX/etc /usr/NX/home /usr/kde/3.4/env /usr/kde/3.4/share/config /usr/kde/3.4/shutdown /usr/kde/3.5/env /usr/kde/3.5/share/config /usr/kde/3.5/shutdown /usr/share/X11/xkb /usr/share/config /usr/share/texmf/dvipdfm/config/ /usr/share/texmf/dvips/config/ /usr/share/texmf/tex/generic/config/ /usr/share/texmf/tex/platex/config/ /usr/share/texmf/xdvi/"
CONFIG_PROTECT_MASK="/etc/env.d /etc/env.d/java/ /etc/gconf /etc/java-config/vms/ /etc/revdep-rebuild /etc/terminfo"
CXXFLAGS="-O2 -march=athlon-xp -fomit-frame-pointer -ffast-math -pipe"
DISTDIR="/usr/portage/distfiles"
FEATURES="autoconfig ccache distcc distlocks metadata-transfer parallel-fetch sandbox sfperms strict userpriv usersandbox"
GENTOO_MIRRORS="http://ftp.belnet.be/mirror/rsync.gentoo.org/gentoo/ http://ftp.club-internet.fr/pub/mirrors/gentoo http://pandemonium.tiscali.de/pub/gentoo/"
LINGUAS="fr"
MAKEOPTS="-j13"
PKGDIR="/usr/portage/packages"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --delete-after --stats --timeout=180 --exclude='/distfiles' --exclude='/local' --exclude='/packages'"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage"
SYNC="rsync://rsync-gentoo.inet6-interne.fr/gentoo-portage"
USE="x86 3dnow 3dnowext X aac aalib alsa arts bash-completion berkdb bindist bitmap-fonts browserplugin bzip2 cairo cdr cjk cli crypt cups dbus dga dlloader dri dvd dvdr eds elibc_glibc emacs emboss encode esd fam firefox flac gdbm gif gimpprint gpm gstreamer hal input_devices_evdev input_devices_keyboard input_devices_mouse isdnlog java jikes jpeg kernel_linux lcms ldap libcaca libg++ linguas_fr mad matroska mikmod mmx mmxext mng mp3 mpeg ncurses nls nptl nptlonly offensive ogg opengl oss pam pcre perl png ppds pppd qt3 qt4 quicktime readline real reflection ruby samba sdl session spell spl sse ssl threads tiff truetype truetype-fonts type1-fonts udev unicode userland_GNU video_cards_mga video_cards_s3 video_cards_s3virge vorbis win32codecs xinerama xml xorg xprint xv xvid xvmc zlib"
Unset:  CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LANG, LC_ALL, LDFLAGS, PORTAGE_RSYNC_EXTRA_OPTS
Comment 9 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2006-09-27 18:46:52 UTC
why are you using -ffast-math - this is a known cause of brokenness in systems.

please rebuild without fast-math - because we don't want stuff from the ccache that used fast-math before to break things now.
rebuilding glibc, openssl, openldap, and then nss_ldap should be sufficent to test this.
Comment 10 Lionel Bouton 2006-09-29 12:36:08 UTC
emerging without fast-math solves the case!
Comment 11 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2006-09-29 13:32:45 UTC
closing as invalid since it was fast-math causing the brokeness.
please heed the warnings in the handbook in future about fast-math.