I've got two boxes running Gentoo (desktop and server) with glibc 2.7-r2 installed. nscd always crashes just after 10 minutes with some kind of memory error: nscd: mem.c:392: gc: Assertion `off_alloc == off_allocend' failed. Aborted I get the same error on both boxes. There is nothing in syslog about it. The daemon just silently dies after that. I only saw the above error running with the --debug switch. This is causing issues as I use LDAP users/groups so the boxes are constantly making LDAP queries instead of using the cache. It used to work fine, not sure when it broke exactly. Reproducible: Always Steps to Reproduce: 1. time /usr/sbin/nscd -d 2. wait about 10 mins Actual Results: nscd: mem.c:392: gc: Assertion `off_alloc == off_allocend' failed. Aborted Expected Results: nscd to continue working Portage 2.1.5.2 (default-linux/amd64/2007.0/desktop, gcc-4.2.3, glibc-2.7-r2, 2.6.25-gentoo-r4 x86_64) ================================================================= System uname: 2.6.25-gentoo-r4 x86_64 Intel(R) Core(TM)2 Quad CPU @ 2.40GHz Timestamp of tree: Thu, 22 May 2008 11:46:01 +0000 app-shells/bash: 3.2_p39 dev-java/java-config: 1.3.7, 2.1.6 dev-lang/python: 2.5.2-r4 dev-python/pycrypto: 2.0.1-r6 sys-apps/baselayout: 2.0.0 sys-apps/openrc: 0.2.4-r1 sys-apps/sandbox: 1.2.18.1-r2 sys-devel/autoconf: 2.13, 2.62 sys-devel/automake: 1.5, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2, 1.10.1-r1 sys-devel/binutils: 2.18-r1 sys-devel/gcc-config: 1.4.0-r4 sys-devel/libtool: 1.5.26 virtual/os-headers: 2.6.25-r3 ACCEPT_KEYWORDS="amd64 ~amd64" CBUILD="x86_64-pc-linux-gnu" CFLAGS="-march=nocona -O2 -pipe -ggdb" CHOST="x86_64-pc-linux-gnu" CONFIG_PROTECT="/etc" CONFIG_PROTECT_MASK="/etc/env.d /etc/env.d/java/ /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/splash /etc/terminfo /etc/udev/rules.d" CXXFLAGS="-march=nocona -O2 -pipe -ggdb" DISTDIR="/usr/portage/distfiles" FEATURES="distlocks parallel-fetch sandbox sfperms splitdebug strict unmerge-orphans userfetch" GENTOO_MIRRORS="http://distfiles.gentoo.org http://distro.ibiblio.org/pub/linux/distributions/gentoo" LDFLAGS="" MAKEOPTS="-j5" PKGDIR="/usr/portage/packages" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/local/portage" SYNC="rsync://rsync.lorenb.net/gentoo-portage" USE="X acl acpi alsa amd64 berkdb cairo cdr cli cracklib crypt cups dbus dri dvd dvdr dvdread eds emboss encode esd evo fam firefox fortran gdbm gif gnome gpm gstreamer gtk hal iconv ipv6 isdnlog jpeg ldap mad midi mikmod mmx mp3 mpeg mudflap ncurses nls nptl nptlonly ogg opengl openmp oss pam pcre pdf perl png pppd python qt3support quicktime readline reflection sdl session spell spl sse sse2 ssl svg tcpd tiff truetype unicode vorbis xml xorg xv zlib" ALSA_CARDS="hda-intel" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" ELIBC="glibc" INPUT_DEVICES="keyboard mouse evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" USERLAND="GNU" VIDEO_CARDS="vesa i810" Unset: CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LANG, LC_ALL, LINGUAS, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
please retest with glibc-2.8
I installed sys-libs/glibc-2.8_p20080602 but it still crashes. Not always at 10mins now, sometimes faster or slightly longer (longest time in my test was 12mins; shortest was 2mins). Still get the same error: nscd: mem.c:392: gc: Assertion `off_alloc == off_allocend' failed.
This might be a glibc bug, because I had this problem on gentoo a few years ago, since then I moved to archlinux where I have the same problem. I have reported a bug there, http://bugs.archlinux.org/task/11165.
Sorry - pressed enter when the cursor was on the button rather than in the input text. My version of glibc is 2.8 - but looking around for the bug on google reveals this happens on debian and suse too, which makes me think that glibc should look at their code more closely.
I've put unscd into my portage overlay which might be of interest: http://svn.hurikhan.ath.cx/gentoo/trunk/sys-libs/unscd/ It's a redesigned implementation immune to most issues causing the nscd bugs. I'm currently test-running it on my machines.
and it's in the portage tree now
I have installed glibc sys-libs/glibc-2.8_p20080602-r1 on about 10 machines, everywhere nscd crashes.
Two days ago I updated glibc on a number of machines, from version 2.6.1 to version 2.8_p20080602-r1. The next morning I found that many of the updated machines (but not quite all) had crashed nscd processes. Nothing in the log, so I restarted the crashed instances in the foreground (nscd -d). Since then, I've recorded four more crashes. The exact error has varied but one instance looked just like the debug output captured above; the error does always come from mem.c; and immediately before it there has always been a "remove" line referring to the host's own address, either BYADDR or BYNAME. I'll paste the last two lines of each case here: 9191: remove GETHOSTBYNAME entry "filthy" nscd: mem.c:399: gc: Assertion `next_hash == &he[db->head->nentries]' failed. 25961: remove GETHOSTBYNAME entry "problems-test" nscd: mem.c:392: gc: Assertion `off_alloc == off_allocend' failed. 12740: remove GETHOSTBYNAME entry "time" nscd: mem.c:399: gc: Assertion `next_hash == &he[db->head->nentries]' failed. 12592: remove GETHOSTBYADDR entry "10.135.119.10" nscd: mem.c:310: gc: Assertion `off_alloc <= db->head->first_free' failed. I tried searching the glibc bugzilla for some sign that this has already been reported, but didn't turn anything up.
you might want to look into switching to unscd ...
Same problem here. Profile: hardened/linux/amd64/2008.0/server glibc-2.8_p20080602-r1
We're hitting the same issue at the OSL with glibc-2.8 + hardened. I did some digging around and it appears that Fedora [1] & Ubuntu [2] have a patch that appears to fix it. I haven't had a chance to test it myself but I can do that in the next few days. Worst case we'll switch to unscd but in the meantime I'd like to at least give this a shot. I'll post a patch if it does indeed fix the problem. [1] https://bugzilla.redhat.com/show_bug.cgi?id=430324 [2] https://bugs.launchpad.net/ubuntu/intrepid/+source/glibc/+bug/256157
Created attachment 193324 [details, diff] patch to fix memory usage in nscd
The previous patch incorporates the following memory fixes: http://sources.redhat.com/ml/glibc-cvs/2008-q2/msg00147.html http://sources.redhat.com/ml/glibc-cvs/2008-q2/msg00148.html http://sources.redhat.com/ml/glibc-cvs/2008-q2/msg00149.html http://sources.redhat.com/ml/glibc-cvs/2008-q2/msg00150.html http://sources.redhat.com/ml/glibc-cvs/2008-q2/msg00151.html http://sources.redhat.com/ml/glibc-cvs/2008-q2/msg00311.html http://sources.redhat.com/ml/glibc-cvs/2008-q2/msg00313.html http://sources.redhat.com/ml/glibc-cvs/2008-q2/msg00314.html http://sources.redhat.com/ml/glibc-cvs/2008-q2/msg00316.html http://sources.redhat.com/ml/glibc-cvs/2008-q2/msg00318.html http://sources.redhat.com/ml/glibc-cvs/2008-q2/msg00320.html I have tested this patch against glibc-2.8_p20080602-r1 and appears to fix the memory issues we were having with nscd.
I also used to experience frequent random nscd crashes with glibc-2.8_p20080602. The crashes used to happen in less than one hour from (re)starting nscd. With the new glibc-2.10.1 the problem seems to be fixed for me. No crash has happened since the upgrade which is more than a week ago.
Fort what's it's worth I'm seeing the crashes with glibc-2.10.1, on a rather fresh ~amd64 system, gcc-4.4. Every time I connect to a network, any network, through either ethernet or wifi, nscd crashes right away. If I restart it manually it sticks. It gets old though. Denis.
on my system I lately had an nscd that had consumed nearly 300MB of memory... seems quite a bit for a laptop.... I'll try unscd
nscd seems to behave itself in 2.11.2