$ emerge --info Portage 2.1_pre6-r2 (default-linux/x86/2005.1, gcc-3.4.5, glibc-2.3.6-r3, 2.6.15-gentoo-r5 i686) ================================================================= System uname: 2.6.15-gentoo-r5 i686 Intel(R) Pentium(R) 4 CPU 3.00GHz Gentoo Base System version 1.12.0_pre16 distcc 2.18.3 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [disabled] ccache version 2.4 [disabled] dev-lang/python: 2.3.5-r2, 2.4.2-r1 sys-apps/sandbox: 1.2.17 sys-devel/autoconf: 2.13, 2.59-r7 sys-devel/automake: 1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r1 sys-devel/binutils: 2.16.1-r2 sys-devel/libtool: 1.5.22 virtual/os-headers: 2.6.11-r3 ACCEPT_KEYWORDS="x86 ~x86" AUTOCLEAN="yes" CBUILD="i686-pc-linux-gnu" CFLAGS="-Os -march=pentium4 -pipe" CHOST="i686-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3/share/config /usr/lib/X11/xkb /usr/share/config /var/qmail/control" CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d" CXXFLAGS="-Os -march=pentium4 -pipe" DISTDIR="/usr/portage/distfiles" FEATURES="autoconfig distlocks metadata-transfer sandbox sfperms strict" GENTOO_MIRRORS="http://distfiles.gentoo.org http://distro.ibiblio.org/pub/linux/distributions/gentoo" LANG="en_GB" MAKEOPTS="-j3" PKGDIR="/usr/portage/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/local/gentoo/overlay" SYNC="rsync://apps.ramix-uk.cho.ge.com/portage" USE="x86 X afs alsa apache2 apm avi berkdb bitmap-fonts browserplugin cairo cli crypt cscope ctype cups dba dbus dri dvd eds emboss encode expat fastbuild firefox foomaticdb force-cgi-redirect fortran ftp gd gdbm gif gnome gpm gstreamer gtk gtk2 hal imlib ipod jpeg kde kerberos krb4 ldap libg++ libwww mad memlimit mikmod mono motif mp3 mpeg ncurses nls nptl nsplugin ogg oggvorbis opengl oss pam pcre pda pdflib perl png posix python quicktime readline samba sasl sdl session simplexml soap sockets spell spl sse ssl tcpd tokenizer truetype truetype-fonts type1-fonts unicode vorbis xml xml2 xmms xsl xv zlib elibc_glibc kernel_linux userland_GNU" Unset: ASFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, LC_ALL, LDFLAGS, LINGUAS After emerging glibc-2.4 running aklog results in a segementation fault. After compiling with debug enabled I've got the following backtrace #0 savecontext (ep=0, savearea=0x80988dc, sp=0xb7dd600c "
$ emerge --info Portage 2.1_pre6-r2 (default-linux/x86/2005.1, gcc-3.4.5, glibc-2.3.6-r3, 2.6.15-gentoo-r5 i686) ================================================================= System uname: 2.6.15-gentoo-r5 i686 Intel(R) Pentium(R) 4 CPU 3.00GHz Gentoo Base System version 1.12.0_pre16 distcc 2.18.3 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [disabled] ccache version 2.4 [disabled] dev-lang/python: 2.3.5-r2, 2.4.2-r1 sys-apps/sandbox: 1.2.17 sys-devel/autoconf: 2.13, 2.59-r7 sys-devel/automake: 1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r1 sys-devel/binutils: 2.16.1-r2 sys-devel/libtool: 1.5.22 virtual/os-headers: 2.6.11-r3 ACCEPT_KEYWORDS="x86 ~x86" AUTOCLEAN="yes" CBUILD="i686-pc-linux-gnu" CFLAGS="-Os -march=pentium4 -pipe" CHOST="i686-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3/share/config /usr/lib/X11/xkb /usr/share/config /var/qmail/control" CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d" CXXFLAGS="-Os -march=pentium4 -pipe" DISTDIR="/usr/portage/distfiles" FEATURES="autoconfig distlocks metadata-transfer sandbox sfperms strict" GENTOO_MIRRORS="http://distfiles.gentoo.org http://distro.ibiblio.org/pub/linux/distributions/gentoo" LANG="en_GB" MAKEOPTS="-j3" PKGDIR="/usr/portage/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/local/gentoo/overlay" SYNC="rsync://apps.ramix-uk.cho.ge.com/portage" USE="x86 X afs alsa apache2 apm avi berkdb bitmap-fonts browserplugin cairo cli crypt cscope ctype cups dba dbus dri dvd eds emboss encode expat fastbuild firefox foomaticdb force-cgi-redirect fortran ftp gd gdbm gif gnome gpm gstreamer gtk gtk2 hal imlib ipod jpeg kde kerberos krb4 ldap libg++ libwww mad memlimit mikmod mono motif mp3 mpeg ncurses nls nptl nsplugin ogg oggvorbis opengl oss pam pcre pda pdflib perl png posix python quicktime readline samba sasl sdl session simplexml soap sockets spell spl sse ssl tcpd tokenizer truetype truetype-fonts type1-fonts unicode vorbis xml xml2 xmms xsl xv zlib elibc_glibc kernel_linux userland_GNU" Unset: ASFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, LC_ALL, LDFLAGS, LINGUAS After emerging glibc-2.4 running aklog results in a segementation fault. After compiling with debug enabled I've got the following backtrace #0 savecontext (ep=0, savearea=0x80988dc, sp=0xb7dd600c "üýþÿ") at process.c:197 #1 0x080720df in LWP_CreateProcess (ep=0, stacksize=196608, priority=0, parm=0x0, name=0x0, pid=0x0) at lwp.c:386 #2 0x08072e68 in IOMGR_Initialize () at iomgr.c:823 #3 0x08070cc4 in rxi_InitializeThreadSupport () at rx_lwp.c:117 #4 0x08064a65 in rx_InitHost (host=0, port=0) at rx.c:409 #5 0x08064cb9 in rx_Init (port=0) at rx.c:550 #6 0x0804c9fd in pr_Initialize (secLevel=0, confDir=0x8080bc0 "/etc/openafs", cell=0xbfda8dc0 "gefes.com") at ptuser.c:169 #7 0x0804b005 in auth_to_cell (context=0x8096058, cell=0x100 <Address 0x100 out of bounds>, realm=0x0) at aklog_main.c:715 #8 0x0804c2b5 in aklog (argc=1, argv=0xbfdb7484) at aklog_main.c:1412 #9 0x08049f2e in main (argc=0, argv=0x0) at aklog.c:20 Reverting this system to glibc-2.3.6-r3 resolved the problem.
Confirmed on x86: emerging glibc-2.4 crashed my whole fileserver. Simplest way to reproduce was trying to run "bos" (I do not have the kerberos flag set, so I don't have aklog), which simply segfaults. It gives a stack trace comparable to the one in the original comment. Rebuilding openafs using glibc-2.4 doesn't solve anything. Not able to reproduce an amd64 for the moment. (I haven't a fileserver on that machine, but at least "bos" doesn't segfault right away). Running valgrind on "bos" yields o.a. ==18267== Warning: client switching stacks? SP change: 0xBEAAD4BC --> 0xE6E65CB0 which supports the fact that gdb gave me very strange readings. I suspect something nasty in that code. (Though on glibc-2.3.6 I get the same valgrind warning, but no segfault) I have no clue whatsoever at this moment, but suspect glibc-2.4 errors as there seem to be more complaints. Adding blocking dependency to openafs-1.4.0-r2, hoping this will minimize the number of people affected by this bug.
OpenAFS 1.4.1rc10 works fine on AMD64 with glibc 2.4 - both server and client (I needed 1.4.1rc10 since 1.4.0 is incompatible with linux 2.6.16; took ebuild for 1.4.0 and fixed 001_all_compiler-settings.patch - the rest applies without problems)...
(In reply to comment #2) > OpenAFS 1.4.1rc10 works fine on AMD64 with glibc 2.4 - both server and client > (I needed 1.4.1rc10 since 1.4.0 is incompatible with linux 2.6.16; took ebuild > for 1.4.0 and fixed 001_all_compiler-settings.patch - the rest applies without > problems)... > Were you actually able to reproduce the problem with OpenAFS 1.4.0 on AMD64 with glibc 2.4? I have only seen this on x86 as of yet.
Hmm. Probably not. I've got segfaulting "bos" command, but now I can not reproduce it (even with 1.4.1rc10=>1.4.0 downgrade). It segfaulted in the middle of AFS server installation but once server was installed problem disappeared... Perhaps problem was not with glibc 2.4 but with openafs itself ? Question: glibc 2.4 blocks openafs right now not just on x86 but on AMD64 too while openafs is happy with glibc 2.4 on AMD64 - why keep this block in place ?
(In reply to comment #4) > Hmm. Probably not. I've got segfaulting "bos" command, but now I can not > reproduce it (even with 1.4.1rc10=>1.4.0 downgrade). It segfaulted in the > middle of AFS server installation but once server was installed problem > disappeared... Perhaps problem was not with glibc 2.4 but with openafs itself ? > > Question: glibc 2.4 blocks openafs right now not just on x86 but on AMD64 too > while openafs is happy with glibc 2.4 on AMD64 - why keep this block in place ? If you had a segfaulting "bos" command, that seems ample reason to me. I blocked glibc-2.4 on all platforms, because I had no reason to assume the bug was in x86-specific code. You getting a segfault as well on amd64 seems to support this. Of course the question remains how to reproduce that, and how to fix it. In the meanwhile, bug reports about this problem seem to be appearing on the upstream mailing list, so I'm confident there will be some fix soon. I hope it'll find its way in openafs-1.4.1.
I had the same problems with klog and aklog (segfaults) but I found a solution that works form me on x86 with glibc 2.4 by fixing the savecontext function in process.c : http://www.archivesat.com/OpenAFS_Developers/thread236068.htm
(In reply to comment #6) > I had the same problems with klog and aklog (segfaults) but I found a solution > that works form me on x86 with glibc 2.4 by fixing the savecontext function in > process.c : > http://www.archivesat.com/OpenAFS_Developers/thread236068.htm > Yes, but this is the x86-only fix, if I'm correct. I've heard a better solution is in the cvs tree. I feel it's better to wait for that to appear in a release.
Put openafs-1.4.1 in the tree. It fixes the incompatibility with glibc-2.4 (at least on my system). One note: emerging openafs first, and then glibc, still gave me the same error. Remerging openafs here fixed the problem. So I suppose the fix detects the installed glibc version and chooses an implementation according to that.