Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 126420 - Openafs and glibc-2.4 failures
Summary: Openafs and glibc-2.4 failures
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: x86 Linux
: High normal (vote)
Assignee: Stefaan De Roeck (RETIRED)
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-03-16 07:00 UTC by Martin Donnelly
Modified: 2006-04-22 03:38 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Donnelly 2006-03-16 07:00:10 UTC
$ emerge --info
Portage 2.1_pre6-r2 (default-linux/x86/2005.1, gcc-3.4.5, glibc-2.3.6-r3, 2.6.15-gentoo-r5 i686)
=================================================================
System uname: 2.6.15-gentoo-r5 i686 Intel(R) Pentium(R) 4 CPU 3.00GHz
Gentoo Base System version 1.12.0_pre16
distcc 2.18.3 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [disabled]
ccache version 2.4 [disabled]
dev-lang/python:     2.3.5-r2, 2.4.2-r1
sys-apps/sandbox:    1.2.17
sys-devel/autoconf:  2.13, 2.59-r7
sys-devel/automake:  1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r1
sys-devel/binutils:  2.16.1-r2
sys-devel/libtool:   1.5.22
virtual/os-headers:  2.6.11-r3
ACCEPT_KEYWORDS="x86 ~x86"
AUTOCLEAN="yes"
CBUILD="i686-pc-linux-gnu"
CFLAGS="-Os -march=pentium4 -pipe"
CHOST="i686-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3/share/config /usr/lib/X11/xkb /usr/share/config /var/qmail/control"
CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d"
CXXFLAGS="-Os -march=pentium4 -pipe"
DISTDIR="/usr/portage/distfiles"
FEATURES="autoconfig distlocks metadata-transfer sandbox sfperms strict"
GENTOO_MIRRORS="http://distfiles.gentoo.org http://distro.ibiblio.org/pub/linux/distributions/gentoo"
LANG="en_GB"
MAKEOPTS="-j3"
PKGDIR="/usr/portage/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/gentoo/overlay"
SYNC="rsync://apps.ramix-uk.cho.ge.com/portage"
USE="x86 X afs alsa apache2 apm avi berkdb bitmap-fonts browserplugin cairo cli crypt cscope ctype cups dba dbus dri dvd eds emboss encode expat fastbuild firefox foomaticdb force-cgi-redirect fortran ftp gd gdbm gif gnome gpm gstreamer gtk gtk2 hal imlib ipod jpeg kde kerberos krb4 ldap libg++ libwww mad memlimit mikmod mono motif mp3 mpeg ncurses nls nptl nsplugin ogg oggvorbis opengl oss pam pcre pda pdflib perl png posix python quicktime readline samba sasl sdl session simplexml soap sockets spell spl sse ssl tcpd tokenizer truetype truetype-fonts type1-fonts unicode vorbis xml xml2 xmms xsl xv zlib elibc_glibc kernel_linux userland_GNU"
Unset:  ASFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, LC_ALL, LDFLAGS, LINGUAS


After emerging glibc-2.4 running aklog results in a segementation fault.  After compiling with debug enabled I've got the following backtrace

#0  savecontext (ep=0, savearea=0x80988dc, sp=0xb7dd600c "
Comment 1 Martin Donnelly 2006-03-16 07:00:10 UTC
$ emerge --info
Portage 2.1_pre6-r2 (default-linux/x86/2005.1, gcc-3.4.5, glibc-2.3.6-r3, 2.6.15-gentoo-r5 i686)
=================================================================
System uname: 2.6.15-gentoo-r5 i686 Intel(R) Pentium(R) 4 CPU 3.00GHz
Gentoo Base System version 1.12.0_pre16
distcc 2.18.3 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [disabled]
ccache version 2.4 [disabled]
dev-lang/python:     2.3.5-r2, 2.4.2-r1
sys-apps/sandbox:    1.2.17
sys-devel/autoconf:  2.13, 2.59-r7
sys-devel/automake:  1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r1
sys-devel/binutils:  2.16.1-r2
sys-devel/libtool:   1.5.22
virtual/os-headers:  2.6.11-r3
ACCEPT_KEYWORDS="x86 ~x86"
AUTOCLEAN="yes"
CBUILD="i686-pc-linux-gnu"
CFLAGS="-Os -march=pentium4 -pipe"
CHOST="i686-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3/share/config /usr/lib/X11/xkb /usr/share/config /var/qmail/control"
CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d"
CXXFLAGS="-Os -march=pentium4 -pipe"
DISTDIR="/usr/portage/distfiles"
FEATURES="autoconfig distlocks metadata-transfer sandbox sfperms strict"
GENTOO_MIRRORS="http://distfiles.gentoo.org http://distro.ibiblio.org/pub/linux/distributions/gentoo"
LANG="en_GB"
MAKEOPTS="-j3"
PKGDIR="/usr/portage/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/gentoo/overlay"
SYNC="rsync://apps.ramix-uk.cho.ge.com/portage"
USE="x86 X afs alsa apache2 apm avi berkdb bitmap-fonts browserplugin cairo cli crypt cscope ctype cups dba dbus dri dvd eds emboss encode expat fastbuild firefox foomaticdb force-cgi-redirect fortran ftp gd gdbm gif gnome gpm gstreamer gtk gtk2 hal imlib ipod jpeg kde kerberos krb4 ldap libg++ libwww mad memlimit mikmod mono motif mp3 mpeg ncurses nls nptl nsplugin ogg oggvorbis opengl oss pam pcre pda pdflib perl png posix python quicktime readline samba sasl sdl session simplexml soap sockets spell spl sse ssl tcpd tokenizer truetype truetype-fonts type1-fonts unicode vorbis xml xml2 xmms xsl xv zlib elibc_glibc kernel_linux userland_GNU"
Unset:  ASFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, LC_ALL, LDFLAGS, LINGUAS


After emerging glibc-2.4 running aklog results in a segementation fault.  After compiling with debug enabled I've got the following backtrace

#0  savecontext (ep=0, savearea=0x80988dc, sp=0xb7dd600c "üýþÿ")
    at process.c:197
#1  0x080720df in LWP_CreateProcess (ep=0, stacksize=196608, priority=0,
    parm=0x0, name=0x0, pid=0x0) at lwp.c:386
#2  0x08072e68 in IOMGR_Initialize () at iomgr.c:823
#3  0x08070cc4 in rxi_InitializeThreadSupport () at rx_lwp.c:117
#4  0x08064a65 in rx_InitHost (host=0, port=0) at rx.c:409
#5  0x08064cb9 in rx_Init (port=0) at rx.c:550
#6  0x0804c9fd in pr_Initialize (secLevel=0, confDir=0x8080bc0 "/etc/openafs",
    cell=0xbfda8dc0 "gefes.com") at ptuser.c:169
#7  0x0804b005 in auth_to_cell (context=0x8096058,
    cell=0x100 <Address 0x100 out of bounds>, realm=0x0) at aklog_main.c:715
#8  0x0804c2b5 in aklog (argc=1, argv=0xbfdb7484) at aklog_main.c:1412
#9  0x08049f2e in main (argc=0, argv=0x0) at aklog.c:20

Reverting this system to glibc-2.3.6-r3 resolved the problem.
Comment 2 Stefaan De Roeck (RETIRED) gentoo-dev 2006-03-17 07:18:31 UTC
Confirmed on x86: emerging glibc-2.4 crashed my whole fileserver. Simplest way to reproduce was trying to run "bos" (I do not have the kerberos flag set, so I don't have aklog), which simply segfaults. It gives a stack trace comparable to the one in the original comment. Rebuilding openafs using glibc-2.4 doesn't solve anything. 
Not able to reproduce an amd64 for the moment. (I haven't a fileserver on that machine, but at least "bos" doesn't segfault right away).

Running valgrind on "bos" yields o.a.
==18267== Warning: client switching stacks?  SP change: 0xBEAAD4BC --> 0xE6E65CB0
which supports the fact that gdb gave me very strange readings.  I suspect something nasty in that code. (Though on glibc-2.3.6 I get the same valgrind warning, but no segfault)

I have no clue whatsoever at this moment, but suspect glibc-2.4 errors as there seem to be more complaints. 
Adding blocking dependency to openafs-1.4.0-r2, hoping this will minimize the number of people affected by this bug. 
Comment 3 Canal Vorfeed 2006-03-27 13:50:08 UTC
OpenAFS 1.4.1rc10 works fine on AMD64 with glibc 2.4 - both server and client (I needed 1.4.1rc10 since 1.4.0 is incompatible with linux 2.6.16; took ebuild for 1.4.0 and fixed 001_all_compiler-settings.patch - the rest applies without problems)...
Comment 4 Stefaan De Roeck (RETIRED) gentoo-dev 2006-03-27 14:01:27 UTC
(In reply to comment #2)
> OpenAFS 1.4.1rc10 works fine on AMD64 with glibc 2.4 - both server and client
> (I needed 1.4.1rc10 since 1.4.0 is incompatible with linux 2.6.16; took ebuild
> for 1.4.0 and fixed 001_all_compiler-settings.patch - the rest applies without
> problems)...
> 

Were you actually able to reproduce the problem with OpenAFS 1.4.0 on AMD64 with glibc 2.4?  I have only seen this on x86 as of yet.  
Comment 5 Canal Vorfeed 2006-03-27 14:29:45 UTC
Hmm. Probably not. I've got segfaulting "bos" command, but now I can not reproduce it (even with 1.4.1rc10=>1.4.0 downgrade). It segfaulted in the middle of AFS server installation but once server was installed problem disappeared... Perhaps problem was not with glibc 2.4 but with openafs itself ?

Question: glibc 2.4 blocks openafs right now not just on x86 but on AMD64 too while openafs is happy with glibc 2.4 on AMD64 - why keep this block in place ?
Comment 6 Stefaan De Roeck (RETIRED) gentoo-dev 2006-03-28 00:31:44 UTC
(In reply to comment #4)
> Hmm. Probably not. I've got segfaulting "bos" command, but now I can not
> reproduce it (even with 1.4.1rc10=>1.4.0 downgrade). It segfaulted in the
> middle of AFS server installation but once server was installed problem
> disappeared... Perhaps problem was not with glibc 2.4 but with openafs itself ?
> 
> Question: glibc 2.4 blocks openafs right now not just on x86 but on AMD64 too
> while openafs is happy with glibc 2.4 on AMD64 - why keep this block in place ?

If you had a segfaulting "bos" command, that seems ample reason to me. I blocked glibc-2.4 on all platforms, because I had no reason to assume the bug was in x86-specific code. You getting a segfault as well on amd64 seems to support this. Of course the question remains how to reproduce that, and how to fix it. 

In the meanwhile, bug reports about this problem seem to be appearing on the upstream mailing list, so I'm confident there will be some fix soon. I hope it'll find its way in openafs-1.4.1. 
Comment 7 borut.kersevan 2006-04-07 04:53:13 UTC
I had the same problems with klog and aklog (segfaults) but I found a solution
that works form me on x86 with glibc 2.4 by fixing the savecontext function in 
process.c :
http://www.archivesat.com/OpenAFS_Developers/thread236068.htm
Comment 8 Stefaan De Roeck (RETIRED) gentoo-dev 2006-04-07 05:40:34 UTC
(In reply to comment #6)
> I had the same problems with klog and aklog (segfaults) but I found a solution
> that works form me on x86 with glibc 2.4 by fixing the savecontext function in 
> process.c :
> http://www.archivesat.com/OpenAFS_Developers/thread236068.htm
> 

Yes, but this is the x86-only fix, if I'm correct. I've heard a better solution is in the cvs tree. I feel it's better to wait for that to appear in a release. 
Comment 9 Stefaan De Roeck (RETIRED) gentoo-dev 2006-04-22 03:38:44 UTC
Put openafs-1.4.1 in the tree. It fixes the incompatibility with glibc-2.4 (at least on my system). 
One note: emerging openafs first, and then glibc, still gave me the same error. Remerging openafs here fixed the problem. So I suppose the fix detects the installed glibc version and chooses an implementation according to that.