If MySQL is not run with the "one-thread" clause in /etc/mysql/my.cnf then it defuncts on alpha when accessed from a webpage (CFLAGS="-mieee -O2 -mcpu=ev6" - Linux alpha 2.6.12.2 alpha EV6 GNU/Linux) - both version 4.0.24 and 4.1.13-r1 shows this behaviour. The problem has been produced on a alpha ds10L server running apache 2.0.54 and PHP 4.4 (earlier versions of php/apache do however have the same problems also this problem can be reproduced from webpages written in perl so I am quite sure that it is not a php problem). MySQL is compiled with the folowing options in version 4.1.13-r1: +berkdb -big-tables -cluster -debug* -doc -extraengine -geometry -minimal +perl +readline (-selinux) +ssl -static +tcpd -utf8 version 4.0.24 with these: +berkdb -debug* +innodb +perl +readline (-selinux) +ssl -static +tcpd I have recompiled mysql with debug support - the links below is a trace of accessing a webpage which resulted in two defunct instances of mysql (the trace file is very big and thus is zipped). http://randompage.org/static/mysqld.sql.bz2 http://randompage.org/static/mysqld.trace.bz2 I have not been able to reproduce this problem when accessing mysql directly local on the machine but I am guessing that the problem is related to parallel access since it disappear when running in single threaded mode. Reproducible: Always Steps to Reproduce: 1. emerge -k mysql Actual Results: defuncts when accessed. Expected Results: run the query without creating zombie processes.
I also have the same problem. mysql-4.0.25-r2: +berkdb -big-tables -debug -doc -minimal +perl +readline (-selinux) +ssl -static +tcpd emerge info: Portage 2.0.51.22-r2 (default-linux/alpha/2005.0/2.4, gcc-3.3.2, glibc-2.3.4.20041102-r1, 2.4.30 alpha) ================================================================= System uname: 2.4.30 alpha EV56 Gentoo Base System version 1.6.13 dev-lang/python: 2.3.5 sys-apps/sandbox: 1.2.11 sys-devel/autoconf: 2.13, 2.59-r6 sys-devel/automake: 1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.5 sys-devel/binutils: 2.15.92.0.2-r10 sys-devel/libtool: 1.5.18-r1 virtual/os-headers: 2.4.23 ACCEPT_KEYWORDS="alpha" AUTOCLEAN="yes" CBUILD="alpha-unknown-linux-gnu" CFLAGS="-mieee -O2 -mcpu=ev4" CHOST="alpha-unknown-linux-gnu" CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3/share/config /usr/share/config /var/qmail/control" CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d" CXXFLAGS="-mieee -O2 -mcpu=ev4" DISTDIR="/usr/portage/distfiles" FEATURES="autoconfig distlocks sandbox sfperms strict" GENTOO_MIRRORS="ftp://cs.ubishops.ca:2121/pub/gentoo ftp://cs.ubishops.ca/pub/gentoo" LDFLAGS="-Wl,-O1" LINGUAS="en" MAKEOPTS="-j2" PKGDIR="/usr/portage/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/local/portage" SYNC="rsync://rsync.ca.gentoo.org/gentoo-portage" USE="alpha apache2 arts berkdb bitmap-fonts crypt cups curl encode font-server foomatic foomaticdb fortran gd gdbm gif gtk2 imlib jabber jpeg libg++ libwww mad mp3 mpeg mysql ncurses nls pam pdflib perl png postgres python readline spell ssl tcpd tiff truetype truetype-fonts type1-fonts xml2 zlib linguas_en userland_GNU kernel_linux elibc_glibc" Unset: ASFLAGS, CTARGET, LANG, LC_ALL
just coming from the read of "http://dev.mysql.com/doc/mysql/en/alpha-dec-osf1.html" please can you try the following sequence ? #export CFLAGS="-D_PTHREAD_USE_D4 -DDONT_USE_THR_ALARM" #export CXXFLAGS="-DDONT_USE_THR_ALARM" #emerge dev-db/mysql if that solve the problem I'll modify the ebuilds accordingly (unless it broke for others) .
Compiling with CFLAGS="-D_PTHREAD_USE_D4 -DDONT_USE_THR_ALARM" CXXFLAGS="-DDONT_USE_THR_ALARM" Does not solve it for mysql 4.1.13-r1.
I'm running: [ebuild R ] dev-db/mysql-4.0.24 +berkdb* -debug -innodb +perl +readline* (-selinux) +ssl -static +tcpd 0 kB on Portage 2.0.51.22-r2 (default-linux/alpha/2005.0/2.4, gcc-3.3.2, glibc-2.3.2-r12, 2.4.29-grsec-2.1.3 alpha) ================================================================= System uname: 2.4.29-grsec-2.1.3 alpha EV45 Gentoo Base System version 1.6.13 dev-lang/python: 2.3.5 sys-apps/sandbox: 1.2.11 sys-devel/autoconf: 2.13, 2.59-r6 sys-devel/automake: 1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.5 sys-devel/binutils: 2.15.92.0.2-r10 sys-devel/libtool: 1.5.18-r1 virtual/os-headers: 2.4.23 ACCEPT_KEYWORDS="alpha" AUTOCLEAN="yes" CBUILD="alpha-unknown-linux-gnu" CFLAGS="-mieee -O2 -mcpu=ev45" CHOST="alpha-unknown-linux-gnu" CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3/share/config /usr/share/config /var/qmail/control" CONFIG_PROTECT_MASK="/etc/gconf /etc/env.d" CXXFLAGS="-mieee -O2 -mcpu=ev45" DISTDIR="/usr/portage/distfiles" FEATURES="autoconfig distlocks sandbox sfperms strict" GENTOO_MIRRORS="http://distfiles.gentoo.org http://distro.ibiblio.org/pub/Linux/distributions/gentoo" MAKEOPTS="-j1" PKGDIR="/usr/portage/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/local/portage" SYNC="rsync://rsync.gentoo.org/gentoo-portage" USE="alpha berkdb bitmap-fonts crypt cups encode font-server foomaticdb fortran gd gdbm gif gpm gtk2 imap imlib jpeg libg++ libwww mad mikmod motif mp3 mpeg mysql ncurses nls oggvorbis opengl oss pam pdflib perl png python quicktime readline sdl spell ssl tcpd truetype truetype-fonts type1-fonts xml2 xmms xv zlib userland_GNU kernel_linux elibc_glibc" Unset: ASFLAGS, CTARGET, LANG, LC_ALL, LDFLAGS, LINGUAS in a production server (gentoo-es) with no problems. Cheers, Ferdy
mysql-4.1.13-r1: ---8<--- gendcc02 mysql # cd /var/tmp/portage/mysql-4.1.13-r1/work/mysql/ gendcc02 mysql # make test cd mysql-test; perl mysql-test-run.pl && perl mysql-test-run.pl --ps-protocol Killing Possible Leftover Processes Removing Stale Files Installing Master Databases Installing Master Databases Installing Slave Databases Installing Slave Databases Installing Slave Databases ======================================================= Finding Tests in the 'main' suite Starting Tests in the 'main' suite TEST RESULT ------------------------------------------------------- alias [ pass ] alter_table [ pass ] analyse [ pass ] analyze [ pass ] ansi [ pass ] archive [ skipped ] auto_increment [ pass ] mysql-test-run: WARNING: can't kill process 1 mysql-test-run: *** ERROR: we could not kill or clean up all processes make: *** [test] Error 1 ---8<--- emerge info: ---8<--- Portage 2.0.51.22-r1 (default-linux/alpha/2005.0/2.4, gcc-3.3.2, glibc-2.3.4.20041102-r1, 2.4.28 alpha) ================================================================= System uname: 2.4.28 alpha EV56 Gentoo Base System version 1.6.12 distcc 2.18.3 alpha-unknown-linux-gnu (protocols 1 and 2) (default port 3632) [disabled] ccache version 2.3 [disabled] dev-lang/python: 2.3.5 sys-apps/sandbox: 1.2.10 sys-devel/autoconf: 2.13, 2.59-r6 sys-devel/automake: 1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.5 sys-devel/binutils: 2.15.92.0.2-r10 sys-devel/libtool: 1.5.18-r1 virtual/os-headers: 2.4.23 ACCEPT_KEYWORDS="alpha" AUTOCLEAN="yes" CBUILD="alpha-unknown-linux-gnu" CFLAGS="-mieee -O2 -mcpu=ev56" CHOST="alpha-unknown-linux-gnu" CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3/share/config /usr/share/config /var/qmail/control" CONFIG_PROTECT_MASK="/etc/gconf /etc/env.d" CXXFLAGS="-mieee -O2 -mcpu=ev56" DISTDIR="/usr/portage/distfiles" FEATURES="autoconfig distlocks sandbox sfperms strict" GENTOO_MIRRORS="ftp://ftp.ussg.iu.edu/pub/linux/gentoo/ ftp://gentoo.mirrors.pair.com" PKGDIR="/usr/portage/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" SYNC="rsync://rsync.gentoo.org/gentoo-portage" USE="alpha X arts berkdb bitmap-fonts crypt cups encode font-server foomaticdb fortran gdbm gif gnome gpm gtk gtk2 imlib jpeg kde libg++ libwww mad mikmod motif mp3 mpeg ncurses nls oggvorbis opengl oss pam pdflib perl png python qt quicktime readline sdl slang spell ssl tcpd truetype truetype-fonts type1-fonts xml2 xmms xv zlib userland_GNU kernel_linux elibc_glibc" Unset: ASFLAGS, CTARGET, LANG, LC_ALL, LDFLAGS, LINGUAS, MAKEOPTS, PORTDIR_OVERLAY ---8<--- Cheers, Ferdy
Closing this because we can't reproduce the bug. May I humbly suggest to verify the cleanes of the system ? Something like: #emerge --sync #emerge -pv depclean #emerge depclean #emerge -uDav --newuse world and maybe re-emerge the packages that depends upon newly installed mysql ? Sorry to not give a better support.
(In reply to comment #5) > mysql-4.1.13-r1: > > ---8<--- > gendcc02 mysql # cd /var/tmp/portage/mysql-4.1.13-r1/work/mysql/ > gendcc02 mysql # make test > cd mysql-test; perl mysql-test-run.pl && perl mysql-test-run.pl --ps-protocol > Killing Possible Leftover Processes > Removing Stale Files > Installing Master Databases > Installing Master Databases > Installing Slave Databases > Installing Slave Databases > Installing Slave Databases > ======================================================= > Finding Tests in the 'main' suite > Starting Tests in the 'main' suite > > TEST RESULT > ------------------------------------------------------- > > alias [ pass ] > alter_table [ pass ] > analyse [ pass ] > analyze [ pass ] > ansi [ pass ] > archive [ skipped ] > auto_increment [ pass ] > mysql-test-run: WARNING: can't kill process 1 > mysql-test-run: *** ERROR: we could not kill or clean up all processes > make: *** [test] Error 1 > ---8<--- > > emerge info: > > ---8<--- > Portage 2.0.51.22-r1 (default-linux/alpha/2005.0/2.4, gcc-3.3.2, > glibc-2.3.4.20041102-r1, 2.4.28 alpha) > ================================================================= > System uname: 2.4.28 alpha EV56 > Gentoo Base System version 1.6.12 > distcc 2.18.3 alpha-unknown-linux-gnu (protocols 1 and 2) (default port 3632) > [disabled] > ccache version 2.3 [disabled] > dev-lang/python: 2.3.5 > sys-apps/sandbox: 1.2.10 > sys-devel/autoconf: 2.13, 2.59-r6 > sys-devel/automake: 1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.5 > sys-devel/binutils: 2.15.92.0.2-r10 > sys-devel/libtool: 1.5.18-r1 > virtual/os-headers: 2.4.23 > ACCEPT_KEYWORDS="alpha" > AUTOCLEAN="yes" > CBUILD="alpha-unknown-linux-gnu" > CFLAGS="-mieee -O2 -mcpu=ev56" > CHOST="alpha-unknown-linux-gnu" > CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3/share/config > /usr/share/config /var/qmail/control" > CONFIG_PROTECT_MASK="/etc/gconf /etc/env.d" > CXXFLAGS="-mieee -O2 -mcpu=ev56" > DISTDIR="/usr/portage/distfiles" > FEATURES="autoconfig distlocks sandbox sfperms strict" > GENTOO_MIRRORS="ftp://ftp.ussg.iu.edu/pub/linux/gentoo/ > ftp://gentoo.mirrors.pair.com" > PKGDIR="/usr/portage/packages" > PORTAGE_TMPDIR="/var/tmp" > PORTDIR="/usr/portage" > SYNC="rsync://rsync.gentoo.org/gentoo-portage" > USE="alpha X arts berkdb bitmap-fonts crypt cups encode font-server foomaticdb > fortran gdbm gif gnome gpm gtk gtk2 imlib jpeg kde libg++ libwww mad mikmod > motif mp3 mpeg ncurses nls oggvorbis opengl oss pam pdflib perl png python qt > quicktime readline sdl slang spell ssl tcpd truetype truetype-fonts type1-fonts > xml2 xmms xv zlib userland_GNU kernel_linux elibc_glibc" > Unset: ASFLAGS, CTARGET, LANG, LC_ALL, LDFLAGS, LINGUAS, MAKEOPTS, PORTDIR_OVERLAY > ---8<--- > > Cheers, > Ferdy This compile setup is different for the one we are using, I can however not get it working either, can you try to compile mysql 4.0 with +berkdb -debug* +innodb +perl +readline (-selinux) +ssl -static +tcpd and see if it still works for you ? (note the problem only seams to occure with multiple mysql threads) - I have also recrated this problem with mysql 4.0 on a alpha DS20L with a freshly installed gentoo.
@ Comment #7 Lars Roland try # mysqlbinlog /var/lib/mysql/[host_name]-bin.[last_log_id] # tail /var/log/mysql/mysql*.{err,log} to see if there is something strange in the logs from two different shell: #mysql -uroot -p #mysql -uroot -p to try if the native mysql client has problems #export LD_ASSUME_KERNEL=2.4.19 #/usr/sbin/mysqld --defaults-file=/etc/mysql/my.cnf --basedir=/usr --datadir=/var/lib/mysql --user=mysql --pid-file=/var/run/mysqld/mysqld.pid --skip-locking --port=3306 --socket=/var/run/mysqld/mysqld.sock to see if it's a nptl problem. try #emerge =dev-db/mysql-4.0.25-r2 -berkdb -big-tables -debug -doc -minimal -perl -readline -ssl -static -tcpd To see if the problem disappear removing all extensions.
(In reply to comment #8) > @ Comment #7 Lars Roland > > try > # mysqlbinlog /var/lib/mysql/[host_name]-bin.[last_log_id] > # tail /var/log/mysql/mysql*.{err,log} > to see if there is something strange in the logs the logs are empty (only entries for starting and stopping mysql shows up). > > from two different shell: > #mysql -uroot -p > #mysql -uroot -p > to try if the native mysql client has problems Making two logins resultss in two defunct mysqld process - but first when I quit the login (\q) - i.e. loging in and executing SQL from within the different logins does not seam to cause any trouble but each login creates a defunct mysql process when I leave it. > > #export LD_ASSUME_KERNEL=2.4.19 > #/usr/sbin/mysqld --defaults-file=/etc/mysql/my.cnf --basedir=/usr > --datadir=/var/lib/mysql --user=mysql --pid-file=/var/run/mysqld/mysqld.pid > --skip-locking --port=3306 --socket=/var/run/mysqld/mysqld.sock > > to see if it's a nptl problem. This has no effect (i.e. mysql still defuncts), also is anyone out there even using nptl in alpha architectures in gentoo, i.e. I have never recompiled glibc with the nptl keyword and thus it should not be used (or is it default now ?). > > try > #emerge =dev-db/mysql-4.0.25-r2 -berkdb -big-tables -debug -doc -minimal -perl > -readline -ssl -static -tcpd > > To see if the problem disappear removing all extensions. the problem is still there - only way to solve it is recompiling with debug turned on and then start mysql in single threaded mode (as I wrote in my initial bug report).
Created attachment 67281 [details] Test case that demonstrates that this is not a MySQL bug Compile with "gcc -lpthread test.c" Run with "./a.out" In another terminal type "ps aux|grep a.out" On affected Alpha systems you should see "<defunct>" On all other systems you should not see "<defunct>"
Above, I posted a test case that demonstrates that this is not a MySQL bug. It appears to be a bug in the pthread library or kernel. The bug makes alpha systems not properly handle threads that finish executing or call pthread_exit.
Created attachment 67282 [details] Test case that demonstrates that this is not a MySQL bug This bug also effects fork()'d processes.
Hi ! Clearly this is an alpha-only bug and seems to be caused by latest glibc. I'm trying those test cases with an older glibc and everything seems to work fine. (reassigning to alpha and cc'ing SpanKY, our glibc guru). The latest (the fork() one) test-case is not appropiate, the defunct proccess also appears on my x86 machine. But the pthread test-case is good. If any of you could please try those test cases in glibc-3.2.3. I'm doing it right now in several machines. Debian bug is: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=325600 SpanKY, any ideas ? Cheers, Ferdy
a good idea to find a reference glibc which does not have this issue ...
Comment on attachment 67282 [details] Test case that demonstrates that this is not a MySQL bug you get zombies with this fork example because you dont ignore child signals nor do you execute any wait funcs
(In reply to comment #15) > (From update of attachment 67282 [details] [edit]) > you get zombies with this fork example because you dont ignore child signals > nor do you execute any wait funcs > glibc-2.3.2-r12 is not affected by this bug. Cheers, Ferdy
(In reply to comment #10) > Created an attachment (id=67281) [edit] > Test case that demonstrates that this is not a MySQL bug > > Compile with "gcc -lpthread test.c" > Run with "./a.out" > In another terminal type "ps aux|grep a.out" > On affected Alpha systems you should see "<defunct>" > On all other systems you should not see "<defunct>" Confirmed, this also produces error at my end.
Created attachment 67377 [details] pthread-test.c
This bug appeared on my system too, following a move from the 2004.2 to the 2005.0 portage tree, and survived a subsequent change from a 2.4.21-alpha-r12 to a 2.6.11.8 kernel, the pthread-test.c rund and gives me a defunct process too This isnt adding to the sum of knowlege on this bug much however I will help test where I can (though the system is production so tht's limited) Cheers Ian >emerge --info Portage 2.0.51.22-r2 (default-linux/alpha/2005.0, gcc-3.3.2, glibc-2.3.4.20041102-r1, 2.6.11.8 alpha) ================================================================= System uname: 2.6.11.8 alpha EV67 Gentoo Base System version 1.6.13 distcc 2.18.3 alphaev67-unknown-linux-gnu (protocols 1 and 2) (default port 3632) [disabled] ccache version 2.3 [disabled] dev-lang/python: 2.3.5 sys-apps/sandbox: 1.2.12 sys-devel/autoconf: 2.13, 2.59-r6 sys-devel/automake: 1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6 sys-devel/binutils: 2.15.92.0.2-r10 sys-devel/libtool: 1.5.18-r1 virtual/os-headers: 2.6.8.1-r4 ACCEPT_KEYWORDS="alpha" AUTOCLEAN="yes" CBUILD="alphaev67-unknown-linux-gnu" CFLAGS="-mieee -O3 -mcpu=ev67 -pipe -fomit-frame-pointer" CHOST="alphaev67-unknown-linux-gnu" CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3.2/share/config /usr/kde/3/share/config /usr/lib/X11/xkb /usr/share/config /var/qmail/control" CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d" CXXFLAGS="-mieee -O3 -mcpu=ev67 -pipe -fomit-frame-pointer" DISTDIR="/usr/portage/distfiles" FEATURES="autoconfig distlocks sandbox sfperms strict" GENTOO_MIRRORS="ftp://ftp.solnet.ch/mirror/Gentoo http://distfiles.gentoo.org http://distro.ibiblio.org/pub/Linux/distributions/gentoo" MAKEOPTS="-j4" PKGDIR="/usr/portage/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" SYNC="rsync://rsync.gentoo.org/gentoo-portage" USE="alpha X adns arts berkdb bitmap-fonts crypt cups curl eds encode esd fam flac font-server foomaticdb fortran gd gdbm gif gnome gpm gstreamer gtk gtk2 imagemagick imlib jpeg kde libg++ libwww mad mikmod motif mp3 mpeg mysql ncurses nls ogg oggvorbis opengl oss pam pdflib perl png python qt quicktime readline sdl slang spell ssl tcpd tiff truetype truetype-fonts type1-fonts vorbis xine xml2 xmms xv zlib userland_GNU kernel_linux elibc_glibc" Unset: ASFLAGS, CTARGET, LANG, LC_ALL, LDFLAGS, LINGUAS, PORTDIR_OVERLAY
No defunct processes are observed with the pthread-test.c testcase when using NPTL and sys-libs/glibc-2.3.4.20041102-r1 on my ev56.
(In reply to comment #20) > No defunct processes are observed with the pthread-test.c testcase when using > NPTL and sys-libs/glibc-2.3.4.20041102-r1 on my ev56. I would very much try to recreate the behavoir you are reporting but I am having some trouble. Now it is highly likely that it is just me who is being blind here (to much coffe and to little sleep), but how did you manage to compile glibc-2.3.4.20041102-r1 with nptl support ? - I get this no matter what I try: --------------------- $> USE="nptl" emerge -pkv glibc These are the packages that I would merge, in order: Calculating dependencies ...done! [ebuild R ] sys-libs/glibc-2.3.4.20041102-r1 -build -erandom (-hardened) (-multilib) +nls -nomalloccheck (-nptl) -nptlonly -pic (-selinux) +userlocales* --------------------- as seen the nptl keyword seams to be hard masked on alpha (if I am missing out on some portage trick to circumvent this type of behaviour then please enlighten me).
Here's the steps I did to get it working: - Edit /usr/portage/sys-libs/glibc/glibc-2.3.4.20041102-r1.ebuild adding || use alpha to the end of line 270 - Take the md5sum of glibc-2.3.4.20041102-r1.ebuild and update that and the filesize in /usr/portage/sys-libs/glibc/Manifest - Comment out nptl in /usr/portage/profiles/default-linux/alpha/use.mask - emerge glibc There might be a better way and I'm pretty sure based on the package masking that NPTL isn't yet supported on the alpha, but this might be a temporary workaround.
recompiled glibc twice today... initially I thought I still had the problem having attempted the work arround http://bugs.gentoo.org/show_bug.cgi?id=100259#c22 on system http://bugs.gentoo.org/show_bug.cgi?id=100259#c19 Initially I thought there was no change, but remembering to include nptl in the USE flags as well fixed this for me, thanks
1) For those of you running 2.6/ profiles then switch over to nptl since it *seems* stable enough. 2) For those of you that can switch over 2.6/ then check point 1). 3) For those of you that can't switch to 2.6/ then stay tunned... we are still thinking on how to solve this. Cheers, Ferdy
(In reply to comment #24) > 1) For those of you running 2.6/ profiles then switch over to nptl since it > *seems* stable enough. > > 2) For those of you that can switch over 2.6/ then check point 1). > > 3) For those of you that can't switch to 2.6/ then stay tunned... we are still > thinking on how to solve this. > > Cheers, > Ferdy I think there are two possibilities: 1. If someone is running a 2.4 kernel force it to stay with glibc-2.3.2-r12. I think this should be doable with the current profiles. I know that means there will be more work for the glibc devs, because they have to maintain a older glibc with security and other patches... 2. If someone is running a 2.6 kernel force it to use nptl. AFAIK every other distros is running with nptl enabled. It seems that nptl on alpha is working fine too. I read somewhere that future glibc version is nptl only (Don't ask me where I read that; Was it in the glibc changelog?) So point 2 is in the long term the only possibility. Just as notice: Last week I have reported this problem on glibc bugzilla, so they are informed. Oh, if you need a tester for patches, I have setup a chroot environment and can do relatively easy tests... Greets Marc
Ok so I thought it over and have unmasked gcc 3.4.4 and recompiled my entire toolchain and system using it along with nptl: --------------------------------- emerge glibc binutils libstdc++-v3 gcc # update gcc with gcc-config, set new gcc as deault compiler emerge glibc binutils libstdc++-v3 gcc portage source /etc/profile && env-update emerge -e system && emerge -e system --------------------------------- I must say that I am amazed - It is like having a new computer - before there was always lying these defunct process around (mysql, sh...), my shell was horibly slow and apache could only handle a few connections - due to the updated glibc these problems are now gone and my system is in a much better shape than ever - it feels like the days when i had Tru64 on it. glibc with nptl and gcc 3.4.4 with -march=ev6 is pure medecin.
Thats nice to hear, but please dont resolve the bug since it is not resolved. This still exists for 2.4/ profiles. toolchain: any idea ? Cheers, Ferdy
(In reply to comment #27) > Thats nice to hear, but please dont resolve the bug since it is not resolved. > This still exists for 2.4/ profiles. Sorry forgot about that - > > toolchain: any idea ? Brain error - should be: toolkit
> If someone is running a 2.4 kernel force it to stay with glibc-2.3.2-r12. This isn't always possible because if someone (like me) installed their system with a 2005.1 stage tarball, then their system would already be running glibc-2.3.4.20041102-r1. Downgrading glibc by doing an emerge =sys-libs/glibc-2.3.2-r12 after all of your packages are compiled against glibc-2.3.4.20041102-r1 doesn't work, it just screws up your system (I know this from personal experience). Maybe someone knows how to safely downgrade glibc? Anyway, I have fixed the problem on my machine and now there are no more defunct threads from the test case or mysql. Here is what I did.... 1) emerge udev 2) Emerged, Compiled and Installed 2.6 kernel and rebooted 3) Changed to the 2.6 profile 4) Unmasked nptl and edited ebuild (see Comment #22 From Mike Hlavac) 5) Added nptl and nptlonly use flags 6) emerge glibc 7) emerge --newuse world
enabling nptl has stopped the zombies in mysql when accessing via apache/php however all is not well, The mysql process appears to crash ------------------------------------------- /usr/sbin/mysqld: ready for connections. Version: '4.0.24' socket: '/var/run/mysqld/mysqld.sock' port: 3306 Gentoo Linux mysql-4.0.24 mysqld got signal 11; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. key_buffer_size=16777216 read_buffer_size=131072 max_used_connections=0 max_connections=100 threads_connected=0 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 233983 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. thd=0x1202ea420 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... Cannot determine thread, fp=0x200016ee0a0, backtrace may not be correct. Stack range sanity check OK, backtrace follows: Warning: Alpha stacks are difficult - will be taking some wild guesses, stack trace may be incorrect or terminate abruptly 0x1200e15b4 New value of fp=0x200016ee040 failed sanity check, terminating stack trace! Please read http://dev.mysql.com/doc/mysql/en/Using_stack_trace.html and follow instructions on how to resolve the stack trace. Resolved stack trace is much more helpful in diagnosing the problem, so please do resolve it Trying to get some variables. Some pointers may be invalid and cause the dump to abort... thd->query at (nil) is invalid pointer thd->thread_id=1 The manual page at http://www.mysql.com/doc/en/Crashing.html contains information that should help you find out what is causing the crash.
Ian - Try and re-emerge mysql. It's possible that it might need to be recompiled now that NPTL is enabled. What are the steps that you use to get MySQL to crash?
Debian's guys has sended a bug to upstream: http://sources.redhat.com/bugzilla/show_bug.cgi?id=1297
Upstream support for Linuxthread is death: "LinuxThreads support is gone. Every remaining problem is a feature." The before comment's Bug was closed as: WONTFIX.
This was the trigger I am sorry to say: http://sources.redhat.com/ml/libc-alpha/2005-09/msg00037.html Like I mentioned there, under gdb the thread exits normally. I tried quickly to build glibc-2.3.5 with linuxthreads-2.3.2, but it will need some changes to just build properly, not sure about work.
News from Debian's bug about this problem: -------------------- I finally tracked this down to the "pthread_read_children" call in the "manager.c" file in linuxthreads. For some reason, the waitpid_not_cancel in the following "while" always returns 0 and no children are "reaped": (Line 947 or so) while ((pid = waitpid_not_cancel(-1, &status, WNOHANG | __WCLONE)) > 0) { pthread_exited(pid); Children are then properly "reaped" if I change it to: while ((pid = wait3( &status, WNOHANG | __WCLONE, NULL )) > 0 ) { pthread_exited(pid);" -------------------- More info: http://bugs.debian.org/325600 I've prepared the patch and a test ebuild to give this solution a try: [*] Patch (drop it inside files/2.3.5/ at glibc portage dir) http://dev.gentoo.org/~yoswink/tmp/glibc-2.3.5-alpha-linuxthreads.patch [*] Ebuild (just a copy of -r2 that applies the patch. Also is keyworded "alpha") http://dev.gentoo.org/~yoswink/tmp/glibc-2.3.5-r3.ebuild I'm compiling it in a chroot and kloeri will test it too. Post your feedback here, please. Cross your fingers ...
added workaround to cvs and will be in glibc-2.3.5-r3
(In reply to comment #36) > added workaround to cvs and will be in glibc-2.3.5-r3 I would like to share some experience I made: I've installed and tested glibc-2.3.5-r3. It seems a lot better as with glibc-2.3.4.20041102-r1 but still not perfect. I see(saw) the same problem as Jan with segfaulting processes with 2.3.4 glibc. Therefore I tried glibc-2.3.5-r3. My testcase is tomcat running under sablevm. With glibc-2.3.5-r3 it seems to run fine with AND without nptl. The only thing I've noticed, without nptl, if I shutdown tomcat some processes are not stopped, but also they appear not as zombies. I can kill they with killall sablevm. I don't know if it's a sablevm problem or not. The same procedure works with nptl. As always, there is still a problem with glibc-2.3.5-r3. I can't compile anything with portage with glibc-2.3.5-r3. If I specify LD_ASSUME_KERNEL=2.4.1 emerge -b <something> it works again. Also I can compile things as usual if I do it manually. So it seems it does not work under portages sandbox. Additionally I made some tests against with the two glibc versions with the posixtestsuite. With glibc-2.3.5-r3 a lot more tests passed successfully as with glibc-2.3.4.20041102-r1, but still far away from perfect. (The x86 version is not better) Does gentoo-devs use the posixtestsuite? I leave this bug resolved / fixed but it's still not...