Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 100259 - Threads Defuncts On Alpha with glibc-2.3.4.20041102-r1
Summary: Threads Defuncts On Alpha with glibc-2.3.4.20041102-r1
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: Alpha Linux
: High major (vote)
Assignee: Alpha Porters
URL: http://bugs.debian.org/cgi-bin/bugrep...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-07-25 11:10 UTC by Lars Roland
Modified: 2005-11-14 13:30 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Test case that demonstrates that this is not a MySQL bug (test.c,289 bytes, text/plain)
2005-08-30 09:05 UTC, Thomas Cort (RETIRED)
Details
Test case that demonstrates that this is not a MySQL bug (test2.c,47 bytes, text/plain)
2005-08-30 09:12 UTC, Thomas Cort (RETIRED)
Details
pthread-test.c (pthread-test.c,324 bytes, text/plain)
2005-08-31 19:44 UTC, SpanKY
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Lars Roland 2005-07-25 11:10:07 UTC
If MySQL is not run with the "one-thread" clause in /etc/mysql/my.cnf then it
defuncts on alpha when accessed from a webpage (CFLAGS="-mieee -O2 -mcpu=ev6" -
Linux alpha 2.6.12.2 alpha EV6 GNU/Linux) - both version 4.0.24 and 4.1.13-r1
shows this behaviour.

The problem has been produced on a alpha ds10L server running apache 2.0.54 and
PHP 4.4 (earlier versions of php/apache do however have the same problems also
this problem can be reproduced from webpages written in perl so I am quite sure
that it is not a php problem).

MySQL is compiled with the folowing options in version 4.1.13-r1:

+berkdb -big-tables -cluster -debug* -doc -extraengine -geometry -minimal +perl
+readline (-selinux) +ssl -static +tcpd -utf8

version 4.0.24 with these:

+berkdb -debug* +innodb +perl +readline (-selinux) +ssl -static +tcpd

I have recompiled mysql with debug support - the links below is a trace of
accessing a webpage which resulted in two defunct instances of mysql (the trace
file is very big and thus is zipped).

http://randompage.org/static/mysqld.sql.bz2
http://randompage.org/static/mysqld.trace.bz2

I have not been able to reproduce this problem when accessing mysql directly
local on the machine but I am guessing that the problem is related to parallel
access since it disappear when running in single threaded mode.


Reproducible: Always
Steps to Reproduce:
1. emerge -k mysql


Actual Results:  
defuncts when accessed.

Expected Results:  
run the query without creating zombie processes.
Comment 1 Thomas Cort (RETIRED) gentoo-dev 2005-07-26 13:17:09 UTC
I also have the same problem.

mysql-4.0.25-r2:  +berkdb -big-tables -debug -doc -minimal +perl +readline
(-selinux) +ssl -static +tcpd


emerge info:

Portage 2.0.51.22-r2 (default-linux/alpha/2005.0/2.4, gcc-3.3.2,
glibc-2.3.4.20041102-r1, 2.4.30 alpha)
=================================================================
System uname: 2.4.30 alpha EV56
Gentoo Base System version 1.6.13
dev-lang/python:     2.3.5
sys-apps/sandbox:    1.2.11
sys-devel/autoconf:  2.13, 2.59-r6
sys-devel/automake:  1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.5
sys-devel/binutils:  2.15.92.0.2-r10
sys-devel/libtool:   1.5.18-r1
virtual/os-headers:  2.4.23
ACCEPT_KEYWORDS="alpha"
AUTOCLEAN="yes"
CBUILD="alpha-unknown-linux-gnu"
CFLAGS="-mieee -O2 -mcpu=ev4"
CHOST="alpha-unknown-linux-gnu"
CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3/share/config
/usr/share/config /var/qmail/control"
CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d"
CXXFLAGS="-mieee -O2 -mcpu=ev4"
DISTDIR="/usr/portage/distfiles"
FEATURES="autoconfig distlocks sandbox sfperms strict"
GENTOO_MIRRORS="ftp://cs.ubishops.ca:2121/pub/gentoo
ftp://cs.ubishops.ca/pub/gentoo"
LDFLAGS="-Wl,-O1"
LINGUAS="en"
MAKEOPTS="-j2"
PKGDIR="/usr/portage/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage"
SYNC="rsync://rsync.ca.gentoo.org/gentoo-portage"
USE="alpha apache2 arts berkdb bitmap-fonts crypt cups curl encode font-server
foomatic foomaticdb fortran gd gdbm gif gtk2 imlib jabber jpeg libg++ libwww mad
mp3 mpeg mysql ncurses nls pam pdflib perl png postgres python readline spell
ssl tcpd tiff truetype truetype-fonts type1-fonts xml2 zlib linguas_en
userland_GNU kernel_linux elibc_glibc"
Unset:  ASFLAGS, CTARGET, LANG, LC_ALL
Comment 2 Francesco R. (RETIRED) gentoo-dev 2005-07-26 13:50:26 UTC
just coming from the read of "http://dev.mysql.com/doc/mysql/en/alpha-dec-osf1.html"

please can you try the following sequence ?

#export CFLAGS="-D_PTHREAD_USE_D4 -DDONT_USE_THR_ALARM"
#export CXXFLAGS="-DDONT_USE_THR_ALARM"
#emerge dev-db/mysql

if that solve the problem I'll modify the ebuilds accordingly (unless it broke
for others) .
Comment 3 Lars Roland 2005-07-27 01:56:45 UTC
Compiling with 

CFLAGS="-D_PTHREAD_USE_D4 -DDONT_USE_THR_ALARM"
CXXFLAGS="-DDONT_USE_THR_ALARM"

Does not solve it for mysql 4.1.13-r1.
Comment 4 Fernando J. Pereda (RETIRED) gentoo-dev 2005-07-31 06:06:25 UTC
I'm running:

[ebuild   R   ] dev-db/mysql-4.0.24  +berkdb* -debug -innodb +perl +readline*
(-selinux) +ssl -static +tcpd 0 kB

on

Portage 2.0.51.22-r2 (default-linux/alpha/2005.0/2.4, gcc-3.3.2,
glibc-2.3.2-r12, 2.4.29-grsec-2.1.3 alpha)
=================================================================
System uname: 2.4.29-grsec-2.1.3 alpha EV45
Gentoo Base System version 1.6.13
dev-lang/python:     2.3.5
sys-apps/sandbox:    1.2.11
sys-devel/autoconf:  2.13, 2.59-r6
sys-devel/automake:  1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.5
sys-devel/binutils:  2.15.92.0.2-r10
sys-devel/libtool:   1.5.18-r1
virtual/os-headers:  2.4.23
ACCEPT_KEYWORDS="alpha"
AUTOCLEAN="yes"
CBUILD="alpha-unknown-linux-gnu"
CFLAGS="-mieee -O2 -mcpu=ev45"
CHOST="alpha-unknown-linux-gnu"
CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3/share/config
/usr/share/config /var/qmail/control"
CONFIG_PROTECT_MASK="/etc/gconf /etc/env.d"
CXXFLAGS="-mieee -O2 -mcpu=ev45"
DISTDIR="/usr/portage/distfiles"
FEATURES="autoconfig distlocks sandbox sfperms strict"
GENTOO_MIRRORS="http://distfiles.gentoo.org
http://distro.ibiblio.org/pub/Linux/distributions/gentoo"
MAKEOPTS="-j1"
PKGDIR="/usr/portage/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage"
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="alpha berkdb bitmap-fonts crypt cups encode font-server foomaticdb fortran
gd gdbm gif gpm gtk2 imap imlib jpeg libg++ libwww mad mikmod motif mp3 mpeg
mysql ncurses nls oggvorbis opengl oss pam pdflib perl png python quicktime
readline sdl spell ssl tcpd truetype truetype-fonts type1-fonts xml2 xmms xv
zlib userland_GNU kernel_linux elibc_glibc"
Unset:  ASFLAGS, CTARGET, LANG, LC_ALL, LDFLAGS, LINGUAS

in a production server (gentoo-es) with no problems.

Cheers,
Ferdy
Comment 5 Fernando J. Pereda (RETIRED) gentoo-dev 2005-07-31 12:16:24 UTC
mysql-4.1.13-r1:

---8<---
gendcc02 mysql # cd /var/tmp/portage/mysql-4.1.13-r1/work/mysql/
gendcc02 mysql # make test
cd mysql-test; perl mysql-test-run.pl && perl mysql-test-run.pl --ps-protocol
Killing Possible Leftover Processes
Removing Stale Files
Installing Master Databases
Installing Master Databases
Installing Slave Databases
Installing Slave Databases
Installing Slave Databases
=======================================================
Finding  Tests in the 'main' suite
Starting Tests in the 'main' suite

TEST                            RESULT
-------------------------------------------------------

alias                           [ pass ]
alter_table                     [ pass ]
analyse                         [ pass ]
analyze                         [ pass ]
ansi                            [ pass ]
archive                         [ skipped ]
auto_increment                  [ pass ]
mysql-test-run: WARNING: can't kill process 1
mysql-test-run: *** ERROR: we could not kill or clean up all processes
make: *** [test] Error 1
---8<---

emerge info:

---8<---
Portage 2.0.51.22-r1 (default-linux/alpha/2005.0/2.4, gcc-3.3.2,
glibc-2.3.4.20041102-r1, 2.4.28 alpha)
=================================================================
System uname: 2.4.28 alpha EV56
Gentoo Base System version 1.6.12
distcc 2.18.3 alpha-unknown-linux-gnu (protocols 1 and 2) (default port 3632)
[disabled]
ccache version 2.3 [disabled]
dev-lang/python:     2.3.5
sys-apps/sandbox:    1.2.10
sys-devel/autoconf:  2.13, 2.59-r6
sys-devel/automake:  1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.5
sys-devel/binutils:  2.15.92.0.2-r10
sys-devel/libtool:   1.5.18-r1
virtual/os-headers:  2.4.23
ACCEPT_KEYWORDS="alpha"
AUTOCLEAN="yes"
CBUILD="alpha-unknown-linux-gnu"
CFLAGS="-mieee -O2 -mcpu=ev56"
CHOST="alpha-unknown-linux-gnu"
CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3/share/config
/usr/share/config /var/qmail/control"
CONFIG_PROTECT_MASK="/etc/gconf /etc/env.d"
CXXFLAGS="-mieee -O2 -mcpu=ev56"
DISTDIR="/usr/portage/distfiles"
FEATURES="autoconfig distlocks sandbox sfperms strict"
GENTOO_MIRRORS="ftp://ftp.ussg.iu.edu/pub/linux/gentoo/
ftp://gentoo.mirrors.pair.com"
PKGDIR="/usr/portage/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="alpha X arts berkdb bitmap-fonts crypt cups encode font-server foomaticdb
fortran gdbm gif gnome gpm gtk gtk2 imlib jpeg kde libg++ libwww mad mikmod
motif mp3 mpeg ncurses nls oggvorbis opengl oss pam pdflib perl png python qt
quicktime readline sdl slang spell ssl tcpd truetype truetype-fonts type1-fonts
xml2 xmms xv zlib userland_GNU kernel_linux elibc_glibc"
Unset:  ASFLAGS, CTARGET, LANG, LC_ALL, LDFLAGS, LINGUAS, MAKEOPTS, PORTDIR_OVERLAY
---8<---

Cheers,
Ferdy
Comment 6 Francesco R. (RETIRED) gentoo-dev 2005-08-01 05:15:53 UTC
Closing this because we can't reproduce the bug.

May I humbly suggest to verify the cleanes of the system ? Something like:
#emerge --sync
#emerge -pv depclean
#emerge depclean
#emerge -uDav --newuse world

and maybe re-emerge the packages that depends upon newly installed mysql ?

Sorry to not give a better support.
Comment 7 Lars Roland 2005-08-01 05:47:46 UTC
(In reply to comment #5)
> mysql-4.1.13-r1:
> 
> ---8<---
> gendcc02 mysql # cd /var/tmp/portage/mysql-4.1.13-r1/work/mysql/
> gendcc02 mysql # make test
> cd mysql-test; perl mysql-test-run.pl && perl mysql-test-run.pl --ps-protocol
> Killing Possible Leftover Processes
> Removing Stale Files
> Installing Master Databases
> Installing Master Databases
> Installing Slave Databases
> Installing Slave Databases
> Installing Slave Databases
> =======================================================
> Finding  Tests in the 'main' suite
> Starting Tests in the 'main' suite
> 
> TEST                            RESULT
> -------------------------------------------------------
> 
> alias                           [ pass ]
> alter_table                     [ pass ]
> analyse                         [ pass ]
> analyze                         [ pass ]
> ansi                            [ pass ]
> archive                         [ skipped ]
> auto_increment                  [ pass ]
> mysql-test-run: WARNING: can't kill process 1
> mysql-test-run: *** ERROR: we could not kill or clean up all processes
> make: *** [test] Error 1
> ---8<---
> 
> emerge info:
> 
> ---8<---
> Portage 2.0.51.22-r1 (default-linux/alpha/2005.0/2.4, gcc-3.3.2,
> glibc-2.3.4.20041102-r1, 2.4.28 alpha)
> =================================================================
> System uname: 2.4.28 alpha EV56
> Gentoo Base System version 1.6.12
> distcc 2.18.3 alpha-unknown-linux-gnu (protocols 1 and 2) (default port 3632)
> [disabled]
> ccache version 2.3 [disabled]
> dev-lang/python:     2.3.5
> sys-apps/sandbox:    1.2.10
> sys-devel/autoconf:  2.13, 2.59-r6
> sys-devel/automake:  1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.5
> sys-devel/binutils:  2.15.92.0.2-r10
> sys-devel/libtool:   1.5.18-r1
> virtual/os-headers:  2.4.23
> ACCEPT_KEYWORDS="alpha"
> AUTOCLEAN="yes"
> CBUILD="alpha-unknown-linux-gnu"
> CFLAGS="-mieee -O2 -mcpu=ev56"
> CHOST="alpha-unknown-linux-gnu"
> CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3/share/config
> /usr/share/config /var/qmail/control"
> CONFIG_PROTECT_MASK="/etc/gconf /etc/env.d"
> CXXFLAGS="-mieee -O2 -mcpu=ev56"
> DISTDIR="/usr/portage/distfiles"
> FEATURES="autoconfig distlocks sandbox sfperms strict"
> GENTOO_MIRRORS="ftp://ftp.ussg.iu.edu/pub/linux/gentoo/
> ftp://gentoo.mirrors.pair.com"
> PKGDIR="/usr/portage/packages"
> PORTAGE_TMPDIR="/var/tmp"
> PORTDIR="/usr/portage"
> SYNC="rsync://rsync.gentoo.org/gentoo-portage"
> USE="alpha X arts berkdb bitmap-fonts crypt cups encode font-server foomaticdb
> fortran gdbm gif gnome gpm gtk gtk2 imlib jpeg kde libg++ libwww mad mikmod
> motif mp3 mpeg ncurses nls oggvorbis opengl oss pam pdflib perl png python qt
> quicktime readline sdl slang spell ssl tcpd truetype truetype-fonts type1-fonts
> xml2 xmms xv zlib userland_GNU kernel_linux elibc_glibc"
> Unset:  ASFLAGS, CTARGET, LANG, LC_ALL, LDFLAGS, LINGUAS, MAKEOPTS,
PORTDIR_OVERLAY
> ---8<---
> 
> Cheers,
> Ferdy

This compile setup is different for the one we are using, I can however not get
it working either, can you try to compile mysql 4.0 with

+berkdb -debug* +innodb +perl +readline (-selinux) +ssl -static +tcpd

and see if it still works for you ? (note the problem only seams to occure with
multiple mysql threads) - I have also recrated this problem with mysql 4.0 on a
alpha DS20L with a freshly installed gentoo.
Comment 8 Francesco R. (RETIRED) gentoo-dev 2005-08-01 06:35:23 UTC
@ Comment #7  Lars Roland  

try 
# mysqlbinlog /var/lib/mysql/[host_name]-bin.[last_log_id]
# tail /var/log/mysql/mysql*.{err,log}
to see if there is something strange in the logs

from two different shell:
#mysql -uroot -p 
#mysql -uroot -p 
to try if the native mysql client has problems 

#export LD_ASSUME_KERNEL=2.4.19
#/usr/sbin/mysqld --defaults-file=/etc/mysql/my.cnf --basedir=/usr
--datadir=/var/lib/mysql --user=mysql --pid-file=/var/run/mysqld/mysqld.pid
--skip-locking --port=3306 --socket=/var/run/mysqld/mysqld.sock

to see if it's a nptl problem.

try 
#emerge =dev-db/mysql-4.0.25-r2  -berkdb -big-tables -debug -doc -minimal -perl
-readline -ssl -static -tcpd

To see if the problem disappear removing all extensions.
Comment 9 Lars Roland 2005-08-11 15:07:23 UTC
(In reply to comment #8)
> @ Comment #7  Lars Roland  
> 
> try 
> # mysqlbinlog /var/lib/mysql/[host_name]-bin.[last_log_id]
> # tail /var/log/mysql/mysql*.{err,log}
> to see if there is something strange in the logs

the logs are empty (only entries for starting and stopping mysql shows up).

> 
> from two different shell:
> #mysql -uroot -p 
> #mysql -uroot -p 
> to try if the native mysql client has problems 

Making two logins resultss in two defunct mysqld process - but first when I quit
the login (\q) - i.e. loging in and executing SQL from within the different
logins does not seam to cause any trouble but each login creates a defunct mysql
process when I leave it.


> 
> #export LD_ASSUME_KERNEL=2.4.19
> #/usr/sbin/mysqld --defaults-file=/etc/mysql/my.cnf --basedir=/usr
> --datadir=/var/lib/mysql --user=mysql --pid-file=/var/run/mysqld/mysqld.pid
> --skip-locking --port=3306 --socket=/var/run/mysqld/mysqld.sock
> 
> to see if it's a nptl problem.

This has no effect (i.e. mysql still defuncts), also is anyone out there even
using nptl in alpha architectures in gentoo, i.e. I have never recompiled glibc
with the nptl keyword and thus it should not be used (or is it default now ?).

> 
> try 
> #emerge =dev-db/mysql-4.0.25-r2  -berkdb -big-tables -debug -doc -minimal -perl
> -readline -ssl -static -tcpd
> 
> To see if the problem disappear removing all extensions.

the problem is still there - only way to solve it is recompiling with debug
turned on and then start mysql in single threaded mode (as I wrote in my initial
bug report).



Comment 10 Thomas Cort (RETIRED) gentoo-dev 2005-08-30 09:05:59 UTC
Created attachment 67281 [details]
Test case that demonstrates that this is not a MySQL bug

Compile with "gcc -lpthread test.c"
Run with "./a.out"
In another terminal type "ps aux|grep a.out"
On affected Alpha systems you should see "<defunct>"
On all other systems you should not see "<defunct>"
Comment 11 Thomas Cort (RETIRED) gentoo-dev 2005-08-30 09:10:25 UTC
Above, I posted a test case that demonstrates that this is not a MySQL bug. It
appears to be a bug in the pthread library or kernel. The bug makes alpha
systems not properly handle threads that finish executing or call pthread_exit.
Comment 12 Thomas Cort (RETIRED) gentoo-dev 2005-08-30 09:12:48 UTC
Created attachment 67282 [details]
Test case that demonstrates that this is not a MySQL bug

This bug also effects fork()'d processes.
Comment 13 Fernando J. Pereda (RETIRED) gentoo-dev 2005-08-31 02:47:22 UTC
Hi !

Clearly this is an alpha-only bug and seems to be caused by latest glibc. I'm
trying those test cases with an older glibc and everything seems to work fine.
(reassigning to alpha and cc'ing SpanKY, our glibc guru).

The latest (the fork() one) test-case is not appropiate, the defunct proccess
also appears on my x86 machine. But the pthread test-case is good.

If any of you could please try those test cases in glibc-3.2.3. I'm doing it
right now in several machines.

Debian bug is: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=325600

SpanKY, any ideas ?

Cheers,
Ferdy
Comment 14 SpanKY gentoo-dev 2005-08-31 10:33:14 UTC
a good idea to find a reference glibc which does not have this issue ...
Comment 15 SpanKY gentoo-dev 2005-08-31 10:34:08 UTC
Comment on attachment 67282 [details]
Test case that demonstrates that this is not a MySQL bug

you get zombies with this fork example because you dont ignore child signals
nor do you execute any wait funcs
Comment 16 Fernando J. Pereda (RETIRED) gentoo-dev 2005-08-31 10:48:20 UTC
(In reply to comment #15)
> (From update of attachment 67282 [details] [edit])
> you get zombies with this fork example because you dont ignore child signals
> nor do you execute any wait funcs
> 

glibc-2.3.2-r12 is not affected by this bug.

Cheers,
Ferdy
Comment 17 Lars Roland 2005-08-31 14:08:04 UTC
(In reply to comment #10)
> Created an attachment (id=67281) [edit]
> Test case that demonstrates that this is not a MySQL bug
> 
> Compile with "gcc -lpthread test.c"
> Run with "./a.out"
> In another terminal type "ps aux|grep a.out"
> On affected Alpha systems you should see "<defunct>"
> On all other systems you should not see "<defunct>"

Confirmed, this also produces error at my end.
Comment 18 SpanKY gentoo-dev 2005-08-31 19:44:51 UTC
Created attachment 67377 [details]
pthread-test.c
Comment 19 Ian Hayhurst 2005-09-02 01:20:30 UTC
This bug appeared on my system too, following a move from the 2004.2 to the
2005.0 portage tree, and survived a subsequent change from a 2.4.21-alpha-r12 to
a 2.6.11.8 kernel, 
the pthread-test.c rund and gives me a defunct process too
This isnt adding to the sum of knowlege on this bug much  however I will help
test where I can (though the system is production so tht's limited)
Cheers 
Ian

>emerge --info
Portage 2.0.51.22-r2 (default-linux/alpha/2005.0, gcc-3.3.2,
glibc-2.3.4.20041102-r1, 2.6.11.8 alpha)
=================================================================
System uname: 2.6.11.8 alpha EV67
Gentoo Base System version 1.6.13
distcc 2.18.3 alphaev67-unknown-linux-gnu (protocols 1 and 2) (default port
3632) [disabled]
ccache version 2.3 [disabled]
dev-lang/python:     2.3.5
sys-apps/sandbox:    1.2.12
sys-devel/autoconf:  2.13, 2.59-r6
sys-devel/automake:  1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6
sys-devel/binutils:  2.15.92.0.2-r10
sys-devel/libtool:   1.5.18-r1
virtual/os-headers:  2.6.8.1-r4
ACCEPT_KEYWORDS="alpha"
AUTOCLEAN="yes"
CBUILD="alphaev67-unknown-linux-gnu"
CFLAGS="-mieee -O3 -mcpu=ev67 -pipe -fomit-frame-pointer"
CHOST="alphaev67-unknown-linux-gnu"
CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3.2/share/config
/usr/kde/3/share/config /usr/lib/X11/xkb /usr/share/config /var/qmail/control"
CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d"
CXXFLAGS="-mieee -O3 -mcpu=ev67 -pipe -fomit-frame-pointer"
DISTDIR="/usr/portage/distfiles"
FEATURES="autoconfig distlocks sandbox sfperms strict"
GENTOO_MIRRORS="ftp://ftp.solnet.ch/mirror/Gentoo  http://distfiles.gentoo.org 
http://distro.ibiblio.org/pub/Linux/distributions/gentoo"
MAKEOPTS="-j4"
PKGDIR="/usr/portage/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="alpha X adns arts berkdb bitmap-fonts crypt cups curl eds encode esd fam
flac font-server foomaticdb fortran gd gdbm gif gnome gpm gstreamer gtk gtk2
imagemagick imlib jpeg kde libg++ libwww mad mikmod motif mp3 mpeg mysql ncurses
nls ogg oggvorbis opengl oss pam pdflib perl png python qt quicktime readline
sdl slang spell ssl tcpd tiff truetype truetype-fonts type1-fonts vorbis xine
xml2 xmms xv zlib userland_GNU kernel_linux elibc_glibc"
Unset:  ASFLAGS, CTARGET, LANG, LC_ALL, LDFLAGS, LINGUAS, PORTDIR_OVERLAY
Comment 20 Mike Hlavac 2005-09-11 00:51:42 UTC
No defunct processes are observed with the pthread-test.c testcase when using
NPTL and sys-libs/glibc-2.3.4.20041102-r1 on my ev56.
Comment 21 Lars Roland 2005-09-11 17:09:26 UTC
(In reply to comment #20)
> No defunct processes are observed with the pthread-test.c testcase when using
> NPTL and sys-libs/glibc-2.3.4.20041102-r1 on my ev56.

I would very much try to recreate the behavoir you are reporting but I am having
some trouble. Now it is highly likely that it is just me who is being blind here
(to much coffe and to little sleep), but how did you manage to compile
glibc-2.3.4.20041102-r1 with nptl support ? - I get this no matter what I try:

---------------------
$> USE="nptl" emerge -pkv glibc

These are the packages that I would merge, in order:

Calculating dependencies ...done!
[ebuild   R   ] sys-libs/glibc-2.3.4.20041102-r1  -build -erandom (-hardened)
(-multilib) +nls -nomalloccheck (-nptl) -nptlonly -pic (-selinux) +userlocales*
---------------------

as seen the nptl keyword seams to be hard masked on alpha (if I am missing out
on some portage trick to circumvent this type of behaviour then please enlighten
me). 
Comment 22 Mike Hlavac 2005-09-11 22:26:26 UTC
Here's the steps I did to get it working:

- Edit /usr/portage/sys-libs/glibc/glibc-2.3.4.20041102-r1.ebuild adding || use
alpha to the end of line 270
- Take the md5sum of glibc-2.3.4.20041102-r1.ebuild and update that and the
filesize in /usr/portage/sys-libs/glibc/Manifest
- Comment out nptl in /usr/portage/profiles/default-linux/alpha/use.mask
- emerge glibc

There might be a better way and I'm pretty sure based on the package masking
that NPTL isn't yet supported on the alpha, but this might be a temporary
workaround.
Comment 23 Ian Hayhurst 2005-09-12 08:28:18 UTC
recompiled glibc twice today...
initially I thought I still had the problem having attempted the work arround
http://bugs.gentoo.org/show_bug.cgi?id=100259#c22 
on system http://bugs.gentoo.org/show_bug.cgi?id=100259#c19

Initially I thought there was no change, but remembering to include nptl in the
USE flags as well fixed this for me, thanks
Comment 24 Fernando J. Pereda (RETIRED) gentoo-dev 2005-09-12 09:21:50 UTC
1) For those of you running 2.6/ profiles then switch over to nptl since it
*seems* stable enough.

2) For those of you that can switch over 2.6/ then check point 1).

3) For those of you that can't switch to 2.6/ then stay tunned... we are still
thinking on how to solve this.

Cheers,
Ferdy
Comment 25 Marc 2005-09-12 11:38:32 UTC
(In reply to comment #24)
> 1) For those of you running 2.6/ profiles then switch over to nptl since it
> *seems* stable enough.
> 
> 2) For those of you that can switch over 2.6/ then check point 1).
> 
> 3) For those of you that can't switch to 2.6/ then stay tunned... we are still
> thinking on how to solve this.
> 
> Cheers,
> Ferdy


I think there are two possibilities:
1. If someone is running a 2.4 kernel force it to stay with glibc-2.3.2-r12. I
think this should be doable with the current profiles. I know that means there
will be more work for the glibc devs, because they have to maintain a older
glibc with security and other patches...
2. If someone is running a 2.6 kernel force it to use nptl. AFAIK every other
distros is running with nptl enabled. It seems that nptl on alpha is working
fine too.
I read somewhere that future glibc version is nptl only (Don't ask me where I
read that; Was it in the glibc changelog?) So point 2 is in the long term the
only possibility.

Just as notice: Last week I have reported this problem on glibc bugzilla, so
they are informed.

Oh, if you need a tester for patches, I have setup a chroot environment and can
do relatively easy tests...

Greets

Marc
Comment 26 Lars Roland 2005-09-13 05:25:30 UTC
Ok so I thought it over and have unmasked gcc 3.4.4 and recompiled my entire
toolchain and system using it along with nptl:

---------------------------------
emerge glibc binutils libstdc++-v3 gcc

# update gcc with gcc-config, set new gcc as deault compiler

emerge glibc binutils libstdc++-v3 gcc portage
source /etc/profile && env-update
emerge -e system && emerge -e system
---------------------------------

I must say that I am amazed - It is like having a new computer - before there
was always lying these defunct process around (mysql, sh...), my shell was
horibly slow and apache could only handle a few connections  - due to the
updated glibc these problems are now gone and my system is in a much better
shape than ever - it feels like the days when i had Tru64 on it.

glibc with nptl and gcc 3.4.4 with -march=ev6 is pure medecin.
Comment 27 Fernando J. Pereda (RETIRED) gentoo-dev 2005-09-13 05:34:16 UTC
Thats nice to hear, but please dont resolve the bug since it is not resolved.
This still exists for 2.4/ profiles.

toolchain: any idea ?

Cheers,
Ferdy
Comment 28 Lars Roland 2005-09-13 05:56:37 UTC
(In reply to comment #27)
> Thats nice to hear, but please dont resolve the bug since it is not resolved.
> This still exists for 2.4/ profiles.

Sorry forgot about that - 

> 
> toolchain: any idea ?

Brain error - should be: toolkit


Comment 29 Thomas Cort (RETIRED) gentoo-dev 2005-09-14 05:23:57 UTC
> If someone is running a 2.4 kernel force it to stay with glibc-2.3.2-r12.
This isn't always possible because if someone (like me) installed their system
with a 2005.1 stage tarball, then their system would already be running
glibc-2.3.4.20041102-r1. Downgrading glibc by doing an emerge
=sys-libs/glibc-2.3.2-r12 after all of your packages are compiled against
glibc-2.3.4.20041102-r1 doesn't work, it just screws up your system (I know this
from personal experience). Maybe someone knows how to safely downgrade glibc?

Anyway, I have fixed the problem on my machine and now there are no more defunct
threads from the test case or mysql. Here is what I did....

1) emerge udev
2) Emerged, Compiled and Installed 2.6 kernel and rebooted
3) Changed to the 2.6 profile
4) Unmasked nptl and edited ebuild (see Comment #22 From  Mike Hlavac)
5) Added nptl and nptlonly use flags
6) emerge glibc
7) emerge --newuse world
Comment 30 Ian Hayhurst 2005-09-15 01:21:06 UTC
enabling nptl has stopped the zombies in mysql when accessing via apache/php
however all is not well, The mysql process appears to crash

-------------------------------------------
/usr/sbin/mysqld: ready for connections.
Version: '4.0.24'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  Gentoo
Linux mysql-4.0.24
mysqld got signal 11;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=16777216
read_buffer_size=131072
max_used_connections=0
max_connections=100
threads_connected=0
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 233983 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd=0x1202ea420
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
Cannot determine thread, fp=0x200016ee0a0, backtrace may not be correct.
Stack range sanity check OK, backtrace follows:
Warning: Alpha stacks are difficult - will be taking some wild guesses, stack
trace may be incorrect or  terminate abruptly
0x1200e15b4
New value of fp=0x200016ee040 failed sanity check, terminating stack trace!
Please read http://dev.mysql.com/doc/mysql/en/Using_stack_trace.html and follow
instructions on how to resolve the stack trace. Resolved
stack trace is much more helpful in diagnosing the problem, so please do
resolve it
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at (nil)  is invalid pointer
thd->thread_id=1
The manual page at http://www.mysql.com/doc/en/Crashing.html contains
information that should help you find out what is causing the crash.
Comment 31 Mike Hlavac 2005-09-15 21:29:17 UTC
Ian -

Try and re-emerge mysql.  It's possible that it might need to be recompiled now
that NPTL is enabled.  What are the steps that you use to get MySQL to crash?
Comment 32 Jose Luis Rivero (yoswink) (RETIRED) gentoo-dev 2005-09-16 03:21:03 UTC
Debian's guys has sended a bug to upstream:

http://sources.redhat.com/bugzilla/show_bug.cgi?id=1297
Comment 33 Jose Luis Rivero (yoswink) (RETIRED) gentoo-dev 2005-09-16 08:48:18 UTC
Upstream support for Linuxthread is death:

"LinuxThreads support is gone.  Every remaining problem is a feature."
The before comment's Bug was closed as: WONTFIX.
Comment 34 Martin Schlemmer (RETIRED) gentoo-dev 2005-09-18 05:11:21 UTC
This was the trigger I am sorry to say:

  http://sources.redhat.com/ml/libc-alpha/2005-09/msg00037.html

Like I mentioned there, under gdb the thread exits normally.  I tried quickly to
build glibc-2.3.5 with linuxthreads-2.3.2, but it will need some changes to just
build properly, not sure about work.
Comment 35 Jose Luis Rivero (yoswink) (RETIRED) gentoo-dev 2005-10-25 07:12:56 UTC
News from Debian's bug about this problem:

--------------------
I finally tracked this down to the "pthread_read_children" call in the 
"manager.c" file in linuxthreads.

For some reason, the waitpid_not_cancel in the following "while" always 
returns 0
and no children are "reaped": (Line 947 or so)
  while ((pid = waitpid_not_cancel(-1, &status, WNOHANG | __WCLONE)) > 0) {
    pthread_exited(pid);

Children are then properly "reaped" if I change it to:
  while ((pid = wait3( &status,  WNOHANG | __WCLONE, NULL )) > 0 ) {
    pthread_exited(pid);"
--------------------
More info: http://bugs.debian.org/325600

I've prepared the patch and a test ebuild to give this solution a try:

[*] Patch (drop it inside files/2.3.5/ at glibc portage dir)
http://dev.gentoo.org/~yoswink/tmp/glibc-2.3.5-alpha-linuxthreads.patch

[*] Ebuild (just a copy of -r2 that applies the patch. Also is keyworded "alpha")
http://dev.gentoo.org/~yoswink/tmp/glibc-2.3.5-r3.ebuild 

I'm compiling it in a chroot and kloeri will test it too. Post your feedback
here, please. 

Cross your fingers ...
Comment 36 SpanKY gentoo-dev 2005-10-26 17:35:33 UTC
added workaround to cvs and will be in glibc-2.3.5-r3
Comment 37 Marc 2005-11-14 13:30:27 UTC
(In reply to comment #36)
> added workaround to cvs and will be in glibc-2.3.5-r3

I would like to share some experience I made:

I've installed and tested glibc-2.3.5-r3. It seems a lot better as with
glibc-2.3.4.20041102-r1 but still not perfect.

I see(saw) the same problem as Jan with segfaulting processes with 2.3.4 glibc.
Therefore I tried glibc-2.3.5-r3. My testcase is tomcat running under sablevm.
With glibc-2.3.5-r3 it seems to run fine with AND without nptl.
The only thing I've noticed, without nptl, if I shutdown tomcat some processes
are not stopped, but also they appear not as zombies. I can kill they with
killall sablevm. I don't know if it's a sablevm problem or not. The same
procedure works with nptl.

As always, there is still a problem with glibc-2.3.5-r3. I can't compile anything
with portage with glibc-2.3.5-r3. If I specify LD_ASSUME_KERNEL=2.4.1 emerge -b
<something> it works again. Also I can compile things as usual if I do it manually.
So it seems it does not work under portages sandbox.

Additionally I made some tests against with the two glibc versions with the
posixtestsuite. With glibc-2.3.5-r3 a lot more tests passed successfully as with
glibc-2.3.4.20041102-r1, but still far away from perfect. (The x86 version is
not better)

Does gentoo-devs use the posixtestsuite?

I leave this bug resolved / fixed but it's still not...