Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 174790

Summary: dev-db/mysql-5.0.38 hangs at shutdown with a linuxthreads (non-NTPL) glibc
Product: Gentoo Linux Reporter: Stefan Kienzl <sharky2>
Component: Current packagesAssignee: Gentoo Linux MySQL bugs team <mysql-bugs>
Status: VERIFIED TEST-REQUEST    
Severity: major CC: aross, clemente.aguiar, gentoo, hardened, mike, sparc, wschlich, yannlehir
Priority: High    
Version: unspecified   
Hardware: x86   
OS: Linux   
Whiteboard:
Package list:
Runtime testing required: ---

Description Stefan Kienzl 2007-04-16 12:01:50 UTC
on a hardened x86 system with mysql 5.0.38 and berkdb.mysql useflags, a restart or stop of mysql isn't possible. process is still active. killall mysqld or killall "pid-number" doesn't work. only "kill -s 9 pid" stops mysql
downgrade to 5.0.26-r2 resolves the problem

Reproducible: Always

Steps to Reproduce:
1. emerge mysql
2. emerge --config mysql
or
1. /etc/init.d/mysql restart
or 
1. /etc/init.d/mysql stop

Actual Results:  
"stopping mysql" .. *hang*

Expected Results:  
process(es) should have been stopped

 emerge --info
Portage 2.1.2.2 (hardened/x86/2.6, gcc-3.4.6, glibc-2.3.6-r5, 2.6.19-gentoo-r5 i686)
=================================================================
System uname: 2.6.19-gentoo-r5 i686 Intel(R) Pentium(R) 4 CPU 3.20GHz
Gentoo Base System release 1.12.9
Timestamp of tree: Mon, 16 Apr 2007 05:20:01 +0000
dev-lang/python:     2.3.5-r3, 2.4.3-r4
dev-python/pycrypto: 2.0.1-r5
sys-apps/sandbox:    1.2.17
sys-devel/autoconf:  2.13, 2.60
sys-devel/automake:  1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2, 1.10
sys-devel/binutils:  2.16.1-r3
sys-devel/gcc-config: 1.3.15-r1
sys-devel/libtool:   1.5.22
virtual/os-headers:  2.6.17-r2
ACCEPT_KEYWORDS="x86"
AUTOCLEAN="yes"
CBUILD="i386-pc-linux-gnu"
CFLAGS="-O2 -march=pentium4"
CHOST="i386-pc-linux-gnu"
CONFIG_PROTECT="/etc"
CONFIG_PROTECT_MASK="/etc/env.d /etc/gconf /etc/revdep-rebuild /etc/terminfo"
CXXFLAGS="-O2 -march=pentium4"
DISTDIR="/usr/portage/distfiles"
FEATURES="distlocks metadata-transfer sandbox sfperms strict"
GENTOO_MIRRORS="ftp://gentoo.inode.at/source/ ftp://gd.tuwien.ac.at/opsys/linux/gentoo/ ftp://ftp.belnet.be/mirror/rsync.gentoo.org/gentoo/ "
MAKEOPTS="-j3"
PKGDIR="/usr/portage/packages"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --delete-after --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --filter=H_**/files/digest-*"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
SYNC="rsync://rsync.europe.gentoo.org/gentoo-portage"
USE="apache2 berkdb crypt hardened midi nls no-old-linux pam pic readline ssl tcpd x86 xorg zlib" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mulaw multi null plug rate route share shm softvol" ELIBC="glibc" INPUT_DEVICES="mouse keyboard" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" USERLAND="GNU"
Unset:  CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LANG, LC_ALL, LDFLAGS, LINGUAS, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, PORTDIR_OVERLAY
Comment 1 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2007-04-16 12:18:48 UTC
sharky2: please killall -9 mysqld at that point, and get a backtrace as to where it was during the running. Also advise if newer GCC/glibc still cause the problem to be present - I don't have any machines that are still on GCC3/glibc-2.3, I'm on 4.1/2.5 everywhere.
Comment 2 Jakub Moc (RETIRED) gentoo-dev 2007-04-16 12:30:57 UTC
(In reply to comment #1)
> Also advise if newer GCC/glibc still cause the
> problem to be present - I don't have any machines that are still on
> GCC3/glibc-2.3, I'm on 4.1/2.5 everywhere.

gcc-4* and >=glibc-2.4 is package.masked on hardened (yeah, sucks).
Comment 3 Stefan Kienzl 2007-04-16 16:16:38 UTC
I will setup another machine trying to reproduce the problem, because i ran out of time for testing on this productional server 

how do i have to backtrace? is there a short howto?
Comment 4 Andreas Westin 2007-04-17 10:33:45 UTC
I can confirm this problem on hardened x86.
Comment 5 Wolfram Schlich (RETIRED) gentoo-dev 2007-04-17 12:01:13 UTC
Same here :-(((
Strangely, my /var/run/mysqld/mysqld.pid does not contain the PID of
the mysqld parent (6748), but the PID of some mysqld child process (6764).

strace of the master process shows no activity at all when
I run "kill 6748":
--8<--
keel mysqld # strace -tt -p 6748
Process 6748 attached - interrupt to quit
13:57:26.283736 select(17, [15 16], NULL, NULL, NULL
--8<--
Same goes for SIGHUP for example.

When I run "kill 6764" (the PID from the pidfile), then this is being
logged:
--8<--
070417 13:59:24 [Note] /usr/sbin/mysqld: Normal shutdown
--8<--

But nothing more happens.
Comment 6 Wolfram Schlich (RETIRED) gentoo-dev 2007-04-17 13:31:58 UTC
I tried to get a backtrace, but I failed :(

1. env USE="debug" FEATURES="nostrip" emerge mysql
--8<--
/usr/sbin/mysqld: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), for GNU/Linux 2.4.1, not stripped
--8<--

2. gdb:
--8<--
# gdb --args /usr/sbin/mysqld --console --core-file --debug='d:t:i:o,/tmp/mysqld.trace' --gdb
GNU gdb 6.6
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-pc-linux-gnu"...
Using host libthread_db library "/lib/libthread_db.so.1".
(gdb) run
Starting program: /usr/sbin/mysqld --console --core-file --debug=d:t:i:o,/tmp/mysqld.trace --gdb
070417 15:24:12 [Warning] The syntax for replication startup options is deprecated and will be removed in MySQL 5.2. Please use 'CHANGE MASTER' instead.

Program received signal SIG32, Real-time event 32.
0x4fb63c74 in ?? ()
(gdb) bt
#0  0x4fb63c74 in ?? ()
#1  0x5a2e27e8 in ?? ()
#2  0x4fb63458 in ?? ()
#3  0x5a2e2750 in ?? ()
#4  0x00000020 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
--8<--

3. I added this to /etc/conf.d/mysql:
--8<--
mysql_slot_0=(
        "core-file"
        "debug=d:t:i:o,/tmp/mysqld.trace"
        "gdb"
)
--8<--
Then I started mysql and looked at /tmp/mysqld.trace.
After running "/etc/init.d/mysql stop", lines like this
keep getting written to the trace file:
--8<--
T@180236: | | info: Waiting for select thread
--8<--

Any hints?
Comment 7 Wolfram Schlich (RETIRED) gentoo-dev 2007-04-17 14:38:42 UTC
I just freshly emerged mysql-5.0.38 on another hardened x86 system
I set up after the same scheme than the first one 2 weeks later,
mysqld on this 2nd machine stops just fine... argh :(
Comment 8 Wolfram Schlich (RETIRED) gentoo-dev 2007-04-17 15:15:39 UTC
Ok I found something out:
- machine #1 where the problem occurs has CHOST=i386-[...]
- machine #2 where the problem does NOT occur has CHOST=i686-[...]

On machine #1, I can see many mysqld processes in the default process listing
(they do not run as threads there).
On machine #2, I only see one mysqld process in the default process listing,
but using 'ps -efL' I can see many mysqld threads.

Both gcc and glibc were merged with identical USE flags (nptl nptlonly etc.),
The only difference I can see is the CHOST.

Ok, another finding:

Looked at /usr/portage/sys-libs/glibc/glibc-2.3.6-r5.ebuild and found out
that TLS (Thread Local Storage) is only supported from i486 upwards
(see function want_tls(), line #799 in the aforementioned ebuild).
I guess that's the problem?

I installed the machine from a hardened stage3 which uses CHOST=i386-[...].
Will rebuild everything now (fun fun fun).
Comment 9 Wolfram Schlich (RETIRED) gentoo-dev 2007-04-17 15:51:41 UTC
See bug #106556 (i386 -> no nptl).
Still recompiling though...
Comment 10 Wolfram Schlich (RETIRED) gentoo-dev 2007-04-17 15:52:19 UTC
Argh, wrong radio button... sorry!
Comment 11 Andreas Westin 2007-04-17 17:20:55 UTC
I have CHOST="i686-pc-linux-gnu".
mysql 5.0.34 works fine for me.

emerge --info
Portage 2.1.2.4 (hardened/x86/2.6, gcc-3.4.6, glibc-2.3.6-r5, 2.6.20.4-grsec i686)
=================================================================
System uname: 2.6.20.4-grsec i686 AMD Athlon(tm) MP 1800+
Gentoo Base System release 1.12.10
Timestamp of tree: Tue, 17 Apr 2007 10:00:01 +0000
ccache version 2.4 [enabled]
dev-lang/python:     2.4.4
dev-python/pycrypto: 2.0.1-r5
dev-util/ccache:     2.4-r6
sys-apps/sandbox:    1.2.18.1
sys-devel/autoconf:  2.13, 2.60
sys-devel/automake:  1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2, 1.10
sys-devel/binutils:  2.17
sys-devel/gcc-config: 1.3.16
sys-devel/libtool:   1.5.23b
virtual/os-headers:  2.6.20-r2
ACCEPT_KEYWORDS="x86 ~x86"
AUTOCLEAN="yes"
CBUILD="i686-pc-linux-gnu"
CFLAGS="-O2 -march=athlon-xp -mtune=athlon-xp -mmmx -msse -m3dnow -pipe"
CHOST="i686-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/X11/xkb /var/qmail/alias /var/qmail/control /var/vpopmail/domains /var/vpopmail/etc"
CONFIG_PROTECT_MASK="/etc/env.d /etc/gconf /etc/php/apache1-php5/ext-active/ /etc/php/apache2-php5/ext-active/ /etc/php/cgi-php5/ext-active/ /etc/php/cli-php5/ext-active/ /etc/revdep-rebuild /etc/terminfo"
CXXFLAGS="-O2 -march=athlon-xp -mtune=athlon-xp -mmmx -msse -m3dnow -pipe"
DISTDIR="/usr/portage/distfiles"
EMERGE_DEFAULT_OPTS="--with-bdeps=y"
FEATURES="candy ccache distlocks metadata-transfer parallel-fetch sandbox sfperms strict userfetch userpriv usersandbox"
GENTOO_MIRRORS="http://ds.thn.htu.se/linux/gentoo ftp://ftp.du.se/pub/os/gentoo http://distfiles.gentoo.org http://distro.ibiblio.org/pub/Linux/distributions/gentoo"
MAKEOPTS=""
PKGDIR="/usr/portage/packages"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --delete-after --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --filter=H_**/files/digest-*"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage"
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="apache2 berkdb crypt hardened mailwrapper midi mysql nls pam pcre pic readline session ssl x86 xorg zlib" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mulaw multi null plug rate route share shm softvol" ELIBC="glibc" INPUT_DEVICES="mouse keyboard" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" USERLAND="GNU" VIDEO_CARDS="fbdev dummy"
Unset:  CTARGET, INSTALL_MASK, LANG, LC_ALL, LDFLAGS, LINGUAS, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
Comment 12 Wolfram Schlich (RETIRED) gentoo-dev 2007-04-17 18:39:22 UTC
Ok, it was fixed after recompiling things with a new CHOST.
See http://rafb.net/p/UDgt8681.html for some excerpt of
my gentoo setup script that takes care of it on a new install.

So I guess upstream has changed something regarding non-NPTL
installations between 5.0.26 and 5.0.38?
Comment 13 Wolfram Schlich (RETIRED) gentoo-dev 2007-04-17 19:11:50 UTC
Maybe this is related?
Quoting http://dev.mysql.com/doc/refman/5.0/en/releasenotes-cs-5-0-37.html

"A workaround was implemented to avoid a race condition in the NPTL pthread_exit() implementation. (Bug#24507)"

-> http://bugs.mysql.com/bug.php?id=24507
Comment 14 Andreas Westin 2007-04-19 04:14:56 UTC
Sorry, I was a bit unclear. 5.0.38 does not work for me and I have CHOST=i686.

Comment 15 Wolfram Schlich (RETIRED) gentoo-dev 2007-04-19 10:02:22 UTC
(In reply to comment #14)
> Sorry, I was a bit unclear. 5.0.38 does not work for me and I have CHOST=i686.
> 

When you start mysql, do you see many mysqld processes with 'ps ax' or just 1?
If you see more than 1 it doesn't use threads -- remerge glibc and try again.
Comment 16 Andreas Westin 2007-04-19 11:31:22 UTC

> When you start mysql, do you see many mysqld processes with 'ps ax' or just 1?
> If you see more than 1 it doesn't use threads -- remerge glibc and try again.
> 

It works now, a recompile of glibc with nptl did the trick. Thanks.
Comment 17 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2007-04-19 19:57:04 UTC
Ok, so closing as invalid.
Anybody on hardened needs to ensure they are using NPTL in their glibc.
Comment 18 solar (RETIRED) gentoo-dev 2007-04-19 19:59:12 UTC
I would hardly call this invalid. This is clearly a bug in UPSTREAM. 
nptl can and should not be required.
Comment 19 solar (RETIRED) gentoo-dev 2007-04-19 21:44:23 UTC
(In reply to comment #13)
> Maybe this is related?
> Quoting http://dev.mysql.com/doc/refman/5.0/en/releasenotes-cs-5-0-37.html
> 
> "A workaround was implemented to avoid a race condition in the NPTL
> pthread_exit() implementation. (Bug#24507)"
> 
> -> http://bugs.mysql.com/bug.php?id=24507

Wolfram,
Can you please contact upstream and inform them that fix infact broke non nptl installs?
Comment 20 Wolfram Schlich (RETIRED) gentoo-dev 2007-04-20 10:14:45 UTC
(In reply to comment #19)
> Wolfram,
> Can you please contact upstream and inform them that fix infact broke non nptl
> installs?

I'll try :)
Comment 21 Wolfram Schlich (RETIRED) gentoo-dev 2007-04-20 11:39:48 UTC
Has anyone tried mysql-5.0.38 with glibc and USE="-nptl" on
a non-hardened system?!
Comment 22 Wolfram Schlich (RETIRED) gentoo-dev 2007-04-20 11:42:14 UTC
Ok, MySQL now has a bug report at http://bugs.mysql.com/bug.php?id=27977
Looking forward to replies...
Comment 23 Wolfram Schlich (RETIRED) gentoo-dev 2007-04-20 12:27:09 UTC
Ok, duplicate of http://bugs.mysql.com/bug.php?id=27310 *sigh*
Comment 24 Wolfram Schlich (RETIRED) gentoo-dev 2007-04-20 12:37:35 UTC
Quoting upstream:
--8<--
fixed in 5.0.42 and 5.1.18
--8<--
Latest 5.0 in portage is 5.0.38 -- we need a bump then :)
Comment 25 Wolfram Schlich (RETIRED) gentoo-dev 2007-04-20 12:40:20 UTC
Ok, upstream correction regarding the fixed version: it was already
fixed in 5.0.40.
Comment 26 Wolfram Schlich (RETIRED) gentoo-dev 2007-04-20 12:58:08 UTC
Argh, first they say "fixed in $version", then they say
"$version is not yet released".
Standby for now :-/
Comment 27 Wolfram Schlich (RETIRED) gentoo-dev 2007-05-04 23:33:31 UTC
Ok, 5.0.40 is available now:

ftp://ftp.mysql.com/pub/mysql/src/mysql-5.0.40.tar.gz
Comment 28 Clemente Aguiar 2007-05-10 17:32:15 UTC
my hardened servers are also sufering from this problem

for when is the version bump scheduled?
Comment 29 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2007-05-11 08:43:06 UTC
5.0.40 in the tree now, resolved.
Comment 30 Ferris McCormick (RETIRED) gentoo-dev 2007-05-11 12:34:51 UTC
Test Result:  SUCCESS

Fixed for me on previously failing system:
With 5.0.38, this would hang forever
=======================
lacewing init.d # !!
./mysql restart
 * Stopping mysql ...
 * Stopping mysqld (0)                                                                                            [ ok ]
 * Starting mysql ...
 * Starting mysql (/etc/mysql/my.cnf)                                                                             [ ok ]
=======================
In fact, to get to this point, I had to stop 5.0.38 using a lot of manual intervention, but when it was finally completely gone, 5.0.40 came up cleanly and seems to be fine.
Comment 31 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2007-05-11 22:25:21 UTC
Thanks, closing fully.