Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 54603 - threading issues on a nptl-enabled system
Summary: threading issues on a nptl-enabled system
Status: RESOLVED DUPLICATE of bug 63734
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: New packages (show other bugs)
Hardware: All Linux
: High normal (vote)
Assignee: dotnet (DISBANDED)
URL: http://bugs.ximian.com/show_bug.cgi?i...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-06-21 00:19 UTC by Gábor Farkas
Modified: 2005-07-17 13:06 UTC (History)
8 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
my-installed-packages.txt (my-installed-packages.txt,7.75 KB, text/plain)
2004-07-12 14:42 UTC, Gábor Farkas
Details
Emerge info and log for a (probably) working system (emerge-info.txt,703.61 KB, text/plain)
2004-07-19 12:11 UTC, Joe Geldart
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Gábor Farkas 2004-06-21 00:19:53 UTC
i have gentoo ~x86, with nptl enabled.

when using muine, it frequently freezes (like when i import my whole music library (around 5-10gig)).

i recompiled my glibc + mono with nptl disabled, and then it worked fine.

this happens with the newest packages (mono-beta-3 (mono-0.96)).


Reproducible: Always
Steps to Reproduce:
1.rm -r ~/.gconf/apps/muine ~/.gnome2/muine
2.start muine
3.import a BIG music directory (5-10gigs)

Actual Results:  
muine start to import the music, but then freezes somewhere...

Expected Results:  
it should import it

the insteresting thing is, that i had a nptl enabled system for a long time,
and when i used  mono-beta-1 with muine, it worked fine, and i had nptl enabled
at that time.

it got wrong when i installed mono-beta-2, and it is still bad with mono-beta-3.
Comment 1 Gábor Farkas 2004-06-21 00:22:46 UTC
there's a discussion on the mono-devel list about this problem, maybe it's of some help...

the link to the beginning of the thread:
http://lists.ximian.com/archives/public/mono-devel-list/2004-June/006183.html
Comment 2 Richard Torkar 2004-06-22 06:27:51 UTC
Hi Paolo,

I tried what you described in the mailinglist.

(gdb) thread apply all bt
(gdb) bt
#0  0xffffe410 in ?? ()
#1  0xbfffeca4 in ?? ()
#2  0x40d08100 in ?? () from /lib/libc.so.6
#3  0x00000008 in ?? ()
#4  0x40c28eff in sigsuspend () from /lib/libc.so.6
#5  0x40138108 in GC_end_blocking () from /usr/lib/libmono.so.0
#6  <signal handler called>
#7  0xffffe410 in ?? ()
#8  0xbffff018 in ?? ()
#9  0x0000004d in ?? ()
#10 0x00000000 in ?? ()
#11 0x40bce810 in pthread_cond_timedwait () from /lib/libpthread.so.0
#12 0x40111506 in mono_method_full_name () from /usr/lib/libmono.so.0
#13 0x4011f704 in mono_once () from /usr/lib/libmono.so.0
#14 0x4011ff65 in mono_once () from /usr/lib/libmono.so.0
#15 0x400dc206 in mono_install_thread_callbacks () from /usr/lib/libmono.so.0
#16 0x08120d90 in ?? ()
#17 0x00000001 in ?? ()
#18 0xffffffff in ?? ()
#19 0x00000000 in ?? ()
#20 0x4000ae60 in _dl_rtld_di_serinfo () from /lib/ld-linux.so.2
#21 0x400dc45a in mono_thread_manage () from /usr/lib/libmono.so.0
Previous frame inner to this frame (corrupt stack?)

Comment 3 Richard Torkar 2004-06-22 06:29:27 UTC
This is the second thread just-in-case ;)

#0  0xffffe410 in ?? ()
#1  0xbfffeb88 in ?? ()
#2  0xffffffff in ?? ()
#3  0x00000003 in ?? ()
#4  0x40ca7d89 in poll () from /lib/libc.so.6
#5  0x401d3022 in g_main_loop_get_context () from /usr/lib/libglib-2.0.so.0
#6  0x0805c620 in ?? ()
#7  0x00000003 in ?? ()
#8  0xffffffff in ?? ()
#9  0x401d1e6f in g_main_context_query () from /usr/lib/libglib-2.0.so.0
#10 0x00000003 in ?? ()
#11 0x00000003 in ?? ()
#12 0x0805c620 in ?? ()
#13 0x401d252f in g_main_context_dispatch () from /usr/lib/libglib-2.0.so.0
#14 0x080528d8 in ?? ()
#15 0xffffffff in ?? ()
#16 0x7fffffff in ?? ()
#17 0x0805c620 in ?? ()
#18 0x00000003 in ?? ()
#19 0x080528d8 in ?? ()
#20 0xbfffecd8 in ?? ()
#21 0x401a59bf in ?? () from /usr/lib/libgthread-2.0.so.0
#22 0x00000000 in ?? ()
#23 0xbfffebf4 in ?? ()
#24 0x402271c0 in g_thread_use_default_impl () from /usr/lib/libglib-2.0.so.0
#25 0x402271b8 in g_ascii_table () from /usr/lib/libglib-2.0.so.0
#26 0x40227da0 in _g_debug_flags () from /usr/lib/libglib-2.0.so.0
#27 0xffffffff in ?? ()
#28 0x7fffffff in ?? ()
#29 0x4022733c in ?? () from /usr/lib/libglib-2.0.so.0
#30 0x080528d8 in ?? ()
#31 0x401963e0 in mono_debugger_class_init_func () from /usr/lib/libmono.so.0
#32 0xbfffecd8 in ?? ()
#33 0x401d2826 in g_main_context_iteration () from /usr/lib/libglib-2.0.so.0
#34 0x080528d8 in ?? ()
#35 0x00000001 in ?? ()
#36 0x00000001 in ?? ()
#37 0x0804f050 in ?? ()
#38 0x40194818 in ?? () from /usr/lib/libmono.so.0
#39 0x080528d8 in ?? ()
#40 0x401963e0 in mono_debugger_class_init_func () from /usr/lib/libmono.so.0
#41 0x4010d103 in mono_method_full_name () from /usr/lib/libmono.so.0
Previous frame inner to this frame (corrupt stack?)
Comment 4 foser (RETIRED) gentoo-dev 2004-06-22 06:34:25 UTC
you should probably rebuild with 'inherit debug' added, so the bt's get a bit more useful.
Comment 5 Gábor Farkas 2004-06-22 06:42:20 UTC
inherit debug...is it an USE flag? or a ./configure switch? for mono? for muine?
Comment 6 Richard Torkar 2004-06-22 07:02:15 UTC
I'll do that foser.
Comment 7 Richard Torkar 2004-06-22 07:18:31 UTC
foser - added debug to the inherit clause on the first line in the mono ebuild. Should it be placed in any particular order? I placed it last now and I didn't get anymore output when using gdb.
Comment 8 foser (RETIRED) gentoo-dev 2004-06-22 07:51:46 UTC
in theory it shouldn't matter, not sure.. might be that you need to compile several packages to get relevant output. you probably use omit-frame-pointer ? That's not helpful
Comment 9 Richard Torkar 2004-06-22 08:55:08 UTC
Nope I don't use omit-fram-pointer.

I did a FEATURES=nostrip USE=debug.

Something is seriously borked...
(gdb) run
Starting program: /usr/bin/mono /usr/lib/muine/muine.exe
warning: Unable to find dynamic linker breakpoint function.
GDB will be unable to debug shared library initializers
and track explicitly loaded dynamic code.
Detaching after fork from child process 7563.
Comment 10 Gábor Farkas 2004-06-22 09:26:22 UTC
i got these traces:

#0  0xffffe410 in ?? ()
#1  0xbffff044 in ?? ()
#2  0x40d38860 in ?? () from /lib/libc.so.6
#3  0x00000008 in ?? ()
#4  0x40c53667 in sigsuspend () from /lib/libc.so.6
#5  0x4014c48f in GC_end_blocking () from /usr/lib/libmono.so.0
#6  <signal handler called>
#7  0xffffe410 in ?? ()
#8  0xbffff3b8 in ?? ()
#9  0x000000c8 in ?? ()
#10 0x00000000 in ?? ()
#11 0x40bf7ae0 in pthread_cond_timedwait () from /lib/libpthread.so.0
#12 0x40122798 in mono_method_full_name () from /usr/lib/libmono.so.0
#13 0x40131920 in mono_once () from /usr/lib/libmono.so.0
#14 0x401320b3 in mono_once () from /usr/lib/libmono.so.0
#15 0x400e75d1 in mono_thread_manage () from /usr/lib/libmono.so.0
#16 0x400c3b02 in mono_runtime_exec_managed_code () from /usr/lib/libmono.so.0
#17 0x4007bd70 in mono_main () from /usr/lib/libmono.so.0
#18 0x08048f2b in main ()


and
#0  0xffffe410 in ?? ()
#1  0xbfffea48 in ?? ()
#2  0xffffffff in ?? ()
#3  0x00000003 in ?? ()
#4  0x40cd5b8d in poll () from /lib/libc.so.6
#5  0x401ea886 in g_main_loop_get_context () from /usr/lib/libglib-2.0.so.0
#6  0x401ec3fc in g_idle_remove_by_data () from /usr/lib/libglib-2.0.so.0
#7  0x401e9fad in g_main_context_iteration () from /usr/lib/libglib-2.0.so.0
#8  0x4011cc2a in mono_method_full_name () from /usr/lib/libmono.so.0
#9  0x4012c470 in mono_once () from /usr/lib/libmono.so.0
#10 0x401235d2 in mono_method_full_name () from /usr/lib/libmono.so.0
#11 0x40129b61 in mono_once () from /usr/lib/libmono.so.0
#12 0x40120906 in mono_method_full_name () from /usr/lib/libmono.so.0
#13 0x4012ff33 in mono_once () from /usr/lib/libmono.so.0
#14 0x4012f00d in mono_once () from /usr/lib/libmono.so.0
#15 0x40101643 in mono_assembly_loaded () from /usr/lib/libmono.so.0
#16 0x4010138e in mono_assembly_load () from /usr/lib/libmono.so.0
#17 0x4010213a in mono_init () from /usr/lib/libmono.so.0
#18 0x40069352 in mono_codegen () from /usr/lib/libmono.so.0
#19 0x4007b6bd in mono_main () from /usr/lib/libmono.so.0
#20 0x08048f2b in main ()

is this of any help?
Comment 11 Gábor Farkas 2004-06-22 09:27:52 UTC
hmm...errr..these stacktraces were from a from-cvs compiled muine..i'll check the emerged muine, if the traces differ
Comment 12 Gábor Farkas 2004-06-22 10:07:20 UTC
if i use export GC_DONT_GC=1 before starting muine,
then it works ok...

so the problem seems to be somewhere with the gc.

(yes, this is also mentioned in the mono-devel-list thread)
Comment 13 Gábor Farkas 2004-06-22 10:15:50 UTC
ok, i tried also with the emerge muine (0.6.3). the stack traces are the same as what i reported (with muine-cvs)
Comment 14 Richard Torkar 2004-06-22 11:39:11 UTC
I opened a bug at bugs.ximian.com.
http://bugs.ximian.com/show_bug.cgi?id=60576
Comment 15 Peter Johanson (RETIRED) gentoo-dev 2004-06-23 09:09:31 UTC
To add to this, I've *not* been having any problems with muine and NPTL enabled glibc. I'm using x86 base, with only the mono items as ~x86. If someone can/has the time, can they please also test this combination and see if things work? If so, we can at least have a starting point to see where these problems might have started appearing in our toolchain. thanks
Comment 16 Peter Johanson (RETIRED) gentoo-dev 2004-06-28 17:56:15 UTC
Okay, CCing the toolchain folks as this seems to be related to glibc versions and NPTL. you guys got any input?
Comment 17 Peter Johanson (RETIRED) gentoo-dev 2004-06-30 17:28:47 UTC
Scratch that again. i *am* having these problems. I've added more info to the ximian bug, but their motivation to look into it is obviously low.
Comment 18 Peter Johanson (RETIRED) gentoo-dev 2004-07-01 18:06:46 UTC
This bug is not going to solve itself any time soon. I recommend people either don't bother with mono, or disable NPTL from glibc.
Comment 19 Gábor Farkas 2004-07-12 04:06:18 UTC
i updated to newest ~x86:
glibc-2.3.4.20040619
linux26-headers-2.6.6-r1
ck-sources-2.6.7-r1

mono-1.0

(muine from cvs, but should be pretty much like muine-0.6.3)

and now it imported my music directory :)))

this was always my test for muine...

i will test it more today to see whether it's freezing or not.

could other people also make the upgrade and see if it helps?
Comment 20 Peter Johanson (RETIRED) gentoo-dev 2004-07-12 11:39:10 UTC
Hey Gabor,

great news, last time i had tried the ~x86 glibc it still had issues. Does further testing still show this seeming to be resolved?

I've just upgraded to those 2.6 headers, i'll hopefully be upgrading and testing the new glibc this afternoon/tonight. Please report back if you still have it working or if it's freaking out. Did you change any CFLAGS/stripping/etc ?
Comment 21 Gábor Farkas 2004-07-12 12:23:46 UTC
i've listened to music for some hours, and it did not deadlock yet ;)


my config:

bash-2.05b# emerge info
Portage 2.0.50-r9 (default-x86-2004.0, gcc-3.3.3, glibc-2.3.4.20040619-r0, 2.6.7-ck1)
=================================================================
System uname: 2.6.7-ck1 i686 Intel(R) Pentium(R) M processor 1400MHz
Gentoo Base System version 1.5.1
ccache version 2.3 [enabled]
Autoconf: sys-devel/autoconf-2.59-r4
Automake: sys-devel/automake-1.8.5-r1
ACCEPT_KEYWORDS="x86 ~x86"
AUTOCLEAN="yes"
CFLAGS="-O3 -march=pentium3 -pipe"
CHOST="i686-pc-linux-gnu"
COMPILER="gcc3"
CONFIG_PROTECT="/etc /usr/X11R6/lib/X11/xkb /usr/kde/2/share/config /usr/kde/3.2/share/config /usr/kde/3/share/config /usr/lib/mozilla/defaults/pref /usr/share/config /var/qmail/control"
CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d"
CXXFLAGS="-O3 -march=pentium3 -pipe"
DISTDIR="/usr/portage/distfiles"
FEATURES="autoaddcvs ccache sandbox"
GENTOO_MIRRORS="http://gentoo.inode.at"
MAKEOPTS="-j2"
PKGDIR="/usr/portage/packages"
PORTAGE_TMPDIR="/home/portage"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage /usr/local/bmg-main /usr/local/gnome-2.7.1"
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="X acpi alsa apache2 avi berkdb canna cdr cjk composite crypt cups divx dri dvd dvdr encode faad flac foomaticdb gdbm gif gnome gnutls gpm gstreamer gtk gtk2 imlib jpeg ldap libg++ libwww mad mikmod mmx mono mozilla moznocompose moznoirc moznomail mozsvg mpeg ncurses nls nptl oggvorbis opengl pam pdflib png python qt quicktime readline samba sdl slang sse ssl tcpd truetype unicode usb x86 xml2 xmms xprint xv xvid zlib"
Comment 22 Peter Johanson (RETIRED) gentoo-dev 2004-07-12 13:28:34 UTC
Okay, things are definitely not fixed for me using that version of glibc and linux26-headers.

gabor, can you please double check and stress test the import on some big*ss media directories and see if it is indeed fixed for you? If so we need to start looking at other places this problem might be originating.
Comment 23 Gábor Farkas 2004-07-12 14:40:49 UTC
it still works :)

i tried to import my music folder (7.5 gig (around 800megs of ogg, 200megs of mpc(well,they are ignored imho :-), and the rest are mp3)

first i tried with my a very-little-modified from-cvs muine. it worked ok.
then i tried with a clean from-cvs muine. it worked ok.

then i emerge muine-0.6.3.
and tried to import the music folder 3 times. it worked ok.

between the imports i stopped muine, killed gconfd, and removed ~/.gconfs/apps/muine and ~/.gnome2/muine .

the cvs versions used the xine backend, and the emerged version used the gstreamer backend, but it should not matter by the indexing part.

muine USE flags: +flac +mad +oggvorbis -xine

i'll attach my list of installed programs.
Comment 24 Gábor Farkas 2004-07-12 14:42:48 UTC
Created attachment 35265 [details]
my-installed-packages.txt

a list of my installed packages (got it with "epm -qa")
Comment 25 Peter Johanson (RETIRED) gentoo-dev 2004-07-12 19:19:00 UTC
Can you also post the output of "emerge info" from your working configuration? Thanks.
Comment 26 Peter Johanson (RETIRED) gentoo-dev 2004-07-12 19:20:25 UTC
nm, you already did. d'oh.
Comment 27 Kevin O'Shea 2004-07-14 17:40:06 UTC
This seems to be a gentoo only bug, are we the only of the NPTL distros using 2.3.4 glibc?  Does anyone know if NPTL 2.3.3 is ok?  I looked at the patches applied to glibc, and its fairly generic so I guess its possible that it could be somewhere else.

I hope this gets fixed soon, I want to play with mono but I don't have the time to recompile my system w/o NPTL.
Comment 28 Gábor Farkas 2004-07-15 01:19:17 UTC
as you can see nptl with glibc-2.3.4 works for me.

try to update to lastes ~x86 (if it's possible for you),
and try it out (import a BIG (several gigs) music repository into muine).
Comment 29 Kevin O'Shea 2004-07-15 11:36:29 UTC
I've got 2.3.4, but still no go.

I can't get it to work without disabling GC.
Comment 30 Gábor Farkas 2004-07-18 11:59:32 UTC
ok guys,
it is not working again ;)

i don't really know what happened, but today i reorganized my mp3's,
and deleted muine's config files, and reimported the music,
and it froze again ;((((


i don't really know why because last time it worked, and i tested it a lot ;((
Comment 31 Gábor Farkas 2004-07-19 06:55:19 UTC
to complicate things more:

today i imported my music dir 2x (deleting the config files between the imports),
and it worked fine.

so it seems that:
either

a. something changed on my compouter between today and yesterday
or

b. the muine-import-my-whole-music-dir test i used to test muine stability is not a good test anymore.

in the past muine froze reliably when doing this test:( now it works fine, but yesterday it froze once.
Comment 32 Joe Geldart 2004-07-19 12:07:09 UTC
Ok, my problem was slightly different to most people's here (in that I couldn't even start muine; it just returned to the command line with no error or comment) but I have found a way to get it to start ok which may help.

Attached are emerge-info.txt, which shows the results of calling `emerge info`, and /var/log/emerge.log.

To get NPTL enabled (and checking /lib/libc.so.6 confirms that NPTL is enabled) I put 'nptl' in my USE flags and ran /usr/portage/scripts/bootstrap-2.6.sh. The script fails part way along due to virtual OS headers (on my box at least) but I then re-emerged glibc and mono and everything worked fine. This trick was pointed out to me by a friend, so I'm not sure how well it works long term. I have to add, this was on a *running* system.

I hope this will help.
Comment 33 Joe Geldart 2004-07-19 12:11:57 UTC
Created attachment 35766 [details]
Emerge info and log for a (probably) working system
Comment 34 Peter Johanson (RETIRED) gentoo-dev 2004-08-05 18:15:26 UTC
Okay, i'm just too d*mn stubborn to let this go.

Starting this evening, i'm recompiling glibc with NPTL support, and compiling mono from CVS from the date -D "20 May 2004" (a few days after the newer libgc was merged to mono).

Every day/evening possible, i will be trying a new date(s) from mono's CVS, to track down exactly when this problem occured, to at least give a patchset that is responsible for the breakage.

Yes, this is ugly and brute force, but as my knowledge of NPTL/thread/GC issues is minimal, and nobody who knows this stuff seems cares about this bug, i'm going with what i've got. If i can find where the problem is introduced, i can at least focus my efforts on something (and maybe learn a little in the process).

PROCESS: Anybody who wants this fixed can help me do this:

1) emerge glibc with USE="nptl"

2) Grab a CVS snapshot from some date that has not yet been reported here yet, preferably sometime around the libgc update which was on 2004-05-18.

3) Compile it into /opt/mono or somewhere so you don't zap your real install (not that it matters since your installed mono with have the NPTL problem anyway) by doing

./configure --prefix=/opt/mono --enable-nptl=yes
make
make install

4) grab the simple thread test from Chris Haydens comment on http://bugzilla.ximian.com/show_bug.cgi?id=60576 (reproduced here to make life easy):

using System;
using System.Threading;

class Test
{
        public static void Main( String[] args )
        {
                int i = 0;
                while( true ) {
                        Thread t = new Thread( new ThreadStart(Blah) );
                        t.Start();
                        i++;
                        Console.WriteLine( i+" threads" );
                }
        }

        private static void Blah() {
                Console.WriteLine( "starting thread" );
        }
}

change your PATH to reference /opt/mono ("export PATH="/opt/mono/bin:$PATH") and compile the above with "mcs Foo.cs". Then run it by doing "mono Foo.exe". If you have problems, the program should just hang at some point. If it doesn't hang, and keeps just happilly spitting out thread started messages, you probably have a working NPTL enabled mono from some date.

5) Post to this bug stating the exact date of the pull you used, and whether it worked or not. *IMPORTANT* Once we find a limit date where at one date it works and some other date it doesn't, we obviously will be focusing on dates *between* those two. When we get to that point. start testing those dates, and don't bother reporting any other dates.

When we manage to track down the exact day/patchset where things broke, we'll go from there.

BONUS: To whomever finds the exact date on which things went from working to not working with regards to this NPTL/GC bug, i will personally buy you a beer (if not many, many beers) the next time you are in new york city.

POSSIBILITY: If i find time and it seems worth it, I may script this up to automate the cvs-update/build/test/lather/rinse/repeat stuff.
Comment 35 Peter Johanson (RETIRED) gentoo-dev 2004-08-17 17:01:30 UTC
Okay, on a side note, i've just commited 1.0.1-r1 and 1.0.1-r2 versions of mono. -r1 has the NPTL support removed, and dies nastily if you try to compile it with a NPTL glibc. -r2 is the same as 1.0.1, but package.masked. It includes the NPTL support.

In order not to let this bug hold up stabling mono for the linuxthreads folks, i've done the above setup. People can still test/work on this bug using -r2, but we have a valid candidate in the -r1 for a stable mono and friends.
Comment 36 Scott Marks 2004-08-21 13:31:40 UTC
Peter - please add some keywords so this shows up on a mono+nptl search
Comment 37 Paul de Vrieze (RETIRED) gentoo-dev 2004-08-24 12:57:40 UTC
Scott: The allowed keywords are fixed, neither nptl nor mono is in it. I'll add mono to the subject though.
Comment 38 SpanKY gentoo-dev 2004-08-31 18:42:12 UTC
if you guys want to check to see if a system is using nptl or linuxthreads all you have to do is run `/lib/libc.so.6` ... bug like one guy said on the ximian bugzilla, he's seen this on non-gentoo nptl-enabled systems
Comment 39 Canal Vorfeed 2004-09-11 15:51:25 UTC
Ok, I've spent few hours on this issue and found where REAL problem lies.

Good news: it's NOT Boehm's GC and it's NOT mono.

Bad news: it's problem with glibc itself :-(

I've started with "GC is broken with nptl" sample by "Peter Johanson" and played with it for few hours. In the end I've just removed Boehm's GC completely (just plain old malloc) and... it still deadlocks somewhere.

So we should stop playing with mono and try to address REAL issue: deadlocks somewhere in nptl library itself :-( Unfortunatelly I'm not glibc guru.

Program:

#include <pthread.h>
#include <stdio.h>

void *thread_function (void *args)
{
        int j = 0;
        char *str;
        printf("starting thread!\n");
        for (j; j < 24; j++)
        {
                str = (char *)malloc(240);
                printf("malloc in thread !\n");
        }
        pthread_yield();
}

int main (void)
{
        int i;
        pthread_t thread;
        char *str;
        for (i=0;i<1000;i++)
        {
                pthread_create( &thread, NULL, thread_function, (void *)i);
                pthread_yield();
                str = (char *)malloc(240);
                printf("%d threads\n", i);
        }
        sleep(10);
}

$ ./testpgm | grep 'malloc in thread' | wc -l
9168

Without ntpl I've got expected 24000 ...

Portage 2.0.50-r11 (default-x86-2004.2, gcc-3.3.4, glibc-2.3.4.20040808-r0, 2.6.9-rc1-mm4)
=================================================================
System uname: 2.6.9-rc1-mm4 i686 Intel(R) Pentium(R) 4 CPU 3.00GHz
Gentoo Base System version 1.5.3
Autoconf: sys-devel/autoconf-2.59-r4
Automake: sys-devel/automake-1.8.5-r1
ACCEPT_KEYWORDS="x86 ~x86"
AUTOCLEAN="yes"
CFLAGS="-O2 -pipe -march=pentium4 -funroll-loops -ffast-math -fomit-frame-pointer -ffloat-store -fforce-addr -ftracer -mmmx -msse -msse2 -mfpmath=sse"
CHOST="i686-pc-linux-gnu"
COMPILER=""
CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3/share/config /usr/share/config /var/qmail/control"
CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d"
CXXFLAGS="-O2 -pipe -march=pentium4 -funroll-loops -ffast-math -fomit-frame-pointer -ffloat-store -fforce-addr -ftracer -mmmx -msse -msse2 -mfpmath=sse"
DISTDIR="/usr/portage/distfiles"
FEATURES="autoaddcvs ccache sandbox"
GENTOO_MIRRORS="ftp:///ftp-stud.fht-esslingen.de/pub/Mirrors/gentoo/ ftp://vlaai.snt.ipv6.utwente.nl/pub/os/linux/gentoo/ ftp://ftp6.uni-erlangen.de/pub/mirrors/gentoo http://gentoo.spb.ru/rsync"
MAKEOPTS="-j3"
PKGDIR="/usr/portage/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY=""
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="X apm arts avi berkdb bitmap-fonts crypt cups encode foomaticdb gdbm gif gnome gpm gtk gtk2 imlib jpeg kde libg++ libwww mad mikmod motif mpeg ncurses nls nptl oggvorbis opengl oss pam pdflib perl png python qt quicktime readline sdl slang spell ssl svga tcpd truetype x86 xml2 xmms xprint xv zlib"
Comment 40 Brandon Hale (RETIRED) gentoo-dev 2004-09-11 20:12:37 UTC

*** This bug has been marked as a duplicate of 63734 ***
Comment 41 Brandon Hale (RETIRED) gentoo-dev 2004-09-11 20:13:29 UTC
Moved this to a new bug which focuses on the "real" issue here.
Thanks to Canal for pointing us in the right directions.
Comment 42 Ed Catmur 2004-09-27 17:55:27 UTC
I've attached a patch to upstream (http://bugs.ximian.com/show_bug.cgi?id=60576 for the lazy) which should fix it.

I think what is happening is this: libgc is being build with -fexceptions (C++-compatible exceptions handling). This clobbers the stack unwinding in libpthread used by pthread_cleanup_{push,pop}. As a result the thread cleanup handler in libgc does not remove dead threads from the list of active threads. As a result dead threads are signalled times (during global stop for garbage collection) and expected to signal back, so the garbage collector deadlocks waiting for non-existent threads to signal back. (Actually it's a little more involved, but you get the picture.)

As you can probably expect, that took quite a lot of debugging. gdb is not good at debugging zombie threads.
Comment 43 Ed Catmur 2004-09-27 18:33:59 UTC
Additional note: from upstream it appears that there is a different bug that affects gcc 3.3. My system is gcc-3.4.2-r2, glibc-2.3.4.20040808.