OK, so lately Mozilla (both 1.5b and 1.4-r4) have been frequently falling into an "uninterruptible sleep state" (state D in the outpuyt from 'ps'): please root # ps auxw |grep moz .... bryan 3252 1.7 3.6 58312 38128 ? D 18:09 0:26 /usr/lib/ mozilla/mozilla-bin -splash Such states are utterly un-kill-able, both from my experience and from what I could glean off a web search. They won't die when quitting X. They won't die if their parent process is killed. please root # killall -9 mozilla-bin please root # ps auxw |grep moz bryan 3271 0.0 0.0 0 0 ? Z 18:09 0:00 [mozilla-bin] <defunct> bryan 3252 1.7 3.6 58312 38128 ? D 18:09 0:26 /usr/lib/ mozilla/mozilla-bin -splash The only way to get rid of them (and so to be able to run Mozilla gain under my profile) is to reboot. Very annoying to say the least. It seems that this problem has cropped up before, with various apps (including Mozilla, but others as well) in 2.4.3 and various 2.4.22_pre* at least. I am filing this report here on the long shot that someone here knows more about this than I could find online, or at the least, might be more closely connected with either Mozilla or kernel development than I am (which is to say, not at all) I ran strace on Mozilla and waited until it hung, the output was: lseek(20, 9216, SEEK_SET) = 9216 read(20, "\0\1\0\5\221\0\0\24\0\0\0\4?\206\7\1?\206\6\364?\300+\250"..., 512) = 512 gettimeofday({1065748572, 220063}, NULL) = 0 gettimeofday({1065748572, 220087}, NULL) = 0 gettimeofday({1065748572, 220171}, NULL) = 0 kill(3318, SIGRTMIN) = 0 rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0 rt_sigsuspend([] <unfinished ...> --- SIGRTMIN (Unknown signal 32) @ 0 (0) --- Process 3252 detached Interestingly enough, the following program: #include <signal.h> #include <stdio.h> int main() { printf("SIGRTMIN=%d\n", SIGRTMIN); printf("SIGRTMAX=%d\n", SIGRTMAX); } outputs: please root # ./a.out SIGRTMIN=35 SIGRTMAX=64 which seems to contradict the strace output claiming SIGRTMIN was signal 32. I also stumbled across various newsgroup postings about glibc problems with SIGRTMIN. Could this have something to do with this problem? If anyone has *any* information at all about this, it would be very greatly appreciated. Reproducible: Sometimes Steps to Reproduce: 1. 2. 3. please root # emerge info Portage 2.0.49-r10 (default-x86-1.4, gcc-3.3.1, glibc-2.3.2-r3, 2.4.22) ================================================================= System uname: 2.4.22 i686 AMD Athlon(tm) MP 2000+ Gentoo Base System version 1.4.3.10p1 ccache version 2.3 [disabled] ACCEPT_KEYWORDS="x86" AUTOCLEAN="yes" CFLAGS="-march=athlon-mp -O2 -ftracer -ffast-math -pipe -fomit-frame-pointer" CHOST="i686-pc-linux-gnu" COMPILER="gcc3" CONFIG_PROTECT="/etc /var/qmail/control /usr/kde/2/share/config /usr/kde/3/ share/config /usr/X11R6/lib/X11/xkb /usr/kde/3.1/share/config /usr/share/ texmf/tex/generic/config/ /usr/share/texmf/tex/platex/config/ /usr/share/ config" CONFIG_PROTECT_MASK="/etc/gconf /etc/env.d" CXXFLAGS="-O2 -mcpu=i686 -pipe" DISTDIR="/usr/local/portage/distfiles" FEATURES="sandbox autoaddcvs fixpackages" GENTOO_MIRRORS="http://gentoo.oregonstate.edu http://distro.ibiblio.org/pub/ Linux/distributions/gentoo" MAKEOPTS="-j4" PKGDIR="/usr/portage/packages" PORTAGE_TMPDIR="/usr/local/portage" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/local/portage/ebuilds" SYNC="rsync://rsync.gentoo.org/gentoo-portage" USE="x86 oss apm cups foomaticdb mad mikmod nls xml2 gdbm slang bonobo ruby libwww motif cdr X 3dnow mmx directfb sse dga opengl xv fbcon kde qt qtmt arts tcltk aalib imlib ncurses readline sdl svga lcms gif jpeg png tiff gd avi mpeg quicktime esd gtk gtk2 -gnome alsa ggi dvd xmms oggvorbis encode pam ssl crypt tcpd mozilla spell truetype xml pdflib plotutils tetex guile perl python libg+ + atlas pic berkdb mysql postgres odbc samba gpm zlib java ppds threads"
I'm not an strace guru, but I don't see anything in that snippet that looks like a block or an I/O problem. Someone correct me if I'm wrong. Did you run strace with the -f option so that child processes were monitored? I went through this exercise several weeks ago and found that mozilla was blocking while trying to talk to the esound daemon (esd), most likely because of the flash plug-in. I've also seen this type of behavior when the client was run with privileges, then by the regular user. The restrictive modes on some of the chrome directories prevent mozilla accessing chrome files, leaving the client pretty much useless until the ownership issue is corrected.
Very interesting. I did find other apps not that will do this, including esd (and sometimes artsd) when they were writing to a socket. I uninstalled flash and have not eperienced the problem with mozilla since then. On the other hand, I have used flash without incident for quite some time. Do you have any insights into what could trigger this problem so suddenly or where the root of the problem is? (kernel, glibc, hardware?)
Maybe another process has control of the sound device? If you're running KDE, try killing artsd and see if the problem goes away. If artsd is the problem, try starting mozilla with artsdsp (I think this is the solution, but I'm not certain).
This may sound strange, but I think it's related to using newer Gaim's, versions 70-r2 and later (I tried 0.71 too). That thought had crossed my mind, since the start of this problem corresponded roughly to when I upgraded to 70-r2, and it always seemed to happen when gaim was running or even shortly after gaim started. So, I switched to 2.4.22-ck last night in hopes that might have a positive effect. Things actually seemed to be going well, but then I ran Gaim and shortly thereafter Moz hung in the D state again even with the new kernel. I downgraded to Gaim 0.68 after that and so far I have not seen the problem recur once (it has been happening several times a day for at least a week). I realize none of this establishes any definite causal link, but I have also seen the newer versions of Gaim do other strange things, like peg one of the CPUs at 100% constantly for no reason.
ive seen gaim make the cpu hit 100% only because of buddy animations ... i have a friend who's animation cycles many times in a second and when his window is open with the icon showing, cpu stays at 100% ...
This problem has fixed itself for some time now.