This morning I was doing some work on my alpha box and noticed slowness for some applications (screen, uptime, /etc/init.d/klive) and no response from others (/etc/init.d/apache2, /etc/init.d/jabber, /etc/init.d/mysql). I looked at the load averages and they kept climbing very rapidly. $ uptime 08:07:40 up 17 days, 22:34, 3 users, load average: 55.45, 46.99, 42.09 The system was so bogged down that I couldn't run many commands, so I just rebooted. I checked the logs and found nothing useful in the system log or the mysql logs. However in the kernel log I did find this: Nov 1 05:04:53 [kernel] [5733083.383995] scheduling while atomic: mysqld/0x00000001/6720 I'm not sure how or why it happened or if I can expect it to happen again. I just upgraded to mysql-4.1.14 about 3 days ago and I'm not sure if that had something to do with it. Can someone tell me if the cause of this is a kernel bug, a mysql bug, or both? Reproducible: Didn't try Steps to Reproduce: 1. Unknown Actual Results: System became unresponsive, forcing me to reboot. Expected Results: It should have kept running smoothly without bringing my load average up to 55.45. Portage 2.0.51.22-r3 (default-linux/alpha/2005.0, gcc-3.3.2, glibc-2.3.4.20041102-r1, 2.6.13.4 alpha) ================================================================= System uname: 2.6.13.4 alpha EV56 Gentoo Base System version 1.6.13 dev-lang/python: 2.3.5-r2, 2.4.2 sys-apps/sandbox: 1.2.12 sys-devel/autoconf: 2.13, 2.59-r6 sys-devel/automake: 1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r1 sys-devel/binutils: 2.15.92.0.2-r10 sys-devel/libtool: 1.5.20 virtual/os-headers: 2.6.8.1-r4 ACCEPT_KEYWORDS="alpha" AUTOCLEAN="yes" CBUILD="alpha-unknown-linux-gnu" CFLAGS="-mieee -O3 -mcpu=ev4" CHOST="alpha-unknown-linux-gnu" CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3/share/config /usr/lib/X11/xkb /usr/lib/mozilla/defaults/pref /usr/share/config /var/qmail/control" CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d" CXXFLAGS="-mieee -O3 -mcpu=ev4" DISTDIR="/usr/portage/distfiles" FEATURES="autoconfig distlocks sandbox sfperms strict" GENTOO_MIRRORS="ftp://gentoo.risq.qc.ca/ http://mirror.arcticnetwork.ca/pub/gentoo/ http://adelie.polymtl.ca/ http://gentoo.cites.uiuc.edu/pub/gentoo/ ftp://gentoo.arcticnetwork.ca/pub/gentoo/" LDFLAGS="-Wl,-O1" LINGUAS="en" MAKEOPTS="-j2" PKGDIR="/usr/portage/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/local/portage" SYNC="rsync://rsync.ca.gentoo.org/gentoo-portage" USE="alpha X apache2 arts berkdb bitmap-fonts crypt cups curl eds encode esd fam font-server foomatic foomaticdb fortran gd gdbm gif gnome gpm gstreamer gtk gtk2 imlib jabber jpeg kde libg++ libwww mad mikmod motif mozilla mp3 mpeg mysql ncurses nls nptl nptlonly ogg oggvorbis opengl oss pam pdflib perl png postgres python qt quicktime readline sdl spell ssl tcpd tiff truetype truetype-fonts type1-fonts udev vorbis xml2 xmms xv zlib linguas_en userland_GNU kernel_linux elibc_glibc" Unset: ASFLAGS, CTARGET, LANG, LC_ALL
Created attachment 71870 [details] Kernel Config 2.6.13.4
Created attachment 71871 [details] MySQL Config File
I think there are known problems with 2.6.13 on alpha. You should try 2.6.14 if possible, but I'm not sure if those issues are resolved there
I just re-read the 2.6.14 changelog[1] and there are some atomic fixes for alpha in 2.6.14, but unfortunately 2.6.14 doesn't compile on alpha[2]. However 2.6.14-git4[3] has a 2.6.14 compile fix[4]. I'll try the fix[4]; unfortunately it will be hard to tell if it worked because I ran 2.6.13.4 for 17 days without a problem, and I don't know how to reproduce the error. Thanks for the advice! [1] http://kernel.org/pub/linux/kernel/v2.6/ChangeLog-2.6.14 [2] http://bugzilla.kernel.org/show_bug.cgi?id=5512 [3] http://kernel.org/pub/linux/kernel/v2.6/snapshots/patch-2.6.14-git4.log [4] http://lkml.org/lkml/2005/10/26/43
I applied the patch[1] and got 2.6.14 to compile. However the "scheduling while atomic" thing is happening again. See output below: pan ~ # uptime 07:49:27 up 2 days, 18:54, 1 user, load average: 42.17, 42.05, 42.01 pan ~ # tail /var/log/kernel/current Nov 4 05:25:13 [kernel] [4426177.025788] scheduling while atomic: mysqld/0x00000001/10489 pan ~ # cat /proc/version Linux version 2.6.14 (root@pan) (gcc version 3.3.2 20040119 (Gentoo Linux 3.3.2-r7, propolice-3.3-7)) #1 Tue Nov 1 12:05:18 EST 2005 I read the ChangeLog for 2.6.14-git7[2] and there are a lot of atomicity fixes. I'll try that kernel[3] and report back if this happens again. [1] http://lkml.org/lkml/2005/10/26/43 [2] http://kernel.org/pub/linux/kernel/v2.6/snapshots/patch-2.6.14-git7.log [3] http://kernel.org/pub/linux/kernel/v2.6/snapshots/patch-2.6.14-git7.gz
I haven't gotten the 'scheduling while atomic' error in over a week. Upgrading to the latest kernel solved the problem.