I hope to provide more details later, but I wanted to get this in before someone else wasted three days trying to track down the bug. I upgraded from 2.4.xx to gentoo-sources-2.6.11-r4 over the weekend, and while everything went surprisingly well, the one problem I ran into was that the sync call wouldn't return. This caused various utilities such as umount to also refuse to return, and a strace showed that the programs always stuck at "sync(". Because umount wouldn't return other important commands like reboot and telinit also failed to work properly, and there was no way to turn off the computer with a clean shutdown. I then compiled vanilla-sources-2.6.11.5 with precisely the same .config file and everything works just fine. I'm marking this as critical simply because of the inability to cleanly unmount the drives. Reproducible: Always Steps to Reproduce: 1. using my config file compile and boot gentoo-sources-2.6.11-r4 2. run sync 3. using the same .config, compile and boot vanilla-sources-2.6.11.5 4. run sync again Actual Results: Under gentoo-sources sync simply doesn't return. There are no errors in dmesg or anywhere else I could look for. Under vanilla-sources there is no problem. Expected Results: I would expect that both kernels would behave the same if the bug wasn't in the kernel, and if there was no bug sync would return pretty much immediately. Portage 2.0.51.19 (default-linux/x86/2004.0, gcc-3.3.5, glibc-2.3.4.20041102-r1,glibc-2.3.2-r3, 2.6.11.5 i586) ================================================================= System uname: 2.6.11.5 i586 AMD-K6(tm) 3D processor Gentoo Base System version 1.4.16 Python: dev-lang/python-2.2.3-r5,dev-lang/python-2.3.4-r1 [2.3.4 (#1, Feb 14 2005, 17:49:24)] distcc 2.16 i586-pc-linux-gnu (protocols 1 and 2) (default port 3632) [enabled] ccache version 2.3 [enabled] dev-lang/python: 2.2.3-r5, 2.3.4-r1 sys-devel/autoconf: 2.59-r6, 2.13 sys-devel/automake: 1.7.9-r1, 1.8.5-r3, 1.5, 1.4_p6, 1.6.3, 1.9.4 sys-devel/binutils: 2.15.92.0.2-r1 sys-devel/libtool: 1.5.10-r4 virtual/os-headers: 2.6.8.1-r2 ACCEPT_KEYWORDS="x86" AUTOCLEAN="yes" CFLAGS="-march=k6-2 -Os -pipe" CHOST="i586-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3/share/config /usr/lib/X11/xkb /usr/share/config /usr/share/texmf/dvipdfm/config/ /usr/share/texmf/dvips/config/ /usr/share/texmf/tex/generic/config/ /usr/share/texmf/tex/platex/config/ /usr/share/texmf/xdvi/ /var/qmail/control" CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d" CXXFLAGS="-march=k6-2 -Os -pipe" DISTDIR="/usr/portage/distfiles" FEATURES="autoaddcvs autoconfig ccache distcc distlocks sandbox sfperms" GENTOO_MIRRORS="http://127.0.0.1:9115/http:/gentoo.osuosl.org/ http://gentoo.oregonstate.edu" MAKEOPTS="-j2" PKGDIR="/usr/portage/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/local/portage" SYNC="rsync://debian.tamu.edu/gentoo-portage" USE="x86 3dnow 3dnowex X X509 aalib alsa apache2 avi berkdb bitmap-fonts cdparan oia crypt cups curl dga directfb divx4linux dvd emboss encode fam fbcon flac font-server foomaticdb fortran gd gdbm ggi gif gmttria gtk gtk2 gtkhtml imagemagick imap imlib ipv6 java jikes jpeg junit ldap libg++ libwww maildir matroska matrox mbox md5sum mikmod mmx mmxext motif mozilla mp3 mpeg mpi mysql ncurses network nls nocardbus nptl nptlonly odbc offensive ogg oggvorbis opengl oss pam parse-clocks pdflib perl php pic plotutils png pnp python qt qtmt quicktime readline real rtc ruby samba sdl slp snmp spell ssl svga tcltk tcpd theora tiff transcode truetype truetype-fonts type1 type1-fonts usb v4l2 wmf wxwindows xface xml xml2 xv xvid zlib" Unset: ASFLAGS, CBUILD, CTARGET, LANG, LC_ALL, LDFLAGS
Is this easily reproducable? (i.e. it happens on *every* sync call, even after a reboot?)
I believe it happens every single time, definitely across reboots. Once I thought I saw a sync go through, but I never saw it happen again. Even if you assume that sync went through it would be about one success out of hundreds of tries. So yes, to answer simply, the bug is very reproducable.
Please could you test 2.6.11-r6? We removed some patches which might have been interfering here.
see comment #3