Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 284611 - gentoo-sources-2.6.31 crashes localmount after fsck.xfs
Summary: gentoo-sources-2.6.31 crashes localmount after fsck.xfs
Status: RESOLVED NEEDINFO
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: AMD64 Linux
: High critical (vote)
Assignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers
URL:
Whiteboard: linux-2.6.31
Keywords:
Depends on:
Blocks:
 
Reported: 2009-09-11 15:06 UTC by Harris Landgarten
Modified: 2009-11-13 23:18 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Harris Landgarten 2009-09-11 15:06:14 UTC
I store vm images in an lvm partition formated xfs. After an incomplete shutdown of vmware I rebooted. 2.6.31 started properly and fscked the xfs vol but immediately after that the machine hung on starting localmount. Another reboot hung in the same place with a message that localmount had crashed prior to the hang at localmount.

I rebooted into 2.6.30-r6 and all came up properly. I then rebooted into 2.6.31 and all was fine.

I have not tested further but there seems to be a major problem

Reproducible: Always




Portage 2.1.6.13 (default/linux/amd64/10.0/desktop, gcc-4.4.1, glibc-2.10.1-r0, 2.6.31-gentoo x86_64)
=================================================================
System uname: Linux-2.6.31-gentoo-x86_64-Intel-R-_Core-TM-2_Quad_CPU_Q9550_@_2.83GHz-with-gentoo-2.0.1
Timestamp of tree: Fri, 11 Sep 2009 14:30:01 +0000
app-shells/bash:     4.0_p33
dev-java/java-config: 2.1.9
dev-lang/python:     2.5.4-r3, 2.6.2-r1, 3.1.1
dev-util/cmake:      2.6.4-r2
sys-apps/baselayout: 2.0.1
sys-apps/openrc:     0.4.3-r3
sys-apps/sandbox:    2.1
sys-devel/autoconf:  2.13, 2.63-r1
sys-devel/automake:  1.7.9-r1, 1.9.6-r2, 1.10.2, 1.11
sys-devel/binutils:  2.19.1-r1
sys-devel/gcc-config: 1.4.1
sys-devel/libtool:   2.2.6a
virtual/os-headers:  2.6.30-r1
ACCEPT_KEYWORDS="amd64 ~amd64"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-O2 -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/config"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/java/ /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/splash /etc/terminfo /etc/udev/rules.d"
CXXFLAGS="-O2 -pipe"
DISTDIR="/usr/portage/distfiles"
FEATURES="distlocks fixpackages parallel-fetch protect-owned sandbox sfperms strict unmerge-orphans userfetch"
GENTOO_MIRRORS="http://distfiles.gentoo.org http://distro.ibiblio.org/pub/linux/distributions/gentoo"
LANG="C"
LDFLAGS="-Wl,-O1"
MAKEOPTS="-j9"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="X a52 aac acl acpi alsa amd64 berkdb bluetooth branding bzip2 cairo cdr cli cracklib crypt cups dbus dri dts dvd dvdr eds emboss encode evo fam firefox flac fortran gdbm gif gnome gpm gstreamer gtk hal iconv ipv6 isdnlog jpeg kde ldap libnotify mad mikmod mmx mp3 mp4 mpeg mudflap multilib ncurses nls nptl nptlonly ogg opengl openmp pam pcre pdf perl png ppds pppd python qt3support qt4 quicktime readline reflection sdl session spell spl sse sse2 ssl startup-notification svg sysfs tcpd thunar tiff truetype unicode usb vorbis x264 xml xorg xulrunner xv xvid zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" ELIBC="glibc" INPUT_DEVICES="keyboard mouse evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" USERLAND="GNU" VIDEO_CARDS="fbdev glint intel mach64 mga neomagic nv r128 radeon savage sis tdfx trident vesa vga via vmware voodoo"
Unset:  CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, FFLAGS, INSTALL_MASK, LC_ALL, LINGUAS, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, PORTDIR_OVERLAY
Comment 1 Mike Pagano gentoo-dev 2009-09-11 15:25:14 UTC
Can you transcribe the exact error?
Comment 2 Harris Landgarten 2009-09-11 15:39:52 UTC
Nothing from the failed boots was logged. This is a section of the 2.6.30-r6 boot which repaired the xfs partition. It seemed to me like the repair of the xfs partition was not finishing and could not be finished under 2.6.31 which caused localmount to crash and hang. I am reluctant to try to reproduce this error because of the possibility of data loss.


Sep 10 21:00:22 harrisl-desktop kernel: REISERFS (device dm-3): checking transaction log (dm-3)
Sep 10 21:00:22 harrisl-desktop kernel: REISERFS (device dm-3): Using r5 hash to sort names
Sep 10 21:00:22 harrisl-desktop kernel: SGI XFS with ACLs, security attributes, realtime, large block/inode numbers, no debug enabled
Sep 10 21:00:22 harrisl-desktop kernel: SGI XFS Quota Management subsystem
Sep 10 21:00:22 harrisl-desktop kernel: XFS mounting filesystem dm-5
Sep 10 21:00:22 harrisl-desktop kernel: Starting XFS recovery on filesystem: dm-5 (logdev: internal)
Sep 10 21:00:22 harrisl-desktop kernel: Ending XFS recovery on filesystem: dm-5 (logdev: internal)
Comment 3 Mike Pagano gentoo-dev 2009-09-12 14:47:04 UTC
Not sure how much we can do here. I don't want you to experience any data loss.

If you want to investigate this further and reproduce the error, reopen this bug.
Comment 4 Harris Landgarten 2009-09-16 02:46:09 UTC
The problem turned out to be fastboot in the grub command line. It worked in 2.6.29 and 2.6.30 but now seems to cause a race condition which causes a hang at mounting local filesystems. I don't know for sure if the change was an update of openrc or 2.6.31
Comment 5 George Kadianakis (RETIRED) gentoo-dev 2009-09-16 13:41:28 UTC
(In reply to comment #4)
> The problem turned out to be fastboot in the grub command line. It worked in
> 2.6.29 and 2.6.30 but now seems to cause a race condition which causes a hang
> at mounting local filesystems. I don't know for sure if the change was an
> update of openrc or 2.6.31
> 

I guess that 0.4.3-r3 is the openrc version with which the issue occurs, right?
Could you try booting a 2.6.29/2.6.30 kernel with the defective openrc version and see if it crashes or not?

If it boots alright, then 2.6.31 is, probably, to blame. If it crashes we will have to hand this bug to the openrc people.
Comment 6 Harris Landgarten 2009-09-16 16:09:37 UTC
I saw the hang with 2.6.30, it took a long time but it eventually got through it which it why I immediately suspected fastboot. It did not try it with 2.6.29 but it doesn't happen every time in any case. It is some sort of race condition that has to do which whether or not fsck has to be run. I have 8 partitions, 1 ext, 1 xfs, the rest reiserfs.

I think the bug is much more likely in openrc-0.4.3-r3
Comment 7 Harris Landgarten 2009-09-27 01:08:37 UTC
more info. Everytime I close a virtual windows xp that has been running for a day or two in vmware-workstaton where the storage is on an xfs partition, the vmware screen turns black and after about 10 minutes appears to shutdown without returning to the vmware console. I notice that even after this the virtual machine still shows up in ps and iotop shows [kdmflush] taking 100% of io. This will continue for another 10 minutes or so after which all returns to normal.

If the machine is rebooted before [kdmflush] completes, the boot stalls at mounting local filesystems for 5 - 10 minutes and then completes. I believe this stall is fsck.xfs running. A reboot at this point starts normally.

There seems to be an issue in XFS which is causing these lengthy delays 
Comment 8 Alexandre Rostovtsev (RETIRED) gentoo-dev 2009-10-18 04:05:07 UTC
(In reply to comment #7)
> iotop shows [kdmflush] taking 100% of io

Same here. xfs on lvm, gentoo-sources-2.6.31-r2. The behavior seems to be triggered by large filesystem operations (deleting a large directory tree, unpacking gcc source code package, etc.).
Comment 9 Dragos Delcea 2009-10-26 10:16:10 UTC
(In reply to comment #8)
> (In reply to comment #7)
> > iotop shows [kdmflush] taking 100% of io
> 
> Same here. xfs on lvm, gentoo-sources-2.6.31-r2. The behavior seems to be
> triggered by large filesystem operations (deleting a large directory tree,
> unpacking gcc source code package, etc.).
> 
I think I've seen it, too. I'm on gentoo-2.6.30 (rSomething), lvm, all xfs (except /boot which is ext2); I'm not using openrc, so I'm on baselayout-1.x and sysvinit.
After an unclean poweroff (was starting with the wrong profile so I kept the laptop power button pressed for a few seconds), it did took a while to mount the filesystems.
Note that as far as I know xfs doesn't have an fsck command, so I have "0" as my last column in fstab for the xfs fs; the xfs kernel module is supposed to replay the journal at the next mount.

So, I'd say it is a kernel issue. Not necessarily a 2.6.31 one, but one that may have become worse with 2.6.31. I think I've seen some reports (the regresion list 26.30->2.6.31 emails) about xfs having some trouble.

dragos
Comment 10 Mike Pagano gentoo-dev 2009-11-04 16:38:47 UTC
Does anyone have the issue with later versions of openrc.

openrc >= 5.2