| Summary: | gentoo-sources-2.6.31 crashes localmount after fsck.xfs | ||
|---|---|---|---|
| Product: | Gentoo Linux | Reporter: | Harris Landgarten <harrisl> |
| Component: | [OLD] Core system | Assignee: | Gentoo Kernel Bug Wranglers and Kernel Maintainers <kernel> |
| Status: | RESOLVED NEEDINFO | ||
| Severity: | critical | CC: | tetromino |
| Priority: | High | ||
| Version: | unspecified | ||
| Hardware: | AMD64 | ||
| OS: | Linux | ||
| Whiteboard: | linux-2.6.31 | ||
| Package list: | Runtime testing required: | --- | |
|
Description
Harris Landgarten
2009-09-11 15:06:14 UTC
Can you transcribe the exact error? Nothing from the failed boots was logged. This is a section of the 2.6.30-r6 boot which repaired the xfs partition. It seemed to me like the repair of the xfs partition was not finishing and could not be finished under 2.6.31 which caused localmount to crash and hang. I am reluctant to try to reproduce this error because of the possibility of data loss. Sep 10 21:00:22 harrisl-desktop kernel: REISERFS (device dm-3): checking transaction log (dm-3) Sep 10 21:00:22 harrisl-desktop kernel: REISERFS (device dm-3): Using r5 hash to sort names Sep 10 21:00:22 harrisl-desktop kernel: SGI XFS with ACLs, security attributes, realtime, large block/inode numbers, no debug enabled Sep 10 21:00:22 harrisl-desktop kernel: SGI XFS Quota Management subsystem Sep 10 21:00:22 harrisl-desktop kernel: XFS mounting filesystem dm-5 Sep 10 21:00:22 harrisl-desktop kernel: Starting XFS recovery on filesystem: dm-5 (logdev: internal) Sep 10 21:00:22 harrisl-desktop kernel: Ending XFS recovery on filesystem: dm-5 (logdev: internal) Not sure how much we can do here. I don't want you to experience any data loss. If you want to investigate this further and reproduce the error, reopen this bug. The problem turned out to be fastboot in the grub command line. It worked in 2.6.29 and 2.6.30 but now seems to cause a race condition which causes a hang at mounting local filesystems. I don't know for sure if the change was an update of openrc or 2.6.31 (In reply to comment #4) > The problem turned out to be fastboot in the grub command line. It worked in > 2.6.29 and 2.6.30 but now seems to cause a race condition which causes a hang > at mounting local filesystems. I don't know for sure if the change was an > update of openrc or 2.6.31 > I guess that 0.4.3-r3 is the openrc version with which the issue occurs, right? Could you try booting a 2.6.29/2.6.30 kernel with the defective openrc version and see if it crashes or not? If it boots alright, then 2.6.31 is, probably, to blame. If it crashes we will have to hand this bug to the openrc people. I saw the hang with 2.6.30, it took a long time but it eventually got through it which it why I immediately suspected fastboot. It did not try it with 2.6.29 but it doesn't happen every time in any case. It is some sort of race condition that has to do which whether or not fsck has to be run. I have 8 partitions, 1 ext, 1 xfs, the rest reiserfs. I think the bug is much more likely in openrc-0.4.3-r3 more info. Everytime I close a virtual windows xp that has been running for a day or two in vmware-workstaton where the storage is on an xfs partition, the vmware screen turns black and after about 10 minutes appears to shutdown without returning to the vmware console. I notice that even after this the virtual machine still shows up in ps and iotop shows [kdmflush] taking 100% of io. This will continue for another 10 minutes or so after which all returns to normal. If the machine is rebooted before [kdmflush] completes, the boot stalls at mounting local filesystems for 5 - 10 minutes and then completes. I believe this stall is fsck.xfs running. A reboot at this point starts normally. There seems to be an issue in XFS which is causing these lengthy delays (In reply to comment #7) > iotop shows [kdmflush] taking 100% of io Same here. xfs on lvm, gentoo-sources-2.6.31-r2. The behavior seems to be triggered by large filesystem operations (deleting a large directory tree, unpacking gcc source code package, etc.). (In reply to comment #8) > (In reply to comment #7) > > iotop shows [kdmflush] taking 100% of io > > Same here. xfs on lvm, gentoo-sources-2.6.31-r2. The behavior seems to be > triggered by large filesystem operations (deleting a large directory tree, > unpacking gcc source code package, etc.). > I think I've seen it, too. I'm on gentoo-2.6.30 (rSomething), lvm, all xfs (except /boot which is ext2); I'm not using openrc, so I'm on baselayout-1.x and sysvinit. After an unclean poweroff (was starting with the wrong profile so I kept the laptop power button pressed for a few seconds), it did took a while to mount the filesystems. Note that as far as I know xfs doesn't have an fsck command, so I have "0" as my last column in fstab for the xfs fs; the xfs kernel module is supposed to replay the journal at the next mount. So, I'd say it is a kernel issue. Not necessarily a 2.6.31 one, but one that may have become worse with 2.6.31. I think I've seen some reports (the regresion list 26.30->2.6.31 emails) about xfs having some trouble. dragos Does anyone have the issue with later versions of openrc. openrc >= 5.2 |