Summary: | XFS lockup on no free space | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | Martin Decky <martin> |
Component: | [OLD] Core system | Assignee: | Bob Johnson (RETIRED) <livewire> |
Status: | RESOLVED FIXED | ||
Severity: | critical | CC: | lostlogic |
Priority: | High | ||
Version: | unspecified | ||
Hardware: | x86 | ||
OS: | Linux | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- |
Description
Martin Decky
2003-01-24 07:08:41 UTC
The bug was confirmed in VMware 3.2 virtual environment, in a clean new installation of Gentoo 1.4_rc2. Conclusion: This is definitively a XFS bug. just got done talking with some xfs devs, they have one bug report on this and cant reproduce it. neither can i. In a vmware session the partition fills and errors saying device is full. no lockups/kernel oops,etc. I will keep a eye on this. That's really strange - I can reproduce the bug on two different machines and in VMware. I would try also on some of the servers I administrate, but I can't afford to lock them up. While your testing, were you filling up the filesystem while preserving the following conditions? 1. CONFIG_XFS_DEBUG in kernel configuration was not set 2. the filesystem was on a real disk partition (not in a file or ramdisk) and was about 1 GB in size (or larger) 3. the disk device was a IDE hard drive (this could be actually the source of the bug, I'm going to make some additional tests) Give me some time to go through it again. If I will be able to reproduce the bug again, I will try to create a small bootable ISO image with just a minimalistic kernel, bash and a script causing the bug (but again, you would need a XFS filesystem on a hard drive partition). Or I can even give you root access to the VMware session and let you watch the bug in an environment, where it really happens. db fix db fix After two days of testing the new (unstable) xfs-sources-2.4.20-r3 I have successfuly compiled a kernel with my usual configuration which doesn't crash both on my computer and in VMWare. However, this unstable release doesn't contain the "gcc31-compile-optimalizations" patch, so I set the Processor family option to "Pentium-III/Celeron(Coppermine)". After applying the gcc31 patch myself and compiling the kernel with my original option "Athlon-XP(gcc>31)" the bug appeared again. This could probably signalize a possible bug in GCC. I will do additional tests in the following days, but for now I can describe in more detail what is happening during the crash: Kernel processes kupdated, bdflush and the process creating the file which fills the filesystem take 100% of the CPU time (100% is indicated in "system"). The system doesn't freeze, processes which don't do any disk I/O remain running (i.e. bash, top), but every process freezes in kernel-space if it tries to access any filesystem. |