sys-kernel/xfs-sources 2.4.19-r2 sys-kernel/xfs-sources 2.4.20_pre5 (other versions can be affected too) When I want to create a file on XFS filesystem and there is not enought free space, the whole system locks up - the kernel gives replies on ping over ethernet and num-lock works, but that's all. There is no debugging output nor kernel oops/panic. The bug can be reproduced any time by running cat /dev/zero > /some/xfs/filesystem/foo The only XFS filesystems, that are not affected, are those mounted via -o loop option. Note: When I compile the kernel with "XFS debugging" on, the bug is gone. Software configuration: gcc version 3.2.1 20021207 (Gentoo Linux 3.2.1-20021207) glibc 2.3.1-r2 xfs-sources kernel compiled with Athlon-XP processor family option XFS compiled in kernel Hardware configuration: Athlon XP 2000+ Soltek 75DRV (VIA KT266)
The bug was confirmed in VMware 3.2 virtual environment, in a clean new installation of Gentoo 1.4_rc2. Conclusion: This is definitively a XFS bug.
just got done talking with some xfs devs, they have one bug report on this and cant reproduce it. neither can i. In a vmware session the partition fills and errors saying device is full. no lockups/kernel oops,etc. I will keep a eye on this.
That's really strange - I can reproduce the bug on two different machines and in VMware. I would try also on some of the servers I administrate, but I can't afford to lock them up. While your testing, were you filling up the filesystem while preserving the following conditions? 1. CONFIG_XFS_DEBUG in kernel configuration was not set 2. the filesystem was on a real disk partition (not in a file or ramdisk) and was about 1 GB in size (or larger) 3. the disk device was a IDE hard drive (this could be actually the source of the bug, I'm going to make some additional tests) Give me some time to go through it again. If I will be able to reproduce the bug again, I will try to create a small bootable ISO image with just a minimalistic kernel, bash and a script causing the bug (but again, you would need a XFS filesystem on a hard drive partition). Or I can even give you root access to the VMware session and let you watch the bug in an environment, where it really happens.
db fix
After two days of testing the new (unstable) xfs-sources-2.4.20-r3 I have successfuly compiled a kernel with my usual configuration which doesn't crash both on my computer and in VMWare. However, this unstable release doesn't contain the "gcc31-compile-optimalizations" patch, so I set the Processor family option to "Pentium-III/Celeron(Coppermine)". After applying the gcc31 patch myself and compiling the kernel with my original option "Athlon-XP(gcc>31)" the bug appeared again. This could probably signalize a possible bug in GCC. I will do additional tests in the following days, but for now I can describe in more detail what is happening during the crash: Kernel processes kupdated, bdflush and the process creating the file which fills the filesystem take 100% of the CPU time (100% is indicated in "system"). The system doesn't freeze, processes which don't do any disk I/O remain running (i.e. bash, top), but every process freezes in kernel-space if it tries to access any filesystem.