Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 14487 - XFS lockup on no free space
Summary: XFS lockup on no free space
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: x86 Linux
: High critical (vote)
Assignee: Bob Johnson (RETIRED)
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2003-01-24 07:08 UTC by Martin Decky
Modified: 2003-04-15 16:51 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Decky 2003-01-24 07:08:41 UTC
sys-kernel/xfs-sources 2.4.19-r2
sys-kernel/xfs-sources 2.4.20_pre5
(other versions can be affected too)

When I want to create a file on XFS filesystem and there is not enought free
space, the whole system locks up - the kernel gives replies on ping over
ethernet and num-lock works, but that's all. There is no debugging output nor
kernel oops/panic.

The bug can be reproduced any time by running

cat /dev/zero > /some/xfs/filesystem/foo

The only XFS filesystems, that are not affected, are those mounted via -o loop
option.


Note:
When I compile the kernel with "XFS debugging" on, the bug is gone.

Software configuration:
gcc version 3.2.1 20021207 (Gentoo Linux 3.2.1-20021207)
glibc 2.3.1-r2
xfs-sources kernel compiled with Athlon-XP processor family option
XFS compiled in kernel

Hardware configuration:
Athlon XP 2000+
Soltek 75DRV (VIA KT266)
Comment 1 Martin Decky 2003-01-24 09:41:50 UTC
The bug was confirmed in VMware 3.2 virtual environment, in a clean new
installation of Gentoo 1.4_rc2.

Conclusion:
This is definitively a XFS bug.
Comment 2 Bob Johnson (RETIRED) gentoo-dev 2003-02-05 22:08:11 UTC
just got done talking with some xfs devs, they have one
bug report on this and cant reproduce it.
neither can i.
In a vmware session the partition fills and errors saying
device is full. no lockups/kernel oops,etc.
I will keep a eye on this.
Comment 3 Martin Decky 2003-02-06 01:45:12 UTC
That's really strange - I can reproduce the bug on two different machines and in VMware. I would try also on some of the servers I administrate, but I can't afford to lock them up.

While your testing, were you filling up the filesystem while preserving the following conditions?

1. CONFIG_XFS_DEBUG in kernel configuration was not set
2. the filesystem was on a real disk partition (not in a file or ramdisk) and was about 1 GB in size (or larger)
3. the disk device was a IDE hard drive (this could be actually the source of the bug, I'm going to make some additional tests)


Give me some time to go through it again.

If I will be able to reproduce the bug again, I will try to create a small bootable ISO image with just a minimalistic kernel, bash and a script causing the bug (but again, you would need a XFS filesystem on a hard drive partition).

Or I can even give you root access to the VMware session and let you watch the bug in an environment, where it really happens.
Comment 4 John Davis (zhen) (RETIRED) gentoo-dev 2003-04-04 01:19:43 UTC
db fix
Comment 5 John Davis (zhen) (RETIRED) gentoo-dev 2003-04-04 01:24:43 UTC
db fix
Comment 6 Martin Decky 2003-04-15 16:51:54 UTC
After two days of testing the new (unstable) xfs-sources-2.4.20-r3 I have successfuly compiled a kernel with my usual configuration which doesn't crash both on my computer and in VMWare.

However, this unstable release doesn't contain the "gcc31-compile-optimalizations" patch, so I set the Processor family option to "Pentium-III/Celeron(Coppermine)".

After applying the gcc31 patch myself and compiling the kernel with my original option "Athlon-XP(gcc>31)" the bug appeared again. This could probably signalize a possible bug in GCC.

I will do additional tests in the following days, but for now I can describe in more detail what is happening during the crash:

Kernel processes kupdated, bdflush and the process creating the file which fills the filesystem take 100% of the CPU time (100% is indicated in "system"). The system doesn't freeze, processes which don't do any disk I/O remain running (i.e. bash, top), but every process freezes in kernel-space if it tries to access any filesystem.