371207 – sys-kernel/gentoo-sources-2.6.38 btrfs: BUG, complete fall apart of system when adding/removing devices to pool

Bug 371207 - sys-kernel/gentoo-sources-2.6.38 btrfs: BUG, complete fall apart of system when adding/removing devices to pool

Summary: sys-kernel/gentoo-sources-2.6.38 btrfs: BUG, complete fall apart of system wh...

Status:	RESOLVED OBSOLETE

Alias:	None

Product:	Gentoo Linux
Classification:	Unclassified
Component:	[OLD] Core system (show other bugs)
Hardware:	All Linux

Importance:	Normal normal (vote)
Assignee:	Gentoo Kernel Bug Wranglers and Kernel Maintainers

URL:	http://bugzilla.kernel.org/show_bug.c...
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2011-06-12 08:05 UTC by Raymond Jennings
Modified:	2013-04-30 17:09 UTC (History)
CC List:	0 users

See Also:
Package list:
Runtime testing required:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Raymond Jennings 2011-06-12 08:05:53 UTC

While manipulating devices on a live btrfs system (btrfs device add/remove), I got a couple errors then a BUG, followed by a complete failure to mount at reboot

Reproducible: Always

Steps to Reproduce:
1. Shrink your filesystem enough to squeeze another partition on your drive
2. Fdisk your drive and split it in two
3. Reboot
4. Add the new partition to the filesystem
5. Remove the old partition from the filesystem (and get coffee)
6. Reboot
7. pacify the initrd by telling it to mount the second partition instead of the first
8. Add the first partition back into the filesystem again
9. Remove the second partition from the filesystem.
Actual Results:
Problems:

1. As soon as the other device was added, "btrfs filesystem resize" became impossible.

2. Deleting the second partition from the filesystem after booting into mounting it failed. most of the data was moved except for 1G.

3. My kernel oopsed with a BUG() in disk-io.c, after which the filesystme was paralyzed and refused to read OR write any further information.

4. The entire filesystem became unmountable when I tried to mount it from the livedvd. mounting the first partition caused a bad fs type or whatever error, and mounting the second partition triggered an "unable to read superblock"

Expected Results:
1. Resizing the filesystem when it had multiple devices should have been possible.

2. It should have been possible to delete the second partition from the filesystem after booting into it. After all, deleting the first partition worked after migrating up.

Regrettably I do not have an emerge --info or any other diagnostic information handy as the filesystem locked up at runtime, and refused to mount afterwards.

Version information I was able to remember:
Kernel version 2.6.38.
Latest version of btrfsprogs as of June 10, 2011

My only suggestion for diagnosis is stress testing the adding and removing of devices from a live filesystem, interspersed with reboots.

Comment 1 Raymond Jennings 2011-06-12 08:07:03 UTC

The btrfs-progs was the latest existing.  I enabled the ~x86 keyword.

Comment 2 Chí-Thanh Christopher Nguyễn gentoo-dev

2011-06-13 18:04:30 UTC

Perhaps btrfs upstream would be interested in a bug report too. Even better if your instructions trigger this bug in a virtual environment (e.g. qemu) and you can provide a qcow image of the corrupted filesystem.

Comment 3 Raymond Jennings 2011-06-13 18:49:39 UTC

I'm afraid someone else will have to be the guinea pig for this one.

My system is too low powered to support much in the way of virtualization since I have a low power Atom processor and a measly 2G of ram.

Adding and removing devices though sounds like the sort of "out of band" stress testing that exercises seldom used code paths.

Comment 4 Mike Pagano gentoo-dev

2011-06-14 17:54:47 UTC

Please submit this upstream at bugzilla.kernel.org and post the url back here. Good luck.

Comment 5 Raymond Jennings 2011-06-14 18:49:16 UTC

http://bugzilla.kernel.org/show_bug.cgi?id=37492

Comment 6 Stratos Psomadakis (RETIRED) gentoo-dev

2011-06-14 18:58:39 UTC

Thanks for reporting upstream. We'll follow the upstream bug and update this one when upstream provides a resolution.

Comment 7 Tom Wijsman (TomWij) (RETIRED) gentoo-dev

2013-04-30 17:09:56 UTC

Reflecting upstream status.