While manipulating devices on a live btrfs system (btrfs device add/remove), I got a couple errors then a BUG, followed by a complete failure to mount at reboot Reproducible: Always Steps to Reproduce: 1. Shrink your filesystem enough to squeeze another partition on your drive 2. Fdisk your drive and split it in two 3. Reboot 4. Add the new partition to the filesystem 5. Remove the old partition from the filesystem (and get coffee) 6. Reboot 7. pacify the initrd by telling it to mount the second partition instead of the first 8. Add the first partition back into the filesystem again 9. Remove the second partition from the filesystem. Actual Results: Problems: 1. As soon as the other device was added, "btrfs filesystem resize" became impossible. 2. Deleting the second partition from the filesystem after booting into mounting it failed. most of the data was moved except for 1G. 3. My kernel oopsed with a BUG() in disk-io.c, after which the filesystme was paralyzed and refused to read OR write any further information. 4. The entire filesystem became unmountable when I tried to mount it from the livedvd. mounting the first partition caused a bad fs type or whatever error, and mounting the second partition triggered an "unable to read superblock" Expected Results: 1. Resizing the filesystem when it had multiple devices should have been possible. 2. It should have been possible to delete the second partition from the filesystem after booting into it. After all, deleting the first partition worked after migrating up. Regrettably I do not have an emerge --info or any other diagnostic information handy as the filesystem locked up at runtime, and refused to mount afterwards. Version information I was able to remember: Kernel version 2.6.38. Latest version of btrfsprogs as of June 10, 2011 My only suggestion for diagnosis is stress testing the adding and removing of devices from a live filesystem, interspersed with reboots.
The btrfs-progs was the latest existing. I enabled the ~x86 keyword.
Perhaps btrfs upstream would be interested in a bug report too. Even better if your instructions trigger this bug in a virtual environment (e.g. qemu) and you can provide a qcow image of the corrupted filesystem.
I'm afraid someone else will have to be the guinea pig for this one. My system is too low powered to support much in the way of virtualization since I have a low power Atom processor and a measly 2G of ram. Adding and removing devices though sounds like the sort of "out of band" stress testing that exercises seldom used code paths.
Please submit this upstream at bugzilla.kernel.org and post the url back here. Good luck.
http://bugzilla.kernel.org/show_bug.cgi?id=37492
Thanks for reporting upstream. We'll follow the upstream bug and update this one when upstream provides a resolution.
Reflecting upstream status.