With coreutils-8.10, the internal copy() function, which is used for critical tasks such as the 'install' command, fails silently with btrfs on amd64. This issue was previously mentioned in 353783, comment #5. It results in installation of broken packages (extremely serious problem). Maybe tests/cp/fiemap-2, included with coreutils-8.10, is useful for detecting this issue. On the same system this test fails when run on btrfs, but succeeds when run on tmpfs. Portage 2.2.0_alpha20 (default/linux/amd64/10.0/desktop, gcc-4.5.2, glibc-2.13-r0, 2.6.37 x86_64) ================================================================= System uname: Linux-2.6.37-x86_64-Intel-R-_Core-TM-2_Duo_CPU_T9300_@_2.50GHz-with-gentoo-2.0.1 Timestamp of tree: Sun, 06 Feb 2011 01:45:01 +0000 ccache version 3.1.4 [disabled] app-shells/bash: 4.1_p9 dev-java/java-config: 2.1.11-r3 dev-lang/python: 2.6.6-r1, 3.1.3 dev-util/ccache: 3.1.4 dev-util/cmake: 2.8.3-r1 sys-apps/baselayout: 2.0.1-r1 sys-apps/openrc: 0.7.0 sys-apps/sandbox: 2.4 sys-devel/autoconf: 2.68 sys-devel/automake: 1.9.6-r3, 1.11.1 sys-devel/binutils: 2.20.1-r1 sys-devel/gcc: 4.4.5, 4.5.2 sys-devel/gcc-config: 1.4.1 sys-devel/libtool: 2.4-r1 sys-devel/make: 3.82 virtual/os-headers: 2.6.36.1 (sys-kernel/linux-headers)
Created attachment 261685 [details] log of tests/cp/fiemap-2 failure on btrfs with amd64 linux-2.6.37
Also, tests/cp/fiemap-2 succeeds with btrfs on the same system/kernel when built and executed in a 32-bit i686 chroot.
try using --sparse=never when running `cp` ...
If I modify the test like "for i in never; do", it still fails like this: + printf x + dd bs=1k seek=128 of=k 0+0 records in 0+0 records out 0 bytes (0 B) copied, 1.5923e-05 s, 0.0 kB/s + for append in no yes + test no = yes + for i in never + cp --sparse=never k k2 + cmp k k2 k k2 differ: byte 1, line 1 + fail=1 + for append in no yes + test yes = yes + printf y + for i in never + cp --sparse=never k k2 + cmp k k2 + rm -f k
seems to work fine for me on a small btrfs mount of mine. i only have ext4 fs's everywhere, so i had to create a small btrfs to test with. dd if=/dev/zero of=f count=1 seek=1000000 losetup /dev/loop7 f mkfs.btrfs /dev/loop7 mount /dev/loop7 /mnt/tmp/ cd /mnt/tmp <copy over fiemap-2> while ./fiemap-2 ; do :; done doesnt fail for me
Now I've experimented with a variety of btrfs filesystems, and it turns out that I can only reproduce this for filesystems that are on logical volumes created by lvm2, and it only happens with particular volume groups on particular disks. I'm going to try recreating the physical volumes and volume groups on these disks, in order to see if it resolves the issue.
Actually, it's not just logical volumes. It happens with normal partitions too. However, the test can succeed in one run and fail in the next, so it's important to test multiple times. I've only seen it happen with the "compress" mount option enabled. When I've remounted the same partition with the "compress" option disabled, the test doesn't fail anymore.
I can reproduce it using a 1G btrfs filesystem created on a loopback device, when mounted with the "compress" option. The steps I use are like this: dd if=/dev/zero of=/dev/shm/btrfs.img bs=1M count=0 seek=1024 mkfs.btrfs /dev/shm/btrfs.img mount -t btrfs -o compress /dev/shm/btrfs.img /mnt/btrfs_1g cp -a /var/tmp/portage/sys-apps/work/coreutils-8.10 /mnt/btrfs_1g cd /mnt/btrfs_1g/coreutils-8.10/tests cp/fiemap-2 cp/fiemap-2 cp/fiemap-2 Make sure to run cp/fiemap-2 multiple times, because failure is intermittent (though it seems to fail most of the time).
thanks, i'll try that. and for the record, i'm running fiemap-2 in a while loop, so it runs many many times before i ctrl+c to kill it.
If it helps at all my system is md raided and then that raid device shoved into an lvm.
just a "me too" / is on ext4 PORTAGE_TMPDIR is on /dev/md3 on /srv type btrfs (rw,noatime,compress,nodatasum) md3 : active raid10 sdd7[5] sdc7[7] sdb7[6] sda7[4] sde7[8](S) kernel is: 2.6.37-vs2.3.0.37-rc2
Hi, shoudn't we mask coreutils-8.10 since it can definitly harm a system when e.g. portages compile space is on btrfs? Just had some of portage's files themselves filled with \x0s after an emerge portage -- compiled on a md raided lvm2ed btrfs on an amd64 box. greetings, markus
Not really, btrfs doesn't look like the kind of stuff we should mask packages for on a global level.
what Diego said. a bit ironic coming from someone who is using a fs clearly labeled "unstable disk format". simply avoid the compress option for now.
so they've found at least one bug in btrfs which causes this problem ...
http://lwn.net/Articles/429345/ ?
Quick comment -- should this bug summary also be updated to note that LWN claims ext4 fs is also impacted, not just btrfs. And, are we certain that 8.10 is the first coreutils that triggers it?
(In reply to comment #17) > [...] And, are we certain that 8.10 > is the first coreutils that triggers it? > 1. I use btrfs with -ocompress and identified such behaviour on 8.10. So I reverted to 8.9 that worked and works right. 2. From NEWS of coreutils-8.10: ** New features cp now copies sparse files efficiently on file systems with FIEMAP support (ext4, btrfs, xfs, ocfs2) [...]
i just told about this in btrfs irc channel and they say this should be fixed in 2.6.38 kernel , maybe someone can test and confirm this
In #btrfs they say this should be fixed with 2.6.38 kernel , maybe someone can confirm this?
It's in 2.6.38-rc7. commit 4660ba63f1c4e07c20a435e084f12ba48a82bd2b Merge: 958ede7 ec29ed5 Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Fri Feb 25 14:03:39 2011 -0800 Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable * git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: Btrfs: fix fiemap bugs with delalloc Btrfs: set FMODE_EXCL in btrfs_device->mode Btrfs: make btrfs_rm_device() fail gracefully Btrfs: Avoid accessing unmapped kernel address Btrfs: Fix BTRFS_IOC_SUBVOL_SETFLAGS ioctl Btrfs: allow balance to explicitly allocate chunks as it relocates Btrfs: put ENOSPC debugging under a mount option
ext4 patch is here, hasn't been pulled yet: http://www.spinics.net/lists/linux-ext4/msg23430.html
people should upgrade to 2.6.38 and see if the issue is fixed for them. because it should be. i could add an `elog` when the active kernel version is before 2.6.38, but otherwise upstream doesnt seem too keen on trying to handle this in cp. any other suggestions ?
That seems reasonable; might even couple the .38 check with one for mounted btrfs/ext4 filesystems. Ever since the patch went in during a .38 rc, things have been fine for me.
i havent had any problems with ext4, and that's what i run on my systems now. i guess i could do `grep -qs btrfs /etc/fstab /proc/mounts`. http://sources.gentoo.org/sys-apps/coreutils/coreutils-8.10.ebuild?r1=1.2&r2=1.3
I'd be wary of unmasking anything that could cause corruption w/ ext4 filesystems if the latest gentoo hardened and sources kernel aren't patched to fix. Is there an urgent need to unmask this version of coreutils? I assume 2.6.38 didn't include the ext4 patch? Perhaps we can also look at having the gentoo-sources/gentoo-hardened kernel include the ext4 patch. Hopefully the above makes sense.
coreutils isnt masked, nor are there plans to mask it
(In reply to comment #25) > i havent had any problems with ext4, and that's what i run on my systems now. > > i guess i could do `grep -qs btrfs /etc/fstab /proc/mounts`. > > http://sources.gentoo.org/sys-apps/coreutils/coreutils-8.10.ebuild?r1=1.2&r2=1.3 Like it but I would prefer the ebuild to die. Just my 0.02€
Created attachment 268535 [details, diff] 100_all_coreutils-no-linux-sparse.patch could people see if this makes things work for them with <2.6.39 ?
(In reply to comment #29) > Created attachment 268535 [details, diff] > 100_all_coreutils-no-linux-sparse.patch > > could people see if this makes things work for them with <2.6.39 ? I've tested this patch with 2.6.37.4 and the cp/fiemap-2 test still fails. With 2.6.38.2, cp/fiemap-2 succeeds even without this patch.
Comment on attachment 268535 [details, diff] 100_all_coreutils-no-linux-sparse.patch you could give 8.11 a try ... it's in the tree
coreutils-8.12 is in the tree now with even more changes related to this
everything should have shaken itself out at this point