Host system gentoo - AMD64 bit: with sys-kernel/gentoo-sources-2.6.33, app-emulation/qemu-kvm-0.12.3 Guest system gentoo/sabayon - 32bit: with sys-kernel/gentoo-sources-2.6.33. ok, what I wanted to do was to create a third "virtual" harddisk for the guest system. So I created this harddisk from the host with "qemu-img create -f raw 15GBFile.raw 15G". The 15GB file is located on linux software raid (raid10) device (/dev/md1). The software raid device uses a XFS file system. Then I started the qemu/kvm-guest with the 15GB file as additional harddisk. The harddisk partitioning was also successful but the filesystem creation not. Here the output of the guest (no errors, and nothing in /var/log/messages) [32bit]guest:/ # mkfs.ext4 -LSAB32 /dev/sdc2 mke2fs 1.41.11 (14-Mar-2010) Filesystem label=SAB32 OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) Stride=0 blocks, Stripe width=0 blocks 950272 inodes, 3799372 blocks 189968 blocks (5.00%) reserved for the super user First data block=0 Maximum filesystem blocks=3892314112 116 block groups 32768 blocks per group, 32768 fragments per group 8192 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208 Writing inode tables: done Creating journal (32768 blocks): done Writing superblocks and filesystem accounting information: done This filesystem will be automatically checked every 26 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override. And here parts of the host's /var/log/messages: ... Apr 3 04:23:54 treviso kernel: raid10_make_request bug: can't convert block across chunks or bigger than 512k 4985844 8 Apr 3 04:23:54 treviso kernel: raid10_make_request bug: can't convert block across chunks or bigger than 512k 4986868 8 Apr 3 04:23:54 treviso kernel: raid10_make_request bug: can't convert block across chunks or bigger than 512k 4987892 8 Apr 3 04:23:55 treviso kernel: raid10_make_request bug: can't convert block across chunks or bigger than 512k 4988916 8 Apr 3 04:23:55 treviso kernel: raid10_make_request bug: can't convert block across chunks or bigger than 512k 4989940 8 Apr 3 04:23:55 treviso kernel: raid10_make_request bug: can't convert block across chunks or bigger than 512k 4990964 8 Apr 3 04:23:55 treviso kernel: raid10_make_request bug: can't convert block across chunks or bigger than 512k 4991988 8 ... Apr 3 04:23:58 treviso kernel: raid10_make_request bug: can't convert block across chunks or bigger than 512k 4999156 8 Apr 3 04:23:58 treviso kernel: quiet_error: 74 callbacks suppressed Apr 3 04:23:58 treviso kernel: Buffer I/O error on device md1, logical block 1249789 Apr 3 04:23:58 treviso kernel: lost page write due to I/O error on md1 Apr 3 04:23:58 treviso kernel: Buffer I/O error on device md1, logical block 1249790 Apr 3 04:23:58 treviso kernel: lost page write due to I/O error on md1 Apr 3 04:23:58 treviso kernel: Buffer I/O error on device md1, logical block 1249791 Apr 3 04:23:58 treviso kernel: lost page write due to I/O error on md1 Apr 3 04:23:58 treviso kernel: Buffer I/O error on device md1, logical block 1249792 Apr 3 04:23:58 treviso kernel: lost page write due to I/O error on md1 ... The created ext4 file system is corrupt. the real harddisks used for the software raid have no problems at least the smart stuff shows no errors, and the XFS file system is also ok. Reproducible: Always the underlying /dev/md1 (parameters) mdadm -D /dev/md1 /dev/md1: Version : 1.01 Creation Time : Tue Jan 26 23:16:11 2010 Raid Level : raid10 Array Size : 209720320 (200.00 GiB 214.75 GB) Used Dev Size : 209720320 (200.00 GiB 214.75 GB) Raid Devices : 2 Total Devices : 2 Persistence : Superblock is persistent Update Time : Sat Apr 3 04:54:42 2010 State : clean Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Layout : far=2 Chunk Size : 512K Name : treviso:1 (local to host treviso) UUID : 726bee27:61c8e8f2:7c6e7838:c108a851 Events : 16965 Number Major Minor RaidDevice State 0 8 10 0 active sync /dev/sda10 1 8 18 1 active sync /dev/sdb2
*** Bug 312931 has been marked as a duplicate of this bug. ***
the result of further investigation is that everything works as expected when I create the virtual harddisk file (15GBFile.raw) on a XFS filesystem which is not on a linux raid device (not on /dev/md1) - but on the same real hard disks that are also used for the raid10 device /dev/md1.
Please attach your kernel .config and the output of emerge --info.
Created attachment 226539 [details] emerge --info
Created attachment 226545 [details] 2.6.33 gentoo sources - kernel config
I hope this helps a little bit. ps: I have copied the 15GBFile.raw-file to the raid10 device and use it as a additional harddisk for the qemu-guest. no errors at the moment. perhaps there is a problem with sparse files (qemu-raw file format) and they will be expanded by the guest (with mkfs.ext4) and the raw-file itself is on raid10/XFS.
update: it happens with 2.6.33-gentoo-r1 too but I found this: http://marc.info/?l=linux-raid&m=126802743419044&w=2 https://patchwork.kernel.org/patch/83932/ I think I test the mentioned patches.
can't try it out because patch https://patchwork.kernel.org/patch/83932/ isn't applicable. drivers/md/dm-table.c: In function 'dm_set_device_limits': drivers/md/dm-table.c:532: error: 'struct queue_limits' has no member named 'max_segments' make[2]: *** [drivers/md/dm-table.o] Error 1 make[1]: *** [drivers/md] Error 2 make: *** [drivers] Error 2
here my result of further tests. It does NOT depend on qemu/kvm and the virtual machine stuff (as I initially thought). It's a problem with raid10 and the XFS file system. Copying 100GB of data from a non raid partition to the XFS/raid10 file system triggers this BUffer IO error >100 times. And when I compare the files - they are really different. The combination of XFS and raid10 is unusable with a 2.6.33 kernel. Let's see what happens when I replace XFS with EXT4 on the raid10 dev....
It must be a problem with the combination of XFS and the underlying linux raid10 and nothing else.. I replaced XFS with an ext4 file system and what a surprise - no errors at all. ext4 is not able to trigger this bug. I copied several hundreds of gigs between the non-raid and raid10 partitions and got no errors in the logs and no diffs between the files.
After two weeeks of testing with different vanilla 2.6.34-rcX kernels and the current 2.6.34 kernel I can say that the poblem (XFS-RAID10) seems to fixed in the 2.6.34 kernels. At least it is no longer reproducible.
Cool, Jochen! I'll go ahead and close the bug, but if you feel that it should be reopened do go ahead and do it.