Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 248674 - xfs with blocksize 1024 suggest corrupted disk
Summary: xfs with blocksize 1024 suggest corrupted disk
Status: RESOLVED WORKSFORME
Alias: None
Product: Gentoo Release Media
Classification: Unclassified
Component: InstallCD (show other bugs)
Hardware: x86 Linux
: High major (vote)
Assignee: Gentoo Release Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-11-24 19:59 UTC by Marcin Rybak
Modified: 2008-12-29 13:51 UTC (History)
3 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Marcin Rybak 2008-11-24 19:59:35 UTC
When I tried to install gentoo at raid10 (software) and lvm. I use xfs for portage filesystem. When unpacking portage it hanged and in dmesg I got a lot of messages containing:
Device dm-5, XFS metadata write error block 0x20927a in dm-5
raid10_make_request bug: can't convert block across chunks or bigger than 256k 37785086 4

I tried to repeat this bug at different machine, and the error occured only with xfs blocksize 1024. Disk is clear of badsectors or sth. 

Reproducible: Always

Steps to Reproduce:
1. make mdadm array raid10 with far=2 as you can see below:
mdadm -C -v /dev/md1 --level=10 -n4 --layout=f2 -c256  /dev/sda2 /dev/sdb2 /dev/sdc2 /dev/sdd2
2. make lvm device at it
3. make xfs at lvm device as below:
mkfs.xfs -b size=1024 /dev/lvgroup/logicalvolume
4. try to unpack portage

Actual Results:  
tar stay hanged, unable to kill the process:
root     16199  0.3  0.0   2120   872 pts/0    D+   19:31   0:05 tar xjf portage-latest.tar.bz2

# cat /proc/16199/status
Name:   tar
State:  D (disk sleep)
Tgid:   16199
Pid:    16199
PPid:   16018
TracerPid:      0
Uid:    0       0       0       0
Gid:    0       0       0       0
FDSize: 256
Groups: 0 1 2 3 4 6 10 11 20 26 27
VmPeak:     2136 kB
VmSize:     2120 kB
VmLck:         0 kB
VmHWM:       872 kB
VmRSS:       872 kB
VmData:      172 kB
VmStk:        84 kB
VmExe:       224 kB
VmLib:      1560 kB
VmPTE:        12 kB
Threads:        1
SigQ:   0/32768
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000000000
SigCgt: 0000000180000000
CapInh: 0000000000000000
CapPrm: 00000000ffffffff
CapEff: 00000000ffffffff
voluntary_ctxt_switches:        6927
nonvoluntary_ctxt_switches:     91

dmesg:
Device dm-5, XFS metadata write error block 0x20927a in dm-5
raid10_make_request bug: can't convert block across chunks or bigger than 256k 37785086 4
raid10_make_request bug: can't convert block across chunks or bigger than 256k 37786618 4


Expected Results:  
unpack the data

livecd 2008.0
Comment 1 Andrew Gaffney (RETIRED) gentoo-dev 2008-11-24 20:29:51 UTC
Base-system, do you guys have any idea what's going on here?
Comment 2 SpanKY gentoo-dev 2008-11-27 20:43:43 UTC
have the kernel guys look at it ... sounds like a bug in xfs assuming the hardware stack is OK ...

also, you might want to make sure you arent using 4k stacks in the kernel.  i seem to recall that stacking xfs on things tends to result in kernel stack overflows.
Comment 3 Duane Griffin 2008-12-04 12:43:47 UTC
The suggestion to check 4k stacks is a good one. Could you attach your .config, please. Also which kernel you are using (output from "uname -a" will do nicely). Assuming you aren't using the latest vanilla kernel (2.6.27.7 as of writing) it would be helpful if you could also try that to check it hasn't already been fixed.

Next, could you please attach a task dump after reproducing the problem. To do that ensure Magic-SysRq is enabled in the kernel, reproduce the bug, then do an "echo t > /proc/sysrq-trigger". Do this with the latest vanilla kernel, if possible.
Comment 4 Marcin Rybak 2008-12-29 12:34:24 UTC
I've just tired to reproduce this bug in 2.6.27.7 and it does not appear. As I said, it appers at gentoo live cd 2008.0 and for example kernel 2.6.22 too. No more time for more tests - sorry
Comment 5 Duane Griffin 2008-12-29 13:51:56 UTC
OK, thanks for testing.

If you see the problem recur with a current kernel please reopen the bug (ideally with the info requested in 3).