Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 489682

Summary: =sys-kernel/dracut-0.34-r1 - mdraid, mtpsas - Assemble troubles, mystically appearing superblock
Product: Gentoo Linux Reporter: NiTr0 <nitr0>
Component: [OLD] Core systemAssignee: Amadeusz Żołnowski (RETIRED) <aidecoe>
Status: RESOLVED WORKSFORME    
Severity: normal CC: alexander, kernel, proxy-maint
Priority: Normal    
Version: unspecified   
Hardware: All   
OS: Linux   
Whiteboard:
Package list:
Runtime testing required: ---

Description NiTr0 2013-10-28 20:31:16 UTC
I have a M2N-E MCP55 mobo, LSI SATA/SAS HBA (LSI 3041E) with 4x1TB HDD in Raid5 mdraid (superblock v0.90) with LVM on top, and CF on /dev/sda with / and /boot (now after crash - only with /boot).

After system crash (caused by hardware/software - CF on /dev/sda becomes trashed) and kernel update from 3.8.13 to 3.10.7-rc1, which were almost simultaneously (I compiled kernel but didn't used it till crash&reboot, or maybe system rebooted with new kernel, which cause crash - I didn't looked on it for some days), I noticed a problem with mdraid assembling at boot. After each reboot /dev/sdc1 component becomes with wrong superblock (dated by 2011 year, and with invalid checksum and wrong minor). Adding it to array fixes superblock - but till next reboot, then it magically appears back.

What I did: mdadm --zero-superblock; dd if=/dev/zero of=/dev/sdc1 (first 100G and last 70G - as I read, this must zero superblock), without any result.

Booting from gentoo minimal CD (fresh, I downloaded it at Saturday) shows same problem - mystically appearing superblock.

Of course, /dev/sdc1 isn't added to array, and array starts degraded or even didn't start at all.

When I boots from rare livecd with 3.0 kernel, or even when I boot 3.8.13 kernel, array assembles at boot perfectly. So I think that it's a kernel bug.

Anybody saws such behavior of mdraid? Where (and why) it fetches rare superblock?
Comment 1 NiTr0 2013-10-30 09:59:00 UTC
Hm, now even with 3.8.13 kernel I have some situation.

I did full erase of partition and array reassembling; no effect - after reboot it has same superblock.
Comment 2 NiTr0 2013-10-31 20:41:14 UTC
Hm, it seems that trouble is caused by dracut's initrd (which I tried to use with fresh kernel). I use (tried to use) dracut-0.34-r1. Some of initrd components sets the super-minor on other 3 devices to 127 (I didn't look what other attributes were setted), and also writes garbage to /dev/sdc1 superblock.

Anybody saw same behavior?
Comment 3 Amadeusz Żołnowski (RETIRED) gentoo-dev 2013-12-06 07:54:52 UTC
(In reply to NiTr0 from comment #2)
> Hm, it seems that trouble is caused by dracut's initrd (which I tried to use
> with fresh kernel). I use (tried to use) dracut-0.34-r1. Some of initrd
> components sets the super-minor on other 3 devices to 127 (I didn't look
> what other attributes were setted), and also writes garbage to /dev/sdc1
> superblock.
> 
> Anybody saw same behavior?

I have asked Dracut leader about that:

07:59 < haraldh> aidecoe, to wipe, they should use wipefs
Comment 4 Alexander Tsoy 2013-12-10 20:37:29 UTC
Are you sure that this is really a dracut bug? There is a very similar bug opened for mdadm-3.3: bug 491108
Comment 5 Amadeusz Żołnowski (RETIRED) gentoo-dev 2013-12-11 06:30:13 UTC
Reopen if it is really different from bug #491108, please.

*** This bug has been marked as a duplicate of bug 491108 ***
Comment 6 NiTr0 2014-01-02 14:10:35 UTC
This bug is slightly different from bug #491108. I use superblock v. 0.90. And all is working OK until I trying to boot with dracut initrd. After that - I have superblock corruption (md0 becomes md127, and one of 4 devices has old superblock update date + broken checksum).
Comment 7 Alexander Tsoy 2014-01-29 17:49:49 UTC
(In reply to NiTr0 from comment #6)
> This bug is slightly different from bug #491108. I use superblock v. 0.90.
> And all is working OK until I trying to boot with dracut initrd. After that
> - I have superblock corruption (md0 becomes md127

Do you have mdadm.conf in the initramfs? AFAIK, this is the only place where MD device names are mapped to UUIDs.

> and one of 4 devices has old superblock update date + broken checksum).

Please attach the output of the following command (when the array is in degraded state, of course):
mdadm --examine /dev/sd?1
Comment 8 Alexander Tsoy 2014-01-29 17:52:42 UTC
(In reply to Alexander Tsoy from comment #7)
> Please attach the output of the following command (when the array is in
> degraded state, of course):
> mdadm --examine /dev/sd?1

Or better specify only components of the array instead of that pattern.
Comment 9 NiTr0 2014-02-09 23:29:43 UTC
I haven't specified array into mdadm.conf, autodetection works OK w/o dracut.
md127 - maybe I was missed something when I re-assembled degraded array, or maybe livecd rewrites superblock some earlier, but in any case, when I rebooted with sync md127 array, after dracut I have on one component bad superblock (with invalid super minor, old date and wrong checksum).
I don't want to do experiments on live array - it has a lot of data, and I haven't a spare 3TB for backup. But I can try to dump and send to you some regions of HDDs where superblock (true or fake) was seen by dracut.
Comment 10 Alexander Tsoy 2014-02-10 11:21:08 UTC
(In reply to NiTr0 from comment #9)
> I haven't specified array into mdadm.conf, autodetection works OK w/o dracut.

Do you have MD_AUTODETECT enabled in the kernel config and "md=.." on the kernel cmdline? If yes, then I guess kernel autodetection/autoassembly and incremental assembly performed via udev rules can interfere with each other.

> md127 - maybe I was missed something when I re-assembled degraded array, or
> maybe livecd rewrites superblock some earlier, but in any case, when I
> rebooted with sync md127 array, after dracut I have on one component bad
> superblock (with invalid super minor, old date and wrong checksum).

md127 is the default name for the first incrementally assembled array.

> I don't want to do experiments on live array - it has a lot of data, and I
> haven't a spare 3TB for backup. But I can try to dump and send to you some
> regions of HDDs where superblock (true or fake) was seen by dracut.

If my assumptions are correct, then one of the following solutions should solve the problem for you:

1. Disable in-kernel autodetection of MD arrays (in-kernel autodetection is not recommended by upstream): recompile the kernel without CONFIG_MD_AUTODETECT or append raid=noautodetect to the kernel cmdline and remove "md=" options from the kernel cmdline. mdraid module should be included in the initramfs.
Array will be detected and assembled from within the initramfs (via udev rules).

2. Do not include mdraid module in the initramfs (omit_dracutmodules+=" mdraid " in /etc/dracut.conf|/etc/dracut.conf.d/*.conf).
Array will be detected and assembled by the kernel.

3. Disable autoassmebly of MD arrays with the metadata v0.90:
echo 'AUTO +1.x -all' > /etc/mdadm.conf
Make sure that mdadm.conf is included in the initramfs (mdadmconf="yes" in /etc/dracut.conf|/etc/dracut.conf.d/*.conf).
Array will be detected and assembled by the kernel. If you prefer in-kernel autodetection, then this option is better than 2nd imo.
Comment 11 Amadeusz Żołnowski (RETIRED) gentoo-dev 2014-04-24 19:35:14 UTC
NiTr0: Did Alexander's hints helped you?

It doesn't look like Dracut bug, but please reopen with additional info if you think it is.