Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 282100 - [PATCH] add proper mdadm support to genkernel generated initramfs
Summary: [PATCH] add proper mdadm support to genkernel generated initramfs
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: All Linux
: High enhancement with 3 votes (vote)
Assignee: Gentoo Genkernel Maintainers
URL:
Whiteboard:
Keywords: InVCS
Depends on: 351919
Blocks:
  Show dependency tree
 
Reported: 2009-08-20 12:39 UTC by Matthias Dahl
Modified: 2011-05-31 01:25 UTC (History)
11 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
mdadm support for genkernel generated initial ramdisks (mdadm-support.patch,179.49 KB, patch)
2009-08-20 12:40 UTC, Matthias Dahl
Details | Diff
ebuild support (ebuild.patch,1.58 KB, patch)
2009-08-20 12:44 UTC, Matthias Dahl
Details | Diff
full mdadm support (v2) (0001-use-mdadm-instead-of-bundled-stripped-down-mdassembl.patch,5.32 KB, patch)
2010-08-27 15:39 UTC, Matthias Dahl
Details | Diff
ebuild side of mdadm support (ebuild.patch,1.64 KB, patch)
2010-08-27 15:40 UTC, Matthias Dahl
Details | Diff
Full mdadm support, with IMSM support (v3) (0001-use-mdadm-instead-of-bundled-stripped-down-mdassembl.patch,5.41 KB, patch)
2010-12-28 00:35 UTC, Laurent Pinchart
Details | Diff
Changes from v2 to v3 of mdadm patch (v2-to-v3.diff,1.05 KB, patch)
2011-01-12 23:57 UTC, Sebastian Pipping
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Matthias Dahl 2009-08-20 12:39:00 UTC
Currently genkernel patches busybox to include a striped down mdadm. This for example lacks support for partitioned md arrays among other things.

The attached patch replaces the patched busybox solution with a full blown mdadm support like its done with lvm2. I decided against using mdassemble which is intended for use in initial ramdisks because it causes its own share of problems and is not that much smaller actually.

Reproducible: Always
Comment 1 Matthias Dahl 2009-08-20 12:40:39 UTC
Created attachment 201781 [details, diff]
mdadm support for genkernel generated initial ramdisks

I have been using this for quite a while and it works just fine. If there are any questions, let me know. I tried to stay true to the style conventions in genkernel, so... :-)
Comment 2 Matthias Dahl 2009-08-20 12:44:50 UTC
Created attachment 201783 [details, diff]
ebuild support

This adds the necessary stuff to the genkernel ebuild.
Comment 3 Matthias Dahl 2009-10-04 17:54:36 UTC
Just wanted to ask if there is any intention on merging this or if there is anything holding it back in general which I could fix?
Comment 4 Phattanon Duangdara 2010-06-29 14:46:01 UTC
full mdadm support required for md raid with metadata 1.x to work.
Comment 5 Craig Andrews gentoo-dev 2010-08-10 03:21:46 UTC
(In reply to comment #4)
> full mdadm support required for md raid with metadata 1.x to work.
> 

Seconded. I just installed a new hard drive and (inadvertently) upgraded to the 1.2 superblock - it's really annoying that genkernel cannot boot it. This patch would be a great addition to genkernel.
Comment 6 Fabio Erculiani (RETIRED) gentoo-dev 2010-08-10 06:21:10 UTC
Could you clean up your patch first?
Comment 7 Matthias Dahl 2010-08-10 06:41:10 UTC
(In reply to comment #6)
> Could you clean up your patch first?
> 

Sure. Please just tell me what you want exactly and I'll take care of it. AFAICR I stayed true to the implicit style guide lines and mostly did it the way it was done with other things within genkernel.
Comment 8 Fabio Erculiani (RETIRED) gentoo-dev 2010-08-10 06:57:01 UTC
Just get rid of the garbage around, it's 179kb patch for just a bunch of line changes, don't let it pull in other stuff, I know what patches should be dropped, etc.
Comment 9 Gabe Peters 2010-08-22 01:58:18 UTC
Was attempting to get boot-time configuration for an imsm raid0 through mdadm, this patch fixed me right up. Now my gentoo and Win7 live in harmony side by side on the fakeraid. Still booting off another drive mounted to /boot - grub/grub2 won't boot directly to the fakeraid yet. When will we see this patch merged into genkernel? :)
Comment 10 Matthias Dahl 2010-08-23 18:15:28 UTC
Sorry. I'll attach a revised patch asap. I was very busy lately and unfortunately just haven't gotten around to it yet.
Comment 11 Matthias Dahl 2010-08-27 15:39:46 UTC
Created attachment 244967 [details, diff]
full mdadm support (v2)

Version 2.

Adds some fixes and basically cleans up the patch. Applies against current HEAD but also cleanly against current release.
Comment 12 Matthias Dahl 2010-08-27 15:40:48 UTC
Created attachment 244969 [details, diff]
ebuild side of mdadm support
Comment 13 Matthias Dahl 2010-08-27 15:42:27 UTC
Forgot to mention, I raised the mdadm version to 3.1.3 which works just fine for me and I don't see a problem with this. Nevertheless, just a word of caution for anyone trying this out.
Comment 14 Xake 2010-12-02 12:27:40 UTC
(In reply to comment #11)
> Created an attachment (id=244967) [details]
> full mdadm support (v2)
> 
> Version 2.
> 
> Adds some fixes and basically cleans up the patch. Applies against current HEAD
> but also cleanly against current release.
> 

This patch does not cleanly "git apply" on top of genkernel from git master (as found on git.overlays.gentoo.org) so please rebase the patch on top of the current genkernel git master.
Comment 15 Xake 2010-12-02 12:48:15 UTC
(In reply to comment #14)

Also do you really need "echo "DEVICE /dev/sd[a-z]* /dev/hd[a-z]*" >/etc/mdadm.conf"? For me mdadm --examine --scan can handle on its own, and if this really creates constraints on what is scanned it may kill special cases.

Also "mdadm --assemble --scan" looks better then "mdadm -A --scan".
Comment 16 Alexander Zubkov 2010-12-06 10:47:48 UTC
I've tested this patch too. It assembles my raid, but I have a problem. My md device contains partitions. Looks like kernel seems doesn't scan it for partitions automatically and mdadm doesn't send specific ioctl to kernel for this.
I found that calling blkid also cause this re-read. But looks like blkid use some sort of cache and it doesn't show labels at first run. So mounting by UUID is not working itself too, I need to hit Enter on promt for root filesystem for it to try one more time.
I have updated busybox in genkernel to latest version (1.18.0), because they added blockdev there, I wanted to use it for "pushing" kernel someway. And ta-da! They also added their own blkid to busybox, which acts little differently and gives label at first try. And so my system boots at least by UUID or LABEL. Plain /dev/md0p2 still not working of course.
So my propose is to update busybox and use their blkid with this patch too. Sorry, if this is offtopic.
Comment 17 Vladislav Poluhin 2010-12-09 05:47:28 UTC
Add ebuild into my overlay:
http://bitbucket.org/nuklea/overlay/changeset/f736540ce94e

Comment 18 Laurent Pinchart 2010-12-28 00:35:45 UTC
Created attachment 258228 [details, diff]
Full mdadm support, with IMSM support (v3)

The proposed v2 patch is missing mdmon. mdmon is required to monitor IMSM containers and switch the associated arrays from read-only to read-write mode.
Comment 19 Sebastian Pipping gentoo-dev 2011-01-05 23:01:49 UTC
PS: Adding keyword "Inclusion" to better show this bug's nature in searches...
Comment 20 Sebastian Pipping gentoo-dev 2011-01-12 23:57:48 UTC
Created attachment 259663 [details, diff]
Changes from v2 to v3 of mdadm patch

Thanks to Matthias and Laurent for the patches.

We have just added GPG support to genkernel in Git (bug #217959) so I can imagine to get in mdadm in soon, too.  As I don't use RAID in my setup here, I will need one (or more) of you as a tester.  Please drop me a mail so I can add you to <genkernel@g.o.> to get this going.
Comment 21 Sebastian Pipping gentoo-dev 2011-01-17 16:07:43 UTC
Patches have now been applied to the experimental branch exposed by genkernel-99999 (five nines).

http://git.overlays.gentoo.org/gitweb/?p=proj/genkernel.git;a=shortlog;h=refs/heads/experimental
Comment 22 Sebastian Pipping gentoo-dev 2011-01-20 15:03:28 UTC
Btw the sooner one of you gets a chance testing genkernel-99999 (five nines) the sooner I can turn this into a new release :-D
Comment 23 Xake 2011-01-20 15:37:36 UTC
(In reply to comment #22)
> Btw the sooner one of you gets a chance testing genkernel-99999 (five nines)
> the sooner I can turn this into a new release :-D
> 

Hehe, was about to but got tackled by the config-bug and got sidetracked.

But just a heads up, here genkernel is able with the help of mdadm to pick up and enable fakeraids level0 and boot from the lvm residing on a partition of one of them.
Comment 24 Sebastian Pipping gentoo-dev 2011-01-20 17:41:17 UTC
Thanks.

One other person promised testing during the next fews days.  After that, I'll make a new release.
Comment 25 Matthias Dahl 2011-01-20 18:50:31 UTC
I'm _very_ sorry for my extended absence the last couple of month but as life through a couple of obstacles my way, I had to prioritize. :-( I'm sorry I haven't responded again earlier.

Back to the topic. :-) Thanks for applying the patch. I've too just tested the experimental branch and it works flawlessly on my machine as well. (partitioned raid5 w/ root on one of those partitions) Thumbs up from my side.
Comment 26 Craig Andrews gentoo-dev 2011-01-23 17:38:29 UTC
(In reply to comment #24)
> Thanks.
> 
> One other person promised testing during the next fews days.  After that, I'll
> make a new release.
> 

I just completed testing, and it works great! I'm back to unpatched genkernel and very happy about it!
Comment 27 Sebastian Pipping gentoo-dev 2011-01-23 18:34:40 UTC
Thanks for the patch and for testing!

+*genkernel-3.4.12 (23 Jan 2011)
+
+  23 Jan 2011; Sebastian Pipping <sping@gentoo.org> +genkernel-3.4.12.ebuild,
+  genkernel-9999.ebuild:
+  Bump to 3.4.12, sync 9999 (four nines) live ebuild
+
Comment 28 the_mgt 2011-05-07 21:30:15 UTC
Ok, first, if you really fixed it, let me thank you.
Second, please explain why a bug reported over a year ago for the stable version (I assume so) of genkernel is now marked "resolved fixed" when it is only fixed in the ~arch version of the package?

I recently switched from a nvidia raid layout with encrypted / partition to a mdraid with encrypted lvm containing the rootfs and ripped my hair out because the sucker wouldnt assemble my raid properly. Although I included the mdadm.conf, I still needed to mdadm --assemble --scan in the shell and only after that would genkernels initrd continue to boot. I found this bug by accident via a google search and wondered why it is marked resolved fixed when it doesnt work in real life. Then I saw that it only works with ~arch.

Is my understanding of "resolved fixed" wrong? Does it not mean that all versions of the package are fixed? My intuition would say that packages are keyworded because they are still in testing condition, possibly doing weird stuff while the normal packages are the working ones, bugfree, not eating kittens, etc.

Why are gentoo users forced to use ~arch when they want a working system? Sorry for my rantings, but I hit bugs like this regularly every three month.

Will test ~arch tomorrow.
Comment 29 Xake 2011-05-08 10:57:11 UTC
(In reply to comment #28)
> Ok, first, if you really fixed it, let me thank you.
> Second, please explain why a bug reported over a year ago for the stable
> version (I assume so) of genkernel is now marked "resolved fixed" when it is
> only fixed in the ~arch version of the package?
> 

> Is my understanding of "resolved fixed" wrong? Does it not mean that all
> versions of the package are fixed? My intuition would say that packages are
> keyworded because they are still in testing condition, possibly doing weird
> stuff while the normal packages are the working ones, bugfree, not eating
> kittens, etc.

When it comes to a bug against an ebuild, then yes.
However when it comes to genkernel we are the upstream, which makes stuff a bit different.
In this case it means that this bug is against the current upstream codebase of genkernel, instead of the ebuild.
Also in the upstream some subsystems has been reworked, bugs has been fixed and so on. So currently 3.4.15 lost the last showstopping bug we found wrt mdadm (the handling of mdadm.conf was broken so auto-detection did never work), however that version also have some new features fixing things wrt the handling of /dev. These new features does not work that well with baselayout1, so we cannot stabilize a new version of genkernel that we know should work before OpenRC has gone stable (which should happen today). So maybe soon there will be a request to the arch-team to test and mark genkernel 3.4.15 stable...

Also when this bug started this was more of a feature request to add support for new things more then something that was broken needing fixing, which is what it also lately has become.


> Why are gentoo users forced to use ~arch when they want a working system? Sorry
> for my rantings, but I hit bugs like this regularly every three month.
> 
> Will test ~arch tomorrow.

Genkernel has a past of lacking maintenance, and has just picked up speed again. 
3.4.15 should work well for you if you are using OpenRC. Please report back if that is the case or open a new bug if you found any problems.
Comment 30 the_mgt 2011-05-08 18:00:42 UTC
(In reply to comment #29)
> (In reply to comment #28)
> When it comes to a bug against an ebuild, then yes.
> However when it comes to genkernel we are the upstream, which makes stuff a bit different.
Thanks to your friendly answer to my harsh post! I understand your point wrt upstream completly. My frustration comes from bugs were Gentoo is not uspstream or which are open for longer than one and a half year.
> These new features does not work that well with baselayout1, so we
> cannot stabilize a new version of genkernel that we know should work before
> OpenRC has gone stable (which should happen today).
Hurray for OpenRC going stable, was about time. It is one of the few things i use from the ~arch part of Gentoo and I really love it. I also understand perfectly well why this would make backporting difficult

> > Will test ~arch tomorrow.
> 3.4.15 should work well for you if you are using OpenRC. Please report back if
> that is the case or open a new bug if you found any problems.

I did test today with ~arch version and the default genkernel.conf and I am using baselayout2 since a while on this system. Sadly, it didnt work out. Reading the genkernel.conf, mdadm.conf was not copied to the initrd on purpose, initially I was wondering about that. But autodetecting didn't work either.

Some Information about my setup:
I folowed this guide http://en.gentoo-wiki.com/wiki/Root_filesystem_over_LVM2,_DM-Crypt_and_RAID almost to the point: 2 disks with identical partitions, first one for booting, second for lvm with this layout: raid->cryptfs->lvm in this order. sda2 and sdb2 were assembled as md2.
I call for crypt_root=/dev/md2 in my grub.conf, since that is what I called the md when I created it.
Unfortunately, the gentoo livecd from where I set stuff up, autodetects/assembles this raid as /dev/md127 and calls it name=livecd:2, at least I get that number after a reboot with the livecd.

Now this is the behaviour I get for the two different genkernel versions:
With arch-genkernel, after dropping to a shell because of failed autodetect, when I run "mdadm --assemble --scan" (with a correct copy of mdadm.conf present in the initrd), I get a /dev/md2, exit the shell and genkernel asks for crypt passphrase and the system boots.
With ~arch-genkernel, no autodetect either, dropping to a shell and running "mdadm --assemble --scan" gives me /dev/md127 (and /dev/md/livecd:2 iirc), so I exit the shell and have to point initrd to the proper md dev. Then I am asked the passphrase and booting works fine.

Is it possible that my md somehow remembers the stupid livecd settings (the number "127" and the "name") in the superblock? And does it fail now, because I still have "md2" as crypt_root? I can only continue testing tomorrow, and will do. I need dropbear support in my initrd...
Comment 31 Xake 2011-05-08 19:54:08 UTC
Can you please make sure you have "domdadm" in your grub.cof, and if that does not work please post a new bug against genkernel-3.4.15 your grub.conf, /etc/fstab, /etc/mdadm.conf (you may remove every line starting with #), the version istalled on your computer of mdadm and lvm2 and emerge --info.
Please try also to post the output from genkernel ramdisk during boot (i.e. if there is any output from mdadm, from lvm, from anything at all) before it tells you it cannot find /dev/md2.
CC me in that bug, and we can continue there, because your problem is out of the scope for this bug.

And that mdadm nowdays (and also in the ramdisk nowdays when we do not have a stub any longer) names the arrays to >=/dev/md125 is known and expected with a 1.x superblock. If you want to have a consistent behaviour there are two ways I know should work: use the /dev/md/*, or have a mdadm.conf that makes sure things are consistent. uuid may work also, but I have not tested so I cannot guarantee it.
Comment 32 the_mgt 2011-05-09 10:41:03 UTC
(In reply to comment #31)
> Can you please make sure you have "domdadm" in your grub.cof
I was missing "domdadm" in my grub.conf. After migrating from dmraid, I searched the manpage but found no option there. I tried "domdraid" (since it was "dodmraid" for the fakeraids) but it didn't help. So I thought there was some general autodetection routine if "md" was used for either real or crypt root...
Conclusion:
1. All is well! Genkernel works fine, if you know the proper option.
2. Please add "domdadm" to man page

> If you want to have a consistent behaviour there are two ways I
> know should work: use the /dev/md/*, or have a mdadm.conf that makes sure
> things are consistent. uuid may work also, but I have not tested so I cannot
> guarantee it.
Using "/dev/md/livecd:2" did work fine, md127 might have also been working. "crypt_root=UUID=bla:blah:foo:blah" did not work, but I never ever used uuid's before since I find them annoying. But I am willing to test if you want, mail me directly or query me on freenode (nick is the_mgt there, too)

Thanks for your patience and support!
Comment 33 Xake 2011-05-09 12:19:32 UTC
(In reply to comment #32)
> Conclusion:
> 1. All is well! Genkernel works fine, if you know the proper option.
Good to hear!:-)

> 2. Please add "domdadm" to man page
I will! That option is old, so I somehow had missed that it was not documented there, like it should be... So thanks for the heads up!

> Using "/dev/md/livecd:2" did work fine, md127 might have also been working.
> "crypt_root=UUID=bla:blah:foo:blah" did not work, but I never ever used uuid's
> before since I find them annoying. But I am willing to test if you want, mail
> me directly or query me on freenode (nick is the_mgt there, too)

Na, that is ok. I have the possibility to set up my own test system here if I find time to work on it.
Comment 34 Sebastian Pipping gentoo-dev 2011-05-31 01:25:29 UTC
(In reply to comment #33)
> > 2. Please add "domdadm" to man page
> I will! That option is old, so I somehow had missed that it was not documented
> there, like it should be... So thanks for the heads up!

I have opened a new bug for that: bug #369415