Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 414259 - sys-fs/mdadm: handling root-on-raid in init.d on shutdown
Summary: sys-fs/mdadm: handling root-on-raid in init.d on shutdown
Status: RESOLVED NEEDINFO
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: All Linux
: Normal enhancement (vote)
Assignee: Gentoo's Team for Core System packages
URL:
Whiteboard:
Keywords:
Depends on: 119380
Blocks:
  Show dependency tree
 
Reported: 2012-05-01 15:29 UTC by Duncan
Modified: 2014-05-05 07:03 UTC (History)
4 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Duncan 2012-05-01 15:29:18 UTC
At the end of the stop function of my mdraid initscript, I've had to do this:

        ebegin "Shutting down RAID devices (mdadm)"
        output=$(mdadm -Ss 2>&1)
        eend $? "${output}"
+
+       # Always return 0 on stop. The kernel takes care of itself
+       # and with rootfs on mdraid, without this it won't stop going
+       # into single user mode and thus won't start coming out.
+       return 0
 }


It's widely known that from the mdadm and kernel md/raid maintainer's perspective, when the kernel shuts down it resets md/raids to read-only and syncs, so they don't actually have to be stopped.  Still, stopping them is nice to do when possible, just in case.

However, when root is on mdraid, of course that raid cannot be stopped as there's still active processes with open files.  However, due to the above, that in itself isn't a real problem.

The problem is as stated in that comment added by the patch, not with shutdown, but with going to single user mode and trying to come back out.  Because mdraid correctly returns an error exit status, and the initscript returns that status as well, when one goes to single user mode (really, any initlevel that happens not to have mdraid in it), the mdraid initscript fails to successfully stop.

Because the initscript fails to successfully stop, it won't restart when coming back out of single-user mode and back to a normal initlevel with mdraid in it.

The fix is simple enough:  Simply return 0/success exit status regardless of what mdadm's exit status was.  It doesn't matter when going to initlevel 0 or 6 anyway, but it fixes going to single-user or initlevel 1, and coming back out!

(This is one change of two I've made to that script.  But the other is unrelated and I'll file it as a different bug.)
Comment 1 SpanKY gentoo-dev 2012-05-03 05:43:00 UTC
blindly ignoring the return value isn't good.  if we could tell mdadm to stop all devices except for ones listed on the command line, that might be a good extension.  and if you could do it by mount point, that'd be even better.
Comment 2 Duncan 2012-11-12 04:39:09 UTC
Just a "real world" update.  I upgraded my system and am presently not running mdraid, tho I expect I will be again in the future.  So I can't do any tests, etc, ATM.

Here's a thought:  With root on md/raid (at least without an initr* that starts it), the kernel must start at least the root md before launching usermode (init, etc), so the initscript (and mdadm itself) doesn't need to touch it.

What about leaving the rootmd out of mdadm.conf?  Then it won't try to start (ok, the kernel already starts it) or stop, thus eliminating the problem.

The problem with that is that mdadm in monitor mode won't see it either, so one would have to monitor the rootmd some other way if desired.  Also, without that md listed in mdadm.conf, mdadm maintenance on that md would be much more complicated, since all parameters would need supplied as they couldn't be pulled from mdadm.conf.

Another alternative:  Rewrite the initscript to take a list of mds to start and stop.  It could then iterate that list, starting/stopping each one in the list specifically, instead of using mdadm's config to manage everything at once.  Then the rootmd could simply be left out of that list, but it'd still be listed in mdadm.conf, so monitor mode would still see it, and the usual mdadm maintenance commands would "just work".

But, that's a big change from the current script's solution of just letting udev and mdadm automate most of it.  It's a lot of extra work, and would be a big enough change that it'd quite likely add a number of bugs and could possibly break currently working systems, if the admin wasn't paying attention when he upgraded and screwed up the config-file updates.

Additionally, the way the current setup works, if there are md/raid devices that aren't normally started at boot, but that might be started manually by the admin later, the mdraid script will still catch and shut them down as long as they're in mdadm.conf.  If we switched to specific md enumeration and only managed those in that list, it wouldn't shut down any others started manually during a session, as it does now.
Comment 3 Duncan 2014-05-05 07:03:23 UTC
Cleaning out old bugs from the my bugs list I came across this one.

Not only do I not use btrfs raid modes instead of mdraid these days, but I switched to systemd as well and am thus no longer tracking openrc initscripts.  

As I have no idea if this remains an issue or not, I'm closing it.  If anyone on CC or coming across this bug believes it should still be open, please comment in an update dealing current status and I can reopen.

Resolving as NEEDINFO for now.