Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 395203 - sys-apps/mdadm-3.2.1 not shutting down imsm raid array properly
Summary: sys-apps/mdadm-3.2.1 not shutting down imsm raid array properly
Status: CONFIRMED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: All Linux
: Normal normal with 8 votes (vote)
Assignee: Gentoo's Team for Core System packages
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-12-18 20:54 UTC by Daniel Frey
Modified: 2023-09-09 03:01 UTC (History)
16 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
killprocs options to exclude mdmon from kill process (killprocs,153 bytes, text/plain)
2011-12-18 20:55 UTC, Daniel Frey
Details
initscript to add to shutdown level to stop mdmon gracefully (mdadm-shutdown,137 bytes, text/plain)
2011-12-18 20:56 UTC, Daniel Frey
Details
killprocs.patch (killprocs.patch,408 bytes, patch)
2013-01-30 15:48 UTC, Alexander Tsoy
Details | Diff
Patch to suppress warnings (mdadm-shutdown.patch,548 bytes, patch)
2013-02-14 23:05 UTC, Daniel Frey
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Daniel Frey 2011-12-18 20:54:11 UTC
I'm using an Intel (imsm) raid array (raid10) in order to dual boot.

Intel is officially supporting mdadm as the tool to use to access these raid devices.

I've noticed that every time I reboot the array is in a degraded state, and mdadm proceeds to rebuild it.

I've marked this as critical as it could cause data loss on the raid array in some rare (weird?) circumstances.

Reproducible: Always

Steps to Reproduce:
1. Create an imsm raid array
2. Install gentoo with mdadm-3.2.1 (or any mdadm version, actually...)
3. cat /proc/mdstat, ensure array is in a normal state
4. Reboot, notice that the kernel marks the array as degraded
5. cat /proc/mdstat and watch your array rebuild
Actual Results:  
A reboot shouldn't cause the array to be marked as dirty/degraded.

Expected Results:  
It should boot without constantly rebuilding the array. (By the way, I lost a hard drive because of the constant rebuilding that I didn't initially notice, hence the critial severity.)

During shutdown mdmon is terminated before root is mounted read-only. If mdmon is shut down gracefully and not terminated this error doesn't happen and the system will reboot normally with no issues.

# equery list mdadm openrc
[IP-] [  ] sys-fs/mdadm-3.2.1:0
[IP-] [  ] sys-apps/openrc-0.9.4:0 

With a couple of us collaborating on the forums* (note below), we've come up with a working solution to this problem. It may actually help shutting down non-imsm arrays as well, as mdmon is running for any array it supports.

* http://forums.gentoo.org/viewtopic-t-888520.html


I don't see a way to attach files while creating a bug, so I'll post again with the scripts we are using that makes this problem go away.
Comment 1 Daniel Frey 2011-12-18 20:55:24 UTC
Created attachment 296309 [details]
killprocs options to exclude mdmon from kill process
Comment 2 Daniel Frey 2011-12-18 20:56:17 UTC
Created attachment 296311 [details]
initscript to add to shutdown level to stop mdmon gracefully
Comment 3 Daniel Frey 2011-12-18 21:01:08 UTC
Basically, we need to exclude mdmon from being terminated and have it shutdown as part of the shutdown runlevel.

If mdadm requires openrc-0.9.4 or higher, we can use /etc/conf.d/killprocs to exclude mdmon from the termination process (attachment included)

Also attached is an initscript that will shutdown mdmon gracefully after root is mounted read-only. It is added to the shutdown runlevel.

(Could this also stop the 'root partition is in use' problem during shutdown I've seen in other bugs?)

More info is in the forum link in the original bug report.

If these two files are included with mdadm it would be SO much easier getting imsm to shut down properly.

A notice on the ebuild might be nice for users too, just reminding them to add mdadm-shutdown to the shutdown runlevel.

I'll add myself to the CC list in case there are more questions.
Comment 4 blakawk 2012-01-02 03:48:32 UTC
I'm getting that bug too. I have a RAID 5 using IMSM metadata, fixed using the given workaround.

# cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] 
md126 : active raid5 sdb[3] sdc[2] sdd[1] sde[0]
      937706496 blocks super external:/md127/0 level 5, 32k chunk, algorithm 0 [4/4] [UUUU]
      [==>..................]  resync = 11.4% (35938120/312568960) finish=65.3min speed=70543K/sec
      
md127 : inactive sdb[3](S) sdc[2](S) sdd[1](S) sde[0](S)
      9028 blocks super external:imsm
# mdadm -E /dev/md127
/dev/md127:
          Magic : Intel Raid ISM Cfg Sig.
        Version : 1.2.02
    Orig Family : b64a2ed1
         Family : b64a2ed1
     Generation : 0000bd08
           UUID : 7dc67b55:d3b0e81f:0b2142e7:3b8e4133
       Checksum : e251871f correct
    MPB Sectors : 2
          Disks : 4
   RAID Devices : 1

  Disk01 Serial : WD-WMAV2P176453
          State : active
             Id : 00030000
    Usable Size : 625137934 (298.09 GiB 320.07 GB)

[Raid5]:
           UUID : c7d13342:ad60c20b:0cbc2b84:a6d0bd76
     RAID Level : 5
        Members : 4
          Slots : [UUUU]
      This Slot : 1
     Array Size : 1875412992 (894.27 GiB 960.21 GB)
   Per Dev Size : 625137928 (298.09 GiB 320.07 GB)
  Sector Offset : 0
    Num Stripes : 9767776
     Chunk Size : 32 KiB
       Reserved : 0
  Migrate State : repair
      Map State : normal <-- normal
     Checkpoint : 152621 (256)
    Dirty State : clean

  Disk00 Serial : WD-WMAV2P635457
          State : active
             Id : 00020000
    Usable Size : 625137934 (298.09 GiB 320.07 GB)

  Disk02 Serial : WD-WMAV2C166898
          State : active
             Id : 00040000
    Usable Size : 625137934 (298.09 GiB 320.07 GB)

  Disk03 Serial : WD-WMAV2C258315
          State : active
             Id : 00050000
    Usable Size : 625137934 (298.09 GiB 320.07 GB)
# emerge --info mdadm
Portage 2.1.10.44 (default/linux/amd64/10.0/desktop/gnome, gcc-4.5.3, glibc-2.14.1-r1, 3.1.6-gentoo x86_64)
=================================================================
                         System Settings
=================================================================
System uname: Linux-3.1.6-gentoo-x86_64-Intel-R-_Core-TM-_i7-2600K_CPU_@_3.40GHz-with-gentoo-2.1
Timestamp of tree: Sun, 01 Jan 2012 23:45:01 +0000
app-shells/bash:          4.2_p20
dev-java/java-config:     2.1.11-r3
dev-lang/python:          2.7.2-r3, 3.2.2
dev-util/cmake:           2.8.6-r4
dev-util/pkgconfig:       0.26
sys-apps/baselayout:      2.1
sys-apps/openrc:          0.9.4
sys-apps/sandbox:         2.5
sys-devel/autoconf:       2.13, 2.68
sys-devel/automake:       1.10.3, 1.11.1-r1
sys-devel/binutils:       2.22
sys-devel/gcc:            4.5.3-r1
sys-devel/gcc-config:     1.5-r2
sys-devel/libtool:        2.4.2
sys-devel/make:           3.82-r3
sys-kernel/linux-headers: 2.6.39 (virtual/os-headers)
sys-libs/glibc:           2.14.1-r1
Repositories: gentoo
ACCEPT_KEYWORDS="amd64 ~amd64"
ACCEPT_LICENSE="*"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-O2 -pipe -march=native -fomit-frame-pointer"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/gnupg/qualified.txt"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/env.d/java/ /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo"
CXXFLAGS="-O2 -pipe -march=native -fomit-frame-pointer"
DISTDIR="/usr/portage/distfiles"
EMERGE_DEFAULT_OPTS="--autounmask=n --with-bdeps=y --jobs=9"
FEATURES="assume-digests binpkg-logs distlocks ebuild-locks fixlafiles metadata-transfer news parallel-fetch protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch"
FFLAGS=""
GENTOO_MIRRORS="http://mirrors.linuxant.fr/distfiles.gentoo.org"
LANG="fr_FR.utf8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
LINGUAS="fr en"
MAKEOPTS="-j9"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY=""
SYNC="rsync://rsync1.fr.gentoo.org/gentoo-portage"
USE="X a52 aac acl acpi alsa amd64 bash-completion berkdb bluetooth branding bzip2 cairo cdda cdr cli colord consolekit cracklib crypt cups cxx dbus dri dts dvd dvdr eds emboss encode evo exif fam firefox flac fortran gdbm gdu gif gnome gnome-keyring gnome-online-accounts gpm gstreamer gtk iconv ipv6 jpeg lcms libnotify mad mmx mng modules mp3 mp4 mpeg mudflap multilib nautilus ncurses networkmanager nls nptl nptlonly ogg opengl openmp pam pango pcre pdf png policykit ppds pppd pulseaudio qt3support readline sdl session socialweb spell sse sse2 ssl startup-notification svg sysfs tcpd tiff truetype udev unicode usb vim-syntax vorbis x264 xcb xinerama xml xorg xulrunner xv xvid zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="kexi words flow plan stage tables krita karbon braindump" CAMERAS="ptp2" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip tripmate tnt ubx" GRUB_PLATFORMS="efi-64" INPUT_DEVICES="evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LINGUAS="fr en" PHP_TARGETS="php5-3" RUBY_TARGETS="ruby18" USERLAND="GNU" VIDEO_CARDS="radeon" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CPPFLAGS, CTARGET, INSTALL_MASK, LC_ALL, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS

=================================================================
                        Package Settings
=================================================================

sys-fs/mdadm-3.1.5 was built with the following:
USE="(multilib) -static"
Comment 5 Artem Butusov 2012-03-08 16:28:41 UTC
Still unconfirmed... 3 monthes...
Comment 6 Alexander Tsoy 2013-01-30 15:17:27 UTC
+1

MDMON(8) says:

"At shutdown time, mdmon should not be killed along with other processes. Also  as it holds a file (socket actually) open in /dev (by default) it will not be possible to unmount /dev if it is a separate filesystem."
Comment 7 Alexander Tsoy 2013-01-30 15:48:47 UTC
Created attachment 337324 [details, diff]
killprocs.patch

(In reply to comment #1)
> Created attachment 296309 [details]
> killprocs options to exclude mdmon from kill process

Maybe would be better to patch killprocs?
Comment 8 Daniel Frey 2013-02-14 23:05:59 UTC
Created attachment 338912 [details, diff]
Patch to suppress warnings

Newer versions of mdadm generate warnings about stopping the array when using the original version of the script.

The script is still needed; if it is removed mdadm will mark the array as dirty on the next reboot.

This patch suppresses the warnings/errors generated by the mdadm-shutdown script. It does suppress all of them though, so use at your own risk! (I was tired of the notices on shutdown.)
Comment 9 Daniel Frey 2014-11-29 18:13:55 UTC
Just thought I'd post a note on here - I decided to try systemd and it does not exhibit this behaviour, it shuts down the raid device properly.

This bug presumably only affects openrc.
Comment 10 Frank Sieber 2014-12-09 15:00:39 UTC
I also can confirm this bug.

$ equery list mdadm openrc
[IP-] [  ] sys-fs/mdadm-3.3.2:0
[IP-] [  ] sys-apps/openrc-0.12.4:0

I have a RAID-5 with 3 disks.

Adding the mdadm-shutdown Skript and the killall5 options solves the problem for me too.

The cause that systemd users are not affected, may be that mdadm (or it's ebuild) delivers a mdadm-shutdown script for systemd: /usr/lib/systemd/system-shutdown/mdadm.shutdown. But I don't know since when this is the case. I only noticed this during by searching.

The manpage of mdmon says: At shutdown time, mdmon should not be killed along with other processes.  Also as it holds a file (socket actually) open in /dev (by default) it will not be possible to unmount /dev if it is a separate filesystem.

So maybe this scenario should be supported by openrc framework. And when emerging mdadm it should automatically install an mdadm-shutdown script and the killall5 options. Plus an elog notice about this.
Comment 11 Kris Hepler 2015-02-05 06:20:23 UTC
I am seeing this problem on systemd (and openrc as well, but that's old news).
Comment 12 lperkins 2020-03-27 16:11:15 UTC
I'd like to confirm that, 19 years later, this is still an issue with IMSM RAID arrays.

The killprocs.patch will no longer work as that's not where the .pid files are kept any more, but the killprocs options do work.

You also need to run:

mdadm --wait-clean --scan --quiet
mdadm -Ss 

Sometime after all filesystems are mounted readonly.  You can either add it to the end of the mount-ro script or put it in its own script depending on your personal maintenance requirements.

Is there any interest in maintaining this as part of the regular init scripts?  We should be able to use /proc/mdstat to determine the presence and type of mdraid and conditionally execute the extra steps only if there's an IMSM array running.  But I don't want to spend the effort sorting that out if it's just going to be left to rot and nobody but me has to use this junk anyway.
Comment 13 Konstantin Hartwich 2021-03-08 10:55:00 UTC
(In reply to lperkins from comment #12)
> I'd like to confirm that, 19 years later, this is still an issue with IMSM
> RAID arrays.
> 
> The killprocs.patch will no longer work as that's not where the .pid files
> are kept any more, but the killprocs options do work.
> 
> You also need to run:
> 
> mdadm --wait-clean --scan --quiet
> mdadm -Ss 
> 
> Sometime after all filesystems are mounted readonly.  You can either add it
> to the end of the mount-ro script or put it in its own script depending on
> your personal maintenance requirements.
> 
> Is there any interest in maintaining this as part of the regular init
> scripts?  We should be able to use /proc/mdstat to determine the presence
> and type of mdraid and conditionally execute the extra steps only if there's
> an IMSM array running.  But I don't want to spend the effort sorting that
> out if it's just going to be left to rot and nobody but me has to use this
> junk anyway.

I encountered the same problem with current gentoo baselayout. Too bad it is still not properly handled while MDMON clearly stated that it has to be handled differently.

Is it possible to provide current state workiing patches at least here? Don't want to waer off my HDs when rebooting gentoo.. and manually doing the stop commands is pretty error prone

other distros handle it, even systemd is better here.. sad to say that while prefering OpenRC
Comment 14 lperkins 2021-03-10 22:32:06 UTC
> I encountered the same problem with current gentoo baselayout. Too bad it is
> still not properly handled while MDMON clearly stated that it has to be
> handled differently.
> 
> Is it possible to provide current state workiing patches at least here?
> Don't want to waer off my HDs when rebooting gentoo.. and manually doing the
> stop commands is pretty error prone
> 
> other distros handle it, even systemd is better here.. sad to say that while
> prefering OpenRC

Use the /etc/conf.d/killprocs option discussed above since you should definitely have new enough OpenRC by now and tack the shutdown commands onto the end of mount-ro.

It's not really a Systemd or OpenRC issue directly so much as it's just that nobody using Gentoo actually cares about IMSM RAID and it's not checked for in the shutdown scripts.
Comment 15 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-09-09 03:00:53 UTC
Feel free to give a rebased patch with a description inline and I'll get it in.