Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 416081 - sys-fs/mdadm-3.2.4+: incremental assembly of arrays via udev rules.d/64-md-raid.rules fails
Summary: sys-fs/mdadm-3.2.4+: incremental assembly of arrays via udev rules.d/64-md-ra...
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: All Linux
: Normal normal with 1 vote (vote)
Assignee: Gentoo's Team for Core System packages
URL:
Whiteboard:
Keywords:
: 418469 (view as bug list)
Depends on:
Blocks:
 
Reported: 2012-05-15 14:18 UTC by Robin Bankhead
Modified: 2013-04-27 09:32 UTC (History)
23 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
dmesg booting with mdadm-3.2.4 (bad) (dmesg-mdadm-3.2.4.log,247.68 KB, text/plain)
2012-05-16 11:19 UTC, Robin Bankhead
Details
dmesg booting with mdadm-3.2.3-r1 (good) (dmesg-mdadm-3.2.3-r1.log,247.66 KB, text/plain)
2012-05-16 11:21 UTC, Robin Bankhead
Details
mdadm-3.2.5-r1.ebuild.patch (mdadm-3.2.5-r1.ebuild.patch,489 bytes, patch)
2012-12-16 11:53 UTC, Alexander Tsoy
Details | Diff
mdadm-3.2.6.ebuild.patch (mdadm-3.2.6.ebuild.patch,737 bytes, patch)
2012-12-17 10:02 UTC, Alexander Tsoy
Details | Diff
mdadm-3.2.6.ebuild.patch (mdadm-3.2.6.ebuild.patch,794 bytes, patch)
2012-12-26 09:41 UTC, Alexander Tsoy
Details | Diff
mdadm-3.2.6.ebuild.patch (mdadm-3.2.6.ebuild.patch,792 bytes, patch)
2012-12-26 10:51 UTC, Alexander Tsoy
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Robin Bankhead 2012-05-15 14:18:18 UTC
Description of the problem (and things I've tried) at

http://forums.gentoo.org/viewtopic-p-7038642.html

Lots of output in dmesg but not really sure what I'm looking for. Can post the whole thing if it helps.

I don't fully understand the order of things that should be happening, what udev's role should be etc., so apologies if I've overlooked something.

emerge --info:
Portage 2.1.10.59 (default/linux/amd64/10.0/server, gcc-4.5.3, glibc-2.15-r1, 3.3.5-gentoo x86_64)
=================================================================
System uname: Linux-3.3.5-gentoo-x86_64-AMD_Turion-tm-_II_Neo_N40L_Dual-Core_Processor-with-gentoo-2.1
Timestamp of tree: Mon, 14 May 2012 19:15:01 +0000
distcc 3.1 x86_64-pc-linux-gnu [disabled]
ccache version 3.1.7 [enabled]
app-shells/bash:          4.2_p28
dev-java/java-config:     2.1.11-r3
dev-lang/python:          2.7.3-r2, 3.2.3-r1
dev-util/ccache:          3.1.7
dev-util/cmake:           2.8.8-r2
dev-util/pkgconfig:       0.26
sys-apps/baselayout:      2.1
sys-apps/openrc:          0.9.9.3
sys-apps/sandbox:         2.5
sys-devel/autoconf:       2.13, 2.69
sys-devel/automake:       1.11.5
sys-devel/binutils:       2.22-r1
sys-devel/gcc:            4.5.3-r2
sys-devel/gcc-config:     1.7
sys-devel/libtool:        2.4.2
sys-devel/make:           3.82-r3
sys-kernel/linux-headers: 3.3 (virtual/os-headers)
sys-libs/glibc:           2.15-r1
Repositories: gentoo belak x-portage
ACCEPT_KEYWORDS="amd64 ~amd64"
ACCEPT_LICENSE="*"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=native -O2 -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /etc/make.conf /usr/bin/vncserver /usr/lib/X11/xdm/Xsetup_0 /usr/share/gnupg/qualified.txt"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/java/ /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/php/apache2-php5.3/ext-active/ /etc/php/cgi-php5.3/ext-active/ /etc/php/cli-php5.3/ext-active/ /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo"
CXXFLAGS="-march=native -O2 -pipe"
DISTDIR="/usr/portage/distfiles"
EMERGE_DEFAULT_OPTS="--nospinner --quiet-build=n"
FEATURES="assume-digests binpkg-logs ccache distlocks ebuild-locks fixlafiles news parse-eapi-ebuild-head protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch"
FFLAGS=""
GENTOO_MIRRORS="http://gentoo.blueyonder.co.uk/"
LANG="en_GB.utf8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
LINGUAS="en_GB"
MAKEOPTS="-j20 -l5.2"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/var/lib/layman/belak /usr/local/portage"
SYNC="rsync://rsync.uk.gentoo.org/gentoo-portage"
USE="3dnow 3dnowext X509 a52 aac accessibility acl acpi amd64 apache2 bash-completion berkdb bzip2 clamav cli consolekit cracklib crypt cups curl curlwrappers cxx dbm dbus dbx dri dv enca encode exif fastcgi fat ffmpeg flac foomaticdb fortran ftp gd gd-external gdbm gnutls gpm hardened iconv imagemagick imap imlib innodb java javascript jpeg lcms ldap libwww lzo mad matroska mdadm minimal mmx mmxext modules mp3 mudflap multilib mysql ncurses nls nptl ntfs offensive ogg openmp optimized-qmake pam pcre php png policykit posix pppd qt4 raw readline rtmp samba scanner session sharedmem snmp soap sse sse2 sse4a ssl startup-notification tcpd theora threads tidy tiff tokenizer tordns truetype udev unicode usb vhosts x264 xinetd xml xmlrpc xorg xsl xvid xz zip zlib" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic auth_digest authn_anon authn_dbd authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgid dav dav_fs dav_lock dbd deflate dir disk_cache env expires ext_filter file_cache filter headers ident imagemap include info log_config logio mem_cache mime mime_magic negotiation proxy proxy_ajp proxy_balancer proxy_connect proxy_http rewrite setenvif so speling status unique_id userdir usertrack vhost_alias" APACHE2_MPMS="worker" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump" CAMERAS="ptp2" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip tripmate tnt ubx" INPUT_DEVICES="keyboard mouse" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" LINGUAS="en_GB" PHP_TARGETS="php5-3" RUBY_TARGETS="ruby18" SANE_BACKENDS="hp net" USERLAND="GNU" VIDEO_CARDS="radeon r600 vesa" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CPPFLAGS, CTARGET, INSTALL_MASK, LC_ALL, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, USE_PYTHON
Comment 1 Robin Bankhead 2012-05-16 11:19:29 UTC
Created attachment 312001 [details]
dmesg booting with mdadm-3.2.4 (bad)

dmesg when booting with mdadm-3.2.4, kernel autodetect off, nothing in mdadm.conf, mdraid and mdadm in boot runlevel.

Result: 6 arrays in /proc/mdstat (md125-127, md0-2) with random combinations of partitions assigned to each.

Note that if I stop all these arrays and do mdadm -As, the arrays assemble correctly as md0-2.
Comment 2 Robin Bankhead 2012-05-16 11:21:46 UTC
Created attachment 312003 [details]
dmesg booting with mdadm-3.2.3-r1 (good)

dmesg with mdadm-3.2.3-r1, same kernel/mdadm/mdraid config as above.

Result: arrays correctly assembled at boot as md0-2.
Comment 3 Guillaume Rosaire 2012-05-20 22:46:28 UTC
Same thing for me, after upgrading to 3.2.4, my lone raid 0 array (sdd2, sdc2) that i'm using for LVM doesn't start correctly at boot, with the following in dmesg

md: Autodetecting RAID arrays.
md: invalid raid superblock magic on sdc2
md: sdc2 does not have a valid v0.90 superblock, not importing!
md: invalid raid superblock magic on sdd2
md: sdd2 does not have a valid v0.90 superblock, not importing!
md: Scanned 2 and added 0 devices.

After boot, I find /dev/md0 inactive, /dev/md126 and /dev/md127

If I do :
mdadm --stop /dev/md126 
mdadm --stop /dev/md127 
mdadm --assemble /dev/md0 /dev/sdc2 /dev/sdd2 

After that I can start correctly my lvm partitions with "vgchange -a y" till next reboot
Comment 4 Guillaume Rosaire 2012-05-20 23:06:53 UTC
Well, forget about the dmesg part for me, it's always done that if I look at my /var/log/messages history, the kernel raid autodetect doesn't work with version >0.90
Comment 5 Fabian Di Milia 2012-05-24 01:34:46 UTC
I've the same problem with 3.2.4 and RAID5 :(
Comment 6 SpanKY gentoo-dev 2012-05-24 04:27:23 UTC
try 3.2.5
Comment 7 Guillaume Rosaire 2012-05-24 06:54:05 UTC
Same thing with 3.2.5
Comment 8 Gilles Dartiguelongue gentoo-dev 2012-05-24 21:36:31 UTC
I have the same problem here. It looks like there was some race condition problems between udev and mdadm init scripts that had upstream patch some stuff up [1] but it made things stop working for my RAID 1.2 as well it seems (with both mdadm 3.1.4 and 3.1.5). Looking at the init script, I see this happen when it tries to start:

# mdadm -As
mdadm: /dev/md0 is already in use.
mdadm: /dev/md1 is already in use.

restarting mdraid solves the problem because it shutdowns the RAID arrays which for me have the proper names, just not enough devices.

For example, here is my mdadm.conf:
# cat /etc/mdadm.conf  |egrep -v "^#"
DEVICE partitions
ARRAY /dev/md0 UUID=966f16eb:ba91c926:55c047ab:fe7c86c9
ARRAY /dev/md1 UUID=d2929962:d2eba15b:54706133:d94b31cc

and mdstat after boot:
# cat /proc/mdstat 
Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] 
md1 : inactive sdb1[0](S)
      19528704 blocks super 1.2
       
md0 : inactive sde2[5](S)
      1933982720 blocks super 1.2
       
unused devices: <none>

and after mdraid restart:
# cat /proc/mdstat 
Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] 
md1 : active raid1 sdb1[0] sdc1[1]
      19528632 blocks super 1.2 [2/2] [UU]
      bitmap: 0/1 pages [0KB], 65536KB chunk

md0 : active raid5 sda2[0] sde2[5] sdd2[3] sdc2[2] sdb2[1]
      7735908352 blocks super 1.2 level 5, 512k chunk, algorithm 2 [5/5] [UUUUU]
      bitmap: 0/15 pages [0KB], 65536KB chunk

unused devices: <none>

I'll downgrade for now but feel free to ask more things to test.

[1] http://www.digipedia.pl/usenet/thread/19071/35509/
Comment 9 Robin Bankhead 2012-05-29 17:03:05 UTC
Is it possible that, in fixing one race condition, they've created another?

If you look at the mdstat outputs in my forum thread (link in OP), the fact that twice as many arrays are there as should be (one set with my intended numbering md*, one with the md12* pattern) that, to my untrained eye, certainly looks like the result of two things racing to do the same thing.

Kludging somehow to restart mdraid isn't a very elegant approach for me, anyway -- too many other deps on top of it. How to proceed? Which upstream do we bug, if someone hasn't already?
Comment 10 SpanKY gentoo-dev 2012-06-02 06:01:56 UTC
*** Bug 418469 has been marked as a duplicate of this bug. ***
Comment 11 Hans Nieser 2012-06-03 23:24:30 UTC
Same problem here with mdadm 3.2.5 after updating world, after the first reboot my raid1 array refused to auto-assemble. Took me a few hours to figure out what was going on because for some reason grub decided to break simultaneously leading me to think something got seriously hosed on my system, but simply stopping and re-scanning made it working again (until the next reboot)
Comment 12 Michael Weber (RETIRED) gentoo-dev 2012-06-03 23:31:01 UTC
(In reply to comment #11)
> Same problem here with mdadm 3.2.5 after updating world, after the first
> reboot my raid1 array refused to auto-assemble. Took me a few hours to

Raid1, too? Is that correct? That involves a lot more users.
Comment 13 Michael Palimaka (kensington) gentoo-dev 2012-06-04 08:59:40 UTC
(In reply to comment #12)
> Raid1, too? Is that correct? That involves a lot more users.
I can confirm that too.
Comment 14 Alex Alexander (RETIRED) gentoo-dev 2012-06-04 09:13:49 UTC
Broken 4-disk raid5 array here as well. 

Stopping the invalid 1-disk array that shows up on boot and doing a manual --assemble --scan creates the correct array and forces a rebuild of the 4th drive.
Comment 15 alexander haensch 2012-06-09 09:45:07 UTC
All RAID 1 and RAID0 are broken. Reverted to mdmdm-3.2.3-r1.
Now everything works again.
Comment 16 Michael Weber (RETIRED) gentoo-dev 2012-06-09 11:03:36 UTC
To be precise, it only affects raids assembled by mdadm during openrc bootup process. raids with metadata version <=0.9 and kernel level raid assembly afre not affected and still work.

I just reproduced this bug after upgrading to
[ebuild   R   ~] sys-fs/mdadm-3.2.5  USE="-static" 0 kB
[ebuild   R    ] sys-fs/udev-171-r6  USE="extras gudev hwdb keymap rule_generator -action_modeswitch -build -debug -edd -floppy -introspection (-selinux) -test" 0 kB


and rebooting, my two raids assembled into this
Personalities : [raid1] 
md124 : active raid1 sdc[0]
      1953513424 blocks super 1.2 [2/1] [U_]
      
md125 : active raid1 sdf[2]
      1465137424 blocks super 1.2 [2/1] [_U]
      
md126 : inactive sdd[1](S)
      1953513560 blocks super 1.2
       
md127 : inactive sde[0](S)
      1465137560 blocks super 1.2

The real bad thing about it is active raid md124/sdc, which still has the same /dev/disk/by-uuid value, which got auto-mounted in fstab, invalidating the data on the second raid disk sdd (inactive raid mad126).

After stopping all four
umount /dev/md124
mdadm --misc --stop /dev/md124
mdadm --misc --stop /dev/md125
mdadm --misc --stop /dev/md126
mdadm --misc --stop /dev/md126

and running `mdadm --assemble --scan` i ended up with 
md125 : active raid1 sdd[1]
      1953513424 blocks super 1.2 [2/1] [_U]
      
md126 : active raid1 sdc[0]
      1953513424 blocks super 1.2 [2/1] [U_]
      
md127 : active raid1 sde[0] sdf[2]
      1465137424 blocks super 1.2 [2/2] [UU]

raid md127 (sde/sdf) got assembled correctly (the carry an dm_crypt and have not been modified by automount),

[  811.664262] md: bind<sdf>
[  811.668763] md: bind<sde>
[  811.673110] md/raid1:md127: active with 2 out of 2 mirrors
[  811.677283] md127: detected capacity change from 0 to 1500300722176
[  811.682266]  md127: unknown partition table

but the other raid is broken

[  811.894716] md: md126 stopped.
[  811.900623] md: bind<sdd>
[  811.908878] md: bind<sdc>
[  811.912826] md: kicking non-fresh sdd from array!
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[  811.916810] md: unbind<sdd>
[  811.923441] md: export_rdev(sdd)
[  811.928730] md/raid1:md126: active with 1 out of 2 mirrors
[  811.930877] md126: detected capacity change from 0 to 2000397746176
[  811.935688]  md126: unknown partition table
[  812.471740] md: md125 stopped.
[  812.476843] md: bind<sdd>
[  812.483281] md/raid1:md125: active with 1 out of 2 mirrors
[  812.487553] md125: detected capacity change from 0 to 2000397746176
[  812.548118]  md125: unknown partition table

(don't mind the unknown partition table, there is no partition to find).
Comment 17 Michael Weber (RETIRED) gentoo-dev 2012-06-09 11:59:56 UTC
After downgrade and reboot, it mounted the other disk by-uuid ;-) now i have to recover

umount /...
mdadm --misc --stop /dev/md125
mdadm --misc --stop /dev/md126


mdadm --assemble /dev/md3 /dev/sdc /dev/sdd
mdadm: /dev/md3 has been started with 1 drive (out of 2).
(fails with "md: kicking non-fresh sdd from array!" in dmesg)

## flip a coin or do an educated guess which disk you want to override

mdadm --zero-superblock /dev/sdd
mdadm /dev/md3 --add /dev/sdd

watch recovery in `cat /proc/mdstat`

md3 : active raid1 sdd[2] sdc[0]
      1953513424 blocks super 1.2 [2/1] [U_]
      [>....................]  recovery =  0.9% (19414272/1953513424) finish=229.7min speed=140298K/sec
Comment 18 jannis 2012-06-09 23:07:01 UTC
A contrary report: I've just (last two days) converted a non-raid partition to a RAID1 (same size but now mirrored on two disks). I think I've been using mdadm-3.2.5 for that task and it worked liked a charm. I just cae to this bugs sinec I saw the MASK today. I'm using v1.2 superblocks but kernel messages and /proc/mdstat look just fine:
md0 : active raid1 sdb1[0] sdc1[1]
      976427968 blocks super 1.2 [2/2] [UU]
Comment 19 Marko Weber Bürgermeister 2012-06-13 15:17:32 UTC
hello,

i also setup a gentoo server and did 4 raid arrays.

md0 metadata 0.9
md1 metadata 0.9
md2 metadata 0.9
md127 metadata 1.2 (2.7 TB Partition. 0.9 i read has limit 2 TB)

on gentoo server mdadm 3.1.4

mdraid is in bootlevel, gentoo-wiki.com told me to do so.
mdadm i dont have in bootlevel.

noting in mdadm.conf

kernel autodetect off (but all Raid things * in Kernel, not [m] )

on the md127 i have a lvm physical volume with 2 logical volumes.

Every 8th or 9th boot, i can see on monitor the message
"could not mount all local filesystems".
I also can see "could not mount .... cause it does not exist"

it seems "fstab"? tries to mount the logical volumes BEFORE mdadm/mdraid? is ready. it looks like a race between lvm/mdraid?

i then did /etc/conf.d/lvm = RC_NEED="mdraid" RC_AFTER="mdraid"
but the phenomen stays. 

Every xth boot the race prevents the mounting of the logical drives.
This dont happen with the drives that have metadata 0.9
This only happens with the raid array that use metadata 1.2

When i set kernel autodetect on (nothing in mdadm.conf) the phenom extends.
Then every xth boot md127 is multiple available and shows up as
md127 + md126 , ofcourse my logical volumes dont get mounted.

next reboot all is fine, next not. It is like Las Vegas.

Is metadata 1.2 not fully supported? I also noticed that from mdadm 3.1.4 metadata 0.9 is not used as default anymore.

This is very annoying. Is there any solution?

marko
Comment 20 Marko Weber Bürgermeister 2012-06-13 15:19:12 UTC
(In reply to comment #19)
> hello,
> 
> i also setup a gentoo server and did 4 raid arrays.
> 
> md0 metadata 0.9
> md1 metadata 0.9
> md2 metadata 0.9
> md127 metadata 1.2 (2.7 TB Partition. 0.9 i read has limit 2 TB)
> 
> on gentoo server mdadm 3.1.4
> 
> mdraid is in bootlevel, gentoo-wiki.com told me to do so.
> mdadm i dont have in bootlevel.
> 
> noting in mdadm.conf
> 
> kernel autodetect off (but all Raid things * in Kernel, not [m] )
> 
> on the md127 i have a lvm physical volume with 2 logical volumes.
> 
> Every 8th or 9th boot, i can see on monitor the message
> "could not mount all local filesystems".
> I also can see "could not mount .... cause it does not exist"
> 
> it seems "fstab"? tries to mount the logical volumes BEFORE mdadm/mdraid? is
> ready. it looks like a race between lvm/mdraid?
> 
> i then did /etc/conf.d/lvm = RC_NEED="mdraid" RC_AFTER="mdraid"
> but the phenomen stays. 
> 
> Every xth boot the race prevents the mounting of the logical drives.
> This dont happen with the drives that have metadata 0.9
> This only happens with the raid array that use metadata 1.2
> 
> When i set kernel autodetect on (nothing in mdadm.conf) the phenom extends.
> Then every xth boot md127 is multiple available and shows up as
> md127 + md126 , ofcourse my logical volumes dont get mounted.
> 
> next reboot all is fine, next not. It is like Las Vegas.
> 
> Is metadata 1.2 not fully supported? I also noticed that from mdadm 3.1.4
> metadata 0.9 is not used as default anymore.
> 
> This is very annoying. Is there any solution?
> 
> marko

Edit: my "emerge --info"

# emerge --info
--- Invalid atom in /etc/portage/package.keywords: =sys-kernel/vanilla-sources3.4.2                                                                          
Portage 2.1.10.49 (default/linux/amd64/10.0/server, gcc-4.5.3, glibc-2.14.1-r3, 3.4.2_weber3.4.2 x86_64)                                                     
=================================================================                                                                                            
System uname: Linux-3.4.2_weber3.4.2-x86_64-Intel-R-_Core-TM-_i7-2600_CPU_@_3.40GHz-with-gentoo-2.1                                                          
Timestamp of tree: Tue, 12 Jun 2012 08:45:01 +0000                                                                                                           
app-shells/bash:          4.2_p20                                                                                                                            
dev-lang/python:          2.7.3-r2, 3.1.5, 3.2.3                                                                                                             
dev-util/cmake:           2.8.7-r5                                                                                                                           
dev-util/pkgconfig:       0.26                                                                                                                               
sys-apps/baselayout:      2.1-r1                                                                                                                             
sys-apps/openrc:          0.9.9                                                                                                                              
sys-apps/sandbox:         2.5
sys-devel/autoconf:       2.68
sys-devel/automake:       1.4_p6-r1, 1.11.1
sys-devel/binutils:       2.21.1-r1
sys-devel/gcc:            4.5.3-r2
sys-devel/gcc-config:     1.6
sys-devel/libtool:        2.4-r1
sys-devel/make:           3.82-r1
sys-kernel/linux-headers: 3.1 (virtual/os-headers)
sys-libs/glibc:           2.14.1-r3
Repositories: gentoo x-mailserver
ACCEPT_KEYWORDS="amd64"
ACCEPT_LICENSE="* -@EULA"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=core2 -mtune=generic -O2 -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/php/apache2-php5.3/ext-active/ /etc/php/cgi-php5.3/ext-active/ /etc/php/cli-php5.3/ext-active/ /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo"
CXXFLAGS="-march=core2 -mtune=generic -O2 -pipe"
DISTDIR="/usr/portage/distfiles"
FEATURES="assume-digests binpkg-logs distlocks ebuild-locks fixlafiles news parallel-fetch protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch"
FFLAGS=""
GENTOO_MIRRORS="rsync://de-mirror.org/gentoo/ ftp://ftp.halifax.rwth-aachen.de/gentoo/ rsync://ftp.halifax.rwth-aachen.de/gentoo/"
LANG="en_GB.utf8"
LDFLAGS="-Wl,--as-needed"
MAKEOPTS="-j9"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/portage/local/mailserver"
SYNC="rsync://rsync11.de.gentoo.org/gentoo-portage"
USE="acl amd64 berkdb bzip2 cli cracklib crypt cups cxx dri fortran gdbm gpm iconv mmx modules mudflap multilib ncurses nls nptl openmp pam pcre pppd readline session snmp sse sse2 ssl tcpd truetype unicode xml xorg zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" APACHE2_MPMS="prefork" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump" CAMERAS="ptp2" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip tripmate tnt ubx" INPUT_DEVICES="keyboard mouse evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" PHP_TARGETS="php5-3" PYTHON_TARGETS="python3_2 python2_7" RUBY_TARGETS="ruby18 ruby19" USERLAND="GNU" VIDEO_CARDS="fbdev glint intel mach64 mga neomagic nouveau nv r128 radeon savage sis tdfx trident vesa via vmware dummy v4l" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL, LINGUAS, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, USE_PYTHON
Comment 21 Robin Bankhead 2012-06-14 16:56:05 UTC
Marko, I'm not so sure about that as I for one don't use lvm at all. I have the relevant kernel code built-in and the utils are present, but I've never used it.

I think you might have a different issue.
Comment 22 Piotr Łyczba 2012-07-07 16:33:14 UTC
I've noticed similar behaviour on mdadm-3.2.5 and hardware RAID0 through imsm metadata. Running "mdadm -Asv" manualy give me "failed to get exclusive lock on mapfile" error.

I noticed that in interactive boot when I skip 'mdraid' and 'lvm' and stop boot process AFTER starting 'root' service and then exit to console I can successfully assembly devices by running: "mdadm -Ss" and manualy starting 'mdraid' service. Is it possible that 'mdraid' need rw remounted root filesystem to work?
Comment 23 Tom Li 2012-10-28 07:29:19 UTC
(In reply to comment #22)
> I've noticed similar behaviour on mdadm-3.2.5 and hardware RAID0 through
> imsm metadata. Running "mdadm -Asv" manualy give me "failed to get exclusive
> lock on mapfile" error.
> 
> I noticed that in interactive boot when I skip 'mdraid' and 'lvm' and stop
> boot process AFTER starting 'root' service and then exit to console I can
> successfully assembly devices by running: "mdadm -Ss" and manualy starting
> 'mdraid' service. Is it possible that 'mdraid' need rw remounted root
> filesystem to work?

I still have the same issue in mdadm-3.2.6.

> Is it possible that 'mdraid' need rw remounted root filesystem to work?
Your are right. If I put "mount -o remount,rw /" in /etc/init.d/mdraid, then the error will go away, RAID works correctly.
Comment 24 Vlastimil Babka (Caster) (RETIRED) gentoo-dev 2012-10-29 20:57:09 UTC
I recently upgraded to latest ~arch kernel-3.6.2, udev-195, and mdadm-3.6.2, and mdraid stopped working properly. I had only 1 of 3 drives assembled in each array. Restarting the mdraid service fixes it. Took me a while to figure out that assembly was not done by the init script, but in fact due to /usr/lib/udev/rules.d/64-md-raid.rules . I removed the file and it works fine again. Something is wrong with the incremental assembly...
Comment 25 Vlastimil Babka (Caster) (RETIRED) gentoo-dev 2012-10-29 21:35:14 UTC
So, downgrading to mdadm-3.2.3-r2 also helps.
Comment 26 Florian D. 2012-11-03 20:09:42 UTC
I can confirm that it affects mdadm-3.2.6 as well. I have a RAID5 system. It took me hours to find this bug. Could you please *mask* the affected versions? this is a show stopper.
Comment 27 SpanKY gentoo-dev 2012-11-11 07:43:45 UTC
if you comment out this line in the udev rules.d file, does it work ?

ENV{ID_FS_TYPE}=="ddf_raid_member|isw_raid_member|linux_raid_member", GOTO="md_inc"
Comment 28 Michael Palimaka (kensington) gentoo-dev 2012-11-11 13:06:21 UTC
(In reply to comment #27)
> if you comment out this line in the udev rules.d file, does it work ?
> 
> ENV{ID_FS_TYPE}=="ddf_raid_member|isw_raid_member|linux_raid_member",
> GOTO="md_inc"

Why is this unmasked? I have not seen any evidence that this serious bug is fixed.
Comment 29 SpanKY gentoo-dev 2012-11-11 21:01:10 UTC
(In reply to comment #28)

you didn't answer the question
Comment 30 Michael Palimaka (kensington) gentoo-dev 2012-11-12 09:20:06 UTC
> (In reply to comment #28)
> 
> you didn't answer the question

Because I don't have time to break my entire system right now...precisely why the package was masked in the first place.
Comment 31 Didier Link 2012-11-12 09:32:44 UTC
Hi,

I've the same problem since months now and the remove of the mentionned *.rules file have solved the issue because the array is correctly build by /etc/init.d/mdraid script during boot time !

Before the raid5 array was constructed only with 1 drive and never start at all.

It's not a mdadm problem, the utility works as expected in all versions, it's a problem of udev script with "incremental build".
Comment 32 SpanKY gentoo-dev 2012-11-12 18:16:18 UTC
(In reply to comment #30)

then don't spam noise here.  i'm interested in fixing the problem.

(In reply to comment #31)

could you try removing just the one line from the rules.d file and not just deleting the entire thing ?
Comment 33 Andrei Slavoiu 2012-11-14 22:20:48 UTC
(In reply to comment #27)
> if you comment out this line in the udev rules.d file, does it work ?
> 
> ENV{ID_FS_TYPE}=="ddf_raid_member|isw_raid_member|linux_raid_member",
> GOTO="md_inc"

No change here.
Comment 34 Didier Link 2012-12-15 12:29:38 UTC
For me the comments of this single line work, I've only the mdraid init script working as expected.
Comment 35 Alexander Tsoy 2012-12-16 11:53:40 UTC
Created attachment 332464 [details, diff]
mdadm-3.2.5-r1.ebuild.patch

This patch fixes this issue for me. I think that only users with separate /var partition are affected by this bug.

I haven't mdraid init.d script enabled. All my arrays incrementaly assembled by udev now.
Comment 36 Alexander Tsoy 2012-12-16 12:08:21 UTC
(In reply to comment #35)
> I think that only users with separate /var partition are affected by this bug.
> 

In other words incremental assembly is broken without map file. :)
Comment 37 Alexander Tsoy 2012-12-16 17:54:45 UTC
3.2.6 also works fine after changing MAP_DIR.
Comment 38 Nikolay S. Rybaloff 2012-12-16 18:24:24 UTC
(In reply to comment #35)
> Created attachment 332464 [details, diff] [details, diff]
> mdadm-3.2.5-r1.ebuild.patch
> 
> This patch fixes this issue for me. I think that only users with separate
> /var partition are affected by this bug.
> 
> I haven't mdraid init.d script enabled. All my arrays incrementaly assembled
> by udev now.

That is the default since 3.2.4
There is something else that fixed the problem for you...
Comment 39 Nikolay S. Rybaloff 2012-12-16 18:27:53 UTC
BTW, is there anybody affected by the issue, who does not use initramfs or does not use genkernel to generate initramfs image?
Comment 40 Alexander Tsoy 2012-12-16 19:28:19 UTC
(In reply to comment #38)
> (In reply to comment #35)
> > Created attachment 332464 [details, diff] [details, diff] [details, diff]
> > mdadm-3.2.5-r1.ebuild.patch
> > 
> > This patch fixes this issue for me. I think that only users with separate
> > /var partition are affected by this bug.
> > 
> > I haven't mdraid init.d script enabled. All my arrays incrementaly assembled
> > by udev now.
> 
> That is the default since 3.2.4
> There is something else that fixed the problem for you...

Heh.. I just discovered this:

src_prepare() {
    [...]
    sed -i 's:/run/mdadm:/var/run/mdadm:g' *.[ch] Makefile || die
    [...]
}

Before that, I looked at the souces after "ebuild ... unpack", "ebuild ... prepare" and saw this string in Makefile :)
MAP_DIR=/var/run/mdadm
Comment 41 Alexander Tsoy 2012-12-16 21:12:54 UTC
(In reply to comment #40)
> src_prepare() {
>     [...]
>     sed -i 's:/run/mdadm:/var/run/mdadm:g' *.[ch] Makefile || die
>     [...]
> }

Can anybody try to remove this stupid sed from ebuild and check if this help?
Comment 42 Alexander Tsoy 2012-12-17 10:02:17 UTC
Created attachment 332562 [details, diff]
mdadm-3.2.6.ebuild.patch

Updated patch. I think that passing MAP_DIR=/run/mdadm to make, even if this is by default, is good approach anyway.
Comment 43 Alexander Tsoy 2012-12-21 20:56:51 UTC
Since nobody replied I want to explain the problem a bit more. Incremental assembly needs a map file. Default location for map file in mdadm-3.2.3 is /dev/.mdadm/map. In version 3.2.4 default location was changed to /run/mdadm/map, but ebuild changes this path to /var/run/mdadm/map. On system boot udev starts before localmount. So when udev starts /var is not mounted yet and mdadm can't create /var/run/mdadm directory and files in it. That is why incremental assembly fails.
Comment 44 Paul Hartman 2012-12-25 01:34:48 UTC
(In reply to comment #41)
> (In reply to comment #40)
> > src_prepare() {
> >     [...]
> >     sed -i 's:/run/mdadm:/var/run/mdadm:g' *.[ch] Makefile || die
> >     [...]
> > }
> 
> Can anybody try to remove this stupid sed from ebuild and check if this help?

I tried removing this line and rebooted, raid was assembled properly. Thanks.
Comment 45 Alexander Tsoy 2012-12-25 18:31:08 UTC
(In reply to comment #43)
> On system boot udev starts before localmount. So when udev starts /var is not
> mounted yet and mdadm can't create /var/run/mdadm directory and files in it.
> That is why incremental assembly fails.

This is not entirely true. As mentioned in comment 22 the problem also occurs because "/" is mounted ro when udev starts. So separate "/var" is not a necessary condition.
Comment 46 Tom Li 2012-12-26 00:55:04 UTC
(In reply to comment #45)
> (In reply to comment #43)
> > On system boot udev starts before localmount. So when udev starts /var is not
> > mounted yet and mdadm can't create /var/run/mdadm directory and files in it.
> > That is why incremental assembly fails.
> 
> This is not entirely true. As mentioned in comment 22 the problem also
> occurs because "/" is mounted ro when udev starts. So separate "/var" is not
> a necessary condition.

That's right. 

But if I added "localmount" into the dependencies of mdadm init script, then I got a dependency loop: some filesystem on the RAID need mdadm, but mdadm needs to mount "/" to "rw" too :(
Comment 47 Alexander Tsoy 2012-12-26 09:41:19 UTC
Created attachment 333372 [details, diff]
mdadm-3.2.6.ebuild.patch

Updated patch. Block openrc versions that does not mount /run.

(In reply to comment #46)
> (In reply to comment #45)
> > (In reply to comment #43)
> > > On system boot udev starts before localmount. So when udev starts /var is not
> > > mounted yet and mdadm can't create /var/run/mdadm directory and files in it.
> > > That is why incremental assembly fails.
> > 
> > This is not entirely true. As mentioned in comment 22 the problem also
> > occurs because "/" is mounted ro when udev starts. So separate "/var" is not
> > a necessary condition.
> 
> That's right. 
> 
> But if I added "localmount" into the dependencies of mdadm init script, then
> I got a dependency loop: some filesystem on the RAID need mdadm, but mdadm
> needs to mount "/" to "rw" too :(

Simply put patched mdadm ebuild into your local overlay. This should solve the issue.
Comment 48 Tom Li 2012-12-26 10:10:33 UTC
(In reply to comment #47)
> Created attachment 333372 [details, diff] [details, diff]
> mdadm-3.2.6.ebuild.patch
> 
> Updated patch. Block openrc versions that does not mount /run.
> 
> (In reply to comment #46)
> > (In reply to comment #45)
> > > (In reply to comment #43)
> > > > On system boot udev starts before localmount. So when udev starts /var is not
> > > > mounted yet and mdadm can't create /var/run/mdadm directory and files in it.
> > > > That is why incremental assembly fails.
> > > 
> > > This is not entirely true. As mentioned in comment 22 the problem also
> > > occurs because "/" is mounted ro when udev starts. So separate "/var" is not
> > > a necessary condition.
> > 
> > That's right. 
> > 
> > But if I added "localmount" into the dependencies of mdadm init script, then
> > I got a dependency loop: some filesystem on the RAID need mdadm, but mdadm
> > needs to mount "/" to "rw" too :(
> 
> Simply put patched mdadm ebuild into your local overlay. This should solve
> the issue.

Funtoo still uses "openrc-0.10.2", that means I can not update mdadm anymore? :(
Comment 49 Alexander Tsoy 2012-12-26 10:23:01 UTC
(In reply to comment #48)
> Funtoo still uses "openrc-0.10.2", that means I can not update mdadm
> anymore? :(

openrc-0.10.5 is the only available version of 0.10 branch in portage, that's why I choose it. You can change it to something like this:

!<sys-apps/openrc-0.10
Comment 50 Alexander Tsoy 2012-12-26 10:51:56 UTC
Created attachment 333376 [details, diff]
mdadm-3.2.6.ebuild.patch

Updated patch with fixed openrc dependency.
Comment 51 Tom Li 2012-12-30 12:11:02 UTC
(In reply to comment #50)
> Created attachment 333376 [details, diff] [details, diff]
> mdadm-3.2.6.ebuild.patch
> 
> Updated patch with fixed openrc dependency.

Great patch! It works for me. New my mdadm works correctly.
Comment 52 Alexander Tsoy 2013-01-04 11:53:25 UTC
@base-system PING!
Comment 53 Tom Li 2013-01-26 14:02:16 UTC
This bug is resolved. But nobody add the patch into mainline?
Comment 54 Robin Bankhead 2013-02-08 13:37:05 UTC
Patch efficacy confirmed here with 3.2.6 as well, FWIW.
Comment 55 Samuli Suominen gentoo-dev 2013-02-08 13:45:59 UTC
(In reply to comment #50)
> Created attachment 333376 [details, diff] [details, diff]
> mdadm-3.2.6.ebuild.patch
> 
> Updated patch with fixed openrc dependency.

Looks okay to me. I don't use mdadm but this makes sense.

+*mdadm-3.2.6-r1 (08 Feb 2013)
+
+  08 Feb 2013; Samuli Suominen <ssuominen@gentoo.org> +mdadm-3.2.6-r1.ebuild:
+  Use /run/mdadm instead of /var/run/mdadm with baselayout that has the /run
+  directory wrt #416081 by Alexander Tsoy and Robin Bankhead
Comment 56 SpanKY gentoo-dev 2013-04-27 09:32:28 UTC
(In reply to comment #35)

thanks for tracking this down

(In reply to comment #55)

thanks for the commit