Description of the problem (and things I've tried) at http://forums.gentoo.org/viewtopic-p-7038642.html Lots of output in dmesg but not really sure what I'm looking for. Can post the whole thing if it helps. I don't fully understand the order of things that should be happening, what udev's role should be etc., so apologies if I've overlooked something. emerge --info: Portage 2.1.10.59 (default/linux/amd64/10.0/server, gcc-4.5.3, glibc-2.15-r1, 3.3.5-gentoo x86_64) ================================================================= System uname: Linux-3.3.5-gentoo-x86_64-AMD_Turion-tm-_II_Neo_N40L_Dual-Core_Processor-with-gentoo-2.1 Timestamp of tree: Mon, 14 May 2012 19:15:01 +0000 distcc 3.1 x86_64-pc-linux-gnu [disabled] ccache version 3.1.7 [enabled] app-shells/bash: 4.2_p28 dev-java/java-config: 2.1.11-r3 dev-lang/python: 2.7.3-r2, 3.2.3-r1 dev-util/ccache: 3.1.7 dev-util/cmake: 2.8.8-r2 dev-util/pkgconfig: 0.26 sys-apps/baselayout: 2.1 sys-apps/openrc: 0.9.9.3 sys-apps/sandbox: 2.5 sys-devel/autoconf: 2.13, 2.69 sys-devel/automake: 1.11.5 sys-devel/binutils: 2.22-r1 sys-devel/gcc: 4.5.3-r2 sys-devel/gcc-config: 1.7 sys-devel/libtool: 2.4.2 sys-devel/make: 3.82-r3 sys-kernel/linux-headers: 3.3 (virtual/os-headers) sys-libs/glibc: 2.15-r1 Repositories: gentoo belak x-portage ACCEPT_KEYWORDS="amd64 ~amd64" ACCEPT_LICENSE="*" CBUILD="x86_64-pc-linux-gnu" CFLAGS="-march=native -O2 -pipe" CHOST="x86_64-pc-linux-gnu" CONFIG_PROTECT="/etc /etc/make.conf /usr/bin/vncserver /usr/lib/X11/xdm/Xsetup_0 /usr/share/gnupg/qualified.txt" CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/java/ /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/php/apache2-php5.3/ext-active/ /etc/php/cgi-php5.3/ext-active/ /etc/php/cli-php5.3/ext-active/ /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo" CXXFLAGS="-march=native -O2 -pipe" DISTDIR="/usr/portage/distfiles" EMERGE_DEFAULT_OPTS="--nospinner --quiet-build=n" FEATURES="assume-digests binpkg-logs ccache distlocks ebuild-locks fixlafiles news parse-eapi-ebuild-head protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch" FFLAGS="" GENTOO_MIRRORS="http://gentoo.blueyonder.co.uk/" LANG="en_GB.utf8" LDFLAGS="-Wl,-O1 -Wl,--as-needed" LINGUAS="en_GB" MAKEOPTS="-j20 -l5.2" PKGDIR="/usr/portage/packages" PORTAGE_CONFIGROOT="/" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages" PORTAGE_TMPDIR="/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/var/lib/layman/belak /usr/local/portage" SYNC="rsync://rsync.uk.gentoo.org/gentoo-portage" USE="3dnow 3dnowext X509 a52 aac accessibility acl acpi amd64 apache2 bash-completion berkdb bzip2 clamav cli consolekit cracklib crypt cups curl curlwrappers cxx dbm dbus dbx dri dv enca encode exif fastcgi fat ffmpeg flac foomaticdb fortran ftp gd gd-external gdbm gnutls gpm hardened iconv imagemagick imap imlib innodb java javascript jpeg lcms ldap libwww lzo mad matroska mdadm minimal mmx mmxext modules mp3 mudflap multilib mysql ncurses nls nptl ntfs offensive ogg openmp optimized-qmake pam pcre php png policykit posix pppd qt4 raw readline rtmp samba scanner session sharedmem snmp soap sse sse2 sse4a ssl startup-notification tcpd theora threads tidy tiff tokenizer tordns truetype udev unicode usb vhosts x264 xinetd xml xmlrpc xorg xsl xvid xz zip zlib" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic auth_digest authn_anon authn_dbd authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgid dav dav_fs dav_lock dbd deflate dir disk_cache env expires ext_filter file_cache filter headers ident imagemap include info log_config logio mem_cache mime mime_magic negotiation proxy proxy_ajp proxy_balancer proxy_connect proxy_http rewrite setenvif so speling status unique_id userdir usertrack vhost_alias" APACHE2_MPMS="worker" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump" CAMERAS="ptp2" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip tripmate tnt ubx" INPUT_DEVICES="keyboard mouse" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" LINGUAS="en_GB" PHP_TARGETS="php5-3" RUBY_TARGETS="ruby18" SANE_BACKENDS="hp net" USERLAND="GNU" VIDEO_CARDS="radeon r600 vesa" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account" Unset: CPPFLAGS, CTARGET, INSTALL_MASK, LC_ALL, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, USE_PYTHON
Created attachment 312001 [details] dmesg booting with mdadm-3.2.4 (bad) dmesg when booting with mdadm-3.2.4, kernel autodetect off, nothing in mdadm.conf, mdraid and mdadm in boot runlevel. Result: 6 arrays in /proc/mdstat (md125-127, md0-2) with random combinations of partitions assigned to each. Note that if I stop all these arrays and do mdadm -As, the arrays assemble correctly as md0-2.
Created attachment 312003 [details] dmesg booting with mdadm-3.2.3-r1 (good) dmesg with mdadm-3.2.3-r1, same kernel/mdadm/mdraid config as above. Result: arrays correctly assembled at boot as md0-2.
Same thing for me, after upgrading to 3.2.4, my lone raid 0 array (sdd2, sdc2) that i'm using for LVM doesn't start correctly at boot, with the following in dmesg md: Autodetecting RAID arrays. md: invalid raid superblock magic on sdc2 md: sdc2 does not have a valid v0.90 superblock, not importing! md: invalid raid superblock magic on sdd2 md: sdd2 does not have a valid v0.90 superblock, not importing! md: Scanned 2 and added 0 devices. After boot, I find /dev/md0 inactive, /dev/md126 and /dev/md127 If I do : mdadm --stop /dev/md126 mdadm --stop /dev/md127 mdadm --assemble /dev/md0 /dev/sdc2 /dev/sdd2 After that I can start correctly my lvm partitions with "vgchange -a y" till next reboot
Well, forget about the dmesg part for me, it's always done that if I look at my /var/log/messages history, the kernel raid autodetect doesn't work with version >0.90
I've the same problem with 3.2.4 and RAID5 :(
try 3.2.5
Same thing with 3.2.5
I have the same problem here. It looks like there was some race condition problems between udev and mdadm init scripts that had upstream patch some stuff up [1] but it made things stop working for my RAID 1.2 as well it seems (with both mdadm 3.1.4 and 3.1.5). Looking at the init script, I see this happen when it tries to start: # mdadm -As mdadm: /dev/md0 is already in use. mdadm: /dev/md1 is already in use. restarting mdraid solves the problem because it shutdowns the RAID arrays which for me have the proper names, just not enough devices. For example, here is my mdadm.conf: # cat /etc/mdadm.conf |egrep -v "^#" DEVICE partitions ARRAY /dev/md0 UUID=966f16eb:ba91c926:55c047ab:fe7c86c9 ARRAY /dev/md1 UUID=d2929962:d2eba15b:54706133:d94b31cc and mdstat after boot: # cat /proc/mdstat Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] md1 : inactive sdb1[0](S) 19528704 blocks super 1.2 md0 : inactive sde2[5](S) 1933982720 blocks super 1.2 unused devices: <none> and after mdraid restart: # cat /proc/mdstat Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] md1 : active raid1 sdb1[0] sdc1[1] 19528632 blocks super 1.2 [2/2] [UU] bitmap: 0/1 pages [0KB], 65536KB chunk md0 : active raid5 sda2[0] sde2[5] sdd2[3] sdc2[2] sdb2[1] 7735908352 blocks super 1.2 level 5, 512k chunk, algorithm 2 [5/5] [UUUUU] bitmap: 0/15 pages [0KB], 65536KB chunk unused devices: <none> I'll downgrade for now but feel free to ask more things to test. [1] http://www.digipedia.pl/usenet/thread/19071/35509/
Is it possible that, in fixing one race condition, they've created another? If you look at the mdstat outputs in my forum thread (link in OP), the fact that twice as many arrays are there as should be (one set with my intended numbering md*, one with the md12* pattern) that, to my untrained eye, certainly looks like the result of two things racing to do the same thing. Kludging somehow to restart mdraid isn't a very elegant approach for me, anyway -- too many other deps on top of it. How to proceed? Which upstream do we bug, if someone hasn't already?
*** Bug 418469 has been marked as a duplicate of this bug. ***
Same problem here with mdadm 3.2.5 after updating world, after the first reboot my raid1 array refused to auto-assemble. Took me a few hours to figure out what was going on because for some reason grub decided to break simultaneously leading me to think something got seriously hosed on my system, but simply stopping and re-scanning made it working again (until the next reboot)
(In reply to comment #11) > Same problem here with mdadm 3.2.5 after updating world, after the first > reboot my raid1 array refused to auto-assemble. Took me a few hours to Raid1, too? Is that correct? That involves a lot more users.
(In reply to comment #12) > Raid1, too? Is that correct? That involves a lot more users. I can confirm that too.
Broken 4-disk raid5 array here as well. Stopping the invalid 1-disk array that shows up on boot and doing a manual --assemble --scan creates the correct array and forces a rebuild of the 4th drive.
All RAID 1 and RAID0 are broken. Reverted to mdmdm-3.2.3-r1. Now everything works again.
To be precise, it only affects raids assembled by mdadm during openrc bootup process. raids with metadata version <=0.9 and kernel level raid assembly afre not affected and still work. I just reproduced this bug after upgrading to [ebuild R ~] sys-fs/mdadm-3.2.5 USE="-static" 0 kB [ebuild R ] sys-fs/udev-171-r6 USE="extras gudev hwdb keymap rule_generator -action_modeswitch -build -debug -edd -floppy -introspection (-selinux) -test" 0 kB and rebooting, my two raids assembled into this Personalities : [raid1] md124 : active raid1 sdc[0] 1953513424 blocks super 1.2 [2/1] [U_] md125 : active raid1 sdf[2] 1465137424 blocks super 1.2 [2/1] [_U] md126 : inactive sdd[1](S) 1953513560 blocks super 1.2 md127 : inactive sde[0](S) 1465137560 blocks super 1.2 The real bad thing about it is active raid md124/sdc, which still has the same /dev/disk/by-uuid value, which got auto-mounted in fstab, invalidating the data on the second raid disk sdd (inactive raid mad126). After stopping all four umount /dev/md124 mdadm --misc --stop /dev/md124 mdadm --misc --stop /dev/md125 mdadm --misc --stop /dev/md126 mdadm --misc --stop /dev/md126 and running `mdadm --assemble --scan` i ended up with md125 : active raid1 sdd[1] 1953513424 blocks super 1.2 [2/1] [_U] md126 : active raid1 sdc[0] 1953513424 blocks super 1.2 [2/1] [U_] md127 : active raid1 sde[0] sdf[2] 1465137424 blocks super 1.2 [2/2] [UU] raid md127 (sde/sdf) got assembled correctly (the carry an dm_crypt and have not been modified by automount), [ 811.664262] md: bind<sdf> [ 811.668763] md: bind<sde> [ 811.673110] md/raid1:md127: active with 2 out of 2 mirrors [ 811.677283] md127: detected capacity change from 0 to 1500300722176 [ 811.682266] md127: unknown partition table but the other raid is broken [ 811.894716] md: md126 stopped. [ 811.900623] md: bind<sdd> [ 811.908878] md: bind<sdc> [ 811.912826] md: kicking non-fresh sdd from array! ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [ 811.916810] md: unbind<sdd> [ 811.923441] md: export_rdev(sdd) [ 811.928730] md/raid1:md126: active with 1 out of 2 mirrors [ 811.930877] md126: detected capacity change from 0 to 2000397746176 [ 811.935688] md126: unknown partition table [ 812.471740] md: md125 stopped. [ 812.476843] md: bind<sdd> [ 812.483281] md/raid1:md125: active with 1 out of 2 mirrors [ 812.487553] md125: detected capacity change from 0 to 2000397746176 [ 812.548118] md125: unknown partition table (don't mind the unknown partition table, there is no partition to find).
After downgrade and reboot, it mounted the other disk by-uuid ;-) now i have to recover umount /... mdadm --misc --stop /dev/md125 mdadm --misc --stop /dev/md126 mdadm --assemble /dev/md3 /dev/sdc /dev/sdd mdadm: /dev/md3 has been started with 1 drive (out of 2). (fails with "md: kicking non-fresh sdd from array!" in dmesg) ## flip a coin or do an educated guess which disk you want to override mdadm --zero-superblock /dev/sdd mdadm /dev/md3 --add /dev/sdd watch recovery in `cat /proc/mdstat` md3 : active raid1 sdd[2] sdc[0] 1953513424 blocks super 1.2 [2/1] [U_] [>....................] recovery = 0.9% (19414272/1953513424) finish=229.7min speed=140298K/sec
A contrary report: I've just (last two days) converted a non-raid partition to a RAID1 (same size but now mirrored on two disks). I think I've been using mdadm-3.2.5 for that task and it worked liked a charm. I just cae to this bugs sinec I saw the MASK today. I'm using v1.2 superblocks but kernel messages and /proc/mdstat look just fine: md0 : active raid1 sdb1[0] sdc1[1] 976427968 blocks super 1.2 [2/2] [UU]
hello, i also setup a gentoo server and did 4 raid arrays. md0 metadata 0.9 md1 metadata 0.9 md2 metadata 0.9 md127 metadata 1.2 (2.7 TB Partition. 0.9 i read has limit 2 TB) on gentoo server mdadm 3.1.4 mdraid is in bootlevel, gentoo-wiki.com told me to do so. mdadm i dont have in bootlevel. noting in mdadm.conf kernel autodetect off (but all Raid things * in Kernel, not [m] ) on the md127 i have a lvm physical volume with 2 logical volumes. Every 8th or 9th boot, i can see on monitor the message "could not mount all local filesystems". I also can see "could not mount .... cause it does not exist" it seems "fstab"? tries to mount the logical volumes BEFORE mdadm/mdraid? is ready. it looks like a race between lvm/mdraid? i then did /etc/conf.d/lvm = RC_NEED="mdraid" RC_AFTER="mdraid" but the phenomen stays. Every xth boot the race prevents the mounting of the logical drives. This dont happen with the drives that have metadata 0.9 This only happens with the raid array that use metadata 1.2 When i set kernel autodetect on (nothing in mdadm.conf) the phenom extends. Then every xth boot md127 is multiple available and shows up as md127 + md126 , ofcourse my logical volumes dont get mounted. next reboot all is fine, next not. It is like Las Vegas. Is metadata 1.2 not fully supported? I also noticed that from mdadm 3.1.4 metadata 0.9 is not used as default anymore. This is very annoying. Is there any solution? marko
(In reply to comment #19) > hello, > > i also setup a gentoo server and did 4 raid arrays. > > md0 metadata 0.9 > md1 metadata 0.9 > md2 metadata 0.9 > md127 metadata 1.2 (2.7 TB Partition. 0.9 i read has limit 2 TB) > > on gentoo server mdadm 3.1.4 > > mdraid is in bootlevel, gentoo-wiki.com told me to do so. > mdadm i dont have in bootlevel. > > noting in mdadm.conf > > kernel autodetect off (but all Raid things * in Kernel, not [m] ) > > on the md127 i have a lvm physical volume with 2 logical volumes. > > Every 8th or 9th boot, i can see on monitor the message > "could not mount all local filesystems". > I also can see "could not mount .... cause it does not exist" > > it seems "fstab"? tries to mount the logical volumes BEFORE mdadm/mdraid? is > ready. it looks like a race between lvm/mdraid? > > i then did /etc/conf.d/lvm = RC_NEED="mdraid" RC_AFTER="mdraid" > but the phenomen stays. > > Every xth boot the race prevents the mounting of the logical drives. > This dont happen with the drives that have metadata 0.9 > This only happens with the raid array that use metadata 1.2 > > When i set kernel autodetect on (nothing in mdadm.conf) the phenom extends. > Then every xth boot md127 is multiple available and shows up as > md127 + md126 , ofcourse my logical volumes dont get mounted. > > next reboot all is fine, next not. It is like Las Vegas. > > Is metadata 1.2 not fully supported? I also noticed that from mdadm 3.1.4 > metadata 0.9 is not used as default anymore. > > This is very annoying. Is there any solution? > > marko Edit: my "emerge --info" # emerge --info --- Invalid atom in /etc/portage/package.keywords: =sys-kernel/vanilla-sources3.4.2 Portage 2.1.10.49 (default/linux/amd64/10.0/server, gcc-4.5.3, glibc-2.14.1-r3, 3.4.2_weber3.4.2 x86_64) ================================================================= System uname: Linux-3.4.2_weber3.4.2-x86_64-Intel-R-_Core-TM-_i7-2600_CPU_@_3.40GHz-with-gentoo-2.1 Timestamp of tree: Tue, 12 Jun 2012 08:45:01 +0000 app-shells/bash: 4.2_p20 dev-lang/python: 2.7.3-r2, 3.1.5, 3.2.3 dev-util/cmake: 2.8.7-r5 dev-util/pkgconfig: 0.26 sys-apps/baselayout: 2.1-r1 sys-apps/openrc: 0.9.9 sys-apps/sandbox: 2.5 sys-devel/autoconf: 2.68 sys-devel/automake: 1.4_p6-r1, 1.11.1 sys-devel/binutils: 2.21.1-r1 sys-devel/gcc: 4.5.3-r2 sys-devel/gcc-config: 1.6 sys-devel/libtool: 2.4-r1 sys-devel/make: 3.82-r1 sys-kernel/linux-headers: 3.1 (virtual/os-headers) sys-libs/glibc: 2.14.1-r3 Repositories: gentoo x-mailserver ACCEPT_KEYWORDS="amd64" ACCEPT_LICENSE="* -@EULA" CBUILD="x86_64-pc-linux-gnu" CFLAGS="-march=core2 -mtune=generic -O2 -pipe" CHOST="x86_64-pc-linux-gnu" CONFIG_PROTECT="/etc" CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/php/apache2-php5.3/ext-active/ /etc/php/cgi-php5.3/ext-active/ /etc/php/cli-php5.3/ext-active/ /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo" CXXFLAGS="-march=core2 -mtune=generic -O2 -pipe" DISTDIR="/usr/portage/distfiles" FEATURES="assume-digests binpkg-logs distlocks ebuild-locks fixlafiles news parallel-fetch protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch" FFLAGS="" GENTOO_MIRRORS="rsync://de-mirror.org/gentoo/ ftp://ftp.halifax.rwth-aachen.de/gentoo/ rsync://ftp.halifax.rwth-aachen.de/gentoo/" LANG="en_GB.utf8" LDFLAGS="-Wl,--as-needed" MAKEOPTS="-j9" PKGDIR="/usr/portage/packages" PORTAGE_CONFIGROOT="/" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/portage/local/mailserver" SYNC="rsync://rsync11.de.gentoo.org/gentoo-portage" USE="acl amd64 berkdb bzip2 cli cracklib crypt cups cxx dri fortran gdbm gpm iconv mmx modules mudflap multilib ncurses nls nptl openmp pam pcre pppd readline session snmp sse sse2 ssl tcpd truetype unicode xml xorg zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" APACHE2_MPMS="prefork" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump" CAMERAS="ptp2" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip tripmate tnt ubx" INPUT_DEVICES="keyboard mouse evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" PHP_TARGETS="php5-3" PYTHON_TARGETS="python3_2 python2_7" RUBY_TARGETS="ruby18 ruby19" USERLAND="GNU" VIDEO_CARDS="fbdev glint intel mach64 mga neomagic nouveau nv r128 radeon savage sis tdfx trident vesa via vmware dummy v4l" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account" Unset: CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL, LINGUAS, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, USE_PYTHON
Marko, I'm not so sure about that as I for one don't use lvm at all. I have the relevant kernel code built-in and the utils are present, but I've never used it. I think you might have a different issue.
I've noticed similar behaviour on mdadm-3.2.5 and hardware RAID0 through imsm metadata. Running "mdadm -Asv" manualy give me "failed to get exclusive lock on mapfile" error. I noticed that in interactive boot when I skip 'mdraid' and 'lvm' and stop boot process AFTER starting 'root' service and then exit to console I can successfully assembly devices by running: "mdadm -Ss" and manualy starting 'mdraid' service. Is it possible that 'mdraid' need rw remounted root filesystem to work?
(In reply to comment #22) > I've noticed similar behaviour on mdadm-3.2.5 and hardware RAID0 through > imsm metadata. Running "mdadm -Asv" manualy give me "failed to get exclusive > lock on mapfile" error. > > I noticed that in interactive boot when I skip 'mdraid' and 'lvm' and stop > boot process AFTER starting 'root' service and then exit to console I can > successfully assembly devices by running: "mdadm -Ss" and manualy starting > 'mdraid' service. Is it possible that 'mdraid' need rw remounted root > filesystem to work? I still have the same issue in mdadm-3.2.6. > Is it possible that 'mdraid' need rw remounted root filesystem to work? Your are right. If I put "mount -o remount,rw /" in /etc/init.d/mdraid, then the error will go away, RAID works correctly.
I recently upgraded to latest ~arch kernel-3.6.2, udev-195, and mdadm-3.6.2, and mdraid stopped working properly. I had only 1 of 3 drives assembled in each array. Restarting the mdraid service fixes it. Took me a while to figure out that assembly was not done by the init script, but in fact due to /usr/lib/udev/rules.d/64-md-raid.rules . I removed the file and it works fine again. Something is wrong with the incremental assembly...
So, downgrading to mdadm-3.2.3-r2 also helps.
I can confirm that it affects mdadm-3.2.6 as well. I have a RAID5 system. It took me hours to find this bug. Could you please *mask* the affected versions? this is a show stopper.
if you comment out this line in the udev rules.d file, does it work ? ENV{ID_FS_TYPE}=="ddf_raid_member|isw_raid_member|linux_raid_member", GOTO="md_inc"
(In reply to comment #27) > if you comment out this line in the udev rules.d file, does it work ? > > ENV{ID_FS_TYPE}=="ddf_raid_member|isw_raid_member|linux_raid_member", > GOTO="md_inc" Why is this unmasked? I have not seen any evidence that this serious bug is fixed.
(In reply to comment #28) you didn't answer the question
> (In reply to comment #28) > > you didn't answer the question Because I don't have time to break my entire system right now...precisely why the package was masked in the first place.
Hi, I've the same problem since months now and the remove of the mentionned *.rules file have solved the issue because the array is correctly build by /etc/init.d/mdraid script during boot time ! Before the raid5 array was constructed only with 1 drive and never start at all. It's not a mdadm problem, the utility works as expected in all versions, it's a problem of udev script with "incremental build".
(In reply to comment #30) then don't spam noise here. i'm interested in fixing the problem. (In reply to comment #31) could you try removing just the one line from the rules.d file and not just deleting the entire thing ?
(In reply to comment #27) > if you comment out this line in the udev rules.d file, does it work ? > > ENV{ID_FS_TYPE}=="ddf_raid_member|isw_raid_member|linux_raid_member", > GOTO="md_inc" No change here.
For me the comments of this single line work, I've only the mdraid init script working as expected.
Created attachment 332464 [details, diff] mdadm-3.2.5-r1.ebuild.patch This patch fixes this issue for me. I think that only users with separate /var partition are affected by this bug. I haven't mdraid init.d script enabled. All my arrays incrementaly assembled by udev now.
(In reply to comment #35) > I think that only users with separate /var partition are affected by this bug. > In other words incremental assembly is broken without map file. :)
3.2.6 also works fine after changing MAP_DIR.
(In reply to comment #35) > Created attachment 332464 [details, diff] [details, diff] > mdadm-3.2.5-r1.ebuild.patch > > This patch fixes this issue for me. I think that only users with separate > /var partition are affected by this bug. > > I haven't mdraid init.d script enabled. All my arrays incrementaly assembled > by udev now. That is the default since 3.2.4 There is something else that fixed the problem for you...
BTW, is there anybody affected by the issue, who does not use initramfs or does not use genkernel to generate initramfs image?
(In reply to comment #38) > (In reply to comment #35) > > Created attachment 332464 [details, diff] [details, diff] [details, diff] > > mdadm-3.2.5-r1.ebuild.patch > > > > This patch fixes this issue for me. I think that only users with separate > > /var partition are affected by this bug. > > > > I haven't mdraid init.d script enabled. All my arrays incrementaly assembled > > by udev now. > > That is the default since 3.2.4 > There is something else that fixed the problem for you... Heh.. I just discovered this: src_prepare() { [...] sed -i 's:/run/mdadm:/var/run/mdadm:g' *.[ch] Makefile || die [...] } Before that, I looked at the souces after "ebuild ... unpack", "ebuild ... prepare" and saw this string in Makefile :) MAP_DIR=/var/run/mdadm
(In reply to comment #40) > src_prepare() { > [...] > sed -i 's:/run/mdadm:/var/run/mdadm:g' *.[ch] Makefile || die > [...] > } Can anybody try to remove this stupid sed from ebuild and check if this help?
Created attachment 332562 [details, diff] mdadm-3.2.6.ebuild.patch Updated patch. I think that passing MAP_DIR=/run/mdadm to make, even if this is by default, is good approach anyway.
Since nobody replied I want to explain the problem a bit more. Incremental assembly needs a map file. Default location for map file in mdadm-3.2.3 is /dev/.mdadm/map. In version 3.2.4 default location was changed to /run/mdadm/map, but ebuild changes this path to /var/run/mdadm/map. On system boot udev starts before localmount. So when udev starts /var is not mounted yet and mdadm can't create /var/run/mdadm directory and files in it. That is why incremental assembly fails.
(In reply to comment #41) > (In reply to comment #40) > > src_prepare() { > > [...] > > sed -i 's:/run/mdadm:/var/run/mdadm:g' *.[ch] Makefile || die > > [...] > > } > > Can anybody try to remove this stupid sed from ebuild and check if this help? I tried removing this line and rebooted, raid was assembled properly. Thanks.
(In reply to comment #43) > On system boot udev starts before localmount. So when udev starts /var is not > mounted yet and mdadm can't create /var/run/mdadm directory and files in it. > That is why incremental assembly fails. This is not entirely true. As mentioned in comment 22 the problem also occurs because "/" is mounted ro when udev starts. So separate "/var" is not a necessary condition.
(In reply to comment #45) > (In reply to comment #43) > > On system boot udev starts before localmount. So when udev starts /var is not > > mounted yet and mdadm can't create /var/run/mdadm directory and files in it. > > That is why incremental assembly fails. > > This is not entirely true. As mentioned in comment 22 the problem also > occurs because "/" is mounted ro when udev starts. So separate "/var" is not > a necessary condition. That's right. But if I added "localmount" into the dependencies of mdadm init script, then I got a dependency loop: some filesystem on the RAID need mdadm, but mdadm needs to mount "/" to "rw" too :(
Created attachment 333372 [details, diff] mdadm-3.2.6.ebuild.patch Updated patch. Block openrc versions that does not mount /run. (In reply to comment #46) > (In reply to comment #45) > > (In reply to comment #43) > > > On system boot udev starts before localmount. So when udev starts /var is not > > > mounted yet and mdadm can't create /var/run/mdadm directory and files in it. > > > That is why incremental assembly fails. > > > > This is not entirely true. As mentioned in comment 22 the problem also > > occurs because "/" is mounted ro when udev starts. So separate "/var" is not > > a necessary condition. > > That's right. > > But if I added "localmount" into the dependencies of mdadm init script, then > I got a dependency loop: some filesystem on the RAID need mdadm, but mdadm > needs to mount "/" to "rw" too :( Simply put patched mdadm ebuild into your local overlay. This should solve the issue.
(In reply to comment #47) > Created attachment 333372 [details, diff] [details, diff] > mdadm-3.2.6.ebuild.patch > > Updated patch. Block openrc versions that does not mount /run. > > (In reply to comment #46) > > (In reply to comment #45) > > > (In reply to comment #43) > > > > On system boot udev starts before localmount. So when udev starts /var is not > > > > mounted yet and mdadm can't create /var/run/mdadm directory and files in it. > > > > That is why incremental assembly fails. > > > > > > This is not entirely true. As mentioned in comment 22 the problem also > > > occurs because "/" is mounted ro when udev starts. So separate "/var" is not > > > a necessary condition. > > > > That's right. > > > > But if I added "localmount" into the dependencies of mdadm init script, then > > I got a dependency loop: some filesystem on the RAID need mdadm, but mdadm > > needs to mount "/" to "rw" too :( > > Simply put patched mdadm ebuild into your local overlay. This should solve > the issue. Funtoo still uses "openrc-0.10.2", that means I can not update mdadm anymore? :(
(In reply to comment #48) > Funtoo still uses "openrc-0.10.2", that means I can not update mdadm > anymore? :( openrc-0.10.5 is the only available version of 0.10 branch in portage, that's why I choose it. You can change it to something like this: !<sys-apps/openrc-0.10
Created attachment 333376 [details, diff] mdadm-3.2.6.ebuild.patch Updated patch with fixed openrc dependency.
(In reply to comment #50) > Created attachment 333376 [details, diff] [details, diff] > mdadm-3.2.6.ebuild.patch > > Updated patch with fixed openrc dependency. Great patch! It works for me. New my mdadm works correctly.
@base-system PING!
This bug is resolved. But nobody add the patch into mainline?
Patch efficacy confirmed here with 3.2.6 as well, FWIW.
(In reply to comment #50) > Created attachment 333376 [details, diff] [details, diff] > mdadm-3.2.6.ebuild.patch > > Updated patch with fixed openrc dependency. Looks okay to me. I don't use mdadm but this makes sense. +*mdadm-3.2.6-r1 (08 Feb 2013) + + 08 Feb 2013; Samuli Suominen <ssuominen@gentoo.org> +mdadm-3.2.6-r1.ebuild: + Use /run/mdadm instead of /var/run/mdadm with baselayout that has the /run + directory wrt #416081 by Alexander Tsoy and Robin Bankhead
(In reply to comment #35) thanks for tracking this down (In reply to comment #55) thanks for the commit