Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 338034

Summary: /dev/md3 gets removed on shutdown and is missing during boot
Product: Gentoo Linux Reporter: Mario Klebsch <mario>
Component: [OLD] Core systemAssignee: Gentoo Linux bug wranglers <bug-wranglers>
Status: RESOLVED NEEDINFO    
Severity: major CC: mario
Priority: High    
Version: unspecified   
Hardware: x86   
OS: Linux   
Whiteboard:
Package list:
Runtime testing required: ---

Description Mario Klebsch 2010-09-19 11:23:04 UTC
I just updated my long running gentoo system and had some bad surprises. :-(

After emerge --update --deep --newuse world, my system was totally screwed up and unable to boot.

 Here some words about my setup:

I have one flash module on my IDE (hda1) containing /boot with grub and the kernel. The system boots from this disk

I have two sata disks (sda and sdb), but the BIOS is unable to boot from them. This is the readon for the FLASH disk on the IDE. GRUB loads the linux kernel and thge linux kernel can use the SATA drives.


On each sata drive, I have three partitions: he first one is for the root fs, the 2nd one is for swap and the third one is for everything els using LVM2.

I have reated two mirror sets:


md1 : active raid1 sdb1[1] sda1[0]
      256896 blocks [2/2] [UU]
      
md3 : active raid1 sdb3[1] sda3[0]
      486166976 blocks [2/2] [UU]

Root is on /dev/md1, the LMV physical disk is on /dev/md3

# pvs
  PV         VG   Fmt  Attr PSize   PFree 
  /dev/md3   vg   lvm2 a-   463.64g 63.64g


After updating my system, booting failed and I had to enter the root password to enter single user mode.

I discovered that the nodes in /dev were missing. The cause seems to be udev, which did not populate the /dev directory with the md*-nodes. Since the box is my router, too, I had no internet and had to get it working by myself.

I discovered, that I can set RC_DEVICES="static" in /etc/conf.d/rc and the system was booting again.

But from that moment on, every reboot screws up my system.  found out, thet /dev/md3 gets removed by the mdadm -Ss call in /lib/rcscripts/addons/raid-stop.sh, but is not re-created by the matching mdadm -As call in /lib/rcscripts/addons/raid-start.sh. :-(

mdadm -As failes with exit code 2, giving no error message. :-(

when I added some -v paramters to the mdadm call, I saw, that mdadm tries to re-assemble the RAIDs, but the devices all are busy. The cause seems to be, that the kernel already has re-assembled the RAID devices.

But I end up in having lost my /dev/md3 after a reboot. As a workaround, I added a mknod call to /lib/rcscripts/addons/raid-start.sh.

I read tons of bug reports about udev and LVM and RAID in the internet, but was unable to find something to cure my problem. Almost everyone used an initial ram disk to load some kernel modules and do some magic to switch the root fs. But my setup does not use and initial ram disk, I have all required modules in my kernel, which assembles the raid partitions and mounts them as / by itself. I found various solutions, telling to modify genkernel or the initial ram disks content. This all was of no help to me. :-(

I tried to remove CONFIG_SYSFS_DEPRECATED* from my kernels .config file, because one bug report stated, that CONFIG_SYSFS_DEPRECATED* must not be used when using current udev. Other bug reports told, that CONFIG_SYSFS_DEPRECATED* is needed, to get the RAID working. :-(

So what can I do?

- Is there any chance, to get udev working with my kernel or my kernel working with udev? An initial ram disk is not an option.

- Why the f**k is mdadm removing nodes in a static /dev?


Reproducible: Always

Steps to Reproduce:
init 6
Actual Results:  
After the reboot, only / is mounted, /dev/md3 is missing and all file systems on it are missing, too.

Expected Results:  
All file systems mounted
Comment 1 Mario Klebsch 2010-09-19 11:24:02 UTC
emerge --info
Portage 2.1.8.3 (default/linux/x86/10.0, gcc-4.4.3, glibc-2.11.2-r0, 2.6.34-gentoo-r6 i686)
=================================================================
System uname: Linux-2.6.34-gentoo-r6-i686-Intel-R-_Celeron-R-_M_processor_1400MHz-with-gentoo-1.12.13
Timestamp of tree: Sat, 18 Sep 2010 01:15:01 +0000
app-shells/bash:     4.1_p7
dev-lang/python:     2.6.5-r3, 3.1.2-r4
dev-util/cmake:      2.8.1-r2
sys-apps/baselayout: 1.12.13
sys-apps/sandbox:    1.6-r2
sys-devel/autoconf:  2.13, 2.65-r1
sys-devel/automake:  1.7.9-r1, 1.9.6-r2, 1.10.2, 1.11.1
sys-devel/binutils:  2.20.1-r1
sys-devel/gcc:       4.3.2-r3, 4.4.3-r2
sys-devel/gcc-config: 1.4.1
sys-devel/libtool:   2.2.6b
sys-devel/make:      3.81-r2
virtual/os-headers:  2.6.30-r1
ACCEPT_KEYWORDS="x86"
ACCEPT_LICENSE="* -@EULA"
CBUILD="i686-pc-linux-gnu"
CFLAGS="-O2 -march=i686 -pipe"
CHOST="i686-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/kde/3.5/env /usr/kde/3.5/share/config /usr/kde/3.5/shutdown /usr/share/X11/xkb /usr/share/config /var/bind"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo"
CXXFLAGS="-O2 -march=i686 -pipe"
DISTDIR="/usr/portage/distfiles"
FEATURES="assume-digests distlocks fixpackages news parallel-fetch protect-owned sandbox sfperms strict unmerge-logs unmerge-orphans userfetch"
GENTOO_MIRRORS="ftp://ftp.join.uni-muenster.de/pub/linux/distributions/gentoo "
LANG="en_US.UTF-8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
LINGUAS="de en"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_EXTRA_OPTS="-6"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage"
SYNC="rsync://ftp.ipv6.uni-muenster.de/gentoo-portage"
USE="X accessibility acl berkdb bzip2 cli cracklib crypt cups cxx dri fortran gdbm gif gpm iconv ipv6 mbox modules mudflap ncurses nls nptl nptlonly opengl openmp pam pcre perl pppd python qt3 readline reflection sasl session ssl sysfs tcpd unicode x86 xorg zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1 emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" ELIBC="glibc" INPUT_DEVICES="keyboard mouse evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LINGUAS="de en" RUBY_TARGETS="ruby18" USERLAND="GNU" VIDEO_CARDS="i810 vesa" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account" 
Unset:  CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, FFLAGS, INSTALL_MASK, LC_ALL, MAKEOPTS, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS
Comment 2 Wormo (RETIRED) gentoo-dev 2010-10-08 19:04:57 UTC
Sorry to take so long, sept was a very busy month... 

Are you still having problems with mdadm and udev?

If so, I suggest turning off CONFIG_SYSFS_DEPRECATED because modern udev will certainly not work properly with it on. Plenty long-running boxes have been broken on updates in recent times because of udev finally dropping backwards compatibility with the old sysfs layout, and the deprecated option had been propagated via 'make oldconfig' to the current kernel.

Also I think mdadm is supposed to create the md* nodes, so the other thing to look at is your mdadm.conf file.
Comment 3 Mike Auty (RETIRED) gentoo-dev 2010-11-04 18:42:23 UTC
If this is still an issue, please reopen this bug mentioning whether you're using CONFIG_SYSFS_DEPRECATED, and whether turning it off solved your problems...