Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 80702

Summary: checkfs fails on sw-raid using mdadm/udev if devicenodes do not exist
Product: Gentoo Linux Reporter: Andre Lammel <andre>
Component: [OLD] baselayoutAssignee: Gentoo's Team for Core System packages <base-system>
Status: RESOLVED FIXED    
Severity: critical CC: ginsu.squirrel, tronic2
Priority: High    
Version: 2004.3   
Hardware: All   
OS: Linux   
Whiteboard:
Package list:
Runtime testing required: ---
Attachments: Patch to /etc/init.d/checkfs

Description Andre Lammel 2005-02-04 04:35:46 UTC
We use SW-RAID1 and udev on our machine with following options
in /etc/conf.d/rc

RC_DEVICES="udev"
RC_DEVICE_TARBALL="no"

This leads to an error on boot because the devicenodes for /dev/md{n}
do not exist in /dev on invocation of /etc/init.d/checkfs and drops us
to a root-shell. It turned out that 

/sbin/mdadm -As ${DEVICENAME}

in /etc/init.d/checkfs expects the devicenodes /dev/md{n} to exist
at startuptime - which they obviously do not in our setup. We pro-
pose the following modification to to /etc/init.d/checkfs which en-
abled us to boot cleanly without changing our setup and (hopefully)
oes not break any other setup:

<snip>

if [ -e "${DEVICENAME}" ]                                                                                            
then                                                                                                        
  /sbin/mdadm -As "${DEVICENAME}" &>/dev/null                                                                  
  retval=$?                                                                                           
else                                                                                                        
  /sbin/mdadm -Aa "${DEVICENAME}" &> /dev/null                                                                 
  retval=$?                                                                                           
fi

<snip>

using mdadm -Aa creates the devicenodes which do not exist.

A patch to checkfs is attached to this bug. We would be glad
if this could be included with the next version of baslayout.

Yours

Andre Lammel
Comment 1 Andre Lammel 2005-02-04 04:37:41 UTC
Created attachment 50321 [details, diff]
Patch to /etc/init.d/checkfs

patch to /etc/init.d/checkfs for beeing able to boot on SW-RAID even if the
devicenodes do not exist at invocation of /etc/init.d/checkfs
Comment 2 SpanKY gentoo-dev 2005-02-04 05:52:30 UTC
*** Bug 62749 has been marked as a duplicate of this bug. ***
Comment 3 Kenton Groombridge 2005-02-05 12:52:05 UTC
I am glad that a lot of other people are having this problem other than me.  Hopefully will be a fix soon.

The patch below doesn't work for me since it is creating device nodes all with the same minor number.
Comment 4 Andre Lammel 2005-02-05 13:34:39 UTC
i think the problem you describe (device nodes all with the same minor number)
should be discussed with the mdadm people... btw. why is it a problem for you?
which version of mdadm do you use?

there is another way of creating devnodes using mknod - something like:

for i in <DEVLIST>
do
  if [ ! -e ${i} ]
  then
    mknod ${i} b <minor> <major>
  fi
done

but i think it would be cleaner to use mdadm for that and fix the prob-
lem you have in mdadm... let
Comment 5 Andre Lammel 2005-02-05 13:34:39 UTC
i think the problem you describe (device nodes all with the same minor number)
should be discussed with the mdadm people... btw. why is it a problem for you?
which version of mdadm do you use?

there is another way of creating devnodes using mknod - something like:

for i in <DEVLIST>
do
  if [ ! -e ${i} ]
  then
    mknod ${i} b <minor> <major>
  fi
done

but i think it would be cleaner to use mdadm for that and fix the prob-
lem you have in mdadm... let´s hear more voices on this...
Comment 6 Andre Lammel 2005-02-07 06:26:25 UTC
btw, in our case mdadm creates nodes with ascending minors:

brw-rw----  1 root disk 9, 0 Feb  6 13:13 /dev/md/0
brw-rw----  1 root disk 9, 1 Feb  6 13:13 /dev/md/1
brw-rw----  1 root disk 9, 2 Feb  6 13:13 /dev/md/2
brw-rw----  1 root disk 9, 3 Feb  6 13:13 /dev/md/3

we are using mdadm-1.7.0 because mdadm-1.8.1 does not start
our RAID but segfaults without any obvious reason.

yours

andre lammel
Comment 7 Andre Lammel 2005-02-09 05:59:50 UTC
hmmm, 

# mknod ${i} b <minor> <major>

should be

# mknod ${i} b <major> <minor>

my fault...

Andre Lammel
Comment 8 SpanKY gentoo-dev 2005-03-01 20:38:47 UTC
will be fixed properly with baselayout-1.11.10+ and mdadm-1.9.0-r1+
Comment 9 Radek "rush" Senfeld 2006-03-22 00:51:22 UTC
Sorry for reopening this bug, but it still happens to me. I found that the problem is the new way of creating device nodes using uevents - kernel 2.6.14+. I changed /lib/rcscripts/addons/udev-start.sh from this..

populate_udev() {
        # populate /dev with devices already found by the kernel
        if [ "$(get_KV)" -gt "$(KV_to_int '2.6.14')" ] ; then
                ebegin "Populating /dev with existing devices through uevents"
                trigger_events
                eend 0
        else
                ebegin "Populating /dev with existing devices with udevstart"
                /sbin/udevstart
                eend 0
        fi

to this..

populate_udev() {
        # populate /dev with devices already found by the kernel
        if [ "$(get_KV)" -gt "$(KV_to_int '2.6.14')" ] ; then
                ebegin "Populating /dev with existing devices through uevents"
                trigger_events
                eend 0
        fi
        ebegin "Populating /dev with existing devices with udevstart"
        /sbin/udevstart
        eend 0

And everything works now. I know it's an ugly temporary hack, but I can't figure out how to patch it better way. It occurs on every new installation with software RAID.
Comment 10 Greg Kroah-Hartman (RETIRED) gentoo-dev 2006-03-22 13:16:31 UTC
eeek, no, don't change the udev startup logic like that, lots of bad things
can possibly happen.

Time to just go delete udevstart entirely, as we don't use it anymore, and it
seems to be getting abused like this lately...