Bug 5310

Summary:	Refactored LVM & RAID initialization rc-scripts
Product:	Gentoo Linux	Reporter:	cDlm <dpollet>
Component:	[OLD] Core system	Assignee:	Daniel Ahlberg (RETIRED) <aliz>
Status:	RESOLVED FIXED
Severity:	enhancement	CC:	azarah, Bart.Theunissen, chris, dpollet, ftobin+gentoo-bugzilla, greg, kevin, quintino, Take.Vos, vapier, w.k.havinga
Priority:	Normal
Version:	unspecified
Hardware:	All
OS:	Linux
Whiteboard:
Package list:		Runtime testing required:	---
Bug Depends on:
Bug Blocks:	7405
Attachments:	RAID initialization rc-script checkfs without lvm stuff LVM initialization rc-script

Description cDlm 2002-07-20 11:29:16 UTC

Hi,

I could find how to make RAID autodetection work (powermac 8600/200, /usr on RAID 0), so made a rc-script for it; I also extracted the LVM stuff from /etc/init.d/checkfs, which now uses the raid and lvm rc-scripts.

Please find attached my version of those 3 files.

Comment 1 cDlm 2002-07-20 11:31:19 UTC

Created attachment 2417 [details]
RAID initialization rc-script

Comment 2 cDlm 2002-07-20 11:34:43 UTC

Created attachment 2418 [details]
checkfs without lvm stuff

checkfs depends raid and lvm. I don't really know which is best (use/need). I
put use, so one has to add the raid/lvm service to the appropriate (boot ?)
runlevel.

Comment 3 cDlm 2002-07-20 11:35:39 UTC

Created attachment 2419 [details]
LVM initialization rc-script

this is the stuff extracted from checkfs

Comment 4 SpanKY gentoo-dev

2002-07-28 20:22:30 UTC

ok, so i checked out your scripts since i run 2 raids myself ...

here is what i suggest:
the first attachment, make it /etc/init.d/raid
run `rc-update add raid boot`
edit /etc/init.d/checkfs and add 'use raid'

this setup works for me ... it starts the raids before checking fs and then 
mounting them

i dont use LVM so i cant comment on them

Comment 5 cDlm 2002-07-30 15:46:12 UTC

SpanKY: thats what I meant by rc-scripts (scripts in /etc/init.d)

BTW I'm not an expert in linux init things, and don't have much time to fix that now... anyway I noticed on my machine raid failed to stop at shutdown, and that there is LVM stopping stuff in /etc/init.d/halt.sh (but nothing similar for raid), so my scripts may obviously not do The Right Thing...

Comment 6 SpanKY gentoo-dev

2002-07-30 16:30:36 UTC

i know what you meant when you said 'rc-script' but you didnt specify filenames
for the scripts

as for the raid failing to stop on shutdown, i noticed that too ... but when i
do raidstop --all, mine just spits back 'nothing to do' ... yet when i do
`raidstop /dev/md0` and `raidstop /dev/md1` it works fine
`raidstop -a -c /etc/raidtab` and `raidstop /dev/md*` also fail ...
i guess one hack would be to grep 'raiddev' out of /etc/raidtab and run raidstop on it

for dev in `grep raiddev /etc/raidtab` ; do
  [ "${dev:0:5}" == "/dev/" ] && raidstop $dev
done

Comment 7 SpanKY gentoo-dev

2002-08-08 08:39:29 UTC

hey azarah, please take a look at this ?

i know all the raid users out there would love you for this support :)

Comment 8 Martin Schlemmer (RETIRED) gentoo-dev

2002-08-08 09:23:47 UTC

Checkout masked baselayout-1.8.1 ... it already have raid support in checkfs ..

Comment 9 SpanKY gentoo-dev

2002-08-09 09:02:53 UTC

coolio, the raid works for me ...
as for the lvm, i dont know use it so i cant say anything about it ;)

cDlm: how does baselayout 1.8.1 work for you ?

Comment 10 SpanKY gentoo-dev

2002-08-11 01:20:30 UTC

ok, well the raid support doesnt work as it should ;)

your script works if there is only one raid device defined in /etc/raidtab
i have 2 ;)

root@rux0r root # [ `grep "persistent-superblock" /etc/raidtab | awk '{print 
$2}'` == 0 ]
[: too many arguments

also, why do you check for this in /etc/raidtab ?
i have my values set to 1, which means your script (in theory) wouldnt start 
the raid(s).  but i want them started heh

basically, i would suggest you remove this line in /etc/init.d/checkfs:
if [ `grep "persistent-superblock" /etc/raidtab | awk '{print $2}'` == 0 ]

Comment 11 Martin Schlemmer (RETIRED) gentoo-dev

2002-08-11 17:36:56 UTC

Ok, what about changing that to:

---------------------------snip------------------------------------------------
        # Start software raid
        if [ -x /sbin/raidstart -a -f /etc/raidtab -a -f /proc/mdstat ]
        then
                ebegin "Starting software RAID"
                # still echo stderr for debugging
                /sbin/raidstart --all >/dev/null
                eend $? "Failed to start software RAID"
        fi
-------------------------------------------------------------------------------

BTW .. check the ChangeLog .. its not *my* piece of code :P

Also, I guess we need some shutdown stuff in halt.sh

Comment 12 Martin Schlemmer (RETIRED) gentoo-dev

2002-08-11 17:41:00 UTC

#stop RAID
if [ -x /sbin/raidstart -a -f /etc/raidtab -a -f /proc/mdstat ]
then
    ebegin "Stopping software RAID"
    raidstop -a
    eend $? "Failed to stop software RAID"
fi

Should do it (just before LVM in /etc/init.d/halt.sh) ?

Comment 13 SpanKY gentoo-dev

2002-08-11 17:58:34 UTC

thats what i did to my code and it worked ... you talked about raid so i 
thought it was yours :)

aliz: check this stuff out ? :)

Comment 14 Martin Schlemmer (RETIRED) gentoo-dev

2002-08-11 18:47:38 UTC

Commited to CVS.  Add here if anything major wrong.

Comment 15 Jay Nation 2002-08-26 14:54:34 UTC

Im getting boot errors from the checkfs raid startup code. My raid array
autodetects at boot so I don't need to start the raid in checkfs. The current
code causes raidstart to run again. which fails since /dev/md0 already exists.
The easiest solution I see is to readd the 
if [ `grep "persistent-superblock" /etc/raidtab | awk '{print $2}'` == 0 ]
line so that it stops, or I could change the persistent-superblock in
/etc/raidtab to "0" but then why bother autodetecting at all. My solution for
the time being is i just commented out all the raid items in checkfs and
halt.sh. Any other ideas? Id hate to have to edit the init.d scripts everytime
baselayout upgrades.

Comment 16 Bart Theunissen 2002-08-30 04:14:49 UTC

reply to #12 From Martin Schlemmer

raidstop -a does not work, check out the mail
http://web.gnu.walfield.org/mail-archive/linux-raid/1999-November/0074.html

You have to specify the raid device if you want to stop it.

Comment 17 Frank Tobin 2002-09-01 00:41:58 UTC

I'm with the sentiment that we don't need this raidstart in the startup scripts;
that's what we have autodetection for.

I'm also not quite seeing the benefit of having raidstop in the shutdown scripts.

Comment 18 Daniel Ahlberg (RETIRED) gentoo-dev

2002-09-05 02:37:34 UTC

Here are my propopsed changes to checkfs and halt.sh. These would now be able to
handle both auto started raids and manually started raids and combinations.

=== begin halt.sh ===

--- /etc/init.d/halt.sh Mon Aug 26 16:58:22 2002
+++ /tmp/halt.sh        Thu Sep  5 08:58:11 2002
@@ -85,10 +85,13 @@
 eend 0
 
 #stop RAID
-if [ -x /sbin/raidstart -a -f /etc/raidtab -a -f /proc/mdstat ]
+if [ -x /sbin/raidstop -a -f /etc/raidtab -a -f /proc/mdstat ]
 then
        ebegin "Stopping software RAID"
-       raidstop -a
+       for a in $(grep -E "md[0-9]+[[:space:]]?: active raid" /proc/mdstat |
awk -F ':' '{print $1}')
+       do
+               raidstop /dev/$a
+       done
        eend $? "Failed to stop software RAID"
 fi
 
=== end halt.sh ===

=== begin checkfs ===
--- /etc/init.d/checkfs Mon Aug 26 16:58:22 2002
+++ /tmp/checkfs        Thu Sep  5 09:34:58 2002
@@ -28,8 +28,16 @@
        if [ -x /sbin/raidstart -a -f /etc/raidtab -a -f /proc/mdstat ]
        then
                ebegin "Starting software RAID"
-               # still echo stderr for debugging
-               /sbin/raidstart --all >/dev/null
+                       ACTIVE_RAID=`grep -E "md[0-9]+[[:space:]]?: active raid"
/proc/mdstat | awk -F ':' '{print $1}'`
+                       if [ ! -z $ACTIVE_RAID ];
+                       then
+                       for a in $(grep -E "raiddev /dev/md[0-9]+" /etc/raidtab
| awk '{print $2}' | grep -v "`grep -E "md[0-9]+[[:space:]]?: active raid"
/proc/mdstat | awk -F ':' '{print "/dev/"$1}' | sed s'/[[:space:]]//g'`")
+                       do
+                               /sbin/raidstart $a
+                       done
+               else
+                       /sbin/raidstart --all
+               fi
                eend $? "Failed to start software RAID"
        fi
=== end checkfs ===

Thoughts, comments?

Comment 19 Martin Schlemmer (RETIRED) gentoo-dev

2002-09-05 14:40:59 UTC

Aliz .. tested this ?

Then, .. what is the feeling from the other guys that is Raid clued up?  Like
I said, I do not use it myself.

Comment 20 Frank Tobin 2002-09-05 15:43:51 UTC

That shell code for checkfs worries the heck out of me.  It's not that
complicated, but it adds too much 'smarts' to the system startup.  I'm
especially against the fact that it's included in a generic file called
"checkfs".  It should be in a separate script, e.g., "raidfs", not enabled by
default.  I'm certainly worried about fragile 'smarts' in my startup scripts.

For me, because I use autodetection (which is highly reocmmended and simple to
activate), this introductions of raid-knowledgeable stuff into the
startup/shutdown scripts has caused nothing but trouble.  I've had to remove
checkfs from the boot scripts as a result.

Are there really that many people not using autodetection that this stuff in
checkfs should be there at all?

Comment 21 Martin Schlemmer (RETIRED) gentoo-dev

2002-09-05 16:14:38 UTC

Aliz ?

Comment 22 SpanKY gentoo-dev

2002-09-05 17:16:02 UTC

it has actually made my life easier ... i was always adding 'raidstart' 
commands to checkfs so that both my raids would come up and be checked properly

but i do agree that a sep raid script would be better (disabled by default) and 
with checkfs having a 'use raid' line ...

Comment 23 Daniel Ahlberg (RETIRED) gentoo-dev

2002-09-06 01:59:40 UTC

I can confirm that both scripts are working as expected, both on a system with
autostarting raid and one with not autostarting raid.

I dont have any real opinion whetever not to separate raid and maybe lvm from
the checkfs. My suggestion would be to include the fixes in the checkfs and
halt.sh, if it's still broken we could move them to a diffrent script.

Also, if defining "use raid" in checkfs, wouldn't it run raid even if raid
wasn't added to the bootlevel?

Comment 24 Martin Schlemmer (RETIRED) gentoo-dev

2002-09-06 21:54:34 UTC

*** Bug 7405 has been marked as a duplicate of this bug. ***

Comment 25 Martin Schlemmer (RETIRED) gentoo-dev

2002-09-07 08:10:23 UTC

Problem I have with seperate raid script, is the following:  will shutdown
of the raid any place other than halt.sh be correct ?  Wont it then be too
soon?

NB: Check Mandrake/Redhat raid stuff in thier rc.sysinit ... much more
    complicated, and also in *the* main rc script.  I am guessing if not,
    controlling when (usually at the very end I would assume) shutdown, and
    in some extend startup, is difficult.

Comment 26 Daniel Ahlberg (RETIRED) gentoo-dev

2002-09-13 03:24:47 UTC

Martin, the checkfs.sh script stopped working for me all of a sudden, I tracked
the error down to the test -n in the if. Changing "-n" to "! -z"  made the
script start my raid again.

Comment 27 Martin Schlemmer (RETIRED) gentoo-dev

2002-09-13 19:13:34 UTC

Yes, im an idiot.  Please try the one in cvs.

Comment 28 SpanKY gentoo-dev

2002-09-15 13:38:37 UTC

*** Bug 7954 has been marked as a duplicate of this bug. ***

Comment 29 SpanKY gentoo-dev

2002-09-17 22:46:48 UTC

*** Bug 8052 has been marked as a duplicate of this bug. ***

Comment 30 Greg Oliver 2002-09-17 23:15:44 UTC

I get this error when I boot now:

Starting Software RAID:
/sbin/runscript.sh [: too many arguments

It make it to raidstart -all because after this I get file exists for all 3 raid
devices, but for some reason the grep -E from the script is failing.  I have
re-emerged baselayout to amke sure that my file was right, and it is.  There are
no extra "[" in the file.  The only place that a [ comes in is from /proc/mdstat
line1..  Have anything to do with it?  The script runs fine when I remove the
replace ewarn, etc with echo and run it through /bin/sh..

-Greg

Comment 31 Daniel Ahlberg (RETIRED) gentoo-dev

2002-09-19 03:45:57 UTC

Martin, no, the update didn't fix the problem. It seems you must use "! -z"
instead of "-n".

Comment 32 Daniel Ahlberg (RETIRED) gentoo-dev

2002-09-19 03:49:48 UTC

Greg, can you post or mail your /etc/raidtab and /proc/mdstat?  Also, do you
have any autostarting raids?

Comment 33 Greg Oliver 2002-09-19 10:43:55 UTC

All of them are autostarting, so i really do not need the script, just pointing
out that it is broken on my machine.  Running the script from a Bourne Shell
works fine though.

-Greg

----- /etc/raidtab -----
 

##### /USR - (Raid 0)
raiddev /dev/md0
raid-level
	0
nr-raid-disks
	2
chunk-size
	32
persistent-superblock
1
device
		/dev/sdb1
raid-disk
	0
device
		/dev/sdc1
raid-disk
	1
##### /HOME - (Raid 0)
raiddev /dev/md1
raid-level
	0
nr-raid-disks
	2
chunk-size
	32
persistent-superblock
1
device
		/dev/sdd1
raid-disk
	0
device
		/dev/sde1
raid-disk
	1
##### /VAR - (Raid 0)
raiddev /dev/md2
raid-level
	0
nr-raid-disks
	2
chunk-size
	32
persistent-superblock
1
device
		/dev/sdd2
raid-disk
	0
device
		/dev/sde2
raid-disk
	1

----- /proc/mdstat -----

Personalities : [raid0] 
read_ahead 1024 sectors
md0 : active raid0 scsi/host0/bus0/target2/lun0/part1[1]
scsi/host0/bus0/target1/lun0/part1[0]
      15952256 blocks 32k chunks
      
md1 : active raid0 scsi/host0/bus0/target4/lun0/part1[1]
scsi/host0/bus0/target3/lun0/part1[0]
      6522112 blocks 32k chunks
      
md2 : active raid0 scsi/host0/bus0/target4/lun0/part2[1]
scsi/host0/bus0/target3/lun0/part2[0]
      1959680 blocks 32k chunks
      
unused devices: <none>

Comment 34 Martin Schlemmer (RETIRED) gentoo-dev

2002-09-19 13:12:19 UTC

That dont really makes sense.  '! -z' = '-n', but whatever.

Aliz, please just fix and update ChangeLog.  Just make sure whatever you
commit is working on all kinds of setups, and try to keep it tidy.

Thanks.

Comment 35 Daniel Ahlberg (RETIRED) gentoo-dev

2002-09-23 04:31:19 UTC

Commited a fix for my problem and hopefully Greg's problem.

Greg, please try to replace your checkfs script with
http://www.gentoo.org/cgi-bin/viewcvs.cgi/gentoo-src/rc-scripts/init.d/checkfs?rev=1.16&content-type=text/vnd.viewcvs-markup

Comment 36 SpanKY gentoo-dev

2002-10-03 13:50:55 UTC

this doesnt make sense but ive seen it happen before ...

Comment 37 SpanKY gentoo-dev

2002-10-03 13:51:00 UTC

*** Bug 8679 has been marked as a duplicate of this bug. ***