Summary: | Refactored LVM & RAID initialization rc-scripts | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | cDlm <dpollet> |
Component: | [OLD] Core system | Assignee: | Daniel Ahlberg (RETIRED) <aliz> |
Status: | RESOLVED FIXED | ||
Severity: | enhancement | CC: | azarah, Bart.Theunissen, chris, dpollet, ftobin+gentoo-bugzilla, greg, kevin, quintino, Take.Vos, vapier, w.k.havinga |
Priority: | Normal | ||
Version: | unspecified | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- | |
Bug Depends on: | |||
Bug Blocks: | 7405 | ||
Attachments: |
RAID initialization rc-script
checkfs without lvm stuff LVM initialization rc-script |
Description
cDlm
2002-07-20 11:29:16 UTC
Created attachment 2417 [details]
RAID initialization rc-script
Created attachment 2418 [details]
checkfs without lvm stuff
checkfs depends raid and lvm. I don't really know which is best (use/need). I
put use, so one has to add the raid/lvm service to the appropriate (boot ?)
runlevel.
Created attachment 2419 [details]
LVM initialization rc-script
this is the stuff extracted from checkfs
ok, so i checked out your scripts since i run 2 raids myself ... here is what i suggest: the first attachment, make it /etc/init.d/raid run `rc-update add raid boot` edit /etc/init.d/checkfs and add 'use raid' this setup works for me ... it starts the raids before checking fs and then mounting them i dont use LVM so i cant comment on them SpanKY: thats what I meant by rc-scripts (scripts in /etc/init.d) BTW I'm not an expert in linux init things, and don't have much time to fix that now... anyway I noticed on my machine raid failed to stop at shutdown, and that there is LVM stopping stuff in /etc/init.d/halt.sh (but nothing similar for raid), so my scripts may obviously not do The Right Thing... i know what you meant when you said 'rc-script' but you didnt specify filenames for the scripts as for the raid failing to stop on shutdown, i noticed that too ... but when i do raidstop --all, mine just spits back 'nothing to do' ... yet when i do `raidstop /dev/md0` and `raidstop /dev/md1` it works fine `raidstop -a -c /etc/raidtab` and `raidstop /dev/md*` also fail ... i guess one hack would be to grep 'raiddev' out of /etc/raidtab and run raidstop on it for dev in `grep raiddev /etc/raidtab` ; do [ "${dev:0:5}" == "/dev/" ] && raidstop $dev done hey azarah, please take a look at this ? i know all the raid users out there would love you for this support :) Checkout masked baselayout-1.8.1 ... it already have raid support in checkfs .. coolio, the raid works for me ... as for the lvm, i dont know use it so i cant say anything about it ;) cDlm: how does baselayout 1.8.1 work for you ? ok, well the raid support doesnt work as it should ;) your script works if there is only one raid device defined in /etc/raidtab i have 2 ;) root@rux0r root # [ `grep "persistent-superblock" /etc/raidtab | awk '{print $2}'` == 0 ] [: too many arguments also, why do you check for this in /etc/raidtab ? i have my values set to 1, which means your script (in theory) wouldnt start the raid(s). but i want them started heh basically, i would suggest you remove this line in /etc/init.d/checkfs: if [ `grep "persistent-superblock" /etc/raidtab | awk '{print $2}'` == 0 ] Ok, what about changing that to: ---------------------------snip------------------------------------------------ # Start software raid if [ -x /sbin/raidstart -a -f /etc/raidtab -a -f /proc/mdstat ] then ebegin "Starting software RAID" # still echo stderr for debugging /sbin/raidstart --all >/dev/null eend $? "Failed to start software RAID" fi ------------------------------------------------------------------------------- BTW .. check the ChangeLog .. its not *my* piece of code :P Also, I guess we need some shutdown stuff in halt.sh #stop RAID if [ -x /sbin/raidstart -a -f /etc/raidtab -a -f /proc/mdstat ] then ebegin "Stopping software RAID" raidstop -a eend $? "Failed to stop software RAID" fi Should do it (just before LVM in /etc/init.d/halt.sh) ? thats what i did to my code and it worked ... you talked about raid so i thought it was yours :) aliz: check this stuff out ? :) Commited to CVS. Add here if anything major wrong. Im getting boot errors from the checkfs raid startup code. My raid array autodetects at boot so I don't need to start the raid in checkfs. The current code causes raidstart to run again. which fails since /dev/md0 already exists. The easiest solution I see is to readd the if [ `grep "persistent-superblock" /etc/raidtab | awk '{print $2}'` == 0 ] line so that it stops, or I could change the persistent-superblock in /etc/raidtab to "0" but then why bother autodetecting at all. My solution for the time being is i just commented out all the raid items in checkfs and halt.sh. Any other ideas? Id hate to have to edit the init.d scripts everytime baselayout upgrades. reply to #12 From Martin Schlemmer raidstop -a does not work, check out the mail http://web.gnu.walfield.org/mail-archive/linux-raid/1999-November/0074.html You have to specify the raid device if you want to stop it. I'm with the sentiment that we don't need this raidstart in the startup scripts; that's what we have autodetection for. I'm also not quite seeing the benefit of having raidstop in the shutdown scripts. Here are my propopsed changes to checkfs and halt.sh. These would now be able to handle both auto started raids and manually started raids and combinations. === begin halt.sh === --- /etc/init.d/halt.sh Mon Aug 26 16:58:22 2002 +++ /tmp/halt.sh Thu Sep 5 08:58:11 2002 @@ -85,10 +85,13 @@ eend 0 #stop RAID -if [ -x /sbin/raidstart -a -f /etc/raidtab -a -f /proc/mdstat ] +if [ -x /sbin/raidstop -a -f /etc/raidtab -a -f /proc/mdstat ] then ebegin "Stopping software RAID" - raidstop -a + for a in $(grep -E "md[0-9]+[[:space:]]?: active raid" /proc/mdstat | awk -F ':' '{print $1}') + do + raidstop /dev/$a + done eend $? "Failed to stop software RAID" fi === end halt.sh === === begin checkfs === --- /etc/init.d/checkfs Mon Aug 26 16:58:22 2002 +++ /tmp/checkfs Thu Sep 5 09:34:58 2002 @@ -28,8 +28,16 @@ if [ -x /sbin/raidstart -a -f /etc/raidtab -a -f /proc/mdstat ] then ebegin "Starting software RAID" - # still echo stderr for debugging - /sbin/raidstart --all >/dev/null + ACTIVE_RAID=`grep -E "md[0-9]+[[:space:]]?: active raid" /proc/mdstat | awk -F ':' '{print $1}'` + if [ ! -z $ACTIVE_RAID ]; + then + for a in $(grep -E "raiddev /dev/md[0-9]+" /etc/raidtab | awk '{print $2}' | grep -v "`grep -E "md[0-9]+[[:space:]]?: active raid" /proc/mdstat | awk -F ':' '{print "/dev/"$1}' | sed s'/[[:space:]]//g'`") + do + /sbin/raidstart $a + done + else + /sbin/raidstart --all + fi eend $? "Failed to start software RAID" fi === end checkfs === Thoughts, comments? Aliz .. tested this ? Then, .. what is the feeling from the other guys that is Raid clued up? Like I said, I do not use it myself. That shell code for checkfs worries the heck out of me. It's not that complicated, but it adds too much 'smarts' to the system startup. I'm especially against the fact that it's included in a generic file called "checkfs". It should be in a separate script, e.g., "raidfs", not enabled by default. I'm certainly worried about fragile 'smarts' in my startup scripts. For me, because I use autodetection (which is highly reocmmended and simple to activate), this introductions of raid-knowledgeable stuff into the startup/shutdown scripts has caused nothing but trouble. I've had to remove checkfs from the boot scripts as a result. Are there really that many people not using autodetection that this stuff in checkfs should be there at all? Aliz ? it has actually made my life easier ... i was always adding 'raidstart' commands to checkfs so that both my raids would come up and be checked properly but i do agree that a sep raid script would be better (disabled by default) and with checkfs having a 'use raid' line ... I can confirm that both scripts are working as expected, both on a system with autostarting raid and one with not autostarting raid. I dont have any real opinion whetever not to separate raid and maybe lvm from the checkfs. My suggestion would be to include the fixes in the checkfs and halt.sh, if it's still broken we could move them to a diffrent script. Also, if defining "use raid" in checkfs, wouldn't it run raid even if raid wasn't added to the bootlevel? *** Bug 7405 has been marked as a duplicate of this bug. *** Problem I have with seperate raid script, is the following: will shutdown of the raid any place other than halt.sh be correct ? Wont it then be too soon? NB: Check Mandrake/Redhat raid stuff in thier rc.sysinit ... much more complicated, and also in *the* main rc script. I am guessing if not, controlling when (usually at the very end I would assume) shutdown, and in some extend startup, is difficult. Martin, the checkfs.sh script stopped working for me all of a sudden, I tracked the error down to the test -n in the if. Changing "-n" to "! -z" made the script start my raid again. Yes, im an idiot. Please try the one in cvs. *** Bug 7954 has been marked as a duplicate of this bug. *** *** Bug 8052 has been marked as a duplicate of this bug. *** I get this error when I boot now: Starting Software RAID: /sbin/runscript.sh [: too many arguments It make it to raidstart -all because after this I get file exists for all 3 raid devices, but for some reason the grep -E from the script is failing. I have re-emerged baselayout to amke sure that my file was right, and it is. There are no extra "[" in the file. The only place that a [ comes in is from /proc/mdstat line1.. Have anything to do with it? The script runs fine when I remove the replace ewarn, etc with echo and run it through /bin/sh.. -Greg Martin, no, the update didn't fix the problem. It seems you must use "! -z" instead of "-n". Greg, can you post or mail your /etc/raidtab and /proc/mdstat? Also, do you have any autostarting raids? All of them are autostarting, so i really do not need the script, just pointing out that it is broken on my machine. Running the script from a Bourne Shell works fine though. -Greg ----- /etc/raidtab ----- ##### /USR - (Raid 0) raiddev /dev/md0 raid-level 0 nr-raid-disks 2 chunk-size 32 persistent-superblock 1 device /dev/sdb1 raid-disk 0 device /dev/sdc1 raid-disk 1 ##### /HOME - (Raid 0) raiddev /dev/md1 raid-level 0 nr-raid-disks 2 chunk-size 32 persistent-superblock 1 device /dev/sdd1 raid-disk 0 device /dev/sde1 raid-disk 1 ##### /VAR - (Raid 0) raiddev /dev/md2 raid-level 0 nr-raid-disks 2 chunk-size 32 persistent-superblock 1 device /dev/sdd2 raid-disk 0 device /dev/sde2 raid-disk 1 ----- /proc/mdstat ----- Personalities : [raid0] read_ahead 1024 sectors md0 : active raid0 scsi/host0/bus0/target2/lun0/part1[1] scsi/host0/bus0/target1/lun0/part1[0] 15952256 blocks 32k chunks md1 : active raid0 scsi/host0/bus0/target4/lun0/part1[1] scsi/host0/bus0/target3/lun0/part1[0] 6522112 blocks 32k chunks md2 : active raid0 scsi/host0/bus0/target4/lun0/part2[1] scsi/host0/bus0/target3/lun0/part2[0] 1959680 blocks 32k chunks unused devices: <none> That dont really makes sense. '! -z' = '-n', but whatever. Aliz, please just fix and update ChangeLog. Just make sure whatever you commit is working on all kinds of setups, and try to keep it tidy. Thanks. Commited a fix for my problem and hopefully Greg's problem. Greg, please try to replace your checkfs script with http://www.gentoo.org/cgi-bin/viewcvs.cgi/gentoo-src/rc-scripts/init.d/checkfs?rev=1.16&content-type=text/vnd.viewcvs-markup this doesnt make sense but ive seen it happen before ... *** Bug 8679 has been marked as a duplicate of this bug. *** |