Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 182818 - 2007.0 livecd cannot boot from many (eg MD or EVMS) disk partitions
Summary: 2007.0 livecd cannot boot from many (eg MD or EVMS) disk partitions
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Hosted Projects
Classification: Unclassified
Component: genkernel (show other bugs)
Hardware: All Linux
: High normal (vote)
Assignee: Gentoo Genkernel Maintainers
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-06-21 20:54 UTC by Ed Wildgoose
Modified: 2008-01-12 00:51 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Use REAL_ROOT to select the block device containing the livecd identifier and loop file for CDROOT boot from disk or flash media (initrd_cdbootstrap.patch,413 bytes, patch)
2007-12-28 03:43 UTC, Peter Carmichael
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Ed Wildgoose 2007-06-21 20:54:41 UTC
I have a gentoo based hosted server with serial access.  I find it very useful to pop an unpacked minimal boot CD on a spare partition in order to use it to rescue the system if I damage it.

Right now there is a small "bug" in the standard initrd distributed with 2007.0.  The chain of events is as follows:

- It only tests for install media in /dev/hdX and /dev/sdX (ok and some USB drives, etc)
- Doesn't test any /dev/mdX or /dev/evms/xx partitions
- Can't work around it using cdroot=/dev/mdX because this forces the script to deliberately for the partition to mount as iso9660 (which obviously fails unless it is a real CD)

I fixed this by making a small change to the initrc.scripts file.  Changes are documented at the bottom of here:
http://gentoo-wiki.com/HOWTO_LiveCD_on_disk

Patch is as follows:
--- initrd.scripts.old      2007-06-21 20:25:25.000000000 +0100
+++ initrd/etc/initrd.scripts   2007-06-21 20:55:44.000000000 +0100
@@ -77,7 +77,7 @@
                                        mount -r -t auto ${x} ${NEW_ROOT}/mnt/cdrom \
                                                > /dev/null 2>&1
                                else
-                                       mount -r -t iso9660 ${x} ${NEW_ROOT}/mnt/cdrom \
+                                       mount -r -t auto ${x} ${NEW_ROOT}/mnt/cdrom \
                                                > /dev/null 2>&1
                                fi
                                if [ "$?" = '0' ]
@@ -362,6 +362,8 @@
        DEVICES="$DEVICES /dev/ubd* /dev/ubd/*"
        # iSeries devices
        DEVICES="$DEVICES /dev/iseries/vcd*"
+       # MD and EVMS devices (eg Booting from HD partition)
+       DEVICES="$DEVICES /dev/evms/md/md* /dev/evms/sd* /dev/evms/hd* /dev/md*"
        # The device was specified on the command line.  Shold we even be doing a
        # scan at this point?  I think not.
        [ -n "${CDROOT_DEV}" ] && DEVICES="$DEVICES ${CDROOT_DEV}"
----------

I think its pretty self explanatory even if it gets slightly mangled pasting here.


Any comments on this?  Any reason not to incorporate this in future genkernel/livecd initrds?

Having made either of these small changes I can then boot my server in rescue mode and even access any MD, LVM or EVMS drives.  


Reproducible: Always

Steps to Reproduce:
Comment 1 Chris Gianelloni (RETIRED) gentoo-dev 2007-06-21 21:04:03 UTC
Yes, we cannot use -t auto everywhere, for a reason.  It causes lockups under some  circumstances which are much more common than someone trying to boot the LiveCD media on a non-CD, which isn't supported by Release Engineering, at all.  Your solution is therefore mostly correct, btu we'll need to figure out something else to do than setting auto on the second mount.  Needless to say, it isn't a "bug" so much as "done on purpose to keep systems from locking up at boot which is much less desirable than not being able to boot the minimal image from a partition"... ;]
Comment 2 Ed Wildgoose 2007-06-21 23:16:49 UTC
I hear what you are saying, but notice that it *only* stops using -t auto *if* you specifically name a device to use.  For normal boots its still using the -t auto...

I kind of tackled the issue on two fronts here.  The second part of the patch simply adds a number of less common locations to search for partitions to boot - they are also right at the end of the list of options so we should find the USB sticks, etc right at the beginning anyway.  

Perhaps some heuristics are in order here?  Or perhaps try each mount twice, once with -t auto and later with iso9660? (or vice versa)

Is the minimal CD different to the normal cd then? Having a CD which can be used for rescue boots is quite a common use for gentoo livecds.  Would be nice to try and support some more corner cases?
Comment 3 Peter Carmichael 2007-12-28 00:13:39 UTC
Can you add REAL_ROOT to the start of the device search list in initrd.scripts:bootstrapCD?

This would allow the normal search sequence to be overridden with a known intended location.
Comment 4 Andrew Gaffney (RETIRED) gentoo-dev 2007-12-28 01:42:35 UTC
We could do that, but we'd also still need -t auto. I was thinking about this, and perhaps we can just add an extra kernel commandline option (maybe domountauto) that switches the iso9660 to auto for the small number of cases where it's needed. Then, you could just add 'cdroot=/dev/md2 domountauto' to the kernel commandline to get the intended effect. Does this work for everyone?
Comment 5 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2007-12-28 02:01:31 UTC
+1 from me.
Comment 6 Andrew Gaffney (RETIRED) gentoo-dev 2007-12-28 02:47:03 UTC
I've just committed a variation on my idea to SVN. There is a new kernel commandline parameter supported by the linuxrc: cdroot_type=foo. This defaults to iso9660 but can be overriden with something like "cdroot_type=auto", "cdroot_type=vfat", or even "cdroot_type=vfat,udf,ext2,foo,bar". With the next gk release, you'll be able to use "cdroot=/dev/evms/foo cdroot_type=auto" to accomplish what you want to do.
Comment 7 Peter Carmichael 2007-12-28 03:21:35 UTC
From the way it is currently in x86, the -t iso9660 is only invoked specifically if the 'cdroot=/dev/xxx' form is used, giving rise to the CDROOT_DEV variable holding a value.  It looks like most of the device searching functionality is operating on the basis of unintended consequences.

If CDROOT_DEV contains a value this gets appended as the last of the list of potential CD bootstrap devices.    When each of the other devices is tried, it is with -t iso9660 which is likely to fail, of course.  Finally the loop reaches the intended device (by virtue of it being appended to the list in bootstrapCD).  There is a comment in the code that suggests that this bit of code has been mangled.  It says:

# The device was specified on the command line.  Shold we even be doing a
# scan at this point?  I think not.

The code then goes on to append the value of CDROOT_DEV to DEVICES whereas the logic of the preceding comment is that DEVICES should just be set to contain the single value specified.

Is it really the intended behaviour for the script to try a whole bunch of devices that have not been specified prior to the one that has been specified in a manner which assures lots and lots of failed attempts at mounting non-iso9660 partitions as iso9660 partitions?

I've managed to get a workaround with the simplest of patches adding REAL_ROOT to the start of the device search list.  This sidesteps the existing (weird) CDROOT behaviour, without me having to delve into the reasons for it being the way it is.
Comment 8 Andrew Gaffney (RETIRED) gentoo-dev 2007-12-28 03:34:59 UTC
You're mostly right. The -t iso9660 comes into play when you boot with 'cdroot', not 'cdroot=foo'. In most cases, the scanning makes sense. However, in the case of 'cdroot=foo', scanning does not make sense, as the comment you quoted suggests. I've made the following change in SVN:

- # The device was specified on the command line.  Shold we even be doing a
- # scan at this point?  I think not.
- [ -n "${CDROOT_DEV}" ] && DEVICES="$DEVICES ${CDROOT_DEV}"
+ # The device was specified on the command line, so there's no need to scan
+ # a bunch of extra devices
+ [ -n "${CDROOT_DEV}" ] && DEVICES="${CDROOT_DEV}"
Comment 9 Peter Carmichael 2007-12-28 03:43:18 UTC
Created attachment 139479 [details, diff]
Use REAL_ROOT to select the block device containing the livecd identifier and loop file for CDROOT boot from disk or flash media

I have attached the aforementioned simple patch.
Comment 10 Peter Carmichael 2007-12-28 03:49:04 UTC
That fix of the search behaviour looks like what was originally intended by the writer of the comment.

cdroot=foo is the only case which invokes -t iso9660

cdroot on its own invokes -t auto

This is absolutely fine with me and messing with this would break pretty much every embedded type build.  By just adding any real_root=foo value at the start of the cdroot search string, you can get the behaviour that Ed was seeking in the first place.

--- generic/initrd.scripts	2007-12-28 03:34:25.000000000 +0000
+++ generic/initrd.scripts	2007-12-28 03:34:04.000000000 +0000
@@ -381,7 +381,7 @@
 bootstrapCD() {
 	# Locate the cdrom device with our media on it.
 	# CDROM DEVICES
-	DEVICES="/dev/cdroms/* /dev/ide/cd/* /dev/sr*"
+	DEVICES="${REAL_ROOT} /dev/cdroms/* /dev/ide/cd/* /dev/sr*"
 	# USB Keychain/Storage
 	DEVICES="$DEVICES /dev/sd*"
 	# IDE devices

Comment 11 Andrew Gaffney (RETIRED) gentoo-dev 2007-12-28 03:50:57 UTC
Your patch doesn't really make sense. In order to use your patch, you'd have to
use "cdroot real_root=/dev/foo/bar" instead of just "cdroot=/dev/foo/bar". It's
also "more" of an abuse of an existing parameter than using cdroot=.
Comment 12 Andrew Gaffney (RETIRED) gentoo-dev 2007-12-28 03:55:44 UTC
I still don't understand where you're coming from. The only time the -t iso9660 thing is used is in findmediamount() (which was findcdmount() in previous versions), which is only called from bootstrapCD(), which is only called when ${CDROOT} = '1', which is true with either "cdroot" OR "cdroot=foo".
Comment 13 Andrew Gaffney (RETIRED) gentoo-dev 2007-12-28 04:00:21 UTC
Okay, I lied. What I described is the current behavior in SVN, but I see what you're talking about going back a few revisions. However, I'm not sure this was intended behavior, because it really doesn't make sense.

With current SVN, if the old -t auto behavior is needed, you can still get it with 'cdroot_type=auto'.
Comment 14 Peter Carmichael 2007-12-28 04:29:11 UTC
Yes.  I was commenting on the x86 stable branch and 2007.0 livecd as specified by the originator.  Agreed that it doesn't make sense.

I'm a bit of an amateur and this bug-sleuthing thing, so I'm not geared up on SVN at all.

This area of functionality has been shifting around a fair bit and is for all intents undocumented.  From the hint you have given about what is coming with current SVN versions, this is all set to change again.  At least this time I will have a heads up.  This one has kept me up until 4 in the morning.

In genkernel-3.4.9_pre6 it is still current usage to have "cdroot real_root=/dev/nfs" for PXE booting of livecd images, so "cdroot real_root=/dev/foo" for hard disk hosted images has felt consistent.
Comment 15 Chris Gianelloni (RETIRED) gentoo-dev 2008-01-10 02:51:07 UTC
Well, for the most part, we don't care about *anything* in the portage tree, as far as code is concerned.  We're only interested in SVN.  If there's a bug in something in the tree, check SVN.  If the bug still exists, file a bug report (or comment on an existing one, if it's already open)... If the bug doesn't exist anymore, it'll be "automagically" fixed in the next genkernel version.  Basically, SVN is always "ahead" of what's in the tree and what's in the tree is static and unchanging.  Once a tarball has been rolled, that version is "done with" and we're on to the next one.  As such, we don't look back at things like "2007.0" as that is old software (in our eyes) and we look to the future.  When we look back at older releases, it's always to make sure we've resolved bugs or made things better.  We almost *never* look at the actual code and try to base anything on it, as it is likely SVN has changed that code already and we'd waste a lot of time (as you're starting to notice).

Thanks for all the bug-hunting.  It *really* is appreciated.  Don't take any of the above as anything more than informational.  That was my only intention.
Comment 16 Chris Gianelloni (RETIRED) gentoo-dev 2008-01-12 00:51:15 UTC
This should be resolved in genkernel 3.4.9, which will be hitting the tree shortly.  If this is still an issue with that version, please REOPEN this bug (or comment, if you cannot REOPEN it).