Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 19007 - /etc/init.d/halt.sh fails on certain volume group names and tries to umount /.
Summary: /etc/init.d/halt.sh fails on certain volume group names and tries to umount /.
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: All Linux
: High critical (vote)
Assignee: Martin Schlemmer (RETIRED)
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2003-04-08 17:48 UTC by Stefan Förster
Modified: 2003-04-12 01:07 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Stefan Förster 2003-04-08 17:48:09 UTC
/etc/init.d/halt.sh fails on volume groups called "root"$SOMETHING, e.g. rootvg (common name vor VGs e.g. under AIX).
Furthermore, /etc/init.d/halt.sh tries to umount /, although it explicitely tries to avoid that. The offending line is:

for x in $(awk '!/(^#|proc|devfs|tmpfs|^none|^\/dev\/root|[[:space:]]\/[[:space:]])/ {print $2}' /proc/mounts |sort -r)

Perhaps it would be better written as:
for x in $(awk '!/(^#|proc|devfs|tmpfs|^none|[[:space:]]\/[[:space:]])/  && ($2 != "/") {print $2}' /proc/mounts |sort -r)

Reproducible: Always
Steps to Reproduce:

1. Create a VG named "rootvg".
2. Create a logical volume within ths VG.
3. Mount this volume.
4. Reboot.
Actual Results:  
The VG is not shut down properly.

Expected Results:  
Well, umount all LVs within the VG.

Portage 2.0.47-r10 (default-x86-1.4, gcc-3.2.2, glibc-2.3.1-r4)
=================================================================
System uname: 2.4.20-xfs i686 AMD Athlon(TM) XP2200+
GENTOO_MIRRORS="http://www.fhh.opensource-mirror.de/gentoo.org/ http://linux.rz.ruhr-uni-bochum.de/download/gentoo-mirror/ http://gentoo.oregonstate.edu/ http://www.ibiblio.org/pub/Linux/distributions/gentoo"
CONFIG_PROTECT="/etc /var/qmail/control /usr/kde/2/share/config /usr/kde/3/share/config /usr/X11R6/lib/X11/xkb /usr/kde/3.1/share/config /usr/share/config"
CONFIG_PROTECT_MASK="/etc/gconf /etc/env.d"
PORTDIR="/usr/portage"
DISTDIR="/usr/portage/distfiles"
PKGDIR="/usr/portage/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR_OVERLAY=""
USE="slang -3dfx 3dnow -aalib acl acpi -afs alsa -apache2 -apm arts -atlas avi berkdb -bonobo -canna cdr -cjk crypt cups -directfb -dga -doc -dvb dvd encode -esd ethereal -evo fbcon flash -freewnn -gb gd gdbm -ggi gif -gnome gphoto2 gpm -gps -gtk -guile -icc imap imlib innodb -ipv6 java jikes jpeg -junit kde -kerberos lcms -ldap -leim -libg++ -libgda libwww lirc -matrix -maildir mbox mikmod mmx motif mozilla mpeg -mule mysql -nas ncurses nls nocardbus -oav -oci8 odbc oggvorbis opengl oss pam -pcmcia pda pdflib perl pic plotutils png -pnp -postgres python qt quicktime readline -ruby samba sasl scanner sdl -slp snmp -socks5 spell sse ssl -static -svga tcltk tcpd tetex tiff truetype -trusted usb -voodoo3 -wavelan wmf X xface xml xml2 xmms xv zlib x86 xft"
COMPILER="gcc3"
CHOST="i686-pc-linux-gnu"
CFLAGS="-march=athlon-xp -O3 -pipe -fomit-frame-pointer"
CXXFLAGS="-march=athlon-xp -O3 -pipe -fomit-frame-pointer"
ACCEPT_KEYWORDS="x86"
MAKEOPTS="-j2"
AUTOCLEAN="yes"
SYNC="rsync://rsync.de.gentoo.org/gentoo-portage/"
FEATURES="sandbox ccache"
Comment 1 Martin Schlemmer (RETIRED) gentoo-dev 2003-04-09 00:32:40 UTC
Does your changes work ?
Comment 2 Martin Schlemmer (RETIRED) gentoo-dev 2003-04-09 02:37:07 UTC
Actually, I did fix this already.  Could you try baselayout-1.8.6.5, or
change that bit to:

-----------------------------------------------------------
# Try to unmount all filesystems (no /proc,tmpfs,devfs,etc).
# This is needed to make sure we dont have a mounted filesystem
# on a LVM volume when shutting LVM down ...
ebegin "Unmounting filesystems"
for x in `mount | awk '{ if (($5 !~ /(proc|sysfs|devfs|tmpfs)/) &&
                             ($1 !~ /^rootfs|^\/dev\/root/) &&
                             ($3 != "/"))
                           print $3
                       }' | sort -r`
do
        umount -f -r ${x} &> /dev/null
done
eend 0
---------------------------------------------------------------

And let me know if it works ?

Thanks.
Comment 3 Stefan Förster 2003-04-09 05:12:01 UTC
Ths cannot work either (at least not with a volume group named root$SOMETHING), because

...($1 !~ /^rootfs|^\/dev\/root/)...

matches rootvg. As I said, me and two of my colleagues ran into that - we are simply used to call our first volume group "rootvg", it is what AIX does.

Furthermore, I think it is a very bad idea to stick to the output of mount - /etc/mtab is not necessarily up to date all of the time, think about chrooted mounts.

I did _not_ try the new baselayout yet, because I thought the awk-statement you told me was an excerpt from that. If the new baselayout contains _another_ awk-statement, please let me know.

But with that new statement, it doesn't try to umount / too early, so one thing is gone.


Comment 4 Martin Schlemmer (RETIRED) gentoo-dev 2003-04-09 10:15:13 UTC
It does not match it.  If it matches, it will not be in the list to unmount:

----------------------------------------------------------------------------------
workshop root # echo rootvg | awk '{ if ($1 !~ /(^rootfs|^\/dev\/root)/) print }'
rootvg
workshop root # echo rootfs | awk '{ if ($1 !~ /(^rootfs|^\/dev\/root)/) print }'
workshop root # 
----------------------------------------------------------------------------------

Also, it check if $3 is "/", and do not add it to the umount list if so.

Then, why I do not use /proc/mounts, is because it does not have correct
filesystem type for 'bind' mounts, so in a case like this:

-----------------------------------------------------------------------
workshop root # mount --bind /dev/ /tmp/
workshop root # mount | grep tmp  
none on /dev/shm type tmpfs (rw)
/dev on /tmp type none (rw,bind)
workshop root # cat /proc/mounts | grep tmp
none /dev/shm tmpfs rw 0 0
none /tmp devfs rw 0 0
workshop root # 
-----------------------------------------------------------------------

/tmp do not get unmounted ...

Could you at least try the bit in comment #2 ?
Comment 5 Stefan Förster 2003-04-09 11:13:22 UTC
Well, ok, I tried that bit - as expected, it did _not_ work,  devices on /dev/rootvg didn't get umounted. You chose a bad example to demonstrate, of course it is "/dev/rootvg" which is confused with "/dev/root" and not only "rootvg", see also:

foerstes@abyss[~]$ echo /dev/rootvg | awk '{ if ($1 !~ /(^rootfs|^\/dev\/root)/) print }'
foerstes@abyss[~]$ 

Well, why make a loop around these umounts?

[tmpfs-stuff, devfs]
umount -ttmpfs -a -r
[vg-stuff]
umount -tnoproc -a -r -v
Comment 6 Martin Schlemmer (RETIRED) gentoo-dev 2003-04-09 15:53:34 UTC
Ah, ok, missed that, sorry =)  Below should fix, please try and let
me know.

---------------------------------------------------------------------
ebegin "Unmounting filesystems"
for x in `mount | awk '{ if (($5 !~ /^(proc|sysfs|devfs|tmpfs)$/) &&
                             ($1 !~ /^(rootfs|\/dev\/root)$/) &&
                             ($3 != "/"))
                           print $3
                       }' | sort -r`
do
        umount -f -r ${x} &> /dev/null
done
----------------------------------------------------------------------
Comment 7 Stefan Förster 2003-04-10 03:15:43 UTC
Yes, this works like it is supposed to do. This means that the new baselayout-package will fix all the bugs of the initial bug report (although I will slightly change these lines for private use).

Since using the output of "mount" leaves us with mounts not recorded in /etc/mtab (e.g. mounted in a chroot, / was mountd ro, ...) not being umounted, I wonder if I should open a new bug or if this issue could be included here.
Comment 8 Martin Schlemmer (RETIRED) gentoo-dev 2003-04-10 13:30:10 UTC
Like I said, using is /proc/mounts is a problem.  You cannot detect --bind
mounts, so needed things do not get unmounted, as they are seen as /, etc.

Is this still lvm releated ?  Can't we rather use /proc/lvm to check what volumes
is still active, and then check what is mounted in there, and try to unmount
those ?
Comment 9 Stefan Förster 2003-04-10 19:50:07 UTC
Summary:
1. You can not trust /etc/mtab (chroot, mount -n, / was mounted ro).
2. /proc/mount cannot be trusted because fstype is displayed wrongly when using --bind.
3. Loopback-Mounts can also be --bind'ed.
4. You can not umount /dev when it is on devfs.

Would this solved by:
a) Send all processes the term and then the kill signal.
b) Execute "umount -a -O bind".
c) Go through /proc/mounts in reverse order. For every mount which is not /, /dev or /proc: umount /mountpoint, not /device (yes, umount will work with mountpoints not recorded in /etc/mtab).
--> We should remain with an active LVM, / and perhaps /dev as devfs mounted.
d) Shut lvm down.
e) Remount / readonly.
[RAID-stuff, UPS]

This solves 1.) by using /proc/mounts, 2.) by the fact that we are umounting "--bind"ed mounts regardless of the filesystem-type as long as they are not mounted on / or /dev (and we do not get fooled, because we use the mountpoint, not the device, i.e. we won't try to "umount /" if somebody did a "mount --bind / /foo"), 3.) is solved by the same thing as 2.) and is further simplfied because we call killall5 earlier and also resolves the issue with 4.).

Why were you explicitely excluding _all_ mounts of type tmpfs, devfs and procfs?

But this is no longer LVM-related, it's more like a complete rewrite of /etc/init.d/halt.sh. If you want me to write this code, just say so. I have no problem with eschaning some scripts on my system, but I won't emerge complete packages which are masked/unstable, sorry.
Comment 10 Stefan Förster 2003-04-10 20:03:40 UTC
Umm, well, I meant:
[...]
4. You can not umount /dev when it is on devfs, but you can mount every devfs except that from the shutdown script if it is not referenced any longer.
[...]
Comment 11 Martin Schlemmer (RETIRED) gentoo-dev 2003-04-11 14:57:20 UTC
1) "umount -a -O bind" will still fail if it was mounted in chroot, as
   /proc/mounts do not record that info.

2)  halt.sh does try very early (before deactivating swap) to unmount all
    tmpfs mounts not in use.  Thus, because we use tmpfs for /mnt/.init.d,
    we do not want to force unmount of tmpfs mounts (and the Adelie server
    stuff need it to be present till the last).

What you propose, is still not 100% fail safe.  A better approach will
be to check /etc/mtab for critical stuff, and then go through /proc/mounts
and if its mount point do not match a critical mount, try to unmount:

------------------------------------------------------------------
ebegin "Unmounting filesystems"
no_unmount="`mount | awk '{ if (($5 ~ /^(proc|sysfs|devfs|tmpfs)$/) ||
                             ($1 ~ /^(rootfs|\/dev\/root|none)$/) ||
                             ($3 = "/"))
                           print $3
                       }' | uniq`"
for x in `awk '{ print $2 }' < /proc/mounts | sort -r`
do
    do_unmount="yes"

    for y in ${no_unmount}
    do
        [ "${x}" = "${y}" ] && do_unmount="no"
    done

    [ "${do_unmount}" = "yes" ] && umount -f -r ${x} &> /dev/null
done
eend 0
Comment 12 Stefan Förster 2003-04-11 15:49:33 UTC
To your point 1.), "mount -a -O bind" might fail: It doesn't matter if it is successfull, you don't need it at all.

However, what you propose is more or less what I said, so this should work reasonably well (although I still think you can ignore any mountpoints except /, /dev and perhapts /mnt/.init.d, I've tested this approach today with a wide variety of mounts and it works fully well for me). It's a "close this bug" as far as I'm concerned.
Comment 13 Martin Schlemmer (RETIRED) gentoo-dev 2003-04-12 01:07:45 UTC
Ok, thanks for all the input to get this done.

BTW: the 'unstable' baselayout is already well tested, there are only a few
     issues left ... if you want, have a look at bug #14946 .... the change
     in hostname breaks a few things like apache2, etc.  Some input on how
     it could be fixed will be nice ... also have a look at bug #18158 and
     bug #18801.