stopping the netmount script on shutdown. cifs shares won't umount if remote server is dead. I've added "umount -atl" so it will umount but hangs on fuser. --- a/etc/init.d/netmount +++ b/etc/init.d/netmount @@ -80,7 +80,7 @@ stop() fs="$fs${fs:+,}$x" done if [ -n "$fs" ]; then - umount -at $fs || eerror "Failed to simply unmount filesystems" + umount -at $fs || umount -atl $fs || eerror "Failed to simply unmount filesystems" fi eindent I've added a timeout to kill fuser after 60 seconds. +++ /lib/rc/sh/rc-mount.sh 2013-02-04 15:06:45.761257432 +0100 @@ -41,6 +41,17 @@ retry=4 # Effectively TERM, sleep 1, TERM, sleep 1, KILL, sleep 1 while ! LC_ALL=C $cmd "$mnt" 2>/dev/null; do if type fuser >/dev/null 2>&1; then + timeout=60 + while true;do + sleep 3s; + if [ "$timeout" -le 0 ];then + pid_of_user="`ps -A -o pid,comm,args|grep "fuser $f_opts "$mnt""|awk '$2 !~ /grep/ {print $1}'`" + [ -n "$pid_of_user" ] && kill -KILL "$pid_of_user" + break + fi + let timeout-=3 + done & + [[ $SPAMD_OPTS =~ \-u( |)([^\ ]*) ]] && USER=${BASH_REMATCH[2]} pids="$(fuser $f_opts "$mnt" 2>/dev/null)" fi case " $pids " in The whole purpose is that no matter what I need to shut down the server in case a poweroutage occurs and all the servers should be shutdown regardless if the remote server still running or already shutdown. Reproducible: Always Steps to Reproduce: 1. mount remote cifs share 2. edit files in mounted share eg with vim 3. patch netmount with above "mount -ats so it will not hang on umount any longer 4. /etc/init.d/netmount stop --debug Expected Results: stop the service and continue to shut down.
sorry obviously this line went accidently into the patch [[ $SPAMD_OPTS =~ \-u( |)([^\ ]*) ]] && USER=${BASH_REMATCH[2]}
Please attach the patches such that they can be downloaded, also verify that they apply such that they don't have to be manually corrected.
Created attachment 337998 [details, diff] lazy umount remote filesystems add umount -atl to /etc/init.d/netmount
Created attachment 338000 [details, diff] add timeout to fuser command timeout to fuser command
Created attachment 338002 [details, diff] add timeout to fuser command binary "timeout" from coreutils (In reply to comment #4) > Created attachment 338000 [details, diff] [details, diff] > add timeout to fuser command > > timeout to fuser command alternatively patch to patch mentioned in comment #4 This patch uses timeout from coreutils to accomplish the fuser timeout. This is a much simpler solution but depends on coreutils.
I was just informed that we should encourage users to add "nofail" to the mount options in fstab for network file systems. If you do this, how does that affect this bug?
(In reply to comment #6) > I was just informed that we should encourage users to add "nofail" to > the mount options in fstab for network file systems. > > If you do this, how does that affect this bug? The share mounted is a cifs share. I mounted it with the nofail option than tested it again -- same problem.
(In reply to comment #7) > (In reply to comment #6) > > I was just informed that we should encourage users to add "nofail" to > > the mount options in fstab for network file systems. > > > > If you do this, how does that affect this bug? > > > The share mounted is a cifs share. I mounted it with the nofail option than > tested it again -- same problem. Sorry, let me rephrase the question. If you mount all of your network file systems with the nofail option and remove the lazy unmount option you added to netmount, the netmount script should terminate successfully regardless of the status of the remote host. Does this happen?
(In reply to comment #8) > (In reply to comment #7) > > (In reply to comment #6) > > > I was just informed that we should encourage users to add "nofail" to > > > the mount options in fstab for network file systems. > > > > > > If you do this, how does that affect this bug? > > > > > > The share mounted is a cifs share. I mounted it with the nofail option than > > tested it again -- same problem. > > Sorry, let me rephrase the question. > > If you mount all of your network file systems with the nofail option and > remove the lazy unmount option you added to netmount, the netmount script > should terminate successfully regardless of the status of the remote host. > Does this happen? it will terminate with or without nofail. But if there are open files on this share fuser tries to terminate this processes owning the files and get stuck -- so the script never finishes.
Created attachment 338070 [details, diff] add timeout to fuser command binary "timeout" from coreutils -- fixed fixed missing value for -k
I spoke to Mike Frysinger, our base system lead, and he seems to think the cleanest solution would be to add timeout functionality to fuser, and I agree with him, so I am assigning this to base-system.
I will, however, add a patch to OpenRc that is similar to the one above but allows the user to configure the length of the timeout.
I have added a patch in commit 6794441 of OpenRC to handle this temporarily. However, the real fix should go in fuser; maybe adding some kind of timeout capability so we don't have to use an external program to time it out.