|Summary:||sys-apps/openrc: netmount hangs with dead remotes|
|Product:||Gentoo Hosted Projects||Reporter:||Gary <admin>|
|Component:||OpenRC||Assignee:||OpenRC Team <openrc>|
|Package list:||Runtime testing required:||---|
Description Gary 2011-07-18 18:33:33 UTC
During shutdown, occasionally netmount will fail to remove network shares during a parallel shutdown. This causes everything to hang for 50 seconds while it waits for netmount. The solution I was given on irc is to add -timeout to the /etc/init.d/netmount file to keywords so that it looks like keyword -jail -prefix -vserver -timeout Reproducible: Sometimes Steps to Reproduce: 1. Set parallel for OpenRC 2. Mount some network shares 3. Shutdown Actual Results: Netmount fails to umount network share, localmount now waits 50 seconds for netmount Expected Results: Netmount either shuts them down or doesn't care and proceeds with shutdown anyways This can also happen if you lose connection to an nfs share and attempt to shutdown
Comment 2 Gary 2011-07-18 18:50:29 UTC
It can happen on samba shares as well. Really any network share will do this, not just nfs. On the occasions that a network filesystem does freeze, it would be nice not to sit there for 50 seconds for netmount, and then localmount, etc. The problem occurs too even when I know the network share was active and well seconds before shutdown, almost as if parallel stopped my net.eth0 before netmount, making netmount fail for sure since it can no longer reach the remote server to shutdown.
Comment 3 SpanKY 2011-07-18 18:54:47 UTC
the referenced bug isnt specific to nfs but i see our netmount script doesnt use -l anyways. i wonder if we should.
Comment 4 William Hubbs 2011-07-20 00:28:18 UTC
It would be easy enough to add -l. Is this only an option for linux umount or does it also do the same thing on the bsds?
Comment 5 SpanKY 2011-07-20 03:17:57 UTC
my reservation would be that using lazy "unmounts" the point to userspace immediately and userspace could shutdown before it has actually finished syncing
Comment 6 Viktor Avramov 2011-12-29 09:31:14 UTC
I can confirm that this also affect CIFS shares (Apple TimeCapsule)... baselayout = 2.0.3 openrc = 0.9.4 By reading another bug thread (https://bugs.gentoo.org/show_bug.cgi?id=299633) and Gary's comment (Number 2) it became clear that "netmount" was attempting to shutdown the network share AFTER net.eth0 had gone down, therefore making the share unreachable and hence "netmount" and all the other scripts that depend on it enter a wait cycle... Based on this reasoning I have edited /etc/init.d/netmount and changed the DEPEND from "net" to "net.eth0" (that's my wired ethernet, persistent udev rule)... this appears to have completely fixed the issue for me and shutdowns now take only a few seconds.... I believe this also avoid's SpanKY's concerns about using the lazy unmount option... which presumably would also fail if the network is already down... My other theory for "fixing this" could be to edit /etc/rc.conf and setting rc_depend_strict to "YES" but I haven't tried this... Hope this helps somebody else come up with a permanent solution that can be applied to the init.d script...
Comment 7 Lyall Pearce 2012-11-12 10:08:13 UTC
As far as I see it, /etc/init.d/netmount is missing a 'use net' Only problem is that local satisfies this too...
Comment 8 William Hubbs 2012-11-18 21:55:02 UTC
(In reply to comment #7) > As far as I see it, /etc/init.d/netmount is missing a 'use net' > > Only problem is that local satisfies this too... See the following post for reasons we are discouraging use of the net virtual for this, and other things as well . In a nutshell, the virtual really can't tell you what it claims to.  http://blog.flameeyes.eu/2012/10/may-i-have-a-network-connection-please