When using the NetworkManager RC script in a runlevel, it purports to provide the net service that other services depend on. It does not, however, block on receiving an address, with the result that all the other things that need the network interface fail - except that instead of failing to start because a dependency isn't running yet (which is what should happen) they run, thinking they have net, but don't, and crash or otherwise misbehave. Not sure how to achieve it, but the NetworkManager RC script needs to block like (say) net.eth0 + dhcp does and only reach the "start" state on successfully receiving an IP address. AfC
(In reply to comment #0) > > Not sure how to achieve it, but the NetworkManager RC script needs to block > like (say) net.eth0 + dhcp does and only reach the "start" state on > successfully receiving an IP address. > > AfC > This is not fesiable as most people use wireless with NetworkManager and this requires a key to be passed weather it be from the users keyring or manually after logging into a desktop enviroment of some sort.
Let's see what solution netmanager and baselayout maintainers can come up with...
Have you filed bugs on the services that are crashing when they don't have network access? If a service doesn't have access, it should sit silently and wait, not crash and burn, thats just my opinion (yes, I know nfs, and blah blah blah)
No. The rest of the runlevel system is not in error. The NetworkManager RC script is at fault. It is claiming to provide "net", and it does not. AfC
Andrew, If you configure it right (aka add your connection to system-wide settings via keyfile plugin) it _does_ provide net. So I would mark this as user error. Currently we don't have migration guide finished, that's why NM is hard masked in the tree. If you need any help how to set up system wide settings, just drop me an email and I will gladly help you with that. Regards, Rob
(In reply to comment #1) I disagree with this. Regardless of whether you use it for wireless or not, NetworkManager has the ability to have a default system connection be specified without the need for the user to login to their desktop before bringing up the network connection. If the system wide connection is used, then it should be included as a 'net' dependency. (In reply to comment #5) There seems to be some confusion here. Andrew is talking about NetworkManager not being part of the Gentoo init script's 'net' dependency service, not about NetworkManager providing a network interface. Take a peak in '/lib/rcscripts/sh/rc-services.sh' at the net_service() function. For any init script to provide the 'net' dependency, it's init script name must be prefixed with 'net.' So a workaround for this can be done like so: Delete NetworkManager from any runlevels. Create a symlink called /etc/init.d/net.NetworkManager pointing to /etc/init.d/NetworkManager Add the new 'net.NetworkManager' service to your runlevel Now Gentoo will consider NetworkManager to be a 'net' service (note the same workaround applies to 'wicd'). However, you still have the same problem that any init script with the 'need net' dependency will attempt to start before the network interface has been assigned an IP address. This is a bug in Gentoo's baselayout (see bug #268598). Note that all of the above applies to sys-apps/baselayout-1.*, have not tried baselayout-2/openrc to be able to verify if the dependency bugs carry over.
Does anybody know any temporal workaround for preventing other services from being started before NetworkManager? I currently only use NetworkManager but nfsmount starts too soon, without waiting for networkmanager for getting network up Thanks :-)
(In reply to comment #7) > Does anybody know any temporal workaround for preventing other services from > being started before NetworkManager? I currently only use NetworkManager but > nfsmount starts too soon, without waiting for networkmanager for getting > network up > > Thanks :-) > Personally I don't use nfsmount. I just use dispatcher script to mount nfs drives when NM get's the connection.
Ah, I didn't know about it, I will take a look on dispatcher way Thanks
(In reply to comment #5) it's a little bit off topic. but hi Robert, maybe you can put a short text on how to set system-wide connections in the wiki. I have tried hard to search, but found nothing.
I use Gentoo at work, where I have to authenticate against active directory. I now use NetworkManager, to manage my network connections, instead of the net.eth scripts. One interesting effect of what is reported here is that, using the NetworkManager script, after I reboot the machine I cannot login the PC using my active directory account. I first have to login as local user, restart "samba", exit and only then can I login with my active directory account. I really wish that NetworkManager / net scripts get sorted out quickly.
*** Bug 271458 has been marked as a duplicate of this bug. ***
*** Bug 306351 has been marked as a duplicate of this bug. ***
(In reply to comment #7) > Does anybody know any temporal workaround for preventing other services from > being started before NetworkManager? I currently only use NetworkManager but > nfsmount starts too soon, without waiting for networkmanager for getting > network up I'm not using NM, anyway, if you want to "fix" this, then rc_config="/etc/NetworkManager/whatever" rc_provide="!net" or you can set before/after/use/need or whatever like this.
(In reply to comment #14) > (In reply to comment #7) > > Does anybody know any temporal workaround for preventing other services from > > being started before NetworkManager? I currently only use NetworkManager but > > nfsmount starts too soon, without waiting for networkmanager for getting > > network up > > I'm not using NM, anyway, if you want to "fix" this, then > > rc_config="/etc/NetworkManager/whatever" > rc_provide="!net" > > or you can set before/after/use/need or whatever like this. > Hmmm where do i put that. In rc.conf? I tried it but it didnt solve the problem.
(In reply to comment #15) > Hmmm where do i put that. In rc.conf? I tried it but it didnt solve the > problem. You need to replace 'whatever' with something that exists... :) I have this in /etc/rc.conf and it works perfectly fine, so dunno why it wouldn't work for NM rc_xdm_config="/etc/conf.d/xdm" rc_xdm_use="lm_sensors" rc_net_lo_provide="!net" rc_net_pan0_provide="!net"
(In reply to comment #16) > > You need to replace 'whatever' with something that exists... :) I have this in > /etc/rc.conf and it works perfectly fine, so dunno why it wouldn't work for NM > > rc_xdm_config="/etc/conf.d/xdm" > rc_xdm_use="lm_sensors" > rc_net_lo_provide="!net" > rc_net_pan0_provide="!net" > i replaced it with something that exists, but it didnt work. But i found another workaround though. At least for my problem with CIFS mounts which jammed shutdown on unmounting filesystems. I added this line to /etc/conf.d/local.stop: umount -t cifs -a
(In reply to comment #17) > i replaced it with something that exists, but it didnt work. But i found > another workaround though. At least for my problem with CIFS mounts which > jammed shutdown on unmounting filesystems. Well, the above is for openrc/baselayout-2 and works perfectly fine; if you are on stable, I don't know what to use and certainly don't intend to downgrade to test. Wrt you cifs unmount issue, the defaults are like this: # Network fstypes. Below is the default. net_fs_list="afs cifs coda davfs fuse fuse.sshfs gfs glusterfs lustre ncpfs nfs nfs4 ocfs2 shfs smbfs" so pretty weird issue as well. Again, valid for openrc.
I am pretty pusseled why this has not been "fixed" yet. About the workaround "adding connections as system-connections makes NetworkManager provide net", how does that work if you have a laptop and none of your wireless connections are available when you start your computer? Should you start the computer, add the wireless you are going to use to NetworkManager as a system-connection (waiting for ntp-client, freshclam and so on to time-out on the way) only to restart the computer to ensure all services has started correcty? Or why noone has picked up the most useful workaround: the dispatcher scripts. My current workaround is that I created runlevel "network" and then create a script in /etc/NetworkManager/dispatcher.d/ (all scripts in there gets executed when a network connection changes with various arguments so you can finetune which scripts will react on what event for which interface) that if it gets the argument "up" starts all services in said runlevel and if it gets argument "down" stops them. Works fine with netmount, ntp-client, and some other that needs network during start. It also adds the nice aspect that if I have some services I may not wish to have started when my computer has no network connection (because they anyway only lay in the background doing nothing wasting system resources) I only add them to the "network" runlevel since then they will not be started until NetworkManager has a connection. And to be honest this feels like a fix for the problem. Adding a network-runlevel (IIRC at least baselayout-1 had a nonetwork runlevel) and having NetworkManager (and maybe openrc?) only run those services when a network is up.
(In reply to comment #19) Brilliant workaround for a Gentoo bug that still rolls on and for far too long. Why do the Gentoo scripts *still* consider the 'need net' dependency to be fulfilled without a network address ? Migrating over to networkmanager and have it handle the network services is a nice solution. Thanks Xake
I use avahi and autoipd. It works in parallel boot. /etc/conf.d/net: modules=( "autoipd" ) config_eth0=( "autoipd" ) config_eth1=( "autoipd" ) /etc/init.d/samba: ... depend() { ... before NetworkManager ... } ...
Created attachment 263083 [details] NetworkManager initscript for baselayout2 So I was thinking: how can ifplugd/netplug delay and background the provision of "net" until a connection exist if NetworkManager cannot? Well, it turns out that baselayout can make this work. Replace your current initscript with this one, and drop the following file in /etc/NetworkManager/dispatcher.d, and baselayout/openrc will start NetworkManager, but mark it as "inactive" (making everything needing or using "net" get the status "scheduled" and a warning during boot that they will be started when NetworkManager has started). When an connection becomes available and configured the dispatcher will run the initscript again, but this time mark it as "started" making everything waiting for that to happen start as well. Then when the connection is lost, the initscript turns into "inactive" status, and all "net" needing scripts stops and again gets the status "scheduled".
Created attachment 263085 [details] dispatcher-script Put this script into /etc/NetworkManager/dispatcher.d This script has the downside that it currently does not handle multiple connections (it should be extended to only run "/etc/init.d/NetworkManager stop" when the last connection is lost, not on any connection. The same can be true for the start-part, but is not as "destructive" as those services will only return "already started" when this script runs).
Created attachment 263087 [details, diff] patch against ${PORTDIR}/net-misc/networkmanager A patch against the ebuild making it install the right initscript, and the dispatcher script. Usage for these scripts are, remove ANYTHING related to network, net.* and so on from you runlevels. Add ONLY NetworkManager to your favorite runlevel. Then you may add your favorite net-needing script to your favorite runlevel, they will not get started until NetworkManager has succeeded in creating a network connection.
Created attachment 263089 [details] updated dispatcher script I realized a quick way of checking if NetworkManager is connected or not. Works for my computer connecting/disconnecting two cables.
Xake, These scripts are for Baselayout 1, correct? I tried them, however a at startup I get an error from the init.d script complaining about yesno not being defined (or something like that). Can you help?
(In reply to comment #26) > Xake, > > These scripts are for Baselayout 1, correct? > No, I use them with baselayout2/openrc. > I tried them, however a at startup I get an error from the init.d script > complaining about yesno not being defined (or something like that). > > Can you help? > seems like yesno is not defined in /etc/init.d/functions.sh for baselayout1... I could port it to the way the old baselayout1-scripts worked (which was esentially if that variable was defined or not, instead of looking at what it is defined to), but this would be cleaner. If only openrc could ever get stabilized...
Created attachment 263307 [details] NM-initscript for baselayout1 Quick changes, this should work with baselayout1 but totally untested.
Xake, Using the new init script I did not get an error at startup, however it seems to make no diference in comparison to the standard init script, i.e the dependent net scripts fail, see: * Starting NetworkManager ... [ ok ] * Starting avahi-daemon ... [ ok ] * Starting avahi-dnsconfd ... [ ok ] * Loading CDemu userspace daemon ... Daemon successfully started. [ ok ] * Starting cupsd ... [ ok ] * Starting gpm ... [ ok ] * Starting Hardware Abstraction Layer daemon ... [ ok ] * Mounting network filesystems ... [ ok ] * Setting clock via the NTP client 'sntp' ... sntp: getaddrinfo(hostname, ntp) failed with Name or service not known * Failed to set clock [ !! ] * Starting ntpd ... [ ok ] * Enabling numlock on ttys ... [ ok ] * samba -> start: smbd ... [ ok ] * samba -> start: nmbd ... [ !! ] * samba -> start: winbind ... [ ok ] * Error: starting services (see system logs) * samba -> stop: smbd ... [ ok ] * samba -> stop: nmbd ... [ ok ] * samba -> stop: winbind ... [ ok ]
Looking at the messages.log I see this message as eth0 has successfully received an IP: .... Feb 22 16:30:59 NetworkManager[15900]: <info> Activation (eth0) successful, device activated. Feb 22 16:30:59 NetworkManager[15900]: <info> Activation (eth0) Stage 5 of 5 (IP Configure Commit) complete. Feb 22 16:30:59 nm-dispatcher.action: Script '/etc/NetworkManager/dispatcher.d/NetworkManager.action' exited with error status 1. Feb 22 16:30:59 rc-scripts: WARNING: NetworkManager has already been started
Comment on attachment 263307 [details] NM-initscript for baselayout1 Ok, scratch this one. Seems like good old baselayout1 is braindead wrt some stuff. Will maybe take a look at this tomorrow, but it seems like this script needs different logic for bl1 and bl2...:-/
Took a deep-dive into the abyss called baselayout1, and can say the following: my approach will not work with it. Seem like baselayout1 has some restrictions wrt what may provide net and not. In short it seems like nothing not named net.* can provide net in bl1. So I can make NetworkManager switch between active and inactive depending on net-status, but it does not shut down or bring back services up on netplug. So to make this work, it essentially needs to be made into a /lib/rcscripts/net/networkmanager.sh module, but I think I rather make my script only provide net with baselayout2, depend this bug on openrc2 and leave an eventual baselayout1 implementation as an excercise for the reader.
IMHO, I wouldn't waste your time with baselayout-1. I'd work on supporting baselayout-2. Given that NetworkManger is going to support netcf as well, it might be a worth while investment to actually fix netcf to support baselayout-2 style networks and then we get a lot of wins all around.
(In reply to comment #32) [...] > So to make this work, it essentially needs to be made into a > /lib/rcscripts/net/networkmanager.sh module, but I think I rather make my > script only provide net with baselayout2, depend this bug on openrc2 and leave > an eventual baselayout1 implementation as an excercise for the reader. Maybe your modified script could be used now that baselayout-2 is stable :-/
Created attachment 298071 [details] The latest incarnation of the init script Sorry for taking some time to answer. I had totally forgot about this bug, but today I was annoyed enought about having to work around this when updating networkmanager so here we go. Init script works for me. Based on a old 0.9.x script, and on my system does what it is supposed to, that is: When it is called by openrc, it only starts the NetworkManager daemon, but newer marks the service as "started", only "inactive". This leads to OpenRC let the service provide net, but never try to start any of the services depending on net until the NetworkManager service is marked as "active". When it is called by the dispatcher script in this bug, then it only marks the service as active or inactive depending on if the interface is going up or down, making OpenRC starting and stopping the services depending on net when the event happens. This all so NetworkManager is able to "provide net" for me, which fixes a couple of things: 1. OpenRC places NetworkManager in the right spot in the dependency chain during start and stop, like the race condition when for example sshd tries to start before NetworkManager, which makes OpenRC start on my system a non-NetworkManager-controlled instance of a demonized dhcpcd. And all the hilariousness it ensures. 2. services actually being able to start because there is an active network connection when they try to start (autofs, netmount...).
Created attachment 298077 [details] The most current dispatcher script What this script does is the following when it is called: It checks if the interface is "none", and then does nothing (NetworkManager from time to time sends events for a interface "none" which we really are not that interested in). After that it checks if the interface is going up or down. * If it is going up it sets the ARGS for when it later runs the initscript again. * If it is going down, then it checks if there still is an active connection, and if there is it does NOT set ARGS. * If there is something other going on it will send a debug-message to the logger After that it is time to call the OpenRC init script. Here we do two things: * Check if the initscript exists, otherwise fail and log * If script exists, check if ARGS exists, and if they do run the initscript. This last step is so that whatever happens, /etc/init.d/NetworkManager is never marked inactive when an interface is still up, this so that you will not lose "sshd" just because your vpn/wireless went down.
Xake, thank you for your work on this. After investigating different alternatives, it appears that your idea of using a dispatcher.d script and IN_BACKGROUND is the least bad way of fitting a square peg (networkmanager) into a round hole (an openrc service). Sorry that this issue took so long to fix! >*networkmanager-0.9.2.0-r3 (02 Feb 2012) > > 02 Feb 2012; Alexandre Rostovtsev <tetromino@gentoo.org> > +files/10-openrc-status, +networkmanager-0.9.2.0-r3.ebuild, > +files/networkmanager-0.9.2.0-ifnet-unquote-hostname.patch, > +files/networkmanager-0.9.2.0-init-provide-net.patch: > Change the NetworkManager OpenRC service to provide net; the service's status > is set to 'inactive' when NetworkManager is running but has no connections > up, and to 'started' when NetworkManager is connected (bug #252137, thanks to > Xake). Do not keepdir /var/run/NetworkManager, it's not needed in Gentoo (bug > #401019, thanks to Maxim Kammerer). Correctly parse single-quoted hostnames > in /etc/conf.d/hostname.
Thanks a lot for fixing this :D
Created attachment 322058 [details] simple init script without status tracking It would be great if this (quite intrusive, imho) status tracking could be made dependent on a USE flag.
(In reply to comment #39) > Created attachment 322058 [details] > simple init script without status tracking > > It would be great if this (quite intrusive, imho) status tracking could be > made dependent on a USE flag. Why? Your script is broken anyway.
(In reply to comment #40) > Why? Many services that "need net" don't actually need a network connection that's up, they just need to be able to bind to a socket. Examples: privoxy, tor, ntp (if I remember right). Some services are better managed via NetworkManager's dispatcher.d mechanism. Right now, an intrusive, heavy, and error-prone solution is forced on all users. Whereas disabling 10-openrc-status is trivial, disabling stuff in the init service is not. Wrt. error-prone, for instance, let's say NetworkManager is inactive, and privoxy waits for it. When NetworkManager is restarted, will privoxy wait for it again? It didn't in my tests. Maybe I did something wrong, but it's just one example of why such heavy solutions should be optional. > Your script is broken anyway. It works.
(In reply to comment #41) > (In reply to comment #40) > > Why? > > Many services that "need net" don't actually need a network connection > that's up, they just need to be able to bind to a socket. Then those services should be fixed; instead of "need net", they should be using "need lo". Please file bugs about them. See http://archives.gentoo.org/gentoo-dev/msg_ab161abd642de7d1b8efa69cf9cd46c3.xml for the conclusion of a long discussion about this on the gentoo-dev mailing list. The problem error is not in networkmanager's behavior; it is in privoxy's outdated init script.
(In reply to comment #42) > Then those services should be fixed; instead of "need net", they should be > using "need lo". Please file bugs about them. I will, but note that OpenRC 0.9.9 is not stable yet, so net.lo does not yet provide "lo" in stable. > See > http://archives.gentoo.org/gentoo-dev/msg_ab161abd642de7d1b8efa69cf9cd46c3. > xml for the conclusion of a long discussion about this on the gentoo-dev > mailing list. I am not questioning the motivation, I am questioning the lack of choice. The approach taken for fixing this bug has both technical (IN_BACKGROUND is a hack that will eventually bite, and see also my previous comment wrt. restarts) and semantic issues. The semantic issue is that "connecting to remote machines via any available network interface" is an ambiguous definition. Do you want ntpd stopped and started every time nm-online reports true? It will work either way (again, IIRC) -- it will adapt to interfaces being added and removed, but it will take some time. Who decides the trade-off? What if you have a complex [connectivity] setup in NetworkManager.conf? Let's say "uri" is an .onion URL, so it goes via Tor -- and here you have a circular deadlock because you used a high-level tool in a low-level OpenRC script. If I want to disable nm-online calls, I can just chmod a-x .../10-openrc-status, but what are my options if I want NetworkManager to unconditionally provide "net"? I suggest a USE flag or a configuration variable in conf.d/networkManager.
(In reply to comment #43) > I am not questioning the motivation, I am questioning the lack of choice. IMHO the old behavior was unambiguously wrong; if networkmanager knows that it has not established a network connection, then claiming that it provides net is a lie. The new behavior is right in ~90% of cases. I do not see the sense in providing a choice between "wrong" and "almost right". If you have unusual requirements, simply create your own /etc/init.d/MyNetworkManager script with blackjack and hookers; to prevent the "official" /etc/init.d/NetworkManager from getting started automatically, "eselect rc delete" it from your runlevels, and add rc_provide="!net" to /etc/conf.d/NetworkManager. > The semantic issue is that "connecting to remote machines via any available > network interface" is an ambiguous definition. You are right; it's not really a definition at all, just a guideline. > Do you want ntpd stopped and started every time nm-online > reports true? It will work either way (again, IIRC) -- it will adapt to > interfaces being added and removed, but it will take some time. Who decides > the trade-off? net-misc/ntp maintainers decide the tradeoff. Talk to them :) > What if you have a complex [connectivity] setup in > NetworkManager.conf? Let's say "uri" is an .onion URL, so it goes via Tor -- > and here you have a circular deadlock because you used a high-level tool in > a low-level OpenRC script. If you want use .onion URLs in [connectivity], you could add rc_need="!net" to /etc/conf.d/tor to allow tor to run even when nothing is providing net.
(In reply to comment #43) > (In reply to comment #42) > > Then those services should be fixed; instead of "need net", they should be > > using "need lo". Please file bugs about them. > > I will, but note that OpenRC 0.9.9 is not stable yet, so net.lo does not yet > provide "lo" in stable. Actually don't do this. I have started a discussion about removing "provide lo" entirely, so do not make anything need lo. The logic is that I don't know of any situation where you won't have a loopback active, so there may not be a need for the "lo" virtual at all.
(In reply to comment #45) > Actually don't do this. I have started a discussion about removing "provide > lo" entirely, so do not make anything need lo. I am already not doing this, because I can't think of any non-ambiguous service wrt. "need net". Even Privoxy. The guideline should be differentiating between "need network in general" and "need network now", with "now" being reserved for stuff like ntp-client and netmount (and not even sure about netmount).
(In reply to comment #46) We had discussed this on IRC, and it seems that many/most services that want a network connection should have "use net" or "use net; after net" instead of "need net".