After forking off the ypbind daemon in /etc/init.d/ypbind the following code tries to detect correct startup and usually makes the script to wait until ypbind becomes ready: if [ -n "$?" ] ; then notfound=1 for i in 0 1 2 3 4 5 6 7 8 9 do ypwhich &>/dev/null && { notfound=0; break; } sleep 1 done if [ $notfound -eq 1 ] ; then eend 1 "No NIS server found" else eend 0 fi else eend $? fi This completely fails if portmap doesn't become ready fast enough as I reproduced in bug #222403. While ypbind forked and is running, it simply times out during network broadcast. After the loop times out trying to talk to ypbind (which itself cannot talk to the portmapper), the script resets the service's state to "stopped" although ypbind is still running. It's now not possible to use /etc/init.d/ypbind {restart|start|stop|zap} to make it working again. I needed to pkill ypbind and zap the service state to run it. The init script should probably try longer and make sure that ypbind is really not running when returning "failed" state - e.g. by running the stop routine. Reproducible: Always Steps to Reproduce: 1. Make portmapper to not become ready fast enough (as stated above) 2. Probably only reproducable with baselayout-2 and parallel_startup=YES 3. Start ypbind directly after portmap Actual Results: ypbind init-script exits with "failed" but ypbind is still running which leads to a race condition in Gentoo's knowledge about ypbind's service state. Expected Results: Make sure ypbind gets killed after timing out the loop which checks for ypbind working.
Reassigning to maintainer-needed since eradicator has left Gentoo.
This seems to still be a problem. One workaround is to add a YPBIND_WAIT variable that specified how many seconds to wait before giving up. The default /etc/conf.d/ypbind should be updated with a comment to document this if this solution is accepted. I also have a releated problem: I'm seeing /etc/init.d/ypbind report that it has failed to start, but /usr/bin/ypbind is still running, though not bound. If it fails to bind, it needs to kill the ypbind process. This part can be fixed by adding the line to stop the daemon right before the "No NIS server found" line. I think that the real solution to all of this is to change the ypbind program to have an option to not fork until it succeeds in binding, but that's obviously an upstream problem. I've made the above changes, and they seem to work for me. I'll attach my /etc/init.d/ypbind, which might be enough to close this bug if it's added to the ebuild.
Created attachment 226507 [details] /etc/init.d/ypbind
I have just added 1.32-r1 to the tree, which might address this issue, since I added USE=dbus which should use dbus' network monitoring capability. Could you try it out?
(In reply to comment #4) > I have just added 1.32-r1 to the tree, which might address this issue, since I > added USE=dbus which should use dbus' network monitoring capability. Could you > try it out?