race condition due to stop() not waiting for named to finish. A new instance is started while the old one is still tidying up. Happens on my machine during heavy load, with lots of swapping. Akin to bugs 31103, 29932, 28345. Reproducible: Sometimes Steps to Reproduce: 1. run memory_eater & (emerge kdebase will do as well...) 2. /etc/init.d/named restart 3. Actual Results: * Stopping named... [ ok ] * Starting named... [ !! ] Expected Results: * Stopping named... [ ok ] * Starting named... [ ok ] A trivial patch to fix: --- _named.orig 2003-10-14 12:09:59.000000000 +0200 +++ named 2003-10-14 17:07:15.000000000 +0200 @@ -40,7 +40,7 @@ stop() { ebegin "Stopping named" checkconfig || return 2 - start-stop-daemon --stop --quiet --pidfile $PIDFILE + start-stop-daemon --quiet --retry -TERM/60 --stop --pidfile $PIDFILE eend $? }
what does -TERM/60 ?
i've added some options to runscript now in cvs
It's a while since I did this bug, and I found it confusing. I think I ended up using strace to fully understand what was going on. In brief, "--retry -TERM/60 " option will send the SIGTERM, and then wait up to 60 seconds for the process to finish. If it reaches the 60 second timeout, it will return an error. If you just use "--retry 60" then I *think* the process just gets SIGKILL before the wait, which isn't very graceful for some daemons, as they don't get a chance to clean up. It's possible to add some more SIGNAL/timeout pairs onto the sequence, if you need to... Like I said, my memory is hazy, but "man start-stop-daemon" explains. Many thanks...
Sorry to reopen, but /etc/init.d/named stop now fails if you are running in chroot. Adding the PIDFILE and KEY path correction logic to the stop function seems to fix this for me, YMMV. Not sure if this needs to be added in other functions or not. --- /usr/portage/net-dns/bind/files/named.rc6 2004-01-12 16:07:45.000000000 -0 +++ named 2004-01-22 11:13:31.000000000 -0600 @@ -40,6 +40,13 @@ stop() { ebegin "Stopping ${CHROOT:+chrooted }named" checkconfig || return 2 + if [ $CHROOT -a -d $CHROOT ] ; then + PIDFILE="${CHROOT}/var/run/named/named.pid" + KEY="${CHROOT}/etc/bind/rndc.key" + else + PIDFILE="/var/run/named/named.pid" + KEY="/etc/bind/rndc.key" + fi start-stop-daemon --stop --quiet --pidfile $PIDFILE \ --exec /usr/sbin/named -- stop eend $?
I applyed the fix. But running date;/etc/init.d/named restart;date;/etc/init.d/named restart;date; still gives errors.