If at boot time mailman has a stale lock lying around, the init script will happily declare [ok] even though mailman has failed to start. If I run the main step of the start() function manually, I get: skagit root # su - mailman -c 'bin/mailmanctl start' The master qrunner lock could not be acquired. It appears as though there is a stale master qrunner lock. Try re-running mailmanctl with the -s flag. This suggests two problems: First, it looks like mailman and mailmanctl should be passing a failure code back to the init script, so that the init script won't incorrectly assume that everything is hunky-dory. Second, perhaps Reproducible: Didn't try Steps to Reproduce: 1. Start mailman if it isn't already: /etc/init.d/mailman start 2. Kill it abrubtly: ps -u mailman -opid --noheader | xargs kill -9 3. Convince init.d that it hasn't actually started: rm /var/lib/init.d/started/mailman (at this point, you're at the right initial conditions for the bug: a stale lock file.) 4. Try starting mailman again: /etc/init.d/mailman start 5. Observe that it hasn't: ps auxww | grep runner (Expect no processes in the output except maybe the grep.) 6. Manually invoke mailman as the init script does: su - mailman -c 'bin/mailmanctl start' You should see the message: The master qrunner lock could not be acquired. It appears as though there is a stale master qrunner lock. Try re-running mailmanctl with the -s flag. 7. Recover by invoking mailman with the -s flag, as it suggests: su - mailman -c 'bin/mailmanctl -s start' 8. Observe that it has started: ps auxww | grep runner 8. Now kill mailman to get init.d back in sync: su - mailman -c 'bin/mailmanctl stop' /etc/init.d/mailman start Actual Results: (described inline in "Steps to Reproduce" section above) Expected Results: (described in "Details" section above)
This occurs with: net-mail/mailman-2.1.4 *
Jon, would you be able come up with a patch? If not, I'll look in to this when I got a chance. Thanks.
Is this even remotely relevant after two years?
> Is this even remotely relevant after two years? No idea. I think it's a bug in the gentoo init.d script, however, so it won't automatically have been cured by changes in the upstream. If the gentoo init.d hasn't changed in two years, then the bug is still there. Mailman is one of those services you really want to start up correctly every time... (It's no longer relevant to me personally, though, because I'm not running any mailman lists.)
Shrug... User no longer uses this, closing.