During the glibc update I noticed that /etc/init.d/nrpe does not stop all nrpe processes. How to simulate: /etc/init.d/nrpe start schedule all services check on that host ps ax | grep nrpe -> shows multiple processes /etc/init.d/nrpe stop ps ax | grep nrpe -> some still running Should this be reported upstream or we handle it some way?
Which processes are left running? The main daemon process and its children (i.e. what get created by /etc/init.d/nrpe start) should be killed by the corresponding "stop". Does that fail, or is it some other processes that got created in the meantime that stick around?
# ps auxww | grep nrpe nagios 1259 0.0 0.0 25904 2432 ? Ss 16:14 0:00 /usr/bin/nrpe -c /etc/nagios/nrpe.cfg --daemon nagios 3196 0.0 0.0 25904 2948 ? S 16:17 0:00 /usr/bin/nrpe -c /etc/nagios/nrpe.cfg --daemon nagios 3242 3.0 0.1 26592 4156 ? S 16:17 0:00 /usr/bin/nrpe -c /etc/nagios/nrpe.cfg --daemon nagios 3251 3.0 0.1 26592 4156 ? S 16:17 0:00 /usr/bin/nrpe -c /etc/nagios/nrpe.cfg --daemon nagios 3259 0.0 0.0 26592 2084 ? S 16:17 0:00 /usr/bin/nrpe -c /etc/nagios/nrpe.cfg --daemon nagios 3272 0.0 0.0 26592 2084 ? S 16:17 0:00 /usr/bin/nrpe -c /etc/nagios/nrpe.cfg --daemon # /etc/init.d/nrpe stop * Stopping nrpe ... # /etc/init.d/nrpe status * status: stopped # ps auxww | grep nrpe nagios 3196 0.0 0.0 25904 2948 ? S 16:17 0:00 /usr/bin/nrpe -c /etc/nagios/nrpe.cfg --daemon
Do any of the upstream init files (from the startup directory) fare any better? I'm not sure what the problem is, but I'm planning on sending our simplified init script upstream and don't want to make anything worse.
Seems like it's fixed in 3.2.0 as I cannot reproduce it any more. The problem was with longer running checks when nrpe forked a few new processes (for example when forcing a check of all services on a host) and during that time I did nrpe restart (after upgrade).