Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 70043

Summary: Stopping nagios leaves running processes
Product: Gentoo Linux Reporter: Michal Margula <alchemyx>
Component: New packagesAssignee: Gentoo Netmon project <netmon>
Status: RESOLVED INVALID    
Severity: normal    
Priority: High    
Version: 2004.2   
Hardware: All   
OS: All   
Whiteboard:
Package list:
Runtime testing required: ---

Description Michal Margula 2004-11-04 05:52:01 UTC
When you have nagios running, it makes few tens of processes working together (plugins, notify commands and main nagios processes). Unfortunately after issuing

/etc/init.d/nagios stop

I have still them running:

nagios    9065  0.0  0.2  4584 2916 ?        Ss   Nov03   1:05 /usr/nagios/bin/nagios -d /etc/nagios/nagios.cfg
nagios   22868  0.0  0.2  4588 3008 ?        S    14:48   0:00 /usr/nagios/bin/nagios -d /etc/nagios/nagios.cfg
nagios   22869  0.0  0.0  1344  476 ?        S    14:48   0:00 /usr/nagios/libexec//check_ping -H a.b.c.d -w 50.0,5% -c 100.0,20% -p 20
nagios   22874  0.0  0.0  1496  464 ?        S    14:48   0:00 /bin/ping -n -U -c 20 a.b.c.d
nagios   22913  0.0  0.2  4588 3008 ?        S    14:48   0:00 /usr/nagios/bin/nagios -d /etc/nagios/nagios.cfg
nagios   22914  0.0  0.0  1344  476 ?        S    14:48   0:00 /usr/nagios/libexec//check_ping -H a.b.c.d -w 100.0,20% -c 300.0,50% -p 20
nagios   22916  0.0  0.0  1492  460 ?        S    14:48   0:00 /bin/ping -n -U -c 20 a.b.c.d

And so on...

You need to wait some time and after that it leaves only one process:

nagios    9065  0.0  0.2  4584 2916 ?        Ss   Nov03   1:05 /usr/nagios/bin/nagios -d /etc/nagios/nagios.cfg

And then it respawns plugins again. Shouldn't be all processes killed after some time? Let say - wait 10 seconds and kill everything belonging to nagios.
Comment 1 Eldad Zack (RETIRED) gentoo-dev 2004-11-22 08:45:53 UTC
No. It should be very controlled, so killall can't be used.

Can you make this change in /etc/init.d/nagios:

insert this line between 41 and 42 (After stop() ):

einfo "Nagios PID: $(< /var/nagios/nagios.lock)"

This will show you the PID it is stopping. Then when you stop - update the report if it correspond to the process which is still running or not. I've had this problem myself in the original /etc/init.d/ script, but not on gentoo.
Comment 2 Michal Margula 2004-11-22 14:19:55 UTC
It was lines 21 and 22 :)

I had lock file for nagios in other place, now changed and watching what happens:

root@gollum alchemyx # /etc/init.d/nagios stop
 * Nagios PID: 30077
 * Stopping nagios...                                                                                       [ ok ]

root@gollum alchemyx # dmesg
root@gollum alchemyx # ps aux | grep nagios
nagios     773  0.0  0.3  4592 3112 ?        S    23:12   0:00 /usr/nagios/bin/nagios -d /etc/nagios/nagios.cfg
nagios     777  0.0  0.3  4592 3112 ?        S    23:12   0:00 /usr/nagios/bin/nagios -d /etc/nagios/nagios.cfg

And few more running processes. And after few moments they dissaper. So now is everything fine. My fault then, sorry!

By the way - why script wasn't complaining about missing lock?
Comment 3 Eldad Zack (RETIRED) gentoo-dev 2004-11-22 15:23:21 UTC
see bug 72145 I've just submitted...
Comment 4 Michal Margula 2004-11-22 15:31:07 UTC
Great, good idea. Thank you!