Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 631316 - sys-apps/netplug-1.2.9.2-r2 - Creates zombies and leaves interfaces unconfigured
Summary: sys-apps/netplug-1.2.9.2-r2 - Creates zombies and leaves interfaces unconfigured
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: AMD64 Linux
: Normal normal
Assignee: Lars Wendler (Polynomial-C) (RETIRED)
URL:
Whiteboard:
Keywords: PATCH
Depends on:
Blocks:
 
Reported: 2017-09-18 09:14 UTC by Lev Danilski
Modified: 2019-04-20 23:07 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments
Wait for multiple children in SIGCHLD handler (netplug-1.2.9.2-multi-waitpid-sigchld.patch,1.82 KB, patch)
2017-09-18 09:14 UTC, Lev Danilski
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Lev Danilski 2017-09-18 09:14:34 UTC
Created attachment 495198 [details, diff]
Wait for multiple children in SIGCHLD handler

I'm using netplugd and teamd with over 20 interfaces. On boot while the bundle is being configured over all interfaces netplugd leaves some of the interfaces uncofigured. I also noticed that there are some zombies left from netplug and their number matched the number of unconfigured interfaces:

ps -A f
4620 ?        Ss     0:00 /sbin/netplugd -D -c /etc/netplug/netplugd.conf -p /tmp/netplug.pid
 5081 ?        Z      0:00  \_ [netplug] <defunct>
 5082 ?        Z      0:00  \_ [netplug] <defunct>
 5111 ?        Z      0:00  \_ [netplug] <defunct>

After some digging in the netplugd code it turned out that it forks the configuration script (/etc/netplug.d/netplug) and waits for its completion using the SIGCHLD signal. The netplugd uses a state machine to track the state of every interface and the exit of the configuration script is required to be able to get to the inning state of the interface and actually configure it.

The problem with the zombie processes is caused by the way the SIGCHLD handler is currently implemented - it waits just for the child which death generated the signal. The catch is that while the handler is being executed SIGCHLD is blocked and if many such signals are delivered to the netplugd all except one will be discarded and the discarded ones will never be waited for. This will leave these children as zombies and the matching interfaces will not be configured.

I made a minimal patch which fixes the problem using the implementation at https://www.gnu.org/software/libc/manual/html_node/Merged-Signals.html

Briefly the patch changes the SIGCHLD handler to wait for all children. In such way if more children die while the handler is being executed they will be waited for. If a child dies after we stopped waiting but before we exit the handler, the new signal will be left as pending and on the next call of the hanlder the process(es) will be waited for. In such way all children should be reaped.
Comment 1 Larry the Git Cow gentoo-dev 2019-04-20 23:07:04 UTC
The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=ca3c5c6b52e60bea1eab05a2b5bc97942aa7dc66

commit ca3c5c6b52e60bea1eab05a2b5bc97942aa7dc66
Author:     Lars Wendler <polynomial-c@gentoo.org>
AuthorDate: 2019-04-20 23:02:50 +0000
Commit:     Lars Wendler <polynomial-c@gentoo.org>
CommitDate: 2019-04-20 23:06:17 +0000

    sys-apps/netplug: Attempt to fix zombie creation
    
    Thanks-to: Lev Danilski <8o55kd+1v8xnjsby8b9k@pokemail.net>
    Closes: https://bugs.gentoo.org/631316
    Package-Manager: Portage-2.3.64, Repoman-2.3.12
    Signed-off-by: Lars Wendler <polynomial-c@gentoo.org>

 .../netplug-1.2.9.2-multi-waitpid-sigchld.patch    | 65 +++++++++++++++++++
 sys-apps/netplug/netplug-1.2.9.2-r3.ebuild         | 73 ++++++++++++++++++++++
 2 files changed, 138 insertions(+)