"Hello. We faced a bug (?) in Linux kernel causing different misbehaviours on our server. After exploration, it seems that we found some security implications of this issue. When a process exits, it's parent is notified by SIGCHLD, and finished child is kept in process table in "zombie" state until parent process (or init, if parent is already ended) handles child exit. Similary, with linuxthreads, when a thread exits, another thread in the same process is notified by signal 33 (SIGRT_1), and exitted thread exists in the process table in "zombie" state until the exit is handled. When a signal that notifies about exit is generated by the kernel, kernel code allocates a "struct sigqueue" object. This object keeps information about the signal until the signal is delivered. Only a limited number of such objects may be allocated at a time. There is some code in the kernel that still allows signals with numbers less than 32 to be delivered when "struct sigqueue" object can't be allocated. However, for signal 33 signal generation routine just returns -EAGAIN in this case. As the result, process is not notified about thread exits, and ended thread is left in "zombie" state. Details are at http://www.ussg.iu.edu/hypermail/linux/kernel/0404.0/0208.html For long-living processes that create short-living threads (such as mysqld), this causes process table overflow in several minutes. "struct sigqueue" overflow may be easily caused from userspace, if a process blocks a signal and then receives a large number of such signals. The following sample code does that: #include <signal.h> #include <unistd.h> #include <stdlib.h> int main() { sigset_t set; int i; pid_t pid; sigemptyset(&set); sigaddset(&set, 40); sigprocmask(SIG_BLOCK, &set, 0); pid = getpid(); for (i = 0; i < 1024; i++) kill(pid, 40); while (1) sleep(1); } So if a user runs such code (or just runs a buggy program that blocks a signal and then receives 1000 such signals - which happens here), this will cause a DoS againt anything running on the same system that uses linuxthreads, including daemons running as root. On systems that use NPTL (such as Linux 2.6 kernel) there is no 'thread zombie' problem, because in NPTL another notification mechanism is used. However, DoS is still possible (and really happens - in form of daemon crashes), because when it is not possible to allocatre a "struct sigqueue" object, kernel behaviour in signal-passing changes, causing random hangs and segfaults in different programs." Can someone confirm this?
This is confirmed, see LKML thread : http://marc.theaimsgroup.com/?t=108150234800003&r=1&w=2 Doesn't look very practical. No upstream fix for now, they are still discussing how to do it (and who). status = wait for upstream
Some more pointers : http://www.securityfocus.com/bid/10096 http://xforce.iss.net/xforce/xfdb/15917 kernel local DoS = A3 no upstream fix yet
According to Marcelo Tossati, "v2.6.7-mm tree contains a fix for this, adding a rlimit for pending signals." : http://marc.theaimsgroup.com/?l=linux-kernel&m=108725996708714&w=2 not sure it's patchable on most kernels though...
Status update : Patch adding a user-related limit on pending signals is apparently in 2.6.8-rc1 : http://kerneltrap.org/node/view/3443 The code is probably this one : http://lkml.org/lkml/2004/5/11/46 and next patches by Chris Wright. Apparently no backport to 2.4.x yet. Dwongrading severity as this should not be worth a GLSA all by itself.
2.6.8 final includes the fixes. Not sure they are easy to backport, and this is not a very serious issue. We should probably wait for another vulnerability needing >=2.6.8 to include this one in a kernel GLSA.
Moving to newly-created kernel-specific category
Closing, this doesn't seem to be fixed upstream nor we consider it to be a security risk...