Bug 181145 - net-dialup/ppp - killing pppd before detaching kills whole kde session of user
Bug#: 181145 Product:  Gentoo Linux Version: unspecified Platform: All
OS/Version: Linux Status: RESOLVED Severity: normal Priority: P2
Resolution: FIXED Assigned To: net-dialup@gentoo.org Reported By: bugs@saschahlusiak.de
Component: Applications
URL: 
Summary: net-dialup/ppp - killing pppd before detaching kills whole kde session of user
Keywords:  
Status Whiteboard: 
Opened: 2007-06-06 23:11 0000
Description:   Opened: 2007-06-06 23:11 0000
Problem is reported upstream but I'm not sure about upstream situation. Website
says, ppp-2.4.3 is current, portage lists 2.4.4.

http://ppp.samba.org/cgi-bin/ppp-bugs/incoming?id=1411;expression=kill;user=guest
http://ppp.samba.org/cgi-bin/ppp-bugs/incoming?id=1589;expression=kill;user=guest

When entering the command "/usr/sbin/pppd pty 'sleep 60' login nodetach" (or
updetach) in the KDE run box and then execute "killall pppd", it will kill my
whole kde session, due to a kill(0, SIGNAL) (killing process group) before
setsid is called in detach() in file pppd/main.c. With the option detach
(instead of nodetach or updetach) it works fine though.

Moving of setsid out of detach() to the beginning of pppd fixes it, but I have
no idea about possible side-effects.

My user can connect to ISP using "sudo pon" and when he calls "sudo poff"
before the connection is established, kde dies. To see output in a terminal,
the option updetach is enabled.

Reproducible: Always

Steps to Reproduce:
1. Enter "/usr/sbin/pppd pty 'sleep 60' login nodetach" in KDE run box
2. Execute "killall pppd"
3. Watch whole KDE session die

------- Comment #1 From Alin Năstac 2007-06-09 22:12:58 0000 -------
Fixed in 2.4.4-r6 by applying kill-pg.patch.
When detach() hasn't been called yet, kill_my_pg() will kill all children one
by one instead killing the process group.

------- Comment #2 From R. Müller 2008-04-04 13:02:03 0000 -------
IMHO this bug has to be reopened, since this is not bug at all in "pppd".
it clear documentated, that pppd sends the signals it receives to the whole
process-group  (from man-page) :

* SIGINT, SIGTERM, SIGHUP : [...] pppd will send the same signal to its process
group [...]

to start pppd in his own process-group you can use "setsid" which starts a new
session (process-group) on the given command. this also will avoid, that  the
process that spawned "pppd" will be killed e.g. kde.

thats why we say "it's not a bug - its a feature !"

there are several use-cases, in which the given fix causes serious problems,
for instance waiting for call, which means staying long time in the
connect-script. in this case you can not terminate pppd by a normal
term-signal. Not to mention that this fix changes fundamental behavior of the
ppp-daemon. 

summary :

this not a bug ! the mentioned problem can be easily solved without touching
"pppd" ! additionally the given fix causes new problems !

------- Comment #3 From Sascha Hlusiak 2008-04-04 16:11:37 0000 -------
Did you read the two links of my original statement?

I'd define it as a bug if it clearly does not do what the user wants in some
cases. For me it killed a whole KDE session just because setsid was NOT called
early enough.

I think the patch does not change fundamental behaviour of pppd. Killing the pg
should still kill all the scripts it started. Isn't the connect script ran by
pppd in the same process group than pppd?

------- Comment #4 From R. Müller 2008-04-07 07:58:54 0000 -------
I'm not really sure what you mean with :
"[...] just because setsid was NOT called early enough [...]"
Can you please explain a little more.

Here is a minimal testcase :

/usr/sbin/pppd /dev/ttyS3 115200 nodetach connect 'sleep 600'

(adjust the parameters to fit your system)

You can run it as root or as non-privileged user with option 'noauth' in
/etc/options. Its not possible to kill the 'pppd' by just killing pid of the
running 'pppd'. But in my opinion it has to. 'pppd' ignores any signal except
the evil ones, of cause, e.g. -9.

my solution for this bug-report is as follows :

/usr/bin/setsid /usr/sbin/pppd /dev/ttyS3 115200 nodetach connect 'sleep 600'

this will start pppd in his own process-group - so KDE or any other
parent-process is not touched by killing 'pppd'.

------- Comment #5 From Sascha Hlusiak 2008-04-07 08:28:53 0000 -------
Created an attachment (id=148960) [details]
Record connector script as child so it can be killed

You are right. But the problem is that the connector script was not recorded in
the array of child processes, so it's not killed in kill_my_pg.
The setsid logic in pppd is alright for me and the kill-pg.patch should not be
reverted. 

Please test the attached patch if it works for you. 

------- Comment #6 From Sascha Hlusiak 2008-04-07 08:30:25 0000 -------
Reopen because of side effects. Patch attached.

------- Comment #7 From R. Müller 2008-04-07 09:31:54 0000 -------
Its a pitty, that there is no response on your bug-report @ppp.samba.org, but
changing the behavior of ppp in a single distribution seems for me not the
goal.

I'm sticking to my opinion. the behavior of 'pppd' on kill-signals is clearly
documentated. if you want not, that pppd kills your KDE, than run it with
'setsid' or detached, but don't change the code. 

moreover changing this here is not the right location. you (better we) have to
discuss this @ ppp.samba.org. i think we are not at the position to change this
behavior here. both positions have there pros and cons. we can't decide what
might be wrong or right.

with your patch, you have to adjust the man-pages as well.

------- Comment #8 From Sascha Hlusiak 2008-04-07 10:03:50 0000 -------
ppp seems to be pretty dead upstream. And we are not changing behaviour but
fixing things.

pppd already calls setsid to create a process group but only when detaching
(detach or updetach option). Killing pppd after it has detached is fine, but
killing it before that kills much more than you want (apache, running boot
services, etc). Is that behaviour documented in the man page? Do all Gentoo
scripts and dialup programs (kppp, pon, etc) call setsid to prevent the system
to wreck?
Is it documented that signals behave completely different when pppd is run with
nodetach and when it is called with detach/updetach? What would users expect?
The man page does not document the process group at all so how would users know
at all about the setsid() call in pppd/main.c?

What's wrong with the patch? It will close all children, the connectors,
disconnectors, etc, just like the manpage tells. This does not occur by sending
the signal to the process group, but that's for good. Do you have a usecase
where this fails?

The original behaviour caused me a lot of headaches to find "random" crashes or
random programs.

------- Comment #9 From R. Müller 2008-04-08 08:18:30 0000 -------
> Please test the attached patch if it works for you.
thanks a lot for your work, but this is not the point.
simply don't fix an error where is no error.
if somebody needs a new process group (to avoid killing other processes) - he
can use the command "setsid" (see example above).

> Is it documented that signals behave completely different when pppd is run with nodetach and when it is called with detach/updetach?
pppd behaves same, sends the signals to all processes in the process group.
Just in deamon mode it creates a new session, what is fine.

> What would users expect?
exactly what is written in the manpage

> What's wrong with the patch?
you try to fix things at the wrong place.

> The original behaviour caused me a lot of headaches to find "random" crashes or random programs.
Your patch caused me a lot of headaches since with gentoo-linux pppd does not
behave as it should.

------- Comment #10 From Sascha Hlusiak 2008-04-08 10:40:05 0000 -------
> simply don't fix an error where is no error.
In my point of view pppd behaves unpredictable. It's not clear when it is
killing what processes. I don't think that it is designed to be that way so
yes, my patch changes the design, makes it more predicable and most important,
more safe. The man page really is not explicit about the process group. 

> if somebody needs a new process group (to avoid killing other processes) - he
> can use the command "setsid" (see example above).
Tell me, why would anybody _NOT_ want pppd in an own process group? Without
that it's a very dangerous program, killing other programs that have nothing to
do with pppd.

> pppd behaves same, sends the signals to all processes in the process group.
> Just in deamon mode it creates a new session, what is fine.
Why is the latter fine? That was the whole problem for me and it's not even
documentated. 

> > What's wrong with the patch?
> you try to fix things at the wrong place.
Again, what's wrong with my patch? Isn't it easier to change a few lines in
pppd to get a predictable behaviour instead of patching all the init scripts
and programs out there to use process groups so they don't accidentially kill
my apache and sshd?

> > The original behaviour caused me a lot of headaches to find "random" crashes or random programs.
> Your patch caused me a lot of headaches since with gentoo-linux pppd does not
> behave as it should.
True, I agree it caused other side effects and the report is appreciated.
So did you try the second patch? 

------- Comment #11 From Alin Năstac 2008-04-10 21:43:46 0000 -------
(In reply to comment #9)
> thanks a lot for your work, but this is not the point.
> simply don't fix an error where is no error.
> if somebody needs a new process group (to avoid killing other processes) - he
> can use the command "setsid" (see example above).

I agree with Sacha, killing another process group than your own *is* a bug (and
no, this is not in accord with the documentation, the documentation specifying
that it will kill *its* process group). Your setsid suggestion is simply a
workaround, nothing more.

Please test the Sacha's patch and post here the results. Btw, nice patch ;)

------- Comment #12 From R. Müller 2008-04-11 09:41:37 0000 -------
(In reply to comment #11)
> I agree with Sacha, killing another process group than your own *is* a bug (and
> no, this is not in accord with the documentation, the documentation specifying
> that it will kill *its* process group). Your setsid suggestion is simply a
> workaround, nothing more.
> 
> Please test the Sacha's patch and post here the results. Btw, nice patch ;)
> 

thank you for your opinion, Alin. but there is something, what you obviously
misunderstood. pppd does *not* kill other process-groups ! it *only* kills his
own! and thats exactly what the man-pages says. 

btw. do you know a similar patch for debian, suse or redhat and all the other
ditsribution out there ... ? i don't think so. the point is, if you wish to
change something, in how pppd behaves, than do it at the right location !

------- Comment #13 From Alin Năstac 2008-04-11 17:24:10 0000 -------
1) pppd call setsid() with the obvious intention to isolate itself and its
children and to have a easy way of killing whatever it started. Therefore, pppd
*defines* its own process group, it *does not* expect user to define it for
him.
2) when a signal is received before setsid() is called, the result is that the
parent's process group (the one that started the pppd) gets killed.

Now please tell me (with a straight face) this is not a bug... If the patch(es)
are not good enough for your case, please provide alternative. I thought about
calling setsid() first thing after starting pppd and I  think this is the best
way to fix that because it will klll all descendant processes not only the one
it created itself.

As for us modifying pppd sources to fix its bugs, let me tell you something: we
are not alone! All others do it. Also, usually we provide our patches to
upstream, but since ppp upstream is dead for years now, we only solve our
problems.

------- Comment #14 From Alin Năstac 2008-04-12 08:35:31 0000 -------
Fixed in -r15.

I've re-written the kill-pg.patch in a much cleaner way. I don't know why I
always find the complicate way of solving things while there are much simpler
solutions to the given problem :-\

------- Comment #15 From Alin Năstac 2009-11-16 22:41:20 0000 -------
I've changed kill-pg patch in the revision -r25, see bug 292374.