openssh-8.3_p1-r5 doesn't kill session when the machine is rebooted, which leads to hanging ssh session that have to be kileld manually Reproducible: Always Steps to Reproduce: 1. start sshd 2. reboot Actual Results: ssh session hangs Expected Results: open ssh sessions get killed when the machine reboots
adding this to stop_pre fixes this: if [ "${RC_RUNLEVEL}" = "shutdown" ]; then SSH_CLIENT_PIDS="$(pgrep -f 'sshd:')" if [[ -n ${SSH_CLIENT_PIDS} ]] ; then kill -TERM ${SSH_CLIENT_PIDS} fi fi
Is this a regression from a previous version?
No, not a regression. I remember that I filed a similar bug when I was new in Gentoo (like other distributions are actively killing and Gentoo was behaving differently) and also had a conversation with William about this: The service itself is only stopping the master process but not any running child because we don't want to disconnect system administrator (the suggested fix is taking care of this by checking for runlevel). I remember that I was told that /etc/init.d/killprocs should normally take care of childs so it was believed this isn't needed. Would be interesting to understand why killprocs isn't helping us here...
Hrm, I can no longer reproduce the problem -- it's working for me. killprocs service is killing remaining processes, including sshd child processes which will terminate any active SSH connection as expected.
I don't think it makes sense to add logic to the sshd init script to kill existing ssh sessions.
it doesnt work for me. rebooting the machine over an active ssh session creates a "frozen" session for me
Apply the following changes to debug this, > --- /etc/init.d/killprocs.old 2020-11-06 17:06:26.000000000 +0100 > +++ /etc/init.d/killprocs 2020-11-05 20:32:23.000000000 +0100 > @@ -18,10 +18,15 @@ > > start() > { > + set -x > + pgrep sshd > + > ebegin "Terminating remaining processes" > kill_all -v 15 ${killall5_opts} > eend 0 > ebegin "Killing remaining processes" > kill_all -v 9 ${killall5_opts} > eend 0 > + > + sleep 20 > }
Possibly the network interface(s) are stopped before killprocs is started?
(In reply to Mike Gilbert from comment #8) > Possibly the network interface(s) are stopped before killprocs is started? Very likely. Happens to me when I let NetworkManager handle my network devices. NM usually gets stopped (on openrc systems) before the ssh-logins are taken down.
ping
This works fine on systemd since it is smart enough to kill user sessions before stopping the network. OpenRC doesn't provide any way to identify user sessions, and doesn't have any logic to terminate them before stopping the network. We could add a workaround to the sshd init script, but I don't really see the point. I vote WONTFIX on this.
I changed my mind. There are multiple scenarios where killprocs will be too late, for example when network were already stopped. I am currently testing something like > # diff -u /var/db/repos/gentoo/net-misc/openssh/files/sshd-r1.initd /etc/init.d/sshd > --- /var/db/repos/gentoo/net-misc/openssh/files/sshd-r1.initd 2019-03-08 01:31:51.175977236 +0100 > +++ /etc/init.d/sshd 2021-03-07 20:34:27.006650772 +0100 > @@ -72,10 +72,23 @@ > } > > stop_pre() { > - # If this is a restart, check to make sure the user's config > - # isn't busted before we stop the running daemon. > if [ "${RC_CMD}" = "restart" ] ; then > + # If this is a restart, check to make sure the user's config > + # isn't busted before we stop the running daemon. > checkconfig || return $? > + elif yesno "${RC_GOINGDOWN}" && [ -s "${pidfile}" ] && hash pgrep 2>/dev/null ; then > + # Disconnect any clients before killing the master process > + local pid=$(cat "${pidfile}" 2>/dev/null) > + if [ -n "${pid}" ] ; then > + local ssh_session_pattern='sshd: \S.*@pts/[0-9]+' > + > + IFS="${IFS}@" > + local daemon pid pty user > + pgrep -a -P ${pid} -f "$ssh_session_pattern" | while read pid daemon user pty ; do > + ewarn "Found ${daemon%:} session ${pid} on ${pty}; sending SIGTERM ..." > + kill "${pid}" || true > + done > + fi > fi > } > and also plan to bring something like https://salsa.debian.org/ssh-team/openssh/-/blob/master/debian/systemd/ssh-session-cleanup.service for systemd users.
You could use the openrc cgroup_cleanup function on hosts with cgroups enabled, it's probably nicer (and more complete) than the approach detailed here. It will also only get the ssh sessions for the sshd instance that is going down (in the case a host has multiple sshds running).
(In reply to Patrick McLean from comment #13) > You could use the openrc cgroup_cleanup function on hosts with cgroups > enabled, it's probably nicer (and more complete) than the approach detailed > here. It will also only get the ssh sessions for the sshd instance that is > going down (in the case a host has multiple sshds running). Thank you for the feedback. I am not using cgroup_cleanup because like you said, it's not available for everyone and having two code paths would require two tests... My approach should take care of multiple sshd instances because we are passing pidfile from current master process to identify child processes. However, during testing I thought I am over-engineering given that we are about to shutdown, so it should be fine to end *all* connections, not just from this specific sshd instance.