I'm running Gentoo on 3 servers using systemd init. From 218 to sys-apps/systemd-219-r2. Now, sometimes, very often, actually almost randomly or let's phrase it at undetermined conditions but likely related to high load with burst characteristics (for instance when doing emerge -avuD @world), the sshd program dies, aka root pts/2 2a02:8070:c697:a Thu Apr 23 11:59 still logged in root ssh 2a02:8070:c697:a Thu Apr 23 11:59 - 12:13 (00:14) root pts/0 2a02:8070:c697:a Thu Apr 23 11:41 - 12:13 (00:32) root ssh 2a02:8070:c697:a Thu Apr 23 11:41 - 11:59 (00:17) as you can see, 1. a second connection should be "named" pts/1, not pts/2 2. at pts/0 the ssh process is terminated before pts/0 is terminated, exactly 15 minutes later. 3. at pts/2 the ssh process is terminated but pts/2 is still active 4. seems like the second ssh process is being associated with pts/0 and closed when pts/0 is terminated The symptoms are, output that should be sent to the client isn't sent to the client however when you keep your finger on the enter key, which repeats keypresses, and doing so for about 2 seconds output that should have arrived at the client is sent to the client and the session continues working as usual. I have not experienced this behaviour on any of the 3 machines with other distros. It doesn't always happen but it happens often enough to be considered a bug. Here's a console capture, last few lines, when I don't keep pressing the enter key: Would you like to merge these packages? [Yes/No] >>> Verifying ebuild manifests >>> Running pre-merge checks for net-libs/iojs-1.8.1 >>> Emerging (1 of 14) sys-libs/cracklib-2.9.4::gentoo >>> Installing (1 of 14) sys-libs/cracklib-2.9.4::gentoo >>> Emerging (2 of 14) dev-db/sqlite-3.8.9::gentoo >>> Jobs: 1 of 14 complete, 1 running Load avg: 0.32, 0.10, 0.07packet_write_wait: Connection to _the_ip_address_: Broken pipe Reproducible: Sometimes
This is how a fresh session looks: # last|head root pts/0 2a02:8070:c697:a Thu Apr 23 12:34 still logged in root ssh 2a02:8070:c697:a Thu Apr 23 12:34 still logged in root pts/2 2a02:8070:c697:a Thu Apr 23 11:59 - 12:34 (00:35) root ssh 2a02:8070:c697:a Thu Apr 23 11:59 - 12:13 (00:14) root pts/0 2a02:8070:c697:a Thu Apr 23 11:41 - 12:13 (00:32) root ssh 2a02:8070:c697:a Thu Apr 23 11:41 - 11:59 (00:17)
First of all, please paste your "emerge --info" and "emerge -pv openssh" output.
# emerge --info Portage 2.2.18 (python 2.7.9-final-0, default/linux/amd64/13.0, gcc-4.9.2, glibc-2.20-r2, 3.18.11-gentoo x86_64) ================================================================= System uname: Linux-3.18.11-gentoo-x86_64-Intel-R-_Xeon-R-_CPU_E3-1245_V2_@_3.40GHz-with-gentoo-2.2 KiB Mem: 16378924 total, 14279780 free KiB Swap: 16776188 total, 16776188 free Timestamp of repository gentoo: Mon, 27 Apr 2015 06:15:01 +0000 sh bash 4.3_p33-r2 ld GNU ld (Gentoo 2.25 p1.0) 2.25 distcc 3.2rc1 x86_64-pc-linux-gnu [enabled] app-shells/bash: 4.3_p33-r2::gentoo dev-lang/perl: 5.20.2::gentoo dev-lang/python: 2.7.9-r2::gentoo, 3.3.5-r1::gentoo, 3.4.1::gentoo dev-util/cmake: 3.0.2::gentoo dev-util/pkgconfig: 0.28-r2::gentoo sys-apps/baselayout: 2.2::gentoo sys-apps/openrc: 0.14::gentoo sys-apps/sandbox: 2.6-r1::gentoo sys-devel/autoconf: 2.69-r1::gentoo sys-devel/automake: 1.13.4::gentoo, 1.15::gentoo sys-devel/binutils: 2.25::gentoo sys-devel/gcc: 4.9.2::gentoo sys-devel/gcc-config: 1.8-r1::gentoo sys-devel/libtool: 2.4.6-r1::gentoo sys-devel/make: 4.1-r1::gentoo sys-kernel/linux-headers: 3.18::gentoo (virtual/os-headers) sys-libs/glibc: 2.20-r2::gentoo Repositories: gentoo location: /usr/portage sync-type: rsync sync-uri: rsync://mirror.hetzner.de/gentoo/portage priority: -1000 ACCEPT_KEYWORDS="amd64 ~amd64" ACCEPT_LICENSE="* -@EULA" CBUILD="x86_64-pc-linux-gnu" CFLAGS="-O3 -pipe -march=native -fstack-protector-strong --param=ssp-buffer-size=4" CHOST="x86_64-pc-linux-gnu" CONFIG_PROTECT="/etc" CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo" CXXFLAGS="-O3 -pipe -march=native -fstack-protector-strong --param=ssp-buffer-size=4" DISTDIR="/usr/portage/distfiles" EMERGE_DEFAULT_OPTS="--quiet-build=y" FCFLAGS="-O2 -pipe" FEATURES="assume-digests binpkg-logs buildpkg config-protect-if-modified distcc distlocks ebuild-locks fixlafiles merge-sync news parallel-fetch preserve-libs protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr" FFLAGS="-O2 -pipe" GENTOO_MIRRORS="ftp://mirror.hetzner.de/gentoo/" LANG="en_US.UTF-8" LDFLAGS="-Wl,-O1 -Wl,--as-needed" MAKEOPTS="-j8 -l8" PKGDIR="/usr/portage/packages" PORTAGE_CONFIGROOT="/" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages" PORTAGE_TMPDIR="/var/tmp" USE="X509 acl aes amd64 audit avx berkdb bzip2 cli cracklib crypt cxx dri fortran gdbm go iconv icu ipv6 mmx mmxext modern-top modules multilib ncurses nls nptl openmp pam pcre popcnt readline seccomp session sse sse2 sse3 sse4_1 sse4_2 ssl ssse3 systemd tcpd unicode xattr zlib" ABI_X86="64" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump author" CAMERAS="ptp2" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="aes avx mmx mmxext popcnt sse sse2 sse3 sse4_1 sse4_2 ssse3" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip tripmate tnt ublox ubx" GRUB_PLATFORMS="pc" INPUT_DEVICES="keyboard mouse evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" NGINX_MODULES_HTTP="access auth_basic autoindex browser charset empty_gif geo gzip map proxy referer rewrite userid addition auth_pam auth_request cache_purge echo geoip gunzip gzip_static headers_more metrics push_stream realip secure_link slowfs_cache spdy sticky stub_status sub upload_progress upstream_check" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php5-5" PYTHON_SINGLE_TARGET="python2_7" PYTHON_TARGETS="python2_7 python3_3" RUBY_TARGETS="ruby19 ruby20" USERLAND="GNU" VIDEO_CARDS="fbdev glint intel mach64 mga nouveau nv r128 radeon savage sis tdfx trident vesa via vmware dummy v4l" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account" Unset: CPPFLAGS, CTARGET, INSTALL_MASK, LC_ALL, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, USE_PYTHON The 3 hosts are identical in terms of hardware. # emerge -pv openssh These are the packages that would be merged, in order: Calculating dependencies... done! [ebuild R ] net-misc/openssh-6.8_p1-r4::gentoo USE="hpn pam pie ssh1 ssl -X -X509 -bindist -debug -kerberos -ldap -ldns -libedit -sctp (-selinux) -skey -static" 0 KiB
This may have to do with the CFLAGS you're using. I would try reemerging everything with "-march=native -O2 -pipe". If you don't want to do that, make a cronjob and restart sshd every some minutes if it's killed. And check dmesg and other logs to see what is happening on the boxes, without rebooting them. I'm closing this as TEST-REQUEST. Reopen only if you still experience issues and you're sure that this is an openssh bug and not something related to your configuration.
I did -O2 previously - same result I disabled distcc and used -O2 - same result But ok I have now recompiled with -O2 and the machines will be rebooting in a minute. We'll see how that turns out again. Today seems to be a rather stable day. I'm not the only one experiencing this. Gentoo Forum user Schnulli is also experiencing this, he works around this by using screen, I sent him a PM. It's basically stock configuration with -march=native -O3 and few additional use flags and it happened from the very beginning. A very minimal system (or 3). I know it's a complicated bug and there *is* an issue, but I don't know what it's related to. It might be network instability, but how come that the pts/0 ssh process gets associated with pts/2? And why is there no pts/1? So there must be some software bug. I don't know what it's related to, hence the bug report. It might be openssh, it might be something else. It's also not clear if the ssh process is killed or if it segfaults, logs show nothing, so if it would be segfaulting it would appear in the logs. I've been observing this behaviour for about 2 months.
happened again with -O2 after opening a new terminal and connecting to another server root pts/1 2a02:8070:c680:c Wed Jun 3 13:15 still logged in root ssh 2a02:8070:c680:c Wed Jun 3 13:15 - 13:20 (00:05) root pts/0 2a02:8070:c680:c Wed Jun 3 12:32 - 13:20 (00:48) root ssh 2a02:8070:c680:c Wed Jun 3 12:32 - 13:15 (00:43) reboot system boot 3.18.11-gentoo Wed Jun 3 12:31 still running ^ term1 closed term2 connected
and again root pts/1 2a02:8070:c6c2:6 Mon Jun 22 16:19 still logged in root ssh 2a02:8070:c6c2:6 Mon Jun 22 16:19 still logged in root pts/0 2a02:8070:c6c2:6 Mon Jun 22 15:36 still logged in root ssh 2a02:8070:c6c2:6 Mon Jun 22 15:36 - 16:19 (00:43) and again it's stock configuration flagging this as resolved test-request is the lazy way out
and again new system new installation root pts/1 2a02:8070:c688:a Mon Jan 25 03:44 still logged in root ssh 2a02:8070:c688:a Mon Jan 25 03:44 still logged in root ssh 2a02:8070:c688:a Mon Jan 25 01:50 - 01:50 (00:00) root ssh 2a02:8070:c688:a Mon Jan 25 01:50 - 01:50 (00:00) root ssh 2a02:8070:c688:a Mon Jan 25 01:41 - 01:41 (00:00) root ssh 2a02:8070:c688:a Mon Jan 25 01:41 - 01:41 (00:00) root ssh 2a02:8070:c688:a Mon Jan 25 01:38 - 01:38 (00:00) root ssh 2a02:8070:c688:a Mon Jan 25 01:38 - 01:38 (00:00) root ssh 2a02:8070:c688:a Mon Jan 25 01:38 - 01:38 (00:00) root pts/0 2a02:8070:c688:a Mon Jan 25 01:35 still logged in root ssh 2a02:8070:c688:a Mon Jan 25 01:35 - 01:38 (00:03) -O2 -march=native -pipe gcc-5.3.0 come on. open the bug. Same hardware, same system with debian 7, centos 7, opensuse 42.1 fedora 23 no such issue Gentoo - issue