I just debugged a problem where users could not connect to our Intel Xeon server (-march=broadwell) using winscp, with a message "the server unexpectedly closed the connection". On the server, I could see a kernel trap ("general protection fault") in sshd (current net-misc/openssh-9.3_p1). I recently upgraded to sys-devel/gcc-13.1.0-r1 and use -O2 optimization. Compiling with symbols and attaching to the correct child of sshd while following forks finally showed that this occurred in optimized-out code in function ssh_aes_ctr in cipher-ctr-mt.c:419 called from digest-openssl.c:171. After recompiling without any -O flag, sshd worked again. This is not specific to -mbroadwell - it is the same across all of my systems, from ivybridge to haswell. These systems worked as long as openssh was compiled with gcc-12 (and otherwise the same flags, ie -O2). Reproducible: Always
Any chance you can give a backtrace when it happens? I appreciate it's a bit awkward to replicate given how much sshd forks though. Can you give emerge --info too for completeness? Thanks.
Also, emerge -pvO net-misc/openssh please.
I'm going to put together a windows VM later. We may also need a reduced form of your ssh config, winscp version, and windows version of one of the clients. Also, obviously, if you can find a simpler reproducer, that'd be welcome ;) I can't reproduce it yet with just normal ssh and scp.
(In reply to Sam James from comment #3) > I'm going to put together a windows VM later. We may also need a reduced > form of your ssh config, winscp version, and windows version of one of the > clients. > > Also, obviously, if you can find a simpler reproducer, that'd be welcome ;) > > I can't reproduce it yet with just normal ssh and scp. I'll return to work where I have a Windows (10) VM with winscp. It's what made it spooky for me as well - no symptoms within Linux. A path to reproduce it more easily would probably be to force the aes_ctr cipher in ssh, though I had the impression it should be commonly used...
Could reproduce it now on my systems at home: ssh -c aes128-ctr <hostname> It of course did not "work" at first because the host's openssh was compiled with gcc-12 in March. Now compiling again with gcc-13 and -O2 it the server process immediately segfaults. I'll send the logs in a minute...
emerge --info: Portage 3.0.47 (python 3.11.3-final-0, default/linux/amd64/17.1/systemd/merged-usr, gcc-13, glibc-2.37-r2, 6.2.14-gentoo x86_64) ================================================================= System uname: Linux-6.2.14-gentoo-x86_64-AMD_Ryzen_7_5700X_8-Core_Processor-with-glibc2.37 KiB Mem: 32798100 total, 18398872 free KiB Swap: 39112696 total, 39112696 free Timestamp of repository gentoo: Fri, 05 May 2023 11:45:01 +0000 Head commit of repository gentoo: d6a71b3c9b301120a001114af3eb482c954da148 Head commit of repository flatpak-overlay: 4bf9a7815ca9361f86459c8a8e9bc403e3721704 sh bash 5.2_p15-r2 ld GNU ld (Gentoo 2.40 p4) 2.40.0 distcc 3.4 x86_64-pc-linux-gnu [enabled] ccache version 4.8 [enabled] app-misc/pax-utils: 1.3.7::gentoo app-shells/bash: 5.2_p15-r2::gentoo dev-java/java-config: 2.3.1::gentoo dev-lang/perl: 5.36.1-r1::gentoo dev-lang/python: 3.11.3::gentoo dev-lang/rust-bin: 1.69.0::gentoo dev-util/ccache: 4.8::gentoo dev-util/cmake: 3.26.3::gentoo dev-util/meson: 1.1.0::gentoo sys-apps/baselayout: 2.13-r1::gentoo sys-apps/sandbox: 2.30-r1::gentoo sys-apps/systemd: 253.4::gentoo sys-devel/autoconf: 2.13-r8::gentoo, 2.71-r6::gentoo sys-devel/automake: 1.16.5-r1::gentoo sys-devel/binutils: 2.40-r4::gentoo sys-devel/binutils-config: 5.5::gentoo sys-devel/clang: 15.0.7-r1::gentoo, 16.0.3::gentoo sys-devel/gcc: 13.1.0-r1::gentoo sys-devel/gcc-config: 2.10::gentoo sys-devel/libtool: 2.4.7-r1::gentoo sys-devel/lld: 15.0.7::gentoo, 16.0.3::gentoo sys-devel/llvm: 14.0.6-r2::bfown, 15.0.7::gentoo, 16.0.3::gentoo sys-devel/make: 4.4.1::gentoo sys-kernel/linux-headers: 6.3::gentoo (virtual/os-headers) sys-libs/glibc: 2.37-r2::gentoo Repositories: gentoo location: /usr/portage sync-type: rsync sync-uri: rsync://rsync.de.gentoo.org/gentoo-portage priority: -1000 volatile: True sync-rsync-verify-max-age: 24 sync-rsync-extra-opts: sync-rsync-verify-jobs: 1 sync-rsync-verify-metamanifest: yes flatpak-overlay location: /gentoo/local/flatpak-overlay sync-type: git sync-uri: https://github.com/fosero/flatpak-overlay.git masters: gentoo priority: 50 volatile: True science location: /gentoo/local/layman/science sync-type: laymansync sync-uri: https://anongit.gentoo.org/git/proj/sci.git masters: gentoo priority: 50 volatile: True sinustrom location: /gentoo/local/layman/sinustrom sync-type: laymansync sync-uri: https://github.com/zpuskas/sinustrom-gentoo-overlay.git masters: gentoo priority: 50 volatile: True bfown location: /gentoo/overlay masters: gentoo science priority: 100 volatile: True Installed sets: @system ACCEPT_KEYWORDS="amd64 ~amd64" ACCEPT_LICENSE="@FREE" CBUILD="x86_64-pc-linux-gnu" CFLAGS="-march=znver3 -O2 -pipe" CHOST="x86_64-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/lib64/libreoffice/program/sofficerc /usr/share/config /usr/share/gnupg/qualified.txt" CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c" CXXFLAGS="-march=znver3 -O2 -pipe" DISTDIR="/gentoo/distfiles/" ENV_UNSET="CARGO_HOME DBUS_SESSION_BUS_ADDRESS DISPLAY GDK_PIXBUF_MODULE_FILE GOBIN GOPATH PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR XDG_STATE_HOME" FCFLAGS="-O2 -pipe" FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs binpkg-multi-instance buildpkg-live ccache config-protect-if-modified distcc distlocks ebuild-locks fixlafiles ipc-sandbox merge-sync multilib-strict network-sandbox news parallel-fetch pid-sandbox preserve-libs protect-owned qa-unresolved-soname-deps sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr" FFLAGS="-O2 -pipe" GENTOO_MIRRORS="http://ftp-stud.fht-esslingen.de/pub/Mirrors/gentoo/ https://ftp.fau.de/gentoo http://distfiles.gentoo.org" LANG="de_DE.UTF-8" LDFLAGS="-Wl,-O1 -Wl,--as-needed" LEX="flex" LINGUAS="de en" MAKEFLAGS="-j16" MAKEOPTS="-j26 -l16" PKGDIR="/gentoo/packages/x64" PORTAGE_CONFIGROOT="/" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git" PORTAGE_TMPDIR="/var/tmp" SHELL="/bin/bash" USE="R X a52 aac aacs acl acpi afs alsa amd64 argyllcms audit bacula-clientonly bdplus blas bluetooth bluray btrfs bzip2 cairo caps cddb cdparanoia cdr cli clutter color-management colord cpudetection crypt cscope cups dbus dga djvu dri dri3 dv dvd dvdr dvdread dvi eds enblend encode eselect-ldso evo exif ext4 extrafilters faac ffmpeg fftw flac fluidsynth fontconfig fortran fuse g3dvl gdbm gdk-pixbuf gegl gif gimp gles2 gmp gnome gnome-keyring gnome-shell graphics graphviz gs gstreamer gtk gtk3 hddtemp hdri heif iconv icu id3tag imagemagick introspection ipv6 ipython jack java jbig jingle jpeg jpeg2k kerberos kpathsea ladspa lame lapack latex lcms ldap lensfun libdrm libglvnd libnotify libtirpc lm_sensors lv2 lyx lzma lzo mad matplotlib matroska md5sum midi mjpeg mng mp2 mp3 mpeg mtp multilib musepack nautilus ncurses nfs nftables nls nptl numpy office ofx ogg openal opencl opencv opengl openh264 openmp opus otr pam pcre pda pdf pep8 pipewire playlist plotutils plugins png pnm policykit postgres postscript pulseaudio pylint python qt3support quicktime radio rar raw readline real rtc rtmp rubberband sbsms scanner science scipy seccomp sendto sift sndfile sound soundtouch speex spell sqlite srt ssh ssl svg systemd t1lib taglib test-rust theora threads tiff tivo tracker truetype twolame udev udisks umfpack unicode upnp user-session v4l vaapi vamp vdpau vim-syntax vorbis vpx vulkan wayland webengine webp win32codecs wmf x264 x265 xattr xcomposite xetex xinerama xml xmp xmpp xps xv xvid xvmc zeromq zlib zoran zstd" ABI_X86="64" ADA_TARGET="gnat_2021" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="karbon sheets words" CAMERAS="canon ptp2 samsung" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="mmx mmxext sse sse2 aes avx avx2 f16c fma3 pclmul popcnt rdrand sha sse3 sse4_1 sse4_2 sse4a ssse3" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock greis isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" INPUT_DEVICES="libinput" KERNEL="linux" L10N="de en" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer nlpsolver pdfimport" LUA_SINGLE_TARGET="luajit" LUA_TARGETS="luajit" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php7-4 php8-0" POSTGRES_TARGETS="postgres12 postgres13" PYTHON_SINGLE_TARGET="python3_11" PYTHON_TARGETS="python3_11" QEMU_SOFTMMU_TARGETS="x86_64 i386 arm aarch64" QEMU_USER_TARGETS="x86_64 i386 arm aarch64" RUBY_TARGETS="ruby31" SANE_BACKENDS="hp5590 mustek" USERLAND="GNU" VIDEO_CARDS="radeon radeonsi amdgpu" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq proto steal rawnat logmark ipmark dhcpmac delude chaos account" Unset: ADDR2LINE, AR, ARFLAGS, AS, ASFLAGS, CC, CCLD, CONFIG_SHELL, CPP, CPPFLAGS, CTARGET, CXX, CXXFILT, ELFEDIT, EMERGE_DEFAULT_OPTS, EXTRA_ECONF, F77FLAGS, FC, GCOV, GPROF, INSTALL_MASK, LC_ALL, LD, LFLAGS, LIBTOOL, MAKE, NM, OBJCOPY, OBJDUMP, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, RANLIB, READELF, RUSTFLAGS, SIZE, STRINGS, STRIP, YACC, YFLAGS emerge -pvO net-misc/openssh: These are the packages that would be merged, in order: [ebuild R ] net-misc/openssh-9.3_p1::gentoo USE="X audit hpn kerberos pam pie ssl -X509 (-debug) -ldns -libedit -livecd -sctp -security-key (-selinux) -static -test -verify-sig -xmss" 0 KiB Total: 1 package (1 reinstall), Size of downloads: 0 KiB
Right after I finished the hardened migration update, I couldn't connect to my host machine though my Windows guest through Putty. Altought I'm able to connect to my host through Termux on phone. While checking syslog I came across this error message: kernel: traps: sshd[3605] general protection fault ip:556bd9464eb0 sp:7ffe69d84b20 error:0 in sshd[556bd93fe000+b0000] @Sam, removing HPN USE flag solved the issue.
And here is the trace, attaching to the [priv] thread of the three threads started during keyboard-interactive authentification when ssh asks for the password. Note that this is just the trick to be able to connect gdb to the process - the error occurs also with key auth. Since three threads are started, "set follow-fork-mode child" does not help when used directly on the sshd server process... Note that I initially thought the problem would relate to Intel or even the broadwell chipset, but here I reproduce it on a Ryzen 7 system. (gdb) set follow-fork-mode child (gdb) symbol-file /usr/lib/debug/usr/sbin/sshd.debug Reading symbols from /usr/lib/debug/usr/sbin/sshd.debug... (gdb) continue Continuing. [Attaching after Thread 0x7fdb7b642380 (LWP 536695) fork to child process 536720] [New inferior 2 (process 536720)] [Detaching after fork from parent process 536695] [Inferior 1 (process 536695) detached] [Thread debugging using libthread_db enabled] Using host libthread_db library "/usr/lib64/libthread_db.so.1". [New Thread 0x7fdb78bfb6c0 (LWP 536721)] [New Thread 0x7fdb783fa6c0 (LWP 536722)] [New Thread 0x7fdb77bf96c0 (LWP 536723)] [New Thread 0x7fdb773f86c0 (LWP 536724)] [New Thread 0x7fdb76bf76c0 (LWP 536725)] [New Thread 0x7fdb763f66c0 (LWP 536726)] [New Thread 0x7fdb75bf56c0 (LWP 536727)] [New Thread 0x7fdb753f46c0 (LWP 536728)] [New Thread 0x7fdb747ef6c0 (LWP 536729)] [New Thread 0x7fdb73fee6c0 (LWP 536730)] [New Thread 0x7fdb737ed6c0 (LWP 536731)] [New Thread 0x7fdb72fec6c0 (LWP 536732)] [New Thread 0x7fdb727eb6c0 (LWP 536733)] [New Thread 0x7fdb71fea6c0 (LWP 536734)] [New Thread 0x7fdb717e96c0 (LWP 536735)] [New Thread 0x7fdb70fe86c0 (LWP 536736)] Thread 2.1 "sshd" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fdb7b642380 (LWP 536720)] 0x000055c7b666e040 in ssh_aes_ctr (ctx=<optimized out>, dest=<optimized out>, src=<optimized out>, len=<optimized out>) at cipher-ctr-mt.c:419
(In reply to 1nra4yu0 from comment #7) > Right after I finished the hardened migration update, I couldn't connect to > my host machine though my Windows guest through Putty. Altought I'm able to > connect to my host through Termux on phone. > > While checking syslog I came across this error message: > kernel: traps: sshd[3605] general protection fault ip:556bd9464eb0 > sp:7ffe69d84b20 error:0 in sshd[556bd93fe000+b0000] > > @Sam, removing HPN USE flag solved the issue. Just to note - as I mentioned, I did not touch the hpn use flag but leaving out -O2 in gcc13 also makes it work, and openssh - again with the same use flags - compiled with gcc12 also does not crash, so I'm pretty sure it's a compiler bug. Now that it can be easily reproduced (ssh -c aes128-ctr), should I file a bug with gcc bugzilla?
Additional info on optimization flags: [None]: Works -O: Works -O1: Works -Os: Fails -O2: Fails -Ofast: Fails
(In reply to Bernd Feige from comment #9) > Just to note - as I mentioned, I did not touch the hpn use flag but leaving > out -O2 in gcc13 also makes it work, and openssh - again with the same use > flags - compiled with gcc12 also does not crash, so I'm pretty sure it's a > compiler bug. > I'm not saying it's not, but what you've said unfortunately doesn't mean it _is_ yet either. It's not considered a compiler bug if there's UB or the compiler just optimises better based on something behaving correctly. It's somewhat common for new compiler versions to break bad programs. Thank you for the reproducer, I'll try it now.
Reproduced w/ ssh -c aes128-ctr <hostname> (thanks!) I'm pretty sure it's an alignment issue: * -UCIPHER_UNALIGNED_OK is enough to stop it (need to #undef it in cipher-ctr-mt.c, flag doesn't work because cipher-ctr-mt.c doesn't check if it's already defined) * General protection faults (GPFs) tend to arise when giving unaligned operands to vector instructions, like we are here: 0x000055d139eb9ef5 <+357>: vmovdqa xmm0,XMMWORD PTR [rsi] (See https://www.felixcloutier.com/x86/movdqa:vmovdqa32:vmovdqa64). * UBSAN rightly complains about it (these are from the client end): GCC: ``` $ ssh -c aes128-ctr -v 127.0.0.1 [...] debug1: AES-CTR MT spawned a thread with id 139726398674624 in ssh_aes_ctr_init (2, 15) debug1: rekey in after 4294967296 blocks debug1: channel 2: new session [client-session] (inactive timeout: 0) cipher-ctr-mt.c:419:29: runtime error: load of misaligned address 0x55e14bf018a4 for type '__int128 unsigned', which requires 16 byte alignment 0x55e14bf018a4: note: pointer points here 00 00 00 20 07 5a 00 00 00 07 73 65 73 73 69 6f 6e 00 00 00 02 00 10 00 00 00 00 40 00 9b 98 96 ^ mux_client_request_session: read from master failed: Broken pipe Failed to connect to new control master [...] ``` Clang: ``` $ ssh -c aes128-ctr -v 127.0.0.1 [...] debug1: AES-CTR MT spawned a thread with id 140436574230208 in ssh_aes_ctr_init (2, 14) debug1: AES-CTR MT spawned a thread with id 140436565837504 in ssh_aes_ctr_init (2, 15) debug1: rekey in after 4294967296 blocks debug1: channel 2: new session [client-session] (inactive timeout: 0) cipher-ctr-mt.c:419:20: runtime error: load of misaligned address 0x5573f235d7d4 for type '__uint128_t' (aka 'unsigned __int128'), which requires 16 byte alignment 0x5573f235d7d4: note: pointer points here 00 00 00 20 07 5a 00 00 00 07 73 65 73 73 69 6f 6e 00 00 00 02 00 10 00 00 00 00 40 00 22 61 72 ^ SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior cipher-ctr-mt.c:419:20 in cipher-ctr-mt.c:419:35: runtime error: load of misaligned address 0x7fb9ff610018 for type '__uint128_t' (aka 'unsigned __int128'), which requires 16 byte alignment 0x7fb9ff610018: note: pointer points here 00 00 00 00 b9 01 de 71 31 c4 84 02 35 1d 9d f3 3c 21 f1 61 41 38 b1 25 04 c8 66 ea e7 a4 42 de ^ SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior cipher-ctr-mt.c:419:35 in cipher-ctr-mt.c:419:4: runtime error: store to misaligned address 0x5573f2357d34 for type '__uint128_t' (aka 'unsigned __int128'), which requires 16 byte alignment 0x5573f2357d34: note: pointer points here 00 00 00 20 3f 3d c2 22 fe 79 a9 81 3e 0a de 75 6e 19 d0 4e 38 77 ad db 88 55 e9 67 9e ae 74 f6 ^ SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior cipher-ctr-mt.c:419:4 in debug1: Sending environment. [...] ``` * Backtrace: ``` (gdb) bt #0 0x000055d139eb9ef5 in ssh_aes_ctr (ctx=<optimized out>, dest=<optimized out>, src=<optimized out>, len=<optimized out>) at cipher-ctr-mt.c:419 #1 0x000055d139eb71d4 in cipher_crypt (cc=0x55d13baa11a0, seqnr=<optimized out>, dest=0x55d13ba9a1bc "", src=0x55d13ba9cb40 "", len=len@entry=0x20, aadlen=aadlen@entry=0x4, authlen=0x0) at cipher.c:416 #2 0x000055d139ed0c9d in ssh_packet_send2_wrapped (ssh=ssh@entry=0x55d13ba8bf30) at packet.c:1220 #3 0x000055d139ed2304 in ssh_packet_send2 (ssh=0x55d13ba8bf30) at packet.c:1344 #4 0x000055d139ed72e1 in sshpkt_send (ssh=<optimized out>) at packet.c:2724 #5 0x000055d139f26fe8 in kex_send_newkeys (ssh=ssh@entry=0x55d13ba8bf30) at kex.c:519 #6 0x000055d139f319ce in input_kex_gen_reply (type=<optimized out>, seq=<optimized out>, ssh=0x55d13ba8bf30) at kexgen.c:222 #7 0x000055d139ee2186 in ssh_dispatch_run (ssh=ssh@entry=0x55d13ba8bf30, mode=mode@entry=0x1, done=done@entry=0x55d13a080700 <quit_pending>) at dispatch.c:112 #8 0x000055d139ee24cd in ssh_dispatch_run_fatal (ssh=ssh@entry=0x55d13ba8bf30, mode=mode@entry=0x1, done=done@entry=0x55d13a080700 <quit_pending>) at dispatch.c:132 #9 0x000055d139e3f6c4 in client_process_buffered_input_packets (ssh=0x55d13ba8bf30) at clientloop.c:1225 #10 client_loop (ssh=ssh@entry=0x55d13ba8bf30, have_pty=0x1, escape_char_arg=<optimized out>, ssh2_chan_id=ssh2_chan_id@entry=0x0) at clientloop.c:1372 #11 0x000055d139e0ea84 in ssh_session2 (cinfo=<optimized out>, ssh=<optimized out>) at ssh.c:2317 #12 main (ac=<optimized out>, av=<optimized out>) at ssh.c:1719 ``` * Bad line: ``` (gdb) frame 0 #0 0x000055d139eb9ef5 in ssh_aes_ctr (ctx=<optimized out>, dest=<optimized out>, src=<optimized out>, len=<optimized out>) at cipher-ctr-mt.c:419 419 destp.u128[0] = srcp.u128[0] ^ bufp.u128[0]; ``` * godbolt: https://godbolt.org/z/bG7ax17er. Interestingly, with gcc 13, vmovdqa is emitted even with -fno-tree-vectorize (and -fno-tree-loop-vectorize and -fno-tree-slp-vectorize just to be sure, see https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-ftree-vectorize; don't think it's needed but wanted to try).
I also did hit this segfault with my openssh-server. Disabling USE=hpn helped.
commit 90aa476798f68b5196bdaf5223f8074cdac5cf6d Author: David Seifert <soap@gentoo.org> Date: Mon May 8 22:56:02 2023 +0200 net-misc/openssh: remove USE={hpn,sctp,X509} functionality net-misc/openssh tracks vanilla upstream and aims to keep patching to an absolute minimum. The previous third-party patches for HPN/SCTP/X509 that upstream would never integrate are relegated to the new net-misc/openssh-contrib package. pkg_pretend includes a fail-safe to prevent users relying on the now removed functionality (especially X509) from losing access to their systems. HPN: * https://bugs.gentoo.org/347193#c1 - security updates end up delayed * https://bugs.gentoo.org/414401 * https://bugs.gentoo.org/498514 * https://bugs.gentoo.org/498632 * https://bugs.gentoo.org/499552 * https://bugs.gentoo.org/507210 - historically was enabled by default w/ poor rationale, only ended up off because of one-of-many bugs in the patches (bug #634594), then never got turned back on * https://bugs.gentoo.org/634594 * https://bugs.gentoo.org/719698 * https://bugs.gentoo.org/830623 * https://bugs.gentoo.org/905750 X509: * https://bugs.gentoo.org/258795 * https://bugs.gentoo.org/365655#c1 * https://bugs.gentoo.org/891665 * commit f7dcc5d
(In reply to Sam James from comment #12) > Reproduced w/ ssh -c aes128-ctr <hostname> (thanks!) > > I'm pretty sure it's an alignment issue: > * -UCIPHER_UNALIGNED_OK is enough to stop it (need to #undef it in > cipher-ctr-mt.c, flag doesn't work because cipher-ctr-mt.c doesn't check if > it's already defined) > > * General protection faults (GPFs) tend to arise when giving unaligned > operands to vector instructions, like we are here: > 0x000055d139eb9ef5 <+357>: vmovdqa xmm0,XMMWORD PTR [rsi] > > (See https://www.felixcloutier.com/x86/movdqa:vmovdqa32:vmovdqa64). > > * UBSAN rightly complains about it (these are from the client end): > cipher-ctr-mt.c:419:20: runtime error: load of misaligned address > 0x5573f235d7d4 for type '__uint128_t' (aka 'unsigned __int128'), which > requires 16 byte alignment > * Bad line: > ``` > (gdb) frame 0 > #0 0x000055d139eb9ef5 in ssh_aes_ctr (ctx=<optimized out>, dest=<optimized > out>, src=<optimized out>, len=<optimized out>) at cipher-ctr-mt.c:419 > 419 destp.u128[0] = srcp.u128[0] ^ bufp.u128[0]; Am I missing bits/consequences or... Is not the problem with hpn (High performance SSH/SCP) actually a problem with the cipher-ctr-mt.c code? Hence, "hpn" is then a victim of that whilst multithreading for (mis?)optimized compiled code from gcc13? Rather than victimizing hpn, should not the multithreading or aes128-ctr be the victim? Until cipher-ctr-mt.c can be patched/fixed?? Full kudos for debugging that one! Thanks, Martin
(In reply to Martin from comment #15) > (In reply to Sam James from comment #12) > > Reproduced w/ ssh -c aes128-ctr <hostname> (thanks!) > > > > I'm pretty sure it's an alignment issue: > > * -UCIPHER_UNALIGNED_OK is enough to stop it (need to #undef it in > > cipher-ctr-mt.c, flag doesn't work because cipher-ctr-mt.c doesn't check if > > it's already defined) > Rather than victimizing hpn, should not the multithreading or aes128-ctr be > the victim? Ok for the very good move in packages: * Messages for package net-misc/openssh-9.3_p1-r1: * net-misc/openssh does not support USE='hpn' anymore. * The Base system team *STRONGLY* recommends you not rely on this functionality, * since these USE flags required third-party patches that often trigger bugs * and are of questionable provenance. * * If you must continue relying on this functionality, switch to * net-misc/openssh-contrib. You will have to remove net-misc/openssh from your * world file first: 'emerge --deselect net-misc/openssh' * * In order to prevent loss of SSH remote login access, we will abort the build. * Whether you proceed with disabling the USE flags or switch to the -contrib * variant, when re-emerging you will have to set > > Until cipher-ctr-mt.c can be patched/fixed?? Thanks, Martin
(In reply to Martin from comment #15) > > Is not the problem with hpn (High performance SSH/SCP) actually a problem > with the cipher-ctr-mt.c code? cipher-ctr-mtr.c is provided by HPN, it doesn't exist in vanilla openssh upstream. cipher-ctr.c in upstream openssh is ok. > Hence, "hpn" is then a victim of that whilst multithreading for > (mis?)optimized compiled code from gcc13? > > Rather than victimizing hpn, should not the multithreading or aes128-ctr be > the victim? In this case, it's UB in HPN which causes the bad behaviour, and this was kind of the straw which broke the camel's back in terms of splitting it into another package finally. There's been a bunch of bugs like this over the years and we felt it was too risky to have these in the main package given the risk is users being locked out of their systems. > Full kudos for debugging that one! Thanks!
FYI, note also from 2022-01-06: https://bugs.gentoo.org/830623#c4 "DisableMTAES yes I am wondering if we should add this to sshd_config by default for USE=hpn. On modern hardware, the single threaded implementation is often faster with AES-NI. The multithreaded cipher is mostly useful on older hardware without AES-NI support." Thanks, Martin
(In reply to Sam James from comment #17) > (In reply to Martin from comment #15) > > > > Is not the problem with hpn (High performance SSH/SCP) actually a problem > > with the cipher-ctr-mt.c code? > > cipher-ctr-mtr.c is provided by HPN, it doesn't exist in vanilla openssh That's the missing bit! Good move. Thanks! Martin
What's the safe procedure to move from net-misc/openssh to net-misc/openssh-contrib on a remote server which is accessible only by ssh? Deselecting openssh doesn't do any good because it's not in the world file: # emerge --deselect net-misc/openssh >>> No matching atoms found in "world" favorites file... # emerge -Cvp openssh * This action can remove important packages! In order to be safer, use * `emerge -pv --depclean <atom>` to check for reverse dependencies before * removing packages. >>> These are the packages that would be unmerged: !!! 'net-misc/openssh' (virtual/ssh) is part of your system profile. !!! Unmerging it may be damaging to your system. # emerge -av1 openssh-contrib ... [ebuild N ~] net-misc/openssh-contrib-9.3_p1::gentoo USE="hpn pam pie ssl verify-sig -X -X509 -audit -debug -kerberos -ldns -libedit -livecd -sctp -security-key (-selinux) -static -test -xmss" 0 KiB [blocks B ] net-misc/openssh ("net-misc/openssh" is soft blocking net-misc/openssh-contrib-9.3_p1) Total: 1 package (1 new), Size of downloads: 0 KiB Conflict: 1 block (1 unsatisfied) * Error: The above package list contains packages which cannot be * installed at the same time on the same system. (net-misc/openssh-9.3_p1:0/0::gentoo, installed) pulled in by net-misc/openssh required by (virtual/ssh-0-r1:0/0::gentoo, installed) USE="userland_GNU -minimal" (net-misc/openssh-contrib-9.3_p1:0/0::gentoo, ebuild scheduled for merge) pulled in by openssh-contrib
(In reply to Nexion Kind from comment #20) Your installed virtual/ssh appears to be out of date. Run emerge --sync, and then this: emerge -av1 virtual/ssh virtual/openssh net-misc/openssh-contrib
(In reply to Mike Gilbert from comment #21) Oops, missed it. Thanks a lot!
Following an IRC discussion where the multithreaded AES-CTR was criticized heavily, and there were threats to last-rite openssh-contrib from the ::gentoo tree over it, net-misc/openssh-contrib-9.3_p2 no longer carries multithreaded AES-CTR patch. Since this patch is no longer present, closing this bug as obsolete.