Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 872548 - net-misc/openssh: clang miscompilation? (after system update SSH no longer works (with <sys-devel/gcc-config-2.6))
Summary: net-misc/openssh: clang miscompilation? (after system update SSH no longer wo...
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal
Assignee: Gentoo's Team for Core System packages
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: systemwide-clang 915000
  Show dependency tree
 
Reported: 2022-09-23 17:30 UTC by Horea Christian
Modified: 2023-12-20 07:14 UTC (History)
6 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
emerge --info (file_872548.txt,6.76 KB, text/plain)
2022-09-23 17:31 UTC, Horea Christian
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Horea Christian 2022-09-23 17:30:08 UTC
The error I get is:

ssh_dispatch_run_fatal: Connection to XX.XX.XX.XX port XX: incorrect signature

The openssh version on the machine giving this error is:
```
[I] net-misc/openssh
     Available versions:  8.9_p1-r2^t (~)9.0_p1-r1^t 9.0_p1-r2^t (~)9.0_p1-r4^t {X X509 audit debug hpn kerberos ldns libedit livecd pam +pie +scp sctp security-key selinux +ssl static test verify-sig xmss ABI_MIPS="n32"}
     Installed versions:  9.0_p1-r4^t(23:20:13 20/09/22)(X pam pie ssl -X509 -audit -debug -hpn -kerberos -ldns -libedit -livecd -sctp -security-key -selinux -static -test -verify-sig -xmss ABI_MIPS="-n32")
     Homepage:            https://www.openssh.com/
     Description:         Port of OpenBSD's free SSH release
```

but this is the same as on other machines on which I do not get the error.

Glibc however is different, with the system which gives me the error having:

```
[I] sys-libs/glibc
     Available versions:  (2.2) [M](~)2.19-r2^s [M]2.30-r9^t [M]2.31-r7^t [M]2.32-r8^t 2.33-r14^t 2.34-r14^t 2.35-r8^t (~)2.35-r10^t ~*2.36-r2^t (~)2.36-r3^t **9999*l^t
       {audit caps cet +clone3 compile-locales +crypt custom-cflags debug doc experimental-loong gd hash-sysv-compat headers-only +multiarch multilib multilib-bootstrap nscd profile selinux +ssp stack-realign +static-libs static-pie suid systemd systemtap test vanilla}
     Installed versions:  2.36-r3(2.2)^t(12:47:29 23/09/22)(multiarch multilib ssp stack-realign static-libs systemd -audit -caps -cet -compile-locales -crypt -custom-cflags -doc -gd -hash-sysv-compat -headers-only -multilib-bootstrap -nscd -profile -selinux -suid -systemtap -test -vanilla)
     Homepage:            https://www.gnu.org/software/libc/
     Description:         GNU libc C library
```

And the systems not giving me any error having:

```
[U] sys-libs/glibc
     Available versions:  (2.2) [M](~)2.19-r2^s [M]2.30-r9^t [M]2.31-r7^t [M]2.32-r8^t 2.33-r14^t 2.34-r14^t 2.35-r8^t (~)2.35-r10^t (~)2.36-r3^t **9999*l^t
       {audit caps cet +clone3 compile-locales +crypt custom-cflags debug doc experimental-loong gd hash-sysv-compat headers-only +multiarch multilib multilib-bootstrap nscd profile selinux +ssp stack-realign +static-libs static-pie suid systemd systemtap test vanilla}
     Installed versions:  2.35-r8(2.2)^t(10:07:21 12/07/22)(clone3 multiarch multilib ssp stack-realign static-libs -audit -caps -cet -compile-locales -crypt -custom-cflags -doc -experimental-loong -gd -headers-only -multilib-bootstrap -nscd -profile -selinux -suid -systemd -systemtap -test -vanilla)
     Homepage:            https://www.gnu.org/software/libc/
     Description:         GNU libc C library
```



From non-affected machines I can connect to all my servers, whereas from the affected machine I can connect to some but not all. The common denominator for the servers I cannot connect to from the broken machine is that in `~/.ssh/known_hosts` their entries have ssh-ed25519 keys and will regenerate ssh-ed25519 keys if I delete them. The entries which have ssh-rsa keys in `~/.ssh/known_hosts` work on the affected machine just as well as on the unaffected ones.

Reproducible: Always
Comment 1 Horea Christian 2022-09-23 17:31:25 UTC
Created attachment 813895 [details]
emerge --info
Comment 2 Horea Christian 2022-09-23 17:58:08 UTC
`CC=gcc emerge -1 openssh` seems to have solved it :-/
Comment 3 Ionen Wolkens gentoo-dev 2022-09-23 18:02:01 UTC
Had suspected it could be related because of bug #872416 but hadn't tried it.

Sound like it may be miscompilation with clang rather than a glibc issue.
Comment 4 Larry the Git Cow gentoo-dev 2022-09-23 23:02:56 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=961f11b11a0022c5a4a4a34cc4065d13a48906ba

commit 961f11b11a0022c5a4a4a34cc4065d13a48906ba
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2022-09-23 23:01:10 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2022-09-23 23:01:49 +0000

    net-misc/openssh: block older gcc-config w/o ${CTARGET}-cc
    
    >=sys-devel/gcc-config-2.6 will create ${CTARGET}-cc which avoids
    clang-toolchain-symlinks providing it even on systems with GCC.
    
    See cc6a27ec99c1e08ac51c69ff0ab4c2b8a5578e2e for the details but
    abuse a blocker given it can lead to runtime problems with say,
    OpenSSH.
    
    Bug: https://bugs.gentoo.org/872416
    Bug: https://bugs.gentoo.org/872548
    See: cc6a27ec99c1e08ac51c69ff0ab4c2b8a5578e2e
    Signed-off-by: Sam James <sam@gentoo.org>

 .../openssh/{openssh-9.0_p1-r4.ebuild => openssh-9.0_p1-r5.ebuild}    | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=91b1e01095b7af1b517bc45f94c0c32de9bf9f86

commit 91b1e01095b7af1b517bc45f94c0c32de9bf9f86
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2022-09-23 22:55:16 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2022-09-23 23:01:45 +0000

    sys-devel/clang-toolchain-symlinks: block older gcc-config w/o ${CTARGET}-cc
    
    >=sys-devel/gcc-config-2.6 will create ${CTARGET}-cc which avoids
    clang-toolchain-symlinks providing it even on systems with GCC.
    
    See cc6a27ec99c1e08ac51c69ff0ab4c2b8a5578e2e for the details but
    abuse a blocker given it can lead to runtime problems with say,
    OpenSSH.
    
    Bug: https://bugs.gentoo.org/872416
    Bug: https://bugs.gentoo.org/872548
    See: cc6a27ec99c1e08ac51c69ff0ab4c2b8a5578e2e
    Signed-off-by: Sam James <sam@gentoo.org>

 ...olchain-symlinks-14.ebuild => clang-toolchain-symlinks-14-r1.ebuild} | 2 ++
 ...olchain-symlinks-15.ebuild => clang-toolchain-symlinks-15-r1.ebuild} | 2 ++
 ...olchain-symlinks-16.ebuild => clang-toolchain-symlinks-16-r1.ebuild} | 2 ++
 3 files changed, 6 insertions(+)
Comment 5 Larry the Git Cow gentoo-dev 2022-09-23 23:17:40 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=3194d18b9e7a583b3dc764bd1fdceada10417859

commit 3194d18b9e7a583b3dc764bd1fdceada10417859
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2022-09-23 23:15:07 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2022-09-23 23:17:07 +0000

    net-misc/openssh: depend on newer gcc-config or clang-toolchain-symlinks
    
    Made the classic mistake I always moan about! Blockers don't affect
    dependency resolution, although they do mean "eventual consistency"
    in that it ensures you do upgrade by the end of the run.
    
    Let's be safe given we're talking miscompilation here (i.e. runtime
    failure) and depend on newer gcc-config (or newer clang-toolchain-symlinks)
    in openssh to make sure the merge order is correct.
    
    Bug: https://bugs.gentoo.org/872416
    Bug: https://bugs.gentoo.org/872548
    See: cc6a27ec99c1e08ac51c69ff0ab4c2b8a5578e2e
    Fixes: 961f11b11a0022c5a4a4a34cc4065d13a48906ba
    Signed-off-by: Sam James <sam@gentoo.org>

 .../{openssh-9.0_p1-r5.ebuild => openssh-9.0_p1-r6.ebuild}  | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)
Comment 6 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2022-09-26 00:10:29 UTC
I should add: I don't think this is necessarily done with, as chymera's emerge --info has LLVM 15.0.1 which had the fix for the finit zero thing.

So it may not be the same as bug 869839.
Comment 7 Larry the Git Cow gentoo-dev 2023-12-20 07:14:50 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=0b22d07f89b16ac3400e45077702ac4c4492e5a4

commit 0b22d07f89b16ac3400e45077702ac4c4492e5a4
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2023-12-20 07:12:26 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2023-12-20 07:14:16 +0000

    net-misc/openssh: disable problematic -fzero-call-used-regs=*
    
     --with-hardening adds the following in addition to flags we
     already set in our toolchain:
     * -ftrapv (which is broken with GCC anyway),
     * -ftrivial-auto-var-init=zero (which is nice, but not the end of
        the world to not have)
     * -fzero-call-used-regs=used (history of miscompilations with
        Clang (bug #872548), ICEs on m68k (bug #920350, gcc PR113086,
        gcc PR104820, gcc PR104817, gcc PR110934)).
    
     Furthermore, OSSH_CHECK_CFLAG_COMPILE does not use AC_CACHE_CHECK,
     so we cannot just disable -fzero-call-used-regs=used.
    
     Therefore, just pass --without-hardening, given it doesn't negate
     our already hardened toolchain defaults, and avoids adding flags
     which are known-broken in both Clang and GCC and haven't been
     proven reliable.
    
    Bug: https://bugs.gentoo.org/872548
    Bug: https://bugs.gentoo.org/920350
    Bug: https://bugs.gentoo.org/920292
    Bug: https://gcc.gnu.org/PR113086
    Bug: https://gcc.gnu.org/PR104820
    Bug: https://gcc.gnu.org/PR104817
    Bug: https://gcc.gnu.org/PR110934
    Signed-off-by: Sam James <sam@gentoo.org>

 net-misc/openssh/openssh-9.6_p1-r1.ebuild | 396 ++++++++++++++++++++++++++++++
 1 file changed, 396 insertions(+)