I had a bit of thought if this is a bug or not, but since I was bitten on 3 machines already, I decided to file it in. After switching to gcc-4.1.1 and `emerge -e world` a few days ago, today I ran again `emerge world -vDatu --newuse` and got openssl updated to -r1: 1158507943: Started emerge on: Sep 18, 2006 00:45:43 1158507943: *** emerge --verbose --deep --ask --tree --update --newuse world 1158508120: >>> emerge (1 of 3) dev-libs/openssl-0.9.8c-r1 to / 1158508120: === (1 of 3) Cleaning (dev-libs/openssl-0.9.8c-r1::/usr/portage/dev-libs/openssl/openssl-0.9.8c-r1.ebuild) 1158508121: === (1 of 3) Compiling/Merging (dev-libs/openssl-0.9.8c-r1::/usr/portage/dev-libs/openssl/openssl-0.9.8c-r1.ebuild) 1158508625: >>> AUTOCLEAN: dev-libs/openssl 1158508630: === Unmerging... (dev-libs/openssl-0.9.8c) 1158508632: >>> unmerge success: dev-libs/openssl-0.9.8c 1158508632: === (1 of 3) Post-Build Cleaning (dev-libs/openssl-0.9.8c-r1::/usr/portage/dev-libs/openssl/openssl-0.9.8c-r1.ebuild) 1158508632: ::: completed emerge (1 of 3) dev-libs/openssl-0.9.8c-r1 to / after that /bin/ssh stopped working: old ~ # ssh gentoo Segmentation fault old ~ # ldd `which ssh` linux-gate.so.1 => (0xffffe000) libresolv.so.2 => /lib/libresolv.so.2 (0xb7fce000) libcrypto.so.0.9.8 => /usr/lib/libcrypto.so.0.9.8 (0xb7e8e000) libutil.so.1 => /lib/libutil.so.1 (0xb7e8a000) libz.so.1 => /lib/libz.so.1 (0xb7e76000) libnsl.so.1 => /lib/libnsl.so.1 (0xb7e61000) libcrypt.so.1 => /lib/libcrypt.so.1 (0xb7e33000) libc.so.6 => /lib/libc.so.6 (0xb7d16000) libdl.so.2 => /lib/libdl.so.2 (0xb7d12000) /lib/ld-linux.so.2 (0xb7fe6000) old ~ # env-update >>> Regenerating /etc/ld.so.cache... old ~ # . /etc/profile old ~ # ssh gentoo Segmentation fault old ~ # strace -f ssh gentoo [SNIP] gettimeofday({1158510388, 553008}, NULL) = 0 poll([{fd=4, events=POLLOUT, revents=POLLOUT}], 1, 0) = 1 send(4, "\344\254\1\0\0\1\0\0\0\0\0\0\6gentoo\0\0,\0\1", 24, 0) = 24 poll([{fd=4, events=POLLIN, revents=POLLIN}], 1, 5000) = 1 recvfrom(4, "\344\254\205\203\0\1\0\0\0\0\0\0\6gentoo\0\0,\0\1", 65536, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.1.1")}, [16]) = 24 close(4) = 0 stat64(umovestr: Input/output error 0xfa, 0xbfffe548) = -1 EFAULT (Bad address) stat64(umovestr: Input/output error 0x12, 0xbfffe548) = -1 EFAULT (Bad address) open(umovestr: Input/output error 0x51, O_RDONLY|O_LARGEFILE) = -1 EFAULT (Bad address) open(umovestr: Input/output error 0x52, O_RDONLY|O_LARGEFILE) = -1 EFAULT (Bad address) open(umovestr: Input/output error 0x51, O_WRONLY|O_APPEND|O_CREAT|O_LARGEFILE, 0666) = -1 EFAULT (Bad address) --- SIGSEGV (Segmentation fault) @ 0 (0) --- +++ killed by SIGSEGV +++ (On another machine) I tried to reboot, but it didn't fix the problem. Similar problems with sshd as well: After it succesfully was restarted, a remote login failed: kalin@ss ~ $ ssh old socket: Address family not supported by protocol ssh: connect to host old port 22: Address family not supported by protocol `emerge openssh` and everything is back to normnal. No idea how this has to be solved, but at least a warning is needed I think. For people that administer remote boxen, it might turn fatal...
I also got ssh client segfault on -r1 upgrade. Did revdep-rebuild all the stuff before this upgrade and removed the old 0.9.7 libs, all was working fine. Then I upgraded to -r1 and got only segfaults. I had to re-emerge openssh to fix it. :/
re-emerge openssh *** This bug has been marked as a duplicate of 147758 ***
Uh no? I didn't compile the thing w/ USE="sse2", in fact there's no such flag in -r1.
Confirmed here. It's bad nothing (revdep-rebuild) hints an user to recompile. And if openssh breaks, how bout the rest of packages depending on openssl?
I don't know whether it is related, but I had a similar segfault with ssh *without* upgrading openssl to -r1. I don't know exactly when ssh started failing, but probably the "only" thing which I had upgraded was linux-headers (from 2.6.11 to 2.6.17-r1) and the kernel (from hardened-2.6.16-r11 to hardened-2.6.17-r1). After this openssh always segfaulted - I certainly had openssl-0.9.8c on the system at this time. Reemerging openssh (which I tried immediately) fixed the problem, so I cannot give any more data now. On another system with almost identical configuration, no segfault had occurred, so the thing does not seem to be reproducible.
The reason is apparently abi changes between 0.9.8c (first version) and 0.9.8c (second version), which was then reverted for 0.9.8c-r1 The sse2 flag for 0.9.8c has from what i can tell NOTHING to do with the abi changes, and as was noted in the original bug, the crashes when upgrading from 0.9.7/0.9.8b/0.9.8.c to 0.9.8c (second vesion) happened whether sse2 was used or not. Removing 0.9.8c from portage and adding a new 0.9.8c-r1 was IMO not the correct fix, as it predictably would break systems who had just corrected the 0.9.8c problem. At the very least, I think one should add a masked 0.9.8c back to portage, so people who have it and have corrected the problem (by recompiling dependencies like ssh) can unmask it instead of being forced to install 0.9.8c-r1 and recompiling dependencies again.
Well, well I think we have a winner! Comment #6 From Arthur Hagen tells us the reason for the crash: ABI changes between -r versions... I have no idea how to confirm that, at least documentation form the openssl does not say anyhting on the question. So, for now the soluton will be to `revdep-rebuild world` not only between 0.9.7X and 0.9.8c, but also from 0.9.8c and 0.9.8c-r1... Anybody willing to push that message into openssl-0.9.8c-r1.ebuild (no, no revbump is needed)?
Created attachment 97310 [details, diff] 0.9.8c -> 0.9.8c-r1 diff (In reply to comment #6) > The reason is apparently abi changes between 0.9.8c (first version) and 0.9.8c > (second version), which was then reverted for 0.9.8c-r1 This is the diff between what worked and what made openssh segfault for me. Well, I can't imagine what kind of ABI change could happen with this diff, tarballs being all the same.
I agree it's nothing to do with the sse2 flag, at least here. Because the flag just caused "no-sse2" passed to configure when sse2 flag was NOT set, for systems where sse2 is set (like mine), the result was always the same ("no-sse2" not passed). I think the real culprit are the changes in gentoo.config-0.9.8, see http://sources.gentoo.org/viewcvs.py/gentoo-x86/dev-libs/openssl/files/gentoo.config-0.9.8?hideattic=0&rev=1.10&sortby=date&view=log related to bug 146316.
upgrade to 0.9.8c-r2 and if you still have crashes; re-emerge the offending apps
*** Bug 148060 has been marked as a duplicate of this bug. ***