When sys-libs/libxcrypt-4.4.25 is built using the gold linker on a ppc64le system, sudo cannot be used anymore (fails with the message 'illegal instruction'). Furthermore, local logins (root or normal user) arent't possible either. Chrooting into the system and rebuilding libxcrypt using the bdf linker makes the problem go away. Reproducible: Always Steps to Reproduce: 1.build sys-libs/libxcrypt-4.4.25 using the gold linker 2.try to use sudo 3.try to login locally At least in my case the problem seems to be ppc64le (don't know about big endian yet) specific. I didn't have the same issue on amd64 and aarch64.
Please can you run it under gdb or strace to find which instruction gets it killed? Thanks.
(In reply to Sam James from comment #1) > Please can you run it under gdb or strace to find which instruction gets it > killed? Thanks. The version of sudo and emerge --info would be very helpful too. Grabbing the build log from libxcrypt if you can may be helpful.
Created attachment 748284 [details] output of emerge --info
(In reply to zin0 from comment #3) > Created attachment 748284 [details] > output of emerge --info Could you share the sudo version, possibly the libxcrypt build log (not critical but it'd be useful to have), and also tell us if it happens with lld (from LLVM)?
Thanks for responding so quickly, Sam. The sudo version is 1.9.8_p2. As for the rest of the requested info, in need some additional time. (I'll try to reproduce the problem in a kvm guest to minimize potential damage to my host.)
(In reply to zin0 from comment #5) > Thanks for responding so quickly, Sam. The sudo version is 1.9.8_p2. > As for the rest of the requested info, in need some additional time. > (I'll try to reproduce the problem in a kvm guest to minimize potential > damage to my host.) That's no problem at all, thank you for reporting it. I'm working on upgrading my ppc64le chroot to test this too.
Created attachment 748317 [details] strace output for sudo
Created attachment 748320 [details] strace output for su
Created attachment 748323 [details] build log for libxcrypt
Created attachment 748326 [details] kernel error messages
It seems that big endian ppc64 isn't affected by this. I just did the libxcrypt migration and sudo / su still work as intended. However, the problem might not be exclusively 64bit related. I have an additional system with the 'default/linux/powerpc/ppc64/17.0/32bit-userland' profile and the problem appears there, too.
(In reply to zin0 from comment #11) > It seems that big endian ppc64 isn't affected by this. > I just did the libxcrypt migration and sudo / su still work as intended. > > However, the problem might not be exclusively 64bit related. I have an > additional system with the 'default/linux/powerpc/ppc64/17.0/32bit-userland' > profile and the problem appears there, too. That's.. weird. I've tried a ppc64le system and I can't reproduce it there.
can't repro on my power9 LE system straces lack those things, I think, not a single SIGILL I've noticed you've built libxcrypt with -mpcu=power8, is it intentional? rest of the system seems to be configured differently with -mcpu=native -mtune=native, it's generally not a wise thing to do. also, as a test point, can you maybe install `sys-kernel/gentoo-kernel` WITHOUT savedconfig useflag (it provides a working config and will likely work on your machine, tested on blackbird and talos) and give it a go with a known-good kernel configuration? also there's a TON of ppc64 bugfixes in gold landed recently: https://sourceware.org/git/?p=binutils-gdb.git;a=history;f=gold;hb=HEAD maybe look at testing some. and last: my humble advice is not to use gold, it's under-maintained. some brief info and links can be found here: https://en.wikipedia.org/wiki/Gold_(linker)
(In reply to Georgy Yakovlev from comment #13) > can't repro on my power9 LE system > > straces lack those things, I think, not a single SIGILL > > > I've noticed you've built libxcrypt with -mpcu=power8, is it intentional? > rest of the system seems to be configured differently with -mcpu=native > -mtune=native, it's generally not a wise thing to do. > > also, as a test point, can you maybe install `sys-kernel/gentoo-kernel` > WITHOUT savedconfig useflag (it provides a working config and will likely > work on your machine, tested on blackbird and talos) and give it a go with a > known-good kernel configuration? > > > also there's a TON of ppc64 bugfixes in gold landed recently: > https://sourceware.org/git/?p=binutils-gdb.git;a=history;f=gold;hb=HEAD > > maybe look at testing some. > > and last: my humble advice is not to use gold, it's under-maintained. > some brief info and links can be found here: > https://en.wikipedia.org/wiki/Gold_(linker) Not using the gold linker - at least system wide - is definitely sound advice. (I have been burnt by this linker before, so I guess it's time to reconsider USE='default-gold'.) As for -mcpu=power8 vs -mcpu=native, those stem from two different systems. I first encountered the problem on my bare metal installation on a RCS Blackbird (-mcpu=native), filed the bug report and then did further testing / debugging on a KVM ppc64le guest (-mcpu=power8) in order to not mess with my bare metal installation too much. The issue, however, remained the same. Sorry for not mentioning this explicitly. I will to some further testing regarding this issue out of sheer curiosity, but since neither you, Georgy, nor Sam can reproduce the bug, I don't think there's any need for you guys to spend any more time on this. Thank you both for your help.
After some more testing / experimenting, it seems that the bug described above can indeed be narrowed down to sys-libs/libxcrypt being built using the gold linker. It doesn't seem to matter whether gold or bfd have been used to build the rest of @world. I'm however unable to further determine how exactly the gold linker messes up libxcrypt or more specifically /lib64/libcrypt.so.2.0.0 . Interestingly enough, the same problem also occurs when using app-admin/doas instead of app-admin/sudo. So it doesn't seem to be a sudo specific issue. On the positive side, building sys-libs/libxcrypt using sys-devel/clang with sys-devel/lld as the linker to build sys-libs/libxcrypt works without problems (compiles and works as expected). In conclusion, the fix for this bug is not to use the gold linker. Furthermore, the less active development of the gold linker is refernced in the Gentoo Wiki. (See https://wiki.gentoo.org/wiki/Gold) P.S.: As suggested, the testing was done using sys-kernel/gentoo-kernel without the savedconfig useflag.
(In reply to zin0 from comment #15) Thanks for investigating this further. Given how gold (as you note) is heading to the grave, I'm inclined to just force it off in lieu of more information, given I really don't think people should use it for critical utilities anyway. It'd be interesting to compare a broken binary and a good one if you could upload them though.
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=f9db221ddfadd7518e96fb63d93d1edabc5b97eb commit f9db221ddfadd7518e96fb63d93d1edabc5b97eb Author: Sam James <sam@gentoo.org> AuthorDate: 2021-11-19 17:39:30 +0000 Commit: Sam James <sam@gentoo.org> CommitDate: 2021-11-19 17:41:56 +0000 sys-libs/libxcrypt: disable gold Disable gold as a precaution given report of odd behaviour (possible illegal instructions being emitted) with gold. It's not quite clear how this could happen but gold is already on its way out and I'd rather play it safe here pending more information. Bug: https://bugs.gentoo.org/821496 Signed-off-by: Sam James <sam@gentoo.org> .../{libxcrypt-4.4.25.ebuild => libxcrypt-4.4.25-r1.ebuild} | 6 +++++- .../{libxcrypt-4.4.26.ebuild => libxcrypt-4.4.26-r1.ebuild} | 6 +++++- 2 files changed, 10 insertions(+), 2 deletions(-)
Created attachment 754354 [details] libcrypt.so.2.0.0 bad version (gold linker)
Created attachment 754358 [details] libcrypt.so.2.0.0 good version (bfd linker)
A quick note on reproducing this bug: The type of password hash used for a user account in '/etc/shadow' seems to matter. sha512 password hash -> sudo generates illegal instruction md5 password hash -> sudo works as expected As before, '/usr/bin/doas' and '/bin/login' show the same behaviour.
I'd love to spend more time looking into this but gold is really on the way out and we've got an effective solution here at least.