Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 914657 - sys-devel/clang-17.0.1: Hangs when compiling dev-libs/nss-3.93 with -O2 -march=skylake
Summary: sys-devel/clang-17.0.1: Hangs when compiling dev-libs/nss-3.93 with -O2 -marc...
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: LLVM support project
URL: https://github.com/llvm/llvm-project/...
Whiteboard:
Keywords:
Depends on:
Blocks: 912821
  Show dependency tree
 
Reported: 2023-09-25 11:40 UTC by David Carlos Manuelda
Modified: 2024-01-07 07:58 UTC (History)
4 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
emerge --info (emerge_info.txt,21.01 KB, text/plain)
2023-09-25 11:42 UTC, David Carlos Manuelda
Details
build.log (build.log,466.80 KB, text/x-log)
2023-09-25 11:42 UTC, David Carlos Manuelda
Details
sha512.c preprocessed (sha512-preprocessed.c,262.23 KB, text/x-csrc)
2023-09-25 13:52 UTC, David Carlos Manuelda
Details

Note You need to log in before you can comment on or make changes to this bug.
Description David Carlos Manuelda 2023-09-25 11:40:43 UTC
In a regular system update, I got Clang updated to version 17.0.1 and once it is updated I ran a emerge -e world as usual.

Package dev-libs/nss is never compiled because it stalls (will attach emerge info and build.log -before ctrl+c is hit- for complete logs) in the following compile unit (test better with MAKEOPTS=-j1 to see where it hangs):

clang -o Linux6.5_x86_64_clang_glibc_PTH_64_OPT.OBJ/Linux_SINGLE_SHLIB/sha512.o -c -std=c99  -fPIC  -m64 -pipe -ffunction-sections -fdata-sections -DHAVE_STRERROR -DLINUX -Dlinux -Wall -Wshadow -Qunused-arguments -Wno-parentheses-equality -Wno-array-bounds -Wno-unevaluated-expression -DNSS_NO_GCC48 -DXP_UNIX -DXP_UNIX -UDEBUG -DNDEBUG -D_DEFAULT_SOURCE -D_BSD_SOURCE -D_POSIX_SOURCE -DSDB_MEASURE_USE_TEMP_DIR -D_REENTRANT -DSHLIB_SUFFIX=\"so\" -DSHLIB_PREFIX=\"lib\" -DSHLIB_VERSION=\"3\" -DSOFTOKEN_SHLIB_VERSION=\"3\" -DRIJNDAEL_INCLUDE_TABLES -UDEBUG -DNDEBUG -D_DEFAULT_SOURCE -D_BSD_SOURCE -D_POSIX_SOURCE -DSDB_MEASURE_USE_TEMP_DIR -D_REENTRANT -DNSS_DISABLE_SSE3 -DNSS_NO_INIT_SUPPORT -DSEED_ONLY_DEV_URANDOM -DUSE_UTIL_DIRECTLY -DNO_NSPR_10_SUPPORT -DSSL_DISABLE_DEPRECATED_CIPHER_SUITE_NAMES -DNSS_USE_64 -DFREEBL_NO_DEPEND -DFREEBL_LOWHASH -DNSS_X86_OR_X64 -DNSS_X64 -DUSE_HW_SHA2 -DNSS_BEVAND_ARCFOUR -DMPI_AMD64 -DMP_ASSEMBLY_MULTIPLY -DNSS_USE_COMBA -DMP_IS_LITTLE_ENDIAN -DUSE_HW_AES -DINTEL_GCM -DHAVE_INT128_SUPPORT -DHACL_CAN_COMPILE_VEC256 -DMP_API_COMPATIBLE -I../../dist/Linux6.5_x86_64_clang_glibc_PTH_64_OPT.OBJ/include -I../../dist/public/nss -I../../dist/private/nss -I../../dist/Linux6.5_x86_64_clang_glibc_PTH_64_OPT.OBJ/include/dbm -Impi -Iecl -Iverified -Iverified/internal -Iverified/karamel/include -Iverified/karamel/krmllib/dist/minimal -Ideprecated -O2 -pipe -march=native -Wno-unused-command-line-argument  -I/usr/include/nspr sha512.c

Beyond this point it is completelly stalled with clang consuming 100% of one core (which indicates clearly an infinite loop somewhere).

I know this should be reported upstream but I don't think I could make a proper report for it, can a dev look at it and report upstream properly?

Thanks.

Reproducible: Always
Comment 1 David Carlos Manuelda 2023-09-25 11:42:36 UTC
Created attachment 871378 [details]
emerge --info
Comment 2 David Carlos Manuelda 2023-09-25 11:42:52 UTC
Created attachment 871379 [details]
build.log
Comment 3 David Carlos Manuelda 2023-09-25 11:46:20 UTC
* Also tested without ccache with same behaviour
Comment 4 Ionen Wolkens gentoo-dev 2023-09-25 11:49:14 UTC
I actually just woke up earlier to nss build being stuck on a single thread in my llvm-musl chroot w/ clang:17 after going like that for like 8 hours, figure I hit the same thing. Updated fine after I did CC=clang-16 / CXX=clang++16 but did not dig further into it.
Comment 5 Ionen Wolkens gentoo-dev 2023-09-25 11:57:50 UTC
(In reply to Ionen Wolkens from comment #4)
> I actually just woke up earlier to nss build being stuck on a single thread
> in my llvm-musl chroot w/ clang:17 after going like that for like 8 hours,
> figure I hit the same thing. Updated fine after I did CC=clang-16 /
> CXX=clang++16 but did not dig further into it.
... and seems I can reproduce easily on my regular (non-musl/llvm profile) system as well if I just pick a clang 17 toolchain.
Comment 6 Ionen Wolkens gentoo-dev 2023-09-25 12:49:45 UTC
Seems to build fine if don't pass -march=native (aka hangs with what is effectively `-march=skylake -O2` for me, just -O2 is fine)

For reproducing without portage, seems this was sufficient (needs dev-util/gyp):

   nss-3.93/nss$ CFLAGS="-march=skylake -O2" ./build.sh --system-nspr --clang
   [1247/1264] LINK /root/nss-3.93/dist/Debug/bin/ssl_gtest (hang)

Haven't looked further to reduce more.
Comment 7 David Carlos Manuelda 2023-09-25 13:52:34 UTC
Created attachment 871392 [details]
sha512.c preprocessed

I've added the -E option to the generated command line to the affected compile unit to make the preprocessed output which, in theory, can be tested in any system with Clang without any of its dependencies (because it will be compiled with -c just as an intermediate object) for further testing.
Comment 8 David Carlos Manuelda 2023-09-25 14:03:15 UTC
I reported it upstream in the hope this can be addressed better
Comment 9 Toralf Förster gentoo-dev 2023-09-25 14:12:39 UTC
killed here a emerge process after 3 hours too at a tinderbox image
Comment 10 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-09-25 18:17:11 UTC
For bonus points, Cvise with timeout?
Comment 11 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-09-26 01:54:59 UTC
(In reply to Sam James from comment #10)
> For bonus points, Cvise with timeout?

I've hopefully done it well enough: https://github.com/llvm/llvm-project/issues/67333#issuecomment-1734707805.

I don't think it's really an infinite loop, I think it just scales very very poorly with some code which is why the smaller reproducers still take an outsized amount of time (but do complete).
Comment 12 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-10-04 18:24:31 UTC
Fixed by:

commit 289280527735926a78b97688e3548cf2ca2a87fe
Author: Michał Górny <mgorny@gentoo.org>
Date:   Wed Oct 4 18:16:00 2023 +0200

    sys-devel/llvm: Backport hang fix on skylake

    Signed-off-by: Michał Górny <mgorny@gentoo.org>

I'll keep it open for a little bit for visibility.
Comment 13 asdfg 2023-10-07 14:29:26 UTC
-march=znver2 is also affected.

I have also been affected by this on my Ryzen 3700X when compiling nss-3.94 today.
Comment 14 David Carlos Manuelda 2023-10-10 16:16:56 UTC
Issue closed upstream by merging backported patch so now there are two options:
a) Apply the backported patch (tested ok to current 17.0.2) and remove mask
b) Just wait 1 week for next release
Comment 15 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-10-10 16:21:06 UTC
(In reply to David Carlos Manuelda from comment #14)
> Issue closed upstream by merging backported patch so now there are two
> options:
> a) Apply the backported patch (tested ok to current 17.0.2) and remove mask

We already did: https://bugs.gentoo.org/914657#c12.
Comment 16 David Carlos Manuelda 2023-10-10 16:22:15 UTC
(In reply to Sam James from comment #15)
> (In reply to David Carlos Manuelda from comment #14)
> > Issue closed upstream by merging backported patch so now there are two
> > options:
> > a) Apply the backported patch (tested ok to current 17.0.2) and remove mask
> 
> We already did: https://bugs.gentoo.org/914657#c12.

Just realized it, too many things in head, sorry ;)