Chromium fails to build with clang-18.1.5 using -march=native on non-avx512 host. Example of an error message: ../../third_party/skia/modules/skcms/src/Transform_inl.h:852:9: error: AVX vector return of type 'F' (aka 'Vec<16, float>') without 'evex512' enabled changes the ABI Reproducible: Always Steps to Reproduce: emerge -va1 chromium Actual Results: build failure Expected Results: a successful build Use flags: www-client/chromium X cups custom-cflags libcxx proprietary-codecs system-harfbuzz system-icu system-toolchain vaapi -bindist -debug -ffmpeg-chromium -gtk4 -hangouts -headless -kerberos -l10n_af -l10n_am -l10n_ar -l10n_bg -l10n_bn -l10n_ca -l10n_cs -l10n_da -l10n_de -l10n_el -l10n_en-GB -l10n_es -l10n_es-419 -l10n_et -l10n_fa -l10n_fi -l10n_fil -l10n_fr -l10n_gu -l10n_he -l10n_hi -l10n_hr -l10n_hu -l10n_id -l10n_it -l10n_ja -l10n_kn -l10n_ko -l10n_lt -l10n_lv -l10n_ml -l10n_mr -l10n_ms -l10n_nb -l10n_nl -l10n_pl -l10n_pt-BR -l10n_pt-PT -l10n_ro -l10n_ru -l10n_sk -l10n_sl -l10n_sr -l10n_sv -l10n_sw -l10n_ta -l10n_te -l10n_th -l10n_tr -l10n_uk -l10n_ur -l10n_vi -l10n_zh-CN -l10n_zh-TW -lto -official -pax-kernel -pgo -pulseaudio -qt5 -qt6 -screencast -selinux -system-png -system-zstd -wayland -widevine Portage 3.0.64 (python 3.11.9-final-0, default/linux/amd64/17.1/no-multilib/hardened, gcc-13, glibc-2.39-r5, 6.8.7-gentoo x86_64) ================================================================= sh bash 5.2_p26-r3 ld GNU ld (Gentoo 2.42 p3) 2.42.0 app-misc/pax-utils: 1.3.7::gentoo app-shells/bash: 5.2_p26-r3::gentoo dev-build/autoconf: 2.13-r8::gentoo, 2.72-r1::gentoo dev-build/automake: 1.16.5-r2::gentoo dev-build/cmake: 3.29.3::gentoo dev-build/libtool: 2.4.7-r4::gentoo dev-build/make: 4.4.1-r1::gentoo dev-build/meson: 1.4.0-r1::gentoo dev-lang/perl: 5.38.2-r2::gentoo dev-lang/python: 3.11.9::gentoo dev-lang/rust: 1.77.1::gentoo sys-apps/baselayout: 2.15::gentoo sys-apps/openrc: 0.54::gentoo sys-apps/sandbox: 2.38::gentoo sys-devel/binutils: 2.42-r1::gentoo sys-devel/binutils-config: 5.5::gentoo sys-devel/clang: 17.0.6::gentoo, 18.1.5::gentoo sys-devel/gcc: 13.2.1_p20240210::gentoo sys-devel/gcc-config: 2.11::gentoo sys-devel/lld: 17.0.6::gentoo, 18.1.5::gentoo sys-devel/llvm: 17.0.6::gentoo, 18.1.5::gentoo sys-kernel/linux-headers: 6.8-r1::gentoo (virtual/os-headers) sys-libs/glibc: 2.39-r5::gentoo ACCEPT_KEYWORDS="amd64 ~amd64" ACCEPT_LICENSE="@FREE" CBUILD="x86_64-pc-linux-gnu" CFLAGS="-O2 -pipe -march=native -fstack-clash-protection -fstack-protector-strong -fcf-protection=return" CHOST="x86_64-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/share/gnupg/qualified.txt" CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c" CXXFLAGS="-O2 -pipe -march=native -fstack-clash-protection -fstack-protector-strong -fcf-protection=return" DISTDIR="/var/cache/distfiles" EMERGE_DEFAULT_OPTS="--nospinner --backtrack=4000 --verbose-conflicts --tree --unordered-display --changed-deps-report --with-bdeps=y" ENV_UNSET="CARGO_HOME DBUS_SESSION_BUS_ADDRESS DISPLAY GDK_PIXBUF_MODULE_FILE GOBIN GOPATH PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR XDG_STATE_HOME" FCFLAGS="-O2 -pipe -march=native -fstack-clash-protection -fstack-protector-strong -fcf-protection=return" FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs buildpkg-live collision-protect config-protect-if-modified distlocks ebuild-locks fixlafiles ipc-sandbox merge-sync merge-wait multilib-strict network-sandbox news parallel-fetch pid-sandbox pkgdir-index-trusted preserve-libs protect-owned qa-unresolved-soname-deps sandbox sfperms strict suidctl unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr" FFLAGS="-O2 -pipe -march=native -fstack-clash-protection -fstack-protector-strong -fcf-protection=return" LANG="C.UTF8" LDFLAGS="-Wl,-z,now -Wl,-z,relro -Wl,-O1 -Wl,--as-needed" LEX="flex" LINGUAS="" MAKEOPTS="-j8" PKGDIR="/var/cache/binpkgs" PORTAGE_CONFIGROOT="/" PORTAGE_TMPDIR="/var/tmp" RUSTFLAGS="-C target-cpu=native" SHELL="/bin/bash" USE="amd64 pie split-usr ssp test-rust" ABI_X86="64" CPU_FLAGS_X86="aes avx avx2 f16c fma3 mmx mmxext pclmul popcnt rdrand sha sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3" CURL_SSL="openssl" ELIBC="glibc" INPUT_DEVICES="libinput" KERNEL="linux" LLVM_TARGETS="AMDGPU X86" LUA_SINGLE_TARGET="lua5-1" LUA_TARGETS="lua5-1" PYTHON_SINGLE_TARGET="python3_11" PYTHON_TARGETS="python3_11" Unset: ADDR2LINE, AR, ARFLAGS, AS, ASFLAGS, CC, CCLD, CONFIG_SHELL, CPP, CPPFLAGS, CTARGET, CXX, CXXFILT, ELFEDIT, EXTRA_ECONF, F77FLAGS, FC, GCOV, GPROF, LC_ALL, LD, LFLAGS, LIBTOOL, MAKE, MAKEFLAGS, NM, OBJCOPY, OBJDUMP, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, PYTHONPATH, RANLIB, READELF, SIZE, STRINGS, STRIP, YACC, YFLAGS
Created attachment 892603 [details] build.log
Seems to be the same underlying issue as in https://bugs.gentoo.org/931267 The problem is not limited to the bundled skia, but at least xnnpack as well. I'll know more after the current build completes.
Testing a (better) workaround for qtwebengine atm, will link commit when my build finish if no issues -- just want to be sure it doesn't SIGILL (shouldn't in theory). Will likely work just as well for chromium if no issues.
wrt -march=native, to clarify, it is *really* only with =native i.e. if you have a skylake and you do -march=skylake, it'll build fine, but -march=native will fail.
(In reply to Ionen Wolkens from comment #4) > wrt -march=native, to clarify, it is *really* only with =native > > i.e. if you have a skylake and you do -march=skylake, it'll build fine, but > -march=native will fail. $ clang -march=native -mavx512f -E - <<<"__EVEX512__" | tail -n 1 __EVEX512__ $ clang -march=skylake -mavx512f -E - <<<"__EVEX512__" | tail -n 1 1 Problem is not limited to when -mavx512f is passed though (inlines and stuff).
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=754d6f5226a532ed086afa276b48e89ffafe0484 commit 754d6f5226a532ed086afa276b48e89ffafe0484 Author: Ionen Wolkens <ionen@gentoo.org> AuthorDate: 2024-05-09 08:19:17 +0000 Commit: Ionen Wolkens <ionen@gentoo.org> CommitDate: 2024-05-09 12:12:58 +0000 dev-qt/qtwebengine: improve clang-18 workaround w/ -mevex512 (qt6) Hoping it will be a short-lived and that this will be improved/fixed in clang itself. (have not tried nor looked at qtwebengine:5) For some rough explanation from the little I get from this: clang-18 added -mevex512 (missing from 17), and then -march=native is a bit quirky in that unlike -march=exact it goes out of its way to disable it resulting in e.g. -march=skylake -mavx512f = -mevex512 is auto-enabled -march=skylake -mevex512 = not "enabled" but can be used -march=native(skylake) -mavx512f = forced off(!) And then units that use avx512 / pass -mavx512f (for use with runtime cpu detection) end in build failure without evex512. Always passing -mevex512 on a machine without avx512 "seems" safe, it does not even set __EVEX512__ and believe won't use any avx512 instructions on a whim (__EVEX512__ does get set if add -mavx512f). Or at least my skylake (not skylake-x) passes test + can use the qtwebengine built that way. Considered passing only for files that need it at first with a patch (sounded safer), but chromium's Gn files don't have a variable to test clang version that I could see (or at least not in old qtwebengine) and didn't want this to become more involved nor use conditional patching. The !avx512 check may not be super necessary, but have not dug into the implications of forcing it when avx512 is actually enabled (sounds there are cases where it needs to be off, leaving it to compiler). Bug: https://bugs.gentoo.org/931623 Signed-off-by: Ionen Wolkens <ionen@gentoo.org> dev-qt/qtwebengine/qtwebengine-6.7.0.ebuild | 18 +++++++----------- dev-qt/qtwebengine/qtwebengine-6.7.9999.ebuild | 18 +++++++----------- dev-qt/qtwebengine/qtwebengine-6.9999.ebuild | 18 +++++++----------- 3 files changed, 21 insertions(+), 33 deletions(-)
Reading through [1] it seems that the whole point of -m[no-]evex512 is to disambiguate -mavx512[...] flags which might mean different things in the future: the current avx512 with 512-bit registers; and avx10-256 i.e. avx512 set of instructions but limited to 256-bit registers: > Based on the feedbacks from LLVM and GCC community, we have agreed to > start from supporting -m[no-]evex512 on existing AVX512 features. > The option -mno-evex512 can be used with -mavx512xxx to build > binaries that can run on both legacy AVX512 targets and AVX10-256. It can be tested with the following code sample: float foo(const float * p1, const float * p2) { float r = 0.0f; for (int i = 0; i < 16; ++i) { r += p1[i] * p2[i]; } return r; } and the following cmdlines: clang++ -Ofast -mavx512f -mno-evex512 test.cc -S -o - clang++ -Ofast -mavx512f -mevex512 test.cc -S -o - clang++ -Ofast -msse4.2 -mevex512 test.cc -S -o - The last command targeting SSE produces SSE instructions (not AVX512) despite -mevex512 being present. But it seems like a logical error though and it might as well be possible for compilers to show an error (in the future) when -mevex512 options are used outside of AVX512/AVX10 context. [1] https://reviews.llvm.org/D159250
Created attachment 892627 [details, diff] patch to remove avx512 code from xnnpack and skia
I've attached patch to remove avx512 code from xnnpack and skia libraries bundled with chromium. It is not really useful in the general case (for example, when avx512 might be desired), but it worked for me as a temporary solution.
It's part of why that in the workaround I don't pass -mevex512 when avx512 is present and relying on the last behaviour. It could break eventually but who knows what direction this will go in (maybe clang will be adjusted so we don't need this anymore at all). Makes a decent temporary solution versus maintaining patches (esp. in chromium), and doesn't remove avx512 bits for those that can use it.
(In reply to Ionen Wolkens from comment #10) > Makes a decent temporary solution versus maintaining patches > (esp. in chromium), and doesn't remove avx512 bits for those > that can use it. It is certainly better. > wrt -march=native, to clarify, it is *really* only with =native Come to think of it, it makes some sense to me: -mavx512[...] by itself is ambiguous now and use of -march=native can be interpreted as "resolve this ambiguity for me for the current host". So, we can have a host without avx512 being put into the avx10-256 bin resulting in evex512 not being used -- thus, triggering errors when facing 512-bit register avx512 intrinsics. To confirm: $ clang++ -march=native -mavx512f -mavx512vl -dM -E - </dev/null | grep 'EVEX\|AVX512' #define __AVX512F__ 1 #define __AVX512VL__ 1 #define __EVEX256__ 1 ^ Look, we have AVX512 definitions, but also EVEX256 -- effectively meaning AVX10-256. $ clang++ -march=native -mno-evex512 -mavx512f -mavx512vl -dM -E - </dev/null | grep 'EVEX\|AVX512' #define __AVX512F__ 1 #define __AVX512VL__ 1 #define __EVEX256__ 1 $ clang++ -march=native -mevex512 -mavx512f -mavx512vl -dM -E - </dev/null | grep 'EVEX\|AVX512' #define __AVX512F__ 1 #define __AVX512VL__ 1 #define __EVEX256__ 1 #define __EVEX512__ 1 $ clang++ -march=native -mevex512 -dM -E - </dev/null | grep 'EVEX\|AVX' #define __AVX2__ 1 #define __AVX__ 1 Note that -mavx512vl is important -- AVX10 targets must have it enabled: > AVX-512 Vector Length Extensions (VL) extends most AVX-512 operations > to also operate on XMM (128-bit) and YMM (256-bit) registers. $ clang++ -march=native -mavx512f -dM -E - </dev/null | grep 'EVEX\|AVX512' #define __AVX512F__ 1
(In reply to Alexander Sergeyev from comment #11) > It is certainly better. I mean the -mevex512 workaround is better, of course :) > Look, we have AVX512 definitions, but also EVEX256 -- effectively meaning AVX10-256. There are some inconsistencies though: $ clang++ -march=native -mavx512f -mavx512vl -E -v - </dev/null [...] -target-feature +avx2 [...] -target-feature -avx10.1-256 [...] So, we have __EVEX256__ definition and (with some other definitions in mind) it should indicate avx10-256 [1], but I doubt that is the intended behavior here since the target feature avx10.1-256 was actually disabled. Color me confused. [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-September/631562.html
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=8cb01633a28dad0dd4f2bfb43262ae987429fa1f commit 8cb01633a28dad0dd4f2bfb43262ae987429fa1f Author: Matt Jolly <kangie@gentoo.org> AuthorDate: 2024-05-09 21:31:41 +0000 Commit: Matt Jolly <kangie@gentoo.org> CommitDate: 2024-05-09 21:34:42 +0000 www-client/chromium: add 125.0.6422.26 Add Ionen's clang-18 -mevex512 workaround to chromium: clang-18 added -mevex512 (missing from 17), and then -march=native is a bit quirky in that unlike -march=exact it goes out of its way to disable it resulting in e.g. -march=skylake -mavx512f = -mevex512 is auto-enabled -march=skylake -mevex512 = not "enabled" but can be used -march=native(skylake) -mavx512f = forced off(!) And then units that use avx512 / pass -mavx512f (for use with runtime cpu detection) end in build failure without evex512. Always passing -mevex512 on a machine without avx512 "seems" safe, it does not even set __EVEX512__ and believe won't use any avx512 instructions on a whim (__EVEX512__ does get set if add -mavx512f) Bug: https://bugs.gentoo.org/931623 Signed-off-by: Matt Jolly <kangie@gentoo.org> www-client/chromium/Manifest | 1 + www-client/chromium/chromium-125.0.6422.26.ebuild | 1453 +++++++++++++++++++++ 2 files changed, 1454 insertions(+)
The workaround in www-client/chromium-125.0.6422.26 fails on a no-avx512 machine: FAILED: obj/skia/skcms_TransformSkx/skcms_TransformSkx.o ../../third_party/skia/modules/skcms/src/Transform_inl.h:828:9: error: AVX vector return of type 'F' (aka 'Vec<16, float>') without 'evex512' enabled changes the ABI 828 | a = F_from_U8(load<U8>(src + 1*i)); | ^ ../../third_party/skia/modules/skcms/src/Transform_inl.h:584:12: error: AVX vector return of type 'float __attribute__((ext_vector_type(16)))' (vector of 16 'float' values) without 'evex512' enabled changes the ABI 584 | return cast<F>(v) * (1/255.0f); | ^ ../../third_party/skia/modules/skcms/src/Transform_inl.h:832:17: error: AVX vector return of type 'F' (aka 'Vec<16, float>') without 'evex512' enabled changes the ABI 832 | r = g = b = F_from_U8(load<U8>(src + 1*i)); | ^ ../../third_party/skia/modules/skcms/src/Transform_inl.h:838:9: error: AVX vector return of type 'float __attribute__((ext_vector_type(16)))' (vector of 16 'float' values) without 'evex512' enabled changes the ABI 838 | r = cast<F>((abgr >> 12) & 0xf) * (1/15.0f); | ^ ../../third_party/skia/modules/skcms/src/Transform_inl.h:839:9: error: AVX vector return of type 'float __attribute__((ext_vector_type(16)))' (vector of 16 'float' values) without 'evex512' enabled changes the ABI 839 | g = cast<F>((abgr >> 8) & 0xf) * (1/15.0f); | ^ ../../third_party/skia/modules/skcms/src/Transform_inl.h:840:9: error: AVX vector return of type 'float __attribute__((ext_vector_type(16)))' (vector of 16 'float' values) without 'evex512' enabled changes the ABI 840 | b = cast<F>((abgr >> 4) & 0xf) * (1/15.0f); | ^ ../../third_party/skia/modules/skcms/src/Transform_inl.h:841:9: error: AVX vector return of type 'float __attribute__((ext_vector_type(16)))' (vector of 16 'float' values) without 'evex512' enabled changes the ABI 841 | a = cast<F>((abgr >> 0) & 0xf) * (1/15.0f); | ^ ../../third_party/skia/modules/skcms/src/Transform_inl.h:847:9: error: AVX vector return of type 'float __attribute__((ext_vector_type(16)))' (vector of 16 'float' values) without 'evex512' enabled changes the ABI 847 | r = cast<F>(rgb & (uint16_t)(31<< 0)) * (1.0f / (31<< 0)); | ^ ../../third_party/skia/modules/skcms/src/Transform_inl.h:848:9: error: AVX vector return of type 'float __attribute__((ext_vector_type(16)))' (vector of 16 'float' values) without 'evex512' enabled changes the ABI 848 | g = cast<F>(rgb & (uint16_t)(63<< 5)) * (1.0f / (63<< 5)); | ^ ../../third_party/skia/modules/skcms/src/Transform_inl.h:849:9: error: AVX vector return of type 'float __attribute__((ext_vector_type(16)))' (vector of 16 'float' values) without 'evex512' enabled changes the ABI 849 | b = cast<F>(rgb & (uint16_t)(31<<11)) * (1.0f / (31<<11)); | ^ ../../third_party/skia/modules/skcms/src/Transform_inl.h:871:17: error: AVX vector return of type 'unsigned int __attribute__((ext_vector_type(16)))' (vector of 16 'unsigned int' values) without 'evex512' enabled changes the ABI 871 | r = cast<F>(load_3<U32>(rgb+0) ) * (1/255.0f); | ^ ../../third_party/skia/modules/skcms/src/Transform_inl.h:871:9: error: AVX vector return of type 'float __attribute__((ext_vector_type(16)))' (vector of 16 'float' values) without 'evex512' enabled changes the ABI 871 | r = cast<F>(load_3<U32>(rgb+0) ) * (1/255.0f); | ^ ../../third_party/skia/modules/skcms/src/Transform_inl.h:872:17: error: AVX vector return of type 'unsigned int __attribute__((ext_vector_type(16)))' (vector of 16 'unsigned int' values) without 'evex512' enabled changes the ABI 872 | g = cast<F>(load_3<U32>(rgb+1) ) * (1/255.0f); | ^ ../../third_party/skia/modules/skcms/src/Transform_inl.h:872:9: error: AVX vector return of type 'float __attribute__((ext_vector_type(16)))' (vector of 16 'float' values) without 'evex512' enabled changes the ABI 872 | g = cast<F>(load_3<U32>(rgb+1) ) * (1/255.0f); | ^ ../../third_party/skia/modules/skcms/src/Transform_inl.h:873:17: error: AVX vector return of type 'unsigned int __attribute__((ext_vector_type(16)))' (vector of 16 'unsigned int' values) without 'evex512' enabled changes the ABI 873 | b = cast<F>(load_3<U32>(rgb+2) ) * (1/255.0f); | ^ ../../third_party/skia/modules/skcms/src/Transform_inl.h:873:9: error: AVX vector return of type 'float __attribute__((ext_vector_type(16)))' (vector of 16 'float' values) without 'evex512' enabled changes the ABI 873 | b = cast<F>(load_3<U32>(rgb+2) ) * (1/255.0f); | ^ ../../third_party/skia/modules/skcms/src/Transform_inl.h:878:16: error: AVX vector return of type 'unsigned int __attribute__((ext_vector_type(16)))' (vector of 16 'unsigned int' values) without 'evex512' enabled changes the ABI 878 | U32 rgba = load<U32>(src + 4*i); | ^ ../../third_party/skia/modules/skcms/src/Transform_inl.h:880:9: error: AVX vector return of type 'float __attribute__((ext_vector_type(16)))' (vector of 16 'float' values) without 'evex512' enabled changes the ABI 880 | r = cast<F>((rgba >> 0) & 0xff) * (1/255.0f); | ^ ../../third_party/skia/modules/skcms/src/Transform_inl.h:881:9: error: AVX vector return of type 'float __attribute__((ext_vector_type(16)))' (vector of 16 'float' values) without 'evex512' enabled changes the ABI 881 | g = cast<F>((rgba >> 8) & 0xff) * (1/255.0f); | ^ fatal error: too many errors emitted, stopping now [-ferror-limit=] 20 errors generated.
If I were to guess, it got stripped because strip-flags is ran after the append happened and it's not in allowed flags that I can see.
LLVM fixed the issue (the fix landed into the main branch), -march=native no longer implies -mno-evex512 on non-avx512 hosts, so -march=native -mavx512xxx works as expected now. https://github.com/llvm/llvm-project/issues/91076#issuecomment-2103791226 https://github.com/llvm/llvm-project/pull/91694
(In reply to Alexander Sergeyev from comment #16) > LLVM fixed the issue (the fix landed into the main branch), -march=native no > longer implies -mno-evex512 on non-avx512 hosts, so -march=native > -mavx512xxx works as expected now. > > https://github.com/llvm/llvm-project/issues/91076#issuecomment-2103791226 > https://github.com/llvm/llvm-project/pull/91694 Nice, it'll help a few things but doesn't sound like it'll help the cases where it wasn't passing -mavx512* (like skia/skcms in qtwebengine which is different). It may be enough for current chromium though.
(In reply to Ionen Wolkens from comment #17) > Nice, it'll help a few things but doesn't sound like it'll help the cases > where it wasn't passing -mavx512* (like skia/skcms in qtwebengine which is > different). It may be enough for current chromium though. Could you elaborate more on this? I'm not sure I understand the issue with qtwebengine and missing -mavx512* flags just from reading commit message [1]: > -march=skylake -mavx512f = -mevex512 is auto-enabled > -march=skylake -mevex512 = not "enabled" but can be used > -march=native(skylake) -mavx512f = forced off(!) So, the first works as desired. The third is fixed in LLVM now. The second is somewhat fuzzy. Skylake does not have avx512 by itself, so -mevex512 is not doing anything actually -- avx512 can still be used in cpu-dispatching projects, but it would require adding -mavx512* flags -- which we were already doing before avx10 and clang-18. I don't think that -mevex512 should imply avx512 by default. First of all, there are multiple parts of avx512 (avx512f, avx512vl, -mavx512vnni and so on), so it is not clear which parts should be implied (given that there are cpus with avx512 before avx10). Second, if we are talking about -mevex512 implying avx10-512, then it would make more sense to use something like -mavx10.1-512 instead to avoid confusion between avx512 and avx10-512. So, -mevex512 just means that we have support for 512-bit vectors (and 64-bit masks), but this does not define a particular set of instructions available -- and for that -mavx512*/-mavx10* flags come in. But I'm not sure whether specifics of -mavx10* flags are already decided/finalized in either GCC/LLVM. [1] https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=754d6f5226a532ed086afa276b48e89ffafe0484
(In reply to Alexander Sergeyev from comment #18) > (In reply to Ionen Wolkens from comment #17) > > Nice, it'll help a few things but doesn't sound like it'll help the cases > > where it wasn't passing -mavx512* (like skia/skcms in qtwebengine which is > > different). It may be enough for current chromium though. > > Could you elaborate more on this? I'm not sure I understand the issue with > qtwebengine and missing -mavx512* flags just from reading commit message [1]: > Showing -m* was just to make it easier to understand, the problem happens on qtwebengine with skcms despite it does *not* pass any -mavx512* flags. And if(?) I understand the fix it only comes into effect when avx512 was enabled. Aka, qtwebengine's skcms does not pass any flags at all except -std=c11[1], while current chromium does [2] In other words, [3] remains unfixed if I understand this right -- which suggest doing -mevex512 even though avx512 is not enabled (which solved it for qtwebengine for me). It does not even set __EVEX512__ and such (aka not really enabled), but it overrides' -march=native effects thus working around the issue. [1] https://github.com/qt/qtwebengine-chromium/blob/007bee8d/chromium/third_party/skia/modules/skcms/BUILD.gn [2] https://github.com/google/skia/blob/40fcf198d/modules/skcms/BUILD.gn#L72 [3] https://github.com/llvm/llvm-project/issues/70002
By the "fuzzy" one, it's essentially that affect I was trying to describe -march=skylake = doesn't "hard" disable it, and it works fine -march=native = errors out complaining about EVEX512 -march=native -mevex512 = works like the first one In all three case, no -mavx512* is passed for these scenarios.
(either way I'll give it another try when the fix lands in a release to see if the workaround is still needed -- this is just the impression I was getting looking at it but maybe it's fixed for this too)
(In reply to Ionen Wolkens from comment #19) > In other words, [3] remains unfixed if I understand this right -- which > suggest doing -mevex512 even though avx512 is not enabled (which solved it > for qtwebengine for me). A compilation test of the following snippet: #include <immintrin.h> __m512 foo(float x) { return _mm512_set1_ps(x); } shows that both clang-17 and clang-18 will not compile it without -mavx512* options regardless of -mevex512 being used for clang-18. > Aka, qtwebengine's skcms does not pass any flags at all except -std=c11[1], > while current chromium does [2] They actually do, but in a more tricky way by pushing target attributes via pragmas [1]. So, it seems that is not a LLVM/Clang issue after all. [1] https://github.com/qt/qtwebengine-chromium/blob/007bee8df524433cd9bc0fe818ce7800ef24679f/chromium/third_party/skia/modules/skcms/skcms.cc#L2459
(In reply to Alexander Sergeyev from comment #22) > > Aka, qtwebengine's skcms does not pass any flags at all except -std=c11[1], > > while current chromium does [2] > > They actually do, but in a more tricky way by pushing target attributes via > pragmas [1]. So, it seems that is not a LLVM/Clang issue after all. I see, thanks. Was wondering how these were working at all that way and made me imagine a strange scenario :) Not something I ever used in code myself. Guess it'll work then unless the pragma way breaks something.
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=a167b24a4097bb2d462875285b170818a7336eb0 commit a167b24a4097bb2d462875285b170818a7336eb0 Author: Ionen Wolkens <ionen@gentoo.org> AuthorDate: 2024-05-10 10:25:16 +0000 Commit: Ionen Wolkens <ionen@gentoo.org> CommitDate: 2024-05-10 10:25:16 +0000 dev-qt/qtwebengine: note reminder of when to drop workaround Bug: https://bugs.gentoo.org/931623 Signed-off-by: Ionen Wolkens <ionen@gentoo.org> dev-qt/qtwebengine/qtwebengine-6.7.0.ebuild | 1 + dev-qt/qtwebengine/qtwebengine-6.7.9999.ebuild | 1 + dev-qt/qtwebengine/qtwebengine-6.9999.ebuild | 1 + 3 files changed, 3 insertions(+)
(In reply to Ionen Wolkens from comment #23) > Guess it'll work then unless the pragma way breaks something. It should not, but I believe it might be more cumbersome (especially when support for all of gcc, clang and msvc is required). I've seen somewhere that pragmas in skia were used in an older version and then they dropped the option to disable cpu runtime dispatch and went with compile flags instead of pragmas. But I cannot quickly find the source at the moment.
(In reply to Alexander Sergeyev from comment #25) > I've seen somewhere that pragmas in skia were used in an older version > and then they dropped the option to disable cpu runtime dispatch and > went with compile flags instead of pragmas. Well, this story is a bit more complicated and pragmas apparently do have unexpected downsides at least on LLVM. For details see [1] and [2]. [1] https://skia.googlesource.com/skcms.git/+/e9cc5993398f5bcad9bf62201538c73ae86424ca [2] https://github.com/llvm/llvm-project/issues/64706
Good to know. ftr Qt6.8 will have caught up with that change (it's based on 122.0.6261.72 currently), so any issues these pragmas might cause will just have a few more months to live at best.
*** Bug 931660 has been marked as a duplicate of this bug. ***
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=ded2b2cd180ee3896423dca54c4f24962d5c9b0a commit ded2b2cd180ee3896423dca54c4f24962d5c9b0a Author: Sam James <sam@gentoo.org> AuthorDate: 2024-05-12 04:49:41 +0000 Commit: Sam James <sam@gentoo.org> CommitDate: 2024-05-12 04:49:41 +0000 flag-o-matic.eclass: allow -mevex512 and -mno-evex512 The whole -m/-mno-* situation needs to be improved in the eclass but let's do this for now for the benefit of Chromium (see 754d6f5226a532ed086afa276b48e89ffafe0484). Bug: https://bugs.gentoo.org/931623 Signed-off-by: Sam James <sam@gentoo.org> eclass/flag-o-matic.eclass | 2 ++ 1 file changed, 2 insertions(+)
(In reply to Ionen Wolkens from comment #15) > If I were to guess, it got stripped because strip-flags is ran after the > append happened and it's not in allowed flags that I can see. Indeed, from dupe: """ strip-flags: CXXFLAGS: changed '-march=native -O2 -pipe -fomit-frame-pointer -mevex512' to '-march=native -O2 -pipe' """
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=45550fe4d5b041d274b4a4822f9fb829df5f0d7a commit 45550fe4d5b041d274b4a4822f9fb829df5f0d7a Author: Matt Jolly <kangie@gentoo.org> AuthorDate: 2024-05-14 07:21:10 +0000 Commit: Matt Jolly <kangie@gentoo.org> CommitDate: 2024-05-14 07:21:59 +0000 www-client/chromium: add 124.0.6367.207 Adds the avx512 w/ -march=native fix for clang18. Bug: https://bugs.gentoo.org/931623 Bug: https://bugs.gentoo.org/931897 Signed-off-by: Matt Jolly <kangie@gentoo.org> www-client/chromium/Manifest | 1 + www-client/chromium/chromium-124.0.6367.207.ebuild | 1443 ++++++++++++++++++++ 2 files changed, 1444 insertions(+)
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=43fecd92f7197cbeb286843f239a09fb994cecef commit 43fecd92f7197cbeb286843f239a09fb994cecef Author: Ionen Wolkens <ionen@gentoo.org> AuthorDate: 2024-05-17 01:23:40 +0000 Commit: Ionen Wolkens <ionen@gentoo.org> CommitDate: 2024-05-17 04:16:36 +0000 dev-qt/qtwebengine: update evex512 workaround for fixed llvm version The has_version is not *necessary* but will make it easier to know it's safe to drop when it becomes essentially a no-op. Bug: https://bugs.gentoo.org/931623 Signed-off-by: Ionen Wolkens <ionen@gentoo.org> dev-qt/qtwebengine/qtwebengine-6.7.0.ebuild | 3 ++- dev-qt/qtwebengine/qtwebengine-6.7.9999.ebuild | 3 ++- dev-qt/qtwebengine/qtwebengine-6.9999.ebuild | 3 ++- 3 files changed, 6 insertions(+), 3 deletions(-)
It seems that LLVM 18.1.6 (which contains the fix) is out now