Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 931623

Summary: www-client/chromium-124.0.6367.155: AVX vector without evex512 enabled changes the ABI
Product: Gentoo Linux Reporter: Alexander Sergeyev <sergeev917>
Component: Current packagesAssignee: Chromium Project <chromium>
Status: UNCONFIRMED ---    
Severity: normal CC: answer2019, casta, ionen, jstein, kangie, leonchik1976, patrick, sergeev917
Priority: Normal    
Version: unspecified   
Hardware: AMD64   
OS: Linux   
See Also: https://bugs.gentoo.org/show_bug.cgi?id=931267
https://bugs.gentoo.org/show_bug.cgi?id=916752
https://github.com/llvm/llvm-project/issues/70002
https://github.com/llvm/llvm-project/issues/91076
https://bugs.gentoo.org/show_bug.cgi?id=931656
https://github.com/llvm/llvm-project/issues/64706
https://bugs.gentoo.org/show_bug.cgi?id=931842
Whiteboard:
Package list:
Runtime testing required: ---
Attachments: build.log
patch to remove avx512 code from xnnpack and skia

Description Alexander Sergeyev 2024-05-09 10:14:19 UTC
Chromium fails to build with clang-18.1.5 using -march=native on non-avx512 host.

Example of an error message:
../../third_party/skia/modules/skcms/src/Transform_inl.h:852:9: error: AVX vector return of type 'F' (aka 'Vec<16, float>') without 'evex512' enabled changes the ABI

Reproducible: Always

Steps to Reproduce:
emerge -va1 chromium
Actual Results:  
build failure

Expected Results:  
a successful build

Use flags:
www-client/chromium X cups custom-cflags libcxx proprietary-codecs system-harfbuzz system-icu system-toolchain vaapi -bindist -debug -ffmpeg-chromium -gtk4 -hangouts -headless -kerberos -l10n_af -l10n_am -l10n_ar -l10n_bg -l10n_bn -l10n_ca -l10n_cs -l10n_da -l10n_de -l10n_el -l10n_en-GB -l10n_es -l10n_es-419 -l10n_et -l10n_fa -l10n_fi -l10n_fil -l10n_fr -l10n_gu -l10n_he -l10n_hi -l10n_hr -l10n_hu -l10n_id -l10n_it -l10n_ja -l10n_kn -l10n_ko -l10n_lt -l10n_lv -l10n_ml -l10n_mr -l10n_ms -l10n_nb -l10n_nl -l10n_pl -l10n_pt-BR -l10n_pt-PT -l10n_ro -l10n_ru -l10n_sk -l10n_sl -l10n_sr -l10n_sv -l10n_sw -l10n_ta -l10n_te -l10n_th -l10n_tr -l10n_uk -l10n_ur -l10n_vi -l10n_zh-CN -l10n_zh-TW -lto -official -pax-kernel -pgo -pulseaudio -qt5 -qt6 -screencast -selinux -system-png -system-zstd -wayland -widevine

Portage 3.0.64 (python 3.11.9-final-0, default/linux/amd64/17.1/no-multilib/hardened, gcc-13, glibc-2.39-r5, 6.8.7-gentoo x86_64)
=================================================================
sh bash 5.2_p26-r3
ld GNU ld (Gentoo 2.42 p3) 2.42.0
app-misc/pax-utils:        1.3.7::gentoo
app-shells/bash:           5.2_p26-r3::gentoo
dev-build/autoconf:        2.13-r8::gentoo, 2.72-r1::gentoo
dev-build/automake:        1.16.5-r2::gentoo
dev-build/cmake:           3.29.3::gentoo
dev-build/libtool:         2.4.7-r4::gentoo
dev-build/make:            4.4.1-r1::gentoo
dev-build/meson:           1.4.0-r1::gentoo
dev-lang/perl:             5.38.2-r2::gentoo
dev-lang/python:           3.11.9::gentoo
dev-lang/rust:             1.77.1::gentoo
sys-apps/baselayout:       2.15::gentoo
sys-apps/openrc:           0.54::gentoo
sys-apps/sandbox:          2.38::gentoo
sys-devel/binutils:        2.42-r1::gentoo
sys-devel/binutils-config: 5.5::gentoo
sys-devel/clang:           17.0.6::gentoo, 18.1.5::gentoo
sys-devel/gcc:             13.2.1_p20240210::gentoo
sys-devel/gcc-config:      2.11::gentoo
sys-devel/lld:             17.0.6::gentoo, 18.1.5::gentoo
sys-devel/llvm:            17.0.6::gentoo, 18.1.5::gentoo
sys-kernel/linux-headers:  6.8-r1::gentoo (virtual/os-headers)
sys-libs/glibc:            2.39-r5::gentoo
ACCEPT_KEYWORDS="amd64 ~amd64"
ACCEPT_LICENSE="@FREE"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-O2 -pipe -march=native -fstack-clash-protection -fstack-protector-strong -fcf-protection=return"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/gnupg/qualified.txt"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c"
CXXFLAGS="-O2 -pipe -march=native -fstack-clash-protection -fstack-protector-strong -fcf-protection=return"
DISTDIR="/var/cache/distfiles"
EMERGE_DEFAULT_OPTS="--nospinner --backtrack=4000 --verbose-conflicts --tree --unordered-display --changed-deps-report --with-bdeps=y"
ENV_UNSET="CARGO_HOME DBUS_SESSION_BUS_ADDRESS DISPLAY GDK_PIXBUF_MODULE_FILE GOBIN GOPATH PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR XDG_STATE_HOME"
FCFLAGS="-O2 -pipe -march=native -fstack-clash-protection -fstack-protector-strong -fcf-protection=return"
FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs buildpkg-live collision-protect config-protect-if-modified distlocks ebuild-locks fixlafiles ipc-sandbox merge-sync merge-wait multilib-strict network-sandbox news parallel-fetch pid-sandbox pkgdir-index-trusted preserve-libs protect-owned qa-unresolved-soname-deps sandbox sfperms strict suidctl unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
FFLAGS="-O2 -pipe -march=native -fstack-clash-protection -fstack-protector-strong -fcf-protection=return"
LANG="C.UTF8"
LDFLAGS="-Wl,-z,now -Wl,-z,relro -Wl,-O1 -Wl,--as-needed"
LEX="flex"
LINGUAS=""
MAKEOPTS="-j8"
PKGDIR="/var/cache/binpkgs"
PORTAGE_CONFIGROOT="/"
PORTAGE_TMPDIR="/var/tmp"
RUSTFLAGS="-C target-cpu=native"
SHELL="/bin/bash"
USE="amd64 pie split-usr ssp test-rust" ABI_X86="64" CPU_FLAGS_X86="aes avx avx2 f16c fma3 mmx mmxext pclmul popcnt rdrand sha sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3" CURL_SSL="openssl" ELIBC="glibc" INPUT_DEVICES="libinput" KERNEL="linux" LLVM_TARGETS="AMDGPU X86" LUA_SINGLE_TARGET="lua5-1" LUA_TARGETS="lua5-1" PYTHON_SINGLE_TARGET="python3_11" PYTHON_TARGETS="python3_11"
Unset:  ADDR2LINE, AR, ARFLAGS, AS, ASFLAGS, CC, CCLD, CONFIG_SHELL, CPP, CPPFLAGS, CTARGET, CXX, CXXFILT, ELFEDIT, EXTRA_ECONF, F77FLAGS, FC, GCOV, GPROF, LC_ALL, LD, LFLAGS, LIBTOOL, MAKE, MAKEFLAGS, NM, OBJCOPY, OBJDUMP, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, PYTHONPATH, RANLIB, READELF, SIZE, STRINGS, STRIP, YACC, YFLAGS
Comment 1 Alexander Sergeyev 2024-05-09 10:16:31 UTC
Created attachment 892603 [details]
build.log
Comment 2 Alexander Sergeyev 2024-05-09 10:19:56 UTC
Seems to be the same underlying issue as in https://bugs.gentoo.org/931267 The problem is not limited to the bundled skia, but at least xnnpack as well. I'll know more after the current build completes.
Comment 3 Ionen Wolkens gentoo-dev 2024-05-09 10:34:08 UTC
Testing a (better) workaround for qtwebengine atm, will link commit when my build finish if no issues -- just want to be sure it doesn't SIGILL (shouldn't in theory).

Will likely work just as well for chromium if no issues.
Comment 4 Ionen Wolkens gentoo-dev 2024-05-09 10:37:29 UTC
wrt -march=native, to clarify, it is *really* only with =native

i.e. if you have a skylake and you do -march=skylake, it'll build fine, but -march=native will fail.
Comment 5 Ionen Wolkens gentoo-dev 2024-05-09 10:42:47 UTC
(In reply to Ionen Wolkens from comment #4)
> wrt -march=native, to clarify, it is *really* only with =native
> 
> i.e. if you have a skylake and you do -march=skylake, it'll build fine, but
> -march=native will fail.
$ clang -march=native -mavx512f -E - <<<"__EVEX512__" | tail -n 1
__EVEX512__
$ clang -march=skylake -mavx512f -E - <<<"__EVEX512__" | tail -n 1
1

Problem is not limited to when -mavx512f is passed though (inlines and stuff).
Comment 6 Larry the Git Cow gentoo-dev 2024-05-09 12:13:27 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=754d6f5226a532ed086afa276b48e89ffafe0484

commit 754d6f5226a532ed086afa276b48e89ffafe0484
Author:     Ionen Wolkens <ionen@gentoo.org>
AuthorDate: 2024-05-09 08:19:17 +0000
Commit:     Ionen Wolkens <ionen@gentoo.org>
CommitDate: 2024-05-09 12:12:58 +0000

    dev-qt/qtwebengine: improve clang-18 workaround w/ -mevex512 (qt6)
    
    Hoping it will be a short-lived and that this will be improved/fixed
    in clang itself.
    
    (have not tried nor looked at qtwebengine:5)
    
    For some rough explanation from the little I get from this:
    
    clang-18 added -mevex512 (missing from 17), and then -march=native
    is a bit quirky in that unlike -march=exact it goes out of its way
    to disable it resulting in e.g.
    
            -march=skylake -mavx512f = -mevex512 is auto-enabled
            -march=skylake -mevex512 = not "enabled" but can be used
            -march=native(skylake) -mavx512f = forced off(!)
    
    And then units that use avx512 / pass -mavx512f (for use with runtime
    cpu detection) end in build failure without evex512.
    
    Always passing -mevex512 on a machine without avx512 "seems" safe,
    it does not even set __EVEX512__ and believe won't use any avx512
    instructions on a whim (__EVEX512__ does get set if add -mavx512f).
    Or at least my skylake (not skylake-x) passes test + can use the
    qtwebengine built that way.
    
    Considered passing only for files that need it at first with a patch
    (sounded safer), but chromium's Gn files don't have a variable to test
    clang version that I could see (or at least not in old qtwebengine) and
    didn't want this to become more involved nor use conditional patching.
    
    The !avx512 check may not be super necessary, but have not dug into
    the implications of forcing it when avx512 is actually enabled (sounds
    there are cases where it needs to be off, leaving it to compiler).
    
    Bug: https://bugs.gentoo.org/931623
    Signed-off-by: Ionen Wolkens <ionen@gentoo.org>

 dev-qt/qtwebengine/qtwebengine-6.7.0.ebuild    | 18 +++++++-----------
 dev-qt/qtwebengine/qtwebengine-6.7.9999.ebuild | 18 +++++++-----------
 dev-qt/qtwebengine/qtwebengine-6.9999.ebuild   | 18 +++++++-----------
 3 files changed, 21 insertions(+), 33 deletions(-)
Comment 7 Alexander Sergeyev 2024-05-09 17:29:24 UTC
Reading through [1] it seems that the whole point of -m[no-]evex512 is to disambiguate -mavx512[...] flags which might mean different things in the future: the current avx512 with 512-bit registers; and avx10-256 i.e. avx512 set of instructions but limited to 256-bit registers:

> Based on the feedbacks from LLVM and GCC community, we have agreed to
> start from supporting -m[no-]evex512 on existing AVX512 features.
> The option -mno-evex512 can be used with -mavx512xxx to build
> binaries that can run on both legacy AVX512 targets and AVX10-256.

It can be tested with the following code sample:

float foo(const float * p1, const float * p2)
{
    float r = 0.0f;
    for (int i = 0; i < 16; ++i) {
        r += p1[i] * p2[i];
    }
    return r;
}

and the following cmdlines:

clang++ -Ofast -mavx512f -mno-evex512 test.cc -S -o -
clang++ -Ofast -mavx512f -mevex512 test.cc -S -o -
clang++ -Ofast -msse4.2 -mevex512 test.cc -S -o -

The last command targeting SSE produces SSE instructions (not AVX512) despite -mevex512 being present. But it seems like a logical error though and it might as well be possible for compilers to show an error (in the future) when -mevex512 options are used outside of AVX512/AVX10 context.

[1] https://reviews.llvm.org/D159250
Comment 8 Alexander Sergeyev 2024-05-09 17:44:02 UTC
Created attachment 892627 [details, diff]
patch to remove avx512 code from xnnpack and skia
Comment 9 Alexander Sergeyev 2024-05-09 17:47:21 UTC
I've attached patch to remove avx512 code from xnnpack and skia libraries bundled with chromium. It is not really useful in the general case (for example, when avx512 might be desired), but it worked for me as a temporary solution.
Comment 10 Ionen Wolkens gentoo-dev 2024-05-09 18:00:44 UTC
It's part of why that in the workaround I don't pass -mevex512 when avx512 is present and relying on the last behaviour. It could break eventually but who knows what direction this will go in (maybe clang will be adjusted so we don't need this anymore at all). Makes a decent temporary solution versus maintaining patches (esp. in chromium), and doesn't remove avx512 bits for those that can use it.
Comment 11 Alexander Sergeyev 2024-05-09 19:29:21 UTC
(In reply to Ionen Wolkens from comment #10)
> Makes a decent temporary solution versus maintaining patches
> (esp. in chromium), and doesn't remove avx512 bits for those
> that can use it.

It is certainly better.

> wrt -march=native, to clarify, it is *really* only with =native

Come to think of it, it makes some sense to me: -mavx512[...] by itself is ambiguous now and use of -march=native can be interpreted as "resolve this ambiguity for me for the current host". So, we can have a host without avx512 being put into the avx10-256 bin resulting in evex512 not being used -- thus, triggering errors when facing 512-bit register avx512 intrinsics.

To confirm:

$ clang++ -march=native -mavx512f -mavx512vl -dM -E - </dev/null | grep 'EVEX\|AVX512'
#define __AVX512F__ 1
#define __AVX512VL__ 1
#define __EVEX256__ 1

^ Look, we have AVX512 definitions, but also EVEX256 -- effectively meaning AVX10-256.

$ clang++ -march=native -mno-evex512  -mavx512f -mavx512vl -dM -E - </dev/null | grep 'EVEX\|AVX512'
#define __AVX512F__ 1
#define __AVX512VL__ 1
#define __EVEX256__ 1

$ clang++ -march=native -mevex512  -mavx512f -mavx512vl -dM -E - </dev/null | grep 'EVEX\|AVX512'
#define __AVX512F__ 1
#define __AVX512VL__ 1
#define __EVEX256__ 1
#define __EVEX512__ 1

$ clang++ -march=native -mevex512  -dM -E - </dev/null | grep 'EVEX\|AVX'
#define __AVX2__ 1
#define __AVX__ 1

Note that -mavx512vl is important -- AVX10 targets must have it enabled:

> AVX-512 Vector Length Extensions (VL) extends most AVX-512 operations
> to also operate on XMM (128-bit) and YMM (256-bit) registers.

$ clang++ -march=native -mavx512f  -dM -E - </dev/null | grep 'EVEX\|AVX512'
#define __AVX512F__ 1
Comment 12 Alexander Sergeyev 2024-05-09 19:53:03 UTC
(In reply to Alexander Sergeyev from comment #11)
> It is certainly better.

I mean the -mevex512 workaround is better, of course :)

> Look, we have AVX512 definitions, but also EVEX256 -- effectively meaning AVX10-256.

There are some inconsistencies though:

$ clang++ -march=native -mavx512f -mavx512vl -E -v - </dev/null
[...] -target-feature +avx2 [...] -target-feature -avx10.1-256 [...]

So, we have __EVEX256__ definition and (with some other definitions in mind) it should indicate avx10-256 [1], but I doubt that is the intended behavior here since the target feature avx10.1-256 was actually disabled. Color me confused.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-September/631562.html
Comment 13 Larry the Git Cow gentoo-dev 2024-05-09 21:38:26 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=8cb01633a28dad0dd4f2bfb43262ae987429fa1f

commit 8cb01633a28dad0dd4f2bfb43262ae987429fa1f
Author:     Matt Jolly <kangie@gentoo.org>
AuthorDate: 2024-05-09 21:31:41 +0000
Commit:     Matt Jolly <kangie@gentoo.org>
CommitDate: 2024-05-09 21:34:42 +0000

    www-client/chromium: add 125.0.6422.26
    
    Add Ionen's clang-18 -mevex512 workaround to chromium:
    
    clang-18 added -mevex512 (missing from 17), and then -march=native
    is a bit quirky in that unlike -march=exact it goes out of its way
    to disable it resulting in e.g.
    
            -march=skylake -mavx512f = -mevex512 is auto-enabled
            -march=skylake -mevex512 = not "enabled" but can be used
            -march=native(skylake) -mavx512f = forced off(!)
    
    And then units that use avx512 / pass -mavx512f (for use with runtime
    cpu detection) end in build failure without evex512.
    
    Always passing -mevex512 on a machine without avx512 "seems" safe,
    it does not even set __EVEX512__ and believe won't use any avx512
    instructions on a whim (__EVEX512__ does get set if add -mavx512f)
    
    Bug: https://bugs.gentoo.org/931623
    Signed-off-by: Matt Jolly <kangie@gentoo.org>

 www-client/chromium/Manifest                      |    1 +
 www-client/chromium/chromium-125.0.6422.26.ebuild | 1453 +++++++++++++++++++++
 2 files changed, 1454 insertions(+)
Comment 14 Patrick Lauer gentoo-dev 2024-05-10 05:11:06 UTC
The workaround in www-client/chromium-125.0.6422.26 fails on a no-avx512 machine:

FAILED: obj/skia/skcms_TransformSkx/skcms_TransformSkx.o 
../../third_party/skia/modules/skcms/src/Transform_inl.h:828:9: error: AVX vector return of type 'F' (aka 'Vec<16, float>') without 'evex512' enabled changes the ABI
  828 |     a = F_from_U8(load<U8>(src + 1*i));
      |         ^
../../third_party/skia/modules/skcms/src/Transform_inl.h:584:12: error: AVX vector return of type 'float __attribute__((ext_vector_type(16)))' (vector of 16 'float' values) without 'evex512' enabled changes the ABI
  584 |     return cast<F>(v) * (1/255.0f);
      |            ^
../../third_party/skia/modules/skcms/src/Transform_inl.h:832:17: error: AVX vector return of type 'F' (aka 'Vec<16, float>') without 'evex512' enabled changes the ABI
  832 |     r = g = b = F_from_U8(load<U8>(src + 1*i));
      |                 ^
../../third_party/skia/modules/skcms/src/Transform_inl.h:838:9: error: AVX vector return of type 'float __attribute__((ext_vector_type(16)))' (vector of 16 'float' values) without 'evex512' enabled changes the ABI
  838 |     r = cast<F>((abgr >> 12) & 0xf) * (1/15.0f);
      |         ^
../../third_party/skia/modules/skcms/src/Transform_inl.h:839:9: error: AVX vector return of type 'float __attribute__((ext_vector_type(16)))' (vector of 16 'float' values) without 'evex512' enabled changes the ABI
  839 |     g = cast<F>((abgr >>  8) & 0xf) * (1/15.0f);
      |         ^
../../third_party/skia/modules/skcms/src/Transform_inl.h:840:9: error: AVX vector return of type 'float __attribute__((ext_vector_type(16)))' (vector of 16 'float' values) without 'evex512' enabled changes the ABI
  840 |     b = cast<F>((abgr >>  4) & 0xf) * (1/15.0f);
      |         ^
../../third_party/skia/modules/skcms/src/Transform_inl.h:841:9: error: AVX vector return of type 'float __attribute__((ext_vector_type(16)))' (vector of 16 'float' values) without 'evex512' enabled changes the ABI
  841 |     a = cast<F>((abgr >>  0) & 0xf) * (1/15.0f);
      |         ^
../../third_party/skia/modules/skcms/src/Transform_inl.h:847:9: error: AVX vector return of type 'float __attribute__((ext_vector_type(16)))' (vector of 16 'float' values) without 'evex512' enabled changes the ABI
  847 |     r = cast<F>(rgb & (uint16_t)(31<< 0)) * (1.0f / (31<< 0));
      |         ^
../../third_party/skia/modules/skcms/src/Transform_inl.h:848:9: error: AVX vector return of type 'float __attribute__((ext_vector_type(16)))' (vector of 16 'float' values) without 'evex512' enabled changes the ABI
  848 |     g = cast<F>(rgb & (uint16_t)(63<< 5)) * (1.0f / (63<< 5));
      |         ^
../../third_party/skia/modules/skcms/src/Transform_inl.h:849:9: error: AVX vector return of type 'float __attribute__((ext_vector_type(16)))' (vector of 16 'float' values) without 'evex512' enabled changes the ABI
  849 |     b = cast<F>(rgb & (uint16_t)(31<<11)) * (1.0f / (31<<11));
      |         ^
../../third_party/skia/modules/skcms/src/Transform_inl.h:871:17: error: AVX vector return of type 'unsigned int __attribute__((ext_vector_type(16)))' (vector of 16 'unsigned int' values) without 'evex512' enabled changes the ABI
  871 |     r = cast<F>(load_3<U32>(rgb+0) ) * (1/255.0f);
      |                 ^
../../third_party/skia/modules/skcms/src/Transform_inl.h:871:9: error: AVX vector return of type 'float __attribute__((ext_vector_type(16)))' (vector of 16 'float' values) without 'evex512' enabled changes the ABI
  871 |     r = cast<F>(load_3<U32>(rgb+0) ) * (1/255.0f);
      |         ^
../../third_party/skia/modules/skcms/src/Transform_inl.h:872:17: error: AVX vector return of type 'unsigned int __attribute__((ext_vector_type(16)))' (vector of 16 'unsigned int' values) without 'evex512' enabled changes the ABI
  872 |     g = cast<F>(load_3<U32>(rgb+1) ) * (1/255.0f);
      |                 ^
../../third_party/skia/modules/skcms/src/Transform_inl.h:872:9: error: AVX vector return of type 'float __attribute__((ext_vector_type(16)))' (vector of 16 'float' values) without 'evex512' enabled changes the ABI
  872 |     g = cast<F>(load_3<U32>(rgb+1) ) * (1/255.0f);
      |         ^
../../third_party/skia/modules/skcms/src/Transform_inl.h:873:17: error: AVX vector return of type 'unsigned int __attribute__((ext_vector_type(16)))' (vector of 16 'unsigned int' values) without 'evex512' enabled changes the ABI
  873 |     b = cast<F>(load_3<U32>(rgb+2) ) * (1/255.0f);
      |                 ^
../../third_party/skia/modules/skcms/src/Transform_inl.h:873:9: error: AVX vector return of type 'float __attribute__((ext_vector_type(16)))' (vector of 16 'float' values) without 'evex512' enabled changes the ABI
  873 |     b = cast<F>(load_3<U32>(rgb+2) ) * (1/255.0f);
      |         ^
../../third_party/skia/modules/skcms/src/Transform_inl.h:878:16: error: AVX vector return of type 'unsigned int __attribute__((ext_vector_type(16)))' (vector of 16 'unsigned int' values) without 'evex512' enabled changes the ABI
  878 |     U32 rgba = load<U32>(src + 4*i);
      |                ^
../../third_party/skia/modules/skcms/src/Transform_inl.h:880:9: error: AVX vector return of type 'float __attribute__((ext_vector_type(16)))' (vector of 16 'float' values) without 'evex512' enabled changes the ABI
  880 |     r = cast<F>((rgba >>  0) & 0xff) * (1/255.0f);
      |         ^
../../third_party/skia/modules/skcms/src/Transform_inl.h:881:9: error: AVX vector return of type 'float __attribute__((ext_vector_type(16)))' (vector of 16 'float' values) without 'evex512' enabled changes the ABI
  881 |     g = cast<F>((rgba >>  8) & 0xff) * (1/255.0f);
      |         ^
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.
Comment 15 Ionen Wolkens gentoo-dev 2024-05-10 05:17:30 UTC
If I were to guess, it got stripped because strip-flags is ran after the append happened and it's not in allowed flags that I can see.
Comment 16 Alexander Sergeyev 2024-05-10 07:30:33 UTC
LLVM fixed the issue (the fix landed into the main branch), -march=native no longer implies -mno-evex512 on non-avx512 hosts, so -march=native -mavx512xxx works as expected now.

https://github.com/llvm/llvm-project/issues/91076#issuecomment-2103791226
https://github.com/llvm/llvm-project/pull/91694
Comment 17 Ionen Wolkens gentoo-dev 2024-05-10 07:49:55 UTC
(In reply to Alexander Sergeyev from comment #16)
> LLVM fixed the issue (the fix landed into the main branch), -march=native no
> longer implies -mno-evex512 on non-avx512 hosts, so -march=native
> -mavx512xxx works as expected now.
> 
> https://github.com/llvm/llvm-project/issues/91076#issuecomment-2103791226
> https://github.com/llvm/llvm-project/pull/91694
Nice, it'll help a few things but doesn't sound like it'll help the cases where it wasn't passing -mavx512* (like skia/skcms in qtwebengine which is different). It may be enough for current chromium though.
Comment 18 Alexander Sergeyev 2024-05-10 09:27:09 UTC
(In reply to Ionen Wolkens from comment #17)
> Nice, it'll help a few things but doesn't sound like it'll help the cases
> where it wasn't passing -mavx512* (like skia/skcms in qtwebengine which is
> different). It may be enough for current chromium though.

Could you elaborate more on this? I'm not sure I understand the issue with qtwebengine and missing -mavx512* flags just from reading commit message [1]:

> -march=skylake -mavx512f = -mevex512 is auto-enabled
> -march=skylake -mevex512 = not "enabled" but can be used
> -march=native(skylake) -mavx512f = forced off(!)

So, the first works as desired. The third is fixed in LLVM now. The second is somewhat fuzzy. Skylake does not have avx512 by itself, so -mevex512 is not doing anything actually -- avx512 can still be used in cpu-dispatching projects, but it would require adding -mavx512* flags -- which we were already doing before avx10 and clang-18.

I don't think that -mevex512 should imply avx512 by default. First of all, there are multiple parts of avx512 (avx512f, avx512vl, -mavx512vnni and so on), so it is not clear which parts should be implied (given that there are cpus with avx512 before avx10). Second, if we are talking about -mevex512 implying avx10-512, then it would make more sense to use something like -mavx10.1-512 instead to avoid confusion between avx512 and avx10-512. So, -mevex512 just means that we have support for 512-bit vectors (and 64-bit masks), but this does not define a particular set of instructions available -- and for that -mavx512*/-mavx10* flags come in. But I'm not sure whether specifics of -mavx10* flags are already decided/finalized in either GCC/LLVM.

[1] https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=754d6f5226a532ed086afa276b48e89ffafe0484
Comment 19 Ionen Wolkens gentoo-dev 2024-05-10 09:50:10 UTC
(In reply to Alexander Sergeyev from comment #18)
> (In reply to Ionen Wolkens from comment #17)
> > Nice, it'll help a few things but doesn't sound like it'll help the cases
> > where it wasn't passing -mavx512* (like skia/skcms in qtwebengine which is
> > different). It may be enough for current chromium though.
> 
> Could you elaborate more on this? I'm not sure I understand the issue with
> qtwebengine and missing -mavx512* flags just from reading commit message [1]:
> 
Showing -m* was just to make it easier to understand, the problem happens on qtwebengine with skcms despite it does *not* pass any -mavx512* flags. And if(?) I understand the fix it only comes into effect when avx512 was enabled.

Aka, qtwebengine's skcms does not pass any flags at all except -std=c11[1], while current chromium does [2]

In other words, [3] remains unfixed if I understand this right -- which suggest doing -mevex512 even though avx512 is not enabled (which solved it for qtwebengine for me). It does not even set __EVEX512__ and such (aka not really enabled), but it overrides' -march=native effects thus working around the issue.

[1] https://github.com/qt/qtwebengine-chromium/blob/007bee8d/chromium/third_party/skia/modules/skcms/BUILD.gn
[2] https://github.com/google/skia/blob/40fcf198d/modules/skcms/BUILD.gn#L72
[3] https://github.com/llvm/llvm-project/issues/70002
Comment 20 Ionen Wolkens gentoo-dev 2024-05-10 10:06:21 UTC Comment hidden (obsolete)
Comment 21 Ionen Wolkens gentoo-dev 2024-05-10 10:08:51 UTC
(either way I'll give it another try when the fix lands in a release to see if the workaround is still needed -- this is just the impression I was getting looking at it but maybe it's fixed for this too)
Comment 22 Alexander Sergeyev 2024-05-10 10:11:28 UTC
(In reply to Ionen Wolkens from comment #19)
> In other words, [3] remains unfixed if I understand this right -- which
> suggest doing -mevex512 even though avx512 is not enabled (which solved it
> for qtwebengine for me).

A compilation test of the following snippet:

#include <immintrin.h>
__m512 foo(float x) { return _mm512_set1_ps(x); }

shows that both clang-17 and clang-18 will not compile it without -mavx512* options regardless of -mevex512 being used for clang-18.

> Aka, qtwebengine's skcms does not pass any flags at all except -std=c11[1],
> while current chromium does [2]

They actually do, but in a more tricky way by pushing target attributes via pragmas [1]. So, it seems that is not a LLVM/Clang issue after all.

[1] https://github.com/qt/qtwebengine-chromium/blob/007bee8df524433cd9bc0fe818ce7800ef24679f/chromium/third_party/skia/modules/skcms/skcms.cc#L2459
Comment 23 Ionen Wolkens gentoo-dev 2024-05-10 10:18:02 UTC
(In reply to Alexander Sergeyev from comment #22)
> > Aka, qtwebengine's skcms does not pass any flags at all except -std=c11[1],
> > while current chromium does [2]
> 
> They actually do, but in a more tricky way by pushing target attributes via
> pragmas [1]. So, it seems that is not a LLVM/Clang issue after all.
I see, thanks. Was wondering how these were working at all that way and made me imagine a strange scenario :) Not something I ever used in code myself.

Guess it'll work then unless the pragma way breaks something.
Comment 24 Larry the Git Cow gentoo-dev 2024-05-10 10:26:52 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=a167b24a4097bb2d462875285b170818a7336eb0

commit a167b24a4097bb2d462875285b170818a7336eb0
Author:     Ionen Wolkens <ionen@gentoo.org>
AuthorDate: 2024-05-10 10:25:16 +0000
Commit:     Ionen Wolkens <ionen@gentoo.org>
CommitDate: 2024-05-10 10:25:16 +0000

    dev-qt/qtwebengine: note reminder of when to drop workaround
    
    Bug: https://bugs.gentoo.org/931623
    Signed-off-by: Ionen Wolkens <ionen@gentoo.org>

 dev-qt/qtwebengine/qtwebengine-6.7.0.ebuild    | 1 +
 dev-qt/qtwebengine/qtwebengine-6.7.9999.ebuild | 1 +
 dev-qt/qtwebengine/qtwebengine-6.9999.ebuild   | 1 +
 3 files changed, 3 insertions(+)
Comment 25 Alexander Sergeyev 2024-05-10 10:33:30 UTC
(In reply to Ionen Wolkens from comment #23)
> Guess it'll work then unless the pragma way breaks something.

It should not, but I believe it might be more cumbersome (especially when support for all of gcc, clang and msvc is required). I've seen somewhere that pragmas in skia were used in an older version and then they dropped the option to disable cpu runtime dispatch and went with compile flags instead of pragmas. But I cannot quickly find the source at the moment.
Comment 26 Alexander Sergeyev 2024-05-10 23:09:18 UTC
(In reply to Alexander Sergeyev from comment #25)
> I've seen somewhere that pragmas in skia were used in an older version
> and then they dropped the option to disable cpu runtime dispatch and
> went with compile flags instead of pragmas.

Well, this story is a bit more complicated and pragmas apparently do have unexpected downsides at least on LLVM. For details see [1] and [2].

[1] https://skia.googlesource.com/skcms.git/+/e9cc5993398f5bcad9bf62201538c73ae86424ca
[2] https://github.com/llvm/llvm-project/issues/64706
Comment 27 Ionen Wolkens gentoo-dev 2024-05-11 00:21:44 UTC
Good to know. ftr Qt6.8 will have caught up with that change (it's based on 122.0.6261.72 currently), so any issues these pragmas might cause will just have a few more months to live at best.
Comment 28 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-05-12 04:46:33 UTC
*** Bug 931660 has been marked as a duplicate of this bug. ***
Comment 29 Larry the Git Cow gentoo-dev 2024-05-12 04:51:07 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=ded2b2cd180ee3896423dca54c4f24962d5c9b0a

commit ded2b2cd180ee3896423dca54c4f24962d5c9b0a
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2024-05-12 04:49:41 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2024-05-12 04:49:41 +0000

    flag-o-matic.eclass: allow -mevex512 and -mno-evex512
    
    The whole -m/-mno-* situation needs to be improved in the eclass but
    let's do this for now for the benefit of Chromium (see 754d6f5226a532ed086afa276b48e89ffafe0484).
    
    Bug: https://bugs.gentoo.org/931623
    Signed-off-by: Sam James <sam@gentoo.org>

 eclass/flag-o-matic.eclass | 2 ++
 1 file changed, 2 insertions(+)
Comment 30 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-05-12 05:00:41 UTC
(In reply to Ionen Wolkens from comment #15)
> If I were to guess, it got stripped because strip-flags is ran after the
> append happened and it's not in allowed flags that I can see.

Indeed, from dupe:
"""
strip-flags: CXXFLAGS: changed '-march=native -O2 -pipe -fomit-frame-pointer -mevex512' to '-march=native -O2 -pipe'
"""
Comment 31 Larry the Git Cow gentoo-dev 2024-05-14 07:23:42 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=45550fe4d5b041d274b4a4822f9fb829df5f0d7a

commit 45550fe4d5b041d274b4a4822f9fb829df5f0d7a
Author:     Matt Jolly <kangie@gentoo.org>
AuthorDate: 2024-05-14 07:21:10 +0000
Commit:     Matt Jolly <kangie@gentoo.org>
CommitDate: 2024-05-14 07:21:59 +0000

    www-client/chromium: add 124.0.6367.207
    
    Adds the avx512 w/ -march=native fix for clang18.
    
    Bug: https://bugs.gentoo.org/931623
    Bug: https://bugs.gentoo.org/931897
    Signed-off-by: Matt Jolly <kangie@gentoo.org>

 www-client/chromium/Manifest                       |    1 +
 www-client/chromium/chromium-124.0.6367.207.ebuild | 1443 ++++++++++++++++++++
 2 files changed, 1444 insertions(+)
Comment 32 Larry the Git Cow gentoo-dev 2024-05-17 04:32:34 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=43fecd92f7197cbeb286843f239a09fb994cecef

commit 43fecd92f7197cbeb286843f239a09fb994cecef
Author:     Ionen Wolkens <ionen@gentoo.org>
AuthorDate: 2024-05-17 01:23:40 +0000
Commit:     Ionen Wolkens <ionen@gentoo.org>
CommitDate: 2024-05-17 04:16:36 +0000

    dev-qt/qtwebengine: update evex512 workaround for fixed llvm version
    
    The has_version is not *necessary* but will make it easier to
    know it's safe to drop when it becomes essentially a no-op.
    
    Bug: https://bugs.gentoo.org/931623
    Signed-off-by: Ionen Wolkens <ionen@gentoo.org>

 dev-qt/qtwebengine/qtwebengine-6.7.0.ebuild    | 3 ++-
 dev-qt/qtwebengine/qtwebengine-6.7.9999.ebuild | 3 ++-
 dev-qt/qtwebengine/qtwebengine-6.9999.ebuild   | 3 ++-
 3 files changed, 6 insertions(+), 3 deletions(-)
Comment 33 Alexander Sergeyev 2024-05-18 20:16:15 UTC
It seems that LLVM 18.1.6 (which contains the fix) is out now