Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 900937 - sys-devel/gcc-12.2.1_p20230304: hang building media-video/ffmpeg[abi_x86_32] libavcodec/h264_cabac.c with -O3
Summary: sys-devel/gcc-12.2.1_p20230304: hang building media-video/ffmpeg[abi_x86_32] ...
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal
Assignee: Gentoo Toolchain Maintainers
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-03-12 16:06 UTC by Rafael Kitover
Modified: 2023-03-26 23:34 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
ffmpeg build log (ffmpeg-build.log.xz,61.69 KB, application/x-xz)
2023-03-12 16:09 UTC, Rafael Kitover
Details
emerge --info (emerge-info.txt.xz,7.48 KB, application/x-xz)
2023-03-12 16:11 UTC, Rafael Kitover
Details
emerge -pqv media-video/ffmpeg (emerge-pqv-ffmpeg.txt.xz,788 bytes, application/x-xz)
2023-03-12 16:13 UTC, Rafael Kitover
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Rafael Kitover 2023-03-12 16:06:02 UTC
When building media-video/ffmpeg-5.1.2-r1 for 32 bit ABI on x64 (needed for wine) one of the gcc compile commands hangs with 100% CPU usage on one thread.

These are the commands in the ps auxww output:

root     2820803  0.0  0.0   8532  2048 pts/7    S+   16:01   0:00 x86_64-pc-linux-gnu-gcc -m32 -mfpmath=sse -I. -Isrc/ -D_ISOC99_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_POSIX_C_SOURCE=200112 -D_XOPEN_SOURCE=600 -DPIC -DZLIB_CONST -DHAVE_AV_CONFIG_H -DBUILDING_avcodec -march=native -O3 -pipe -march=znver1 -std=c11 -fPIC -pthread -I/usr/include/lilv-0 -I/usr/include/serd-0 -I/usr/include/sord-0 -I/usr/include/sratom-0 -I/usr/include/freetype2 -I/usr/include/harfbuzz -I/usr/include/glib-2.0 -I/usr/lib/glib-2.0/include -I/usr/include/fribidi -I/usr/include/libxml2 -I/usr/include/freetype2 -I/usr/include/harfbuzz -I/usr/include/glib-2.0 -I/usr/lib/glib-2.0/include -I/usr/include/bs2b -I/usr/include/libdrm -I/usr/include/freetype2 -I/usr/include/harfbuzz -I/usr/include/glib-2.0 -I/usr/lib/glib-2.0/include -I/usr/include/freetype2 -I/usr/include/harfbuzz -I/usr/include/glib-2.0 -I/usr/lib/glib-2.0/include -I/usr/include/fribidi -I/usr/include/openh264 -I/usr/include/openjpeg-2.5 -I/usr/include/opus -I/usr/include/opus -D_REENTRANT -I/usr/include/librsvg-2.0 -I/usr/include/glib-2.0 -I/usr/lib/glib-2.0/include -I/usr/lib/libffi/include -I/usr/include/libmount -I/usr/include/blkid -I/usr/include/gdk-pixbuf-2.0 -I/usr/include/libpng16 -pthread -I/usr/include/cairo -I/usr/include/freetype2 -I/usr/include/harfbuzz -I/usr/include/pixman-1 -I/usr/include/samba-4.0 -I/usr/include/srt -I/usr/include/leptonica -DX264_API_IMPORTS -I/usr/include/libxml2 -I/usr/include/libdrm -Wdeclaration-after-statement -Wall -Wdisabled-optimization -Wpointer-arith -Wredundant-decls -Wwrite-strings -Wtype-limits -Wundef -Wmissing-prototypes -Wstrict-prototypes -Wempty-body -Wno-parentheses -Wno-switch -Wno-format-zero-length -Wno-pointer-sign -Wno-unused-const-variable -Wno-bool-operation -Wno-char-subscripts -march=native -O3 -pipe -fno-math-errno -fno-signed-zeros -fno-tree-vectorize -Werror=format-security -Werror=implicit-function-declaration -Werror=missing-prototypes -Werror=return-type -Werror=vla -Wformat -Wno-maybe-uninitialized -I/usr/include/SDL2 -D_REENTRANT -MMD -MF libavcodec/h264_cabac.d -MT libavcodec/h264_cabac.o -c -o libavcodec/h264_cabac.o src/libavcodec/h264_cabac.c
root     2820805 99.7  0.0 107308 79600 pts/7    R+   16:01   2:47 /usr/libexec/gcc/x86_64-pc-linux-gnu/12/cc1 -quiet -I . -I src/ -I /usr/include/lilv-0 -I /usr/include/serd-0 -I /usr/include/sord-0 -I /usr/include/sratom-0 -I /usr/include/freetype2 -I /usr/include/harfbuzz -I /usr/include/glib-2.0 -I /usr/lib/glib-2.0/include -I /usr/include/fribidi -I /usr/include/libxml2 -I /usr/include/freetype2 -I /usr/include/harfbuzz -I /usr/include/glib-2.0 -I /usr/lib/glib-2.0/include -I /usr/include/bs2b -I /usr/include/libdrm -I /usr/include/freetype2 -I /usr/include/harfbuzz -I /usr/include/glib-2.0 -I /usr/lib/glib-2.0/include -I /usr/include/freetype2 -I /usr/include/harfbuzz -I /usr/include/glib-2.0 -I /usr/lib/glib-2.0/include -I /usr/include/fribidi -I /usr/include/openh264 -I /usr/include/openjpeg-2.5 -I /usr/include/opus -I /usr/include/opus -I /usr/include/librsvg-2.0 -I /usr/include/glib-2.0 -I /usr/lib/glib-2.0/include -I /usr/lib/libffi/include -I /usr/include/libmount -I /usr/include/blkid -I /usr/include/gdk-pixbuf-2.0 -I /usr/include/libpng16 -I /usr/include/cairo -I /usr/include/freetype2 -I /usr/include/harfbuzz -I /usr/include/pixman-1 -I /usr/include/samba-4.0 -I /usr/include/srt -I /usr/include/leptonica -I /usr/include/libxml2 -I /usr/include/libdrm -I /usr/include/SDL2 -imultilib 32 -MMD libavcodec/h264_cabac.d -MF libavcodec/h264_cabac.d -MT libavcodec/h264_cabac.o -D_REENTRANT -D _ISOC99_SOURCE -D _FILE_OFFSET_BITS=64 -D _LARGEFILE_SOURCE -D _POSIX_C_SOURCE=200112 -D _XOPEN_SOURCE=600 -D PIC -D ZLIB_CONST -D HAVE_AV_CONFIG_H -D BUILDING_avcodec -D _REENTRANT -D X264_API_IMPORTS -D _REENTRANT src/libavcodec/h264_cabac.c -march=znver1 -mmmx -mpopcnt -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -mavx -mavx2 -msse4a -mno-fma4 -mno-xop -mfma -mno-avx512f -mbmi -mbmi2 -maes -mpclmul -mno-avx512vl -mno-avx512bw -mno-avx512dq -mno-avx512cd -mno-avx512er -mno-avx512pf -mno-avx512vbmi -mno-avx512ifma -mno-avx5124vnniw -mno-avx5124fmaps -mno-avx512vpopcntdq -mno-avx512vbmi2 -mno-gfni -mno-vpclmulqdq -mno-avx512vnni -mno-avx512bitalg -mno-avx512bf16 -mno-avx512vp2intersect -mno-3dnow -madx -mabm -mno-cldemote -mclflushopt -mno-clwb -mclzero -mcx16 -mno-enqcmd -mf16c -mfsgsbase -mfxsr -mno-hle -msahf -mno-lwp -mlzcnt -mmovbe -mno-movdir64b -mno-movdiri -mmwaitx -mno-pconfig -mno-pku -mno-prefetchwt1 -mprfchw -mno-ptwrite -mno-rdpid -mrdrnd -mrdseed -mno-rtm -mno-serialize -mno-sgx -msha -mno-shstk -mno-tbm -mno-tsxldtrk -mno-vaes -mno-waitpkg -mno-wbnoinvd -mxsave -mxsavec -mxsaveopt -mxsaves -mno-amx-tile -mno-amx-int8 -mno-amx-bf16 -mno-uintr -mno-hreset -mno-kl -mno-widekl -mno-avxvnni -mno-avx512fp16 --param l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=512 -mtune=znver1 -quiet -dumpdir libavcodec/ -dumpbase h264_cabac.c -dumpbase-ext .c -m32 -mfpmath=sse -O3 -O3 -Wdeclaration-after-statement -Wall -Wdisabled-optimization -Wpointer-arith -Wredundant-decls -Wwrite-strings -Wtype-limits -Wundef -Wmissing-prototypes -Wstrict-prototypes -Wempty-body -Wno-parentheses -Wno-switch -Wno-format-zero-length -Wno-pointer-sign -Wunused-const-variable=0 -Wno-bool-operation -Wno-char-subscripts -Werror=format-security -Werror=implicit-function-declaration -Werror=missing-prototypes -Werror=return-type -Werror=vla -Wformat=1 -Wno-maybe-uninitialized -std=c11 -fPIC -fno-math-errno -fno-signed-zeros -fno-tree-vectorize -o -



Reproducible: Always

Steps to Reproduce:
Add the USE flag:

media-video/ffmpeg abi_x86_32
Comment 1 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-03-12 16:08:21 UTC
Please include the full build.log up to the point it hangs and emerge --info.
Comment 2 Rafael Kitover 2023-03-12 16:09:37 UTC
Created attachment 857399 [details]
ffmpeg build log
Comment 3 Rafael Kitover 2023-03-12 16:11:00 UTC
Created attachment 857401 [details]
emerge --info
Comment 4 Rafael Kitover 2023-03-12 16:13:40 UTC
Created attachment 857403 [details]
emerge -pqv media-video/ffmpeg
Comment 5 Rafael Kitover 2023-03-12 16:14:23 UTC
Note that this is the build log after I kill the gcc process, since it's hanging.
Comment 6 Rafael Kitover 2023-03-12 16:58:20 UTC
Builds fine if I change -O3 to -O2 in CFLAGS.
Comment 7 Ninpo 2023-03-12 17:47:47 UTC
Can confirm I also experience this hang with -O3 with gcc-12 and gcc-13 on ffmpeg version 4.4.3. Builds fine with -O2.
Comment 8 Ninpo 2023-03-12 18:09:55 UTC
My march=native for znver1 expanded out:

# for t in param target; do cmd="gcc -Q -O2 --help=$t"; diff -U0 <(LANG=C $cmd) <(LANG=C $cmd -march=native); done
--- /dev/fd/63  2023-03-12 18:07:58.606179015 +0000
+++ /dev/fd/62  2023-03-12 18:07:58.607179028 +0000
@@ -21 +21 @@
-  --param=avoid-fma-max-bits=<0,512>           0
+  --param=avoid-fma-max-bits=<0,512>           128
@@ -234 +234 @@
-  --param=simultaneous-prefetches=     6
+  --param=simultaneous-prefetches=     100
--- /dev/fd/63  2023-03-12 18:07:58.613179102 +0000
+++ /dev/fd/62  2023-03-12 18:07:58.613179102 +0000
@@ -12 +12 @@
-  -mabm                                [disabled]
+  -mabm                                [enabled]
@@ -15,2 +15,2 @@
-  -madx                                [disabled]
-  -maes                                [disabled]
+  -madx                                [enabled]
+  -maes                                [enabled]
@@ -27 +27 @@
-  -march=                              x86-64
+  -march=                              znver1
@@ -29,2 +29,2 @@
-  -mavx                                [disabled]
-  -mavx2                               [disabled]
+  -mavx                                [enabled]
+  -mavx2                               [enabled]
@@ -32 +32 @@
-  -mavx256-split-unaligned-store       [disabled]
+  -mavx256-split-unaligned-store       [enabled]
@@ -53,2 +53,2 @@
-  -mbmi                                [disabled]
-  -mbmi2                               [disabled]
+  -mbmi                                [enabled]
+  -mbmi2                               [enabled]
@@ -60 +60 @@
-  -mclflushopt                         [disabled]
+  -mclflushopt                         [enabled]
@@ -62 +62 @@
-  -mclzero                             [disabled]
+  -mclzero                             [enabled]
@@ -65,2 +65,2 @@
-  -mcrc32                              [disabled]
-  -mcx16                               [disabled]
+  -mcrc32                              [enabled]
+  -mcx16                               [enabled]
@@ -71 +71 @@
-  -mf16c                               [disabled]
+  -mf16c                               [enabled]
@@ -76 +76 @@
-  -mfma                                [disabled]
+  -mfma                                [enabled]
@@ -82 +82 @@
-  -mfsgsbase                           [disabled]
+  -mfsgsbase                           [enabled]
@@ -109 +109 @@
-  -mlzcnt                              [disabled]
+  -mlzcnt                              [enabled]
@@ -115 +115 @@
-  -mmovbe                              [disabled]
+  -mmovbe                              [enabled]
@@ -122,2 +122,2 @@
-  -mmwait                              [disabled]
-  -mmwaitx                             [disabled]
+  -mmwait                              [enabled]
+  -mmwaitx                             [enabled]
@@ -130 +130 @@
-  -mno-sse4                            [enabled]
+  -mno-sse4                            [disabled]
@@ -136 +136 @@
-  -mpclmul                             [disabled]
+  -mpclmul                             [enabled]
@@ -140 +140 @@
-  -mpopcnt                             [disabled]
+  -mpopcnt                             [enabled]
@@ -142 +142 @@
-  -mprefer-vector-width=               none
+  -mprefer-vector-width=               128
@@ -145 +145 @@
-  -mprfchw                             [disabled]
+  -mprfchw                             [enabled]
@@ -149,2 +149,2 @@
-  -mrdrnd                              [disabled]
-  -mrdseed                             [disabled]
+  -mrdrnd                              [enabled]
+  -mrdseed                             [enabled]
@@ -160 +160 @@
-  -msahf                               [disabled]
+  -msahf                               [enabled]
@@ -163 +163 @@
-  -msha                                [disabled]
+  -msha                                [enabled]
@@ -170,5 +170,5 @@
-  -msse3                               [disabled]
-  -msse4                               [disabled]
-  -msse4.1                             [disabled]
-  -msse4.2                             [disabled]
-  -msse4a                              [disabled]
+  -msse3                               [enabled]
+  -msse4                               [enabled]
+  -msse4.1                             [enabled]
+  -msse4.2                             [enabled]
+  -msse4a                              [enabled]
@@ -177 +177 @@
-  -mssse3                              [disabled]
+  -mssse3                              [enabled]
@@ -192 +192 @@
-  -mtune=                              generic
+  -mtune=                              znver1
@@ -205,4 +205,4 @@
-  -mxsave                              [disabled]
-  -mxsavec                             [disabled]
-  -mxsaveopt                           [disabled]
-  -mxsaves                             [disabled]
+  -mxsave                              [enabled]
+  -mxsavec                             [enabled]
+  -mxsaveopt                           [enabled]
+  -mxsaves                             [enabled]
Comment 9 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-03-12 18:18:05 UTC
Someone hit this on the forums at https://forums.gentoo.org/viewtopic-t-1162301.html too.
Comment 10 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-03-12 19:12:17 UTC
Reproduced with CFLAGS="-O3 -march=znver1" CXXFLAGS="-O3 -march=znver1" USE="-* X abi_x86_32 abi_x86_64 alsa amd64 amr amrenc bluray bs2b bzip2 cdio chromium codec2 cpu_flags_x86_aes cpu_flags_x86_avx cpu_flags_x86_avx2 cpu_flags_x86_fma3 cpu_flags_x86_mmx cpu_flags_x86_mmxext cpu_flags_x86_sse cpu_flags_x86_sse2 cpu_flags_x86_sse3 cpu_flags_x86_sse4_1 cpu_flags_x86_sse4_2 cpu_flags_x86_ssse3 dav1d elibc_glibc encode fdk fftools_aviocat fftools_cws2fws fftools_ffescape fftools_ffeval fftools_ffhash fftools_fourcc2pixfmt fftools_graph2dot fftools_ismindex fftools_pktdumper fftools_qt-faststart fftools_sidxindex fftools_trasher flite fontconfig frei0r fribidi gme gmp gnutls gpl gsm hardcoded-tables iconv iec61883 ieee1394 jack jpeg2k kernel_linux kvazaar ladspa lcms libaom libaribb24 libass libcaca libdrm libilbc librtmp libsoxr libtesseract libv4l libxml2 lv2 lzma modplug mp3 network openal opencl opengl openh264 opus oss postproc pulseaudio rubberband samba sdl snappy speex srt ssh svg theora threads truetype twolame userland_GNU v4l vaapi vdpau vidstab vorbis vpx vulkan webp x264 x265 xvid zeromq zimg zlib zvbi" ebuild ffmpeg-5.1.2-r1.ebuild clean compile
Comment 11 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-03-12 19:33:14 UTC
(In reply to Sam James from comment #10)
> Reproduced with CFLAGS="-O3 -march=znver1" CXXFLAGS="-O3 -march=znver1"
> USE="-* X abi_x86_32 abi_x86_64 alsa amd64 amr amrenc bluray bs2b bzip2 cdio
> chromium codec2 cpu_flags_x86_aes cpu_flags_x86_avx cpu_flags_x86_avx2
> cpu_flags_x86_fma3 cpu_flags_x86_mmx cpu_flags_x86_mmxext cpu_flags_x86_sse
> cpu_flags_x86_sse2 cpu_flags_x86_sse3 cpu_flags_x86_sse4_1
> cpu_flags_x86_sse4_2 cpu_flags_x86_ssse3 dav1d elibc_glibc encode fdk
> fftools_aviocat fftools_cws2fws fftools_ffescape fftools_ffeval
> fftools_ffhash fftools_fourcc2pixfmt fftools_graph2dot fftools_ismindex
> fftools_pktdumper fftools_qt-faststart fftools_sidxindex fftools_trasher
> flite fontconfig frei0r fribidi gme gmp gnutls gpl gsm hardcoded-tables
> iconv iec61883 ieee1394 jack jpeg2k kernel_linux kvazaar ladspa lcms libaom
> libaribb24 libass libcaca libdrm libilbc librtmp libsoxr libtesseract libv4l
> libxml2 lv2 lzma modplug mp3 network openal opencl opengl openh264 opus oss
> postproc pulseaudio rubberband samba sdl snappy speex srt ssh svg theora
> threads truetype twolame userland_GNU v4l vaapi vdpau vidstab vorbis vpx
> vulkan webp x264 x265 xvid zeromq zimg zlib zvbi" ebuild
> ffmpeg-5.1.2-r1.ebuild clean compile

USE="encode libxml2 opus jpeg2k samba svg srt libtesseract opengl sdl x264 x265 openh264" CFLAGS="-O3 -march=znver1" CXXFLAGS="-O3 -march=znver1" isn't enough to reproduce tho.
Comment 12 Ninpo 2023-03-12 20:14:34 UTC
Just to add I do not get a hang during build with version 12.2.1_p20230121-r1
Comment 13 Claus-Justus Heine 2023-03-14 06:45:55 UTC
Same problem here with

CFLAGS="-march=native -O3 -pipe -Wno-narrowing -fno-stack-check"
CXXFLAGS="-fpermissive ${CFLAGS}"

USE="X alsa amr amrenc bluray bs2b bzip2 cdio chromaprint chromium codec2 cpudetection dav1d doc encode fdk fontconfig frei0r fribidi gme gmp gnutls gpl gsm hardcoded-tables iconv iec61883 ieee1394 jpeg2k kvazaar ladspa libaom libaribb24 libass libcaca libdrm libilbc librtmp libsoxr libtesseract libv4l libxml2 lv2 lzma modplug mp3 network openal opencl opengl openh264 openssl opus postproc pulseaudio rav1e rubberband samba sdl snappy speex ssh svg theora threads truetype twolame v4l vaapi vdpau vidstab vorbis vpx vulkan webp x264 x265 xvid zeromq zimg zlib zvbi -amf (-appkit) -cuda -debug -flite -gcrypt -jack (-mipsdspr1) (-mipsdspr2) (-mipsfpu) (-mmal) -nvenc -oss -pic -sndio -srt -static-libs -svt-av1 -test -verify-sig -vmaf"

CPU_FLAGS_X86="aes avx avx2 fma3 mmx mmxext sse sse2 sse3 sse4_1 sse4_2 ssse3 -3dnow -3dnowext -fma4 -xop"

FFTOOLS="aviocat cws2fws ffescape ffeval ffhash fourcc2pixfmt graph2dot ismindex pktdumper qt-faststart sidxindex trasher"
Comment 14 Ninpo 2023-03-14 09:28:01 UTC
Minimal flags to reproduce build hang:

CPU_FLAGS_X86="" ABI_X86="32 64" USE="-*" ebuild ffmpeg-4.4.3.ebuild clean compile

Using ABI_X86="64" builds successfully.
Comment 15 Ninpo 2023-03-14 12:17:10 UTC
More experimentation has found that:

CFLAGS="-march=x86-64 -mtune=znver1 -O3 -pipe" CXXFLAGS="${CFLAGS}" CPU_FLAGS_X86="" ABI_X86="32" USE="-*" ebuild ffmpeg-4.4.3.ebuild clean compile

Does NOT hang during build however:

CFLAGS="-march=znver1 -O3 -pipe" CXXFLAGS="${CFLAGS}" CPU_FLAGS_X86="" ABI_X86="32" USE="-*" ebuild ffmpeg-4.4.3.ebuild clean compile

DOES hang. Something involving -march=znver1 and -O3.
Comment 16 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-03-14 23:41:37 UTC
Thanks! (fwiw, re my USE=-*, I forgot to restore abi_x86_32, that's why I couldn't hit it when experimenting.)

Bisecting now.
Comment 17 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-03-15 00:05:46 UTC
Started with:
git bisect good 3ae7a822456bc538d4cefaa3a22fe56d43640a04 # 12.2.1_p20230121-r1
git bisect bad aa1f923af4d8cb5f5735d39f667f61aa7c900b5e # 12.2.1_p20230304

Result:
```
489c81db7d4f75894e9d34aa90fe7224cfafb53a is the first bad commit
commit 489c81db7d4f75894e9d34aa90fe7224cfafb53a
Author: Jan Hubicka <jh@suse.cz>
Date:   Thu Dec 22 10:55:46 2022 +0100

    Zen4 tuning part 2

    Adds tunes needed for zen4 microarchitecture.  I added two new knobs.
    TARGET_AVX512_SPLIT_REGS which is used to specify that internally 512 vectors
    are split to 256 vectors.  This affects vectorization costs and reassociation
    width. It probably should also affect RTX costs however I doubt it is very useful
    since RTL optimizers are usually not judging between 256 and 512 vectors.

    I also added X86_TUNE_AVOID_256FMA_CHAINS. Since fma has improved in zen4 this
    flag may not be a win except for very specific benchmarks. I am still doing some
    more detailed testing here.

    Oherwise I disabled gathers on zen4 for 2 parts nad 4 parts. We can open code them
    and since the latencies has only increased since zen3 opencoding is better than
    actual instrucction.  This shows at 4 tsvc benchmarks.

    I ended up setting AVX256_OPTIMAL. This is a compromise.  There are some tsvc
    benchmarks that increase noticeably (up to 250%) however there are also few
    regressions.  Most of these can be solved by incrasing vec_perm cost in the
    vectorizer.  However this does not cure about 14% regression on x264 that is
    quite important.  Here we produce vectorized loops for avx512 that probably
    would be faster if the loops in question had high enough iteration count.
    We hit this problem with avx256 too: since the loop iterates few times, only
    prologues/epilogues are used.  Adding another round of prologue/epilogue
    code does not make it better.

    Finally I enabled avx stores for constnat sized memcpy and memset.  I am not
    sure why this is an opt-in feature.  I think for most hardware this is a win.

    gcc/ChangeLog:

    2022-12-22  Jan Hubicka  <hubicka@ucw.cz>

            * config/i386/i386-expand.cc (ix86_expand_set_or_cpymem): Add
            TARGET_AVX512_SPLIT_REGS
            * config/i386/i386-options.cc (ix86_option_override_internal):
            Honor x86_TONE_AVOID_256FMA_CHAINS.
            * config/i386/i386.cc (ix86_vec_cost): Honor TARGET_AVX512_SPLIT_REGS.
            (ix86_reassociation_width): Likewise.
            * config/i386/i386.h (TARGET_AVX512_SPLIT_REGS): New tune.
            * config/i386/x86-tune.def (X86_TUNE_USE_GATHER_2PARTS): Disable
            for znver4.
            (X86_TUNE_USE_GATHER_4PARTS): Likewise.
            (X86_TUNE_AVOID_256FMA_CHAINS): Set for znver4.
            (X86_TUNE_AVOID_512FMA_CHAINS): New utne; set for znver4.
            (X86_TUNE_AVX256_OPTIMAL): Add znver4.
            (X86_TUNE_AVX512_SPLIT_REGS): New tune.
            (X86_TUNE_AVX256_MOVE_BY_PIECES): Add znver1-3.
            (X86_TUNE_AVX256_STORE_BY_PIECES): Add znver1-3.
            (X86_TUNE_AVX512_MOVE_BY_PIECES): Add znver4.
            (X86_TUNE_AVX512_STORE_BY_PIECES): Add znver4.

    (cherry picked from commit eef81eefcdc2a58111e50eb2162ea1f5becc8004)

 gcc/config/i386/i386-expand.cc  |  2 ++
 gcc/config/i386/i386-options.cc |  2 ++
 gcc/config/i386/i386.cc         | 11 ++++++++---
 gcc/config/i386/i386.h          |  2 ++
 gcc/config/i386/x86-tune.def    | 23 +++++++++++++++--------
 5 files changed, 29 insertions(+), 11 deletions(-)
bisect found first bad commit
```
Comment 18 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-03-15 00:32:25 UTC
Reported upstream at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109137.
Comment 19 Larry the Git Cow gentoo-dev 2023-03-15 02:15:53 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=60b38d402d8674ea08c9b69cf3147e0b92ab87c2

commit 60b38d402d8674ea08c9b69cf3147e0b92ab87c2
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2023-03-15 02:13:59 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2023-03-15 02:14:48 +0000

    media-video/ffmpeg: fix build on register-starved x86
    
    Newer compilers may optimise such that < 7 registers are free on 32-bit x86
    and then we get an "invalid asm" error. This is https://bugs.gentoo.org/901099
    and https://trac.ffmpeg.org/ticket/8903.
    
    Making matters worse, GCC sometimes hangs on invalid asm, so this also
    mitigates a hang with e.g. -O3 -march=znver1. See https://bugs.gentoo.org/900937
    and https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109137.
    
    In future, we may want to adjust the definition of HAVE_7REGS to just exclude
    32-bit x86, but that's a big sledgehammer, so let's avoid it for now until we have
    a reply on the upstream ffmpeg bug.
    
    Thanks to Ninpo.
    
    Bug: https://trac.ffmpeg.org/ticket/8903
    Bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109137
    Bug: https://bugs.gentoo.org/900937
    Closes: https://bugs.gentoo.org/901099
    Signed-off-by: Sam James <sam@gentoo.org>

 media-video/ffmpeg/ffmpeg-4.4.3.ebuild             |  3 ++-
 media-video/ffmpeg/ffmpeg-5.1.2-r1.ebuild          |  3 ++-
 media-video/ffmpeg/ffmpeg-6.0.ebuild               |  3 ++-
 .../ffmpeg-4.4.3-get_cabac_inline_x86-32-bit.patch | 24 +++++++++++++++++++++
 .../ffmpeg-5.1.2-get_cabac_inline_x86-32-bit.patch | 25 ++++++++++++++++++++++
 5 files changed, 55 insertions(+), 3 deletions(-)
Comment 20 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-03-26 23:34:47 UTC
Fixed in https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=81d762cbec9685c2f2571da21d48f42c42eff33b for 13 (landed in sys-devel/gcc-13.0.1_pre20230326).

Not yet in 12. Let's call this fixed though, as we've worked around it in ffmpeg anyway, and it'll get backported in due course.