https://blogs.gentoo.org/ago/2020/07/04/gentoo-tinderbox/ Issue: x11-drivers/xf86-video-intel-2.99.917_p20201215 fails to compile (lto). Discovered on: amd64 (internal ref: lto_tinderbox) NOTE: This machine uses lto with CFLAGS=-flto -Werror=odr -Werror=lto-type-mismatch -Werror=strict-aliasing Here is a bit of explanation: -Werror=lto-type-mismatch: User to find possible runtime issues in packages. It likely means the package is unsafe to build & use with LTO. For projects using the same identifier but with different types across different files, they must be fixed to be consistent across the codebase. -Werror=odr: Used to find possible runtime issues in packages. These bugs are a problem anyway but may be even worse when combined with LTO. C++ code must comply with the One Definition Rule (ODR) - see https://en.cppreference.com/w/cpp/language/definition#One_Definition_Rule. -Werror=strict-aliasing: Used to find possible runtime issues in packages. These bugs are a problem anyway but may be even worse when combined with LTO. Workarounds: - If upstream is friendly and still active, file a bug upstream. For emulators, codecs, games, or multimedia packages, it may be worth just applying a workaround instead, as upstreams sometimes aren't receptive to these bugs (VALID FOR ALL). - Use the new 'filter-lto' from flag-o-matic.eclass as it's likely to be unsafe with LTO (VALID FOR lto-type-mismatch - odr). - Fix it yourself if interested, of course (VALID FOR ALL). - Append-flags -fno-strict-aliasing (VALID FOR strict-aliasing). - Use memcpy() but a union is sometimes suitable too (VALID FOR strict-aliasing). - -fstrict-aliasing is implied by -O2, so this must be addressed in some form (VALID FOR strict-aliasing). See also: https://marc.info/?l=gentoo-dev&m=165639574126280&w=2
Created attachment 798811 [details] build.log build log and emerge --info
I confirm. I let it run for a while, and the lto1 process was eventually killed because it ran out of memory.
The hanging command is: ``` /usr/bin/x86_64-pc-linux-gnu-gcc -shared -fPIC -DPIC .libs/backlight.o .libs/fd.o .libs/intel_device.o .libs/intel_options.o .libs/intel_module.o -Wl,--whole-archive legacy/.libs/liblegacy.a sna/.libs/libsna.a -Wl,--no-whole-archive -Wl,--as-needed -lpciaccess -lpixman-1 -ludev -lm -ldrm -O2 -march=native -ggdb3 -flto -Werror=strict-aliasing -Werror=lto-type-mismatch -O2 -march=native -ggdb3 -flto -Werror=strict-aliasing -Werror=lto-type-mismatch -ggdb3 -Wl,-O1 -Wl,--defsym=__gentoo_check_ldflags__=0 -Wl,-z -Wl,pack-relative-relocs -flto -Werror=odr -Werror=strict-aliasing -Werror=lto-type-mismatch -Wl,-O1 -Wl,--defsym=__gentoo_check_ldflags__=0 -Wl,-z -Wl,pack-relative-relocs -flto -Werror=odr -Werror=strict-aliasing -Werror=lto-type-mismatch -ggdb3 -Wl,-z -Wl,lazy -pthread -Wl,-soname -Wl,intel_drv.so -o .libs/intel_drv.so ``` Just this hangs for me too (can drop the -flto as it auto-uses it from seeing the objects were built using it): ``` /usr/bin/x86_64-pc-linux-gnu-gcc -shared -fPIC -DPIC .libs/intel_module.o -Wl,--whole-archive legacy/.libs/liblegacy.a sna/.libs/libsna.a -Wl,--no-whole-archive ``` In fact, even this does: `/usr/bin/x86_64-pc-linux-gnu-gcc -shared .libs/intel_module.o sna/.libs/libsna.a` The latter is pretty big: -rw-r--r-- 1 portage portage 334K Jun 1 09:26 .libs/intel_module.o -rw-r--r-- 1 portage portage 33M Jun 1 09:26 sna/.libs/libsna.a Interestingly, someone filed https://gitlab.freedesktop.org/xorg/driver/xf86-video-intel/-/issues/28 ages ago which ended up with https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71991, but obviously an issue is still around even if it's not the same one.
Actually, marxin went on to file https://bugzilla.opensuse.org/show_bug.cgi?id=1133292 *after* the fix in GCC, so it seems like something that people know about.
(In reply to Sam James from comment #3) > In fact, even this does: `/usr/bin/x86_64-pc-linux-gnu-gcc -shared > .libs/intel_module.o sna/.libs/libsna.a` > I extracted libsna.a and then e.g. gcc -shared .libs/intel_module.o *.o would still hang (good). Then cheesily copied all of that to /tmp/hang for use with cvise w/ test.sh: ``` #!/usr/bin/env bash cp -r /tmp/reduce/*.o ${PWD} cp -r /tmp/reduce/.libs/ ${PWD} timeout 30s gcc -shared .libs/intel_module.o @obj_list timeout_result=$? case ${timeout_result} in 124) # Timed out exit 0 ;; 0) ;; *) # Something else happened, skip exit 125 ;; esac exit 1 ``` and obj_list just being a list of *.o in the dir. This ends up giving me @obj_list as: ``` blt.o brw_eu_emit.o brw_eu.o brw_wm.o gen2_render.o gen3_render.o gen4_common.o gen4_render.o gen4_source.o gen4_vertex.o gen5_render.o gen6_common.o gen6_render.o gen7_render.o gen8_eu.o gen8_render.o libfb_la-fbimage.o libfb_la-fbline.o libfb_la-fbstipple.o libfb_la-fbtile.o libfb_la-fbutil.o sna_accel.o sna_acpi.o sna_blt.o sna_composite.o sna_cpu.o sna_damage.o sna_display_fake.o sna_display.o sna_dri2.o sna_dri3.o sna_driver.o sna_glyphs.o sna_gradient.o sna_io.o sna_stream.o sna_threads.o sna_tiling.o sna_transform.o sna_trapezoids_boxes.o sna_trapezoids_imprecise.o sna_trapezoids_mono.o sna_trapezoids.o sna_trapezoids_precise.o sna_vertex.o sna_video.o sna_video_overlay.o sna_video_sprite.o sna_video_textured.o ``` so `gcc -shared .libs/intel_module.o @obj_list` is apparently the minimal command needed to hang for >= 30s. I'm a bit suspicious of this, I feel like surely not all of those objects are genuinely responsible though?
.. and actually, this does terminate for me, while the original doesn't, so may need to re-run it with an increased timeout or something.
given it's using flatten (https://gitlab.freedesktop.org/xorg/driver/xf86-video-intel/-/issues/28), this is borderline an upstream bug even if gcc should perhaps terminate, so let's just filter-lto
The bug has been closed via the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=c65b98fb168626fb7c90358869f2694bdf2809bb commit c65b98fb168626fb7c90358869f2694bdf2809bb Author: Sam James <sam@gentoo.org> AuthorDate: 2023-12-16 07:56:26 +0000 Commit: Sam James <sam@gentoo.org> CommitDate: 2023-12-16 08:23:54 +0000 x11-drivers/xf86-video-intel: filter LTO due to 'flatten' attribute The flatten attribute recursively inlines and causes OOM with GCC. Closes: https://bugs.gentoo.org/864379 Signed-off-by: Sam James <sam@gentoo.org> x11-drivers/xf86-video-intel/xf86-video-intel-2.99.917_p20230201.ebuild | 2 ++ x11-drivers/xf86-video-intel/xf86-video-intel-9999.ebuild | 2 ++ 2 files changed, 4 insertions(+)