948496 – llvm-core/llvm: /usr/lib/gcc/x86_64-pc-linux-gnu/14/include/avx2intrin.h:642:60: error: ‘W_v8si’ undeclared

Bug 948496 - llvm-core/llvm: /usr/lib/gcc/x86_64-pc-linux-gnu/14/include/avx2intrin.h:642:60: error: ‘W_v8si’ undeclared

Summary: llvm-core/llvm: /usr/lib/gcc/x86_64-pc-linux-gnu/14/include/avx2intrin.h:642:...

Status:	RESOLVED INVALID

Alias:	None

Product:	Gentoo Linux
Classification:	Unclassified
Component:	Current packages (show other bugs)
Hardware:	All Linux

Importance:	Normal normal
Assignee:	Gentoo Linux bug wranglers

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2025-01-21 08:59 UTC by Dan Arnold
Modified:	2025-01-22 12:49 UTC (History)
CC List:	3 users (show)

See Also:
Package list:
Runtime testing required:	---

Attachments
llvm-19.1.4 build failure (llvm-19.1.4:20250121-083952.log,257.66 KB, text/x-log) 2025-01-21 08:59 UTC, Dan Arnold	Details
emerge --info (emerge-info.txt,7.75 KB, text/plain) 2025-01-21 08:59 UTC, Dan Arnold	Details
emerge segfault log (emerge_segfault_log.txt,23.59 KB, text/plain) 2025-01-22 03:37 UTC, Dan Arnold	Details
qlop -m output (qlop_dashm.txt,9.45 KB, text/plain) 2025-01-22 03:38 UTC, Dan Arnold	Details
qlop -qEm output (qlop_dashqEm.txt,6.16 KB, text/plain) 2025-01-22 03:38 UTC, Dan Arnold	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Dan Arnold 2025-01-21 08:59:27 UTC

Created attachment 917254 [details]
llvm-19.1.4 build failure

Build failure on llvm-core/llvm-19.1.4 (also tested 19.1.7, same error) on AMD64 Zen5.

# emerge -pqv =llvm-core/llvm-19.1.4
emerge -pqv '=llvm-core/llvm-19.1.4::gentoo'
[ebuild   R   ] llvm-core/llvm-19.1.4  USE="binutils-plugin libffi verify-sig* xml zstd -debug -debuginfod -doc -exegesis -libedit -test -z3" ABI_X86="(64) -32 (-x32)" LLVM_TARGETS="(AArch64) (AMDGPU) (ARM) (AVR) (BPF) (Hexagon) (Lanai) (LoongArch) (MSP430) (Mips) (NVPTX) (PowerPC) (RISCV) (Sparc) (SystemZ) (VE) (WebAssembly) (X86) (XCore) -ARC -CSKY -DirectX -M68k -SPIRV -Xtensa"



emerge --info attached.
build log attached.

Comment 1 Dan Arnold 2025-01-21 08:59:57 UTC

Created attachment 917255 [details]
emerge --info

Comment 2 Dan Arnold 2025-01-21 09:02:08 UTC

My gcc was built with lto, if that's relevant. I tried the llvm build with -flto and without and get the same error in both cases.

Comment 3 Zhixu Liu 2025-01-21 09:55:42 UTC

# sed -n '642p' /usr/lib/gcc/x86_64-pc-linux-gnu/14/include/avx2intrin.h
  return (__m256i) __builtin_ia32_psignd256 ((__v8si)__X, (__v8si)__Y);

no "W_v8si" here, can you show the output of the command above?

Comment 4 Dan Arnold 2025-01-21 22:37:29 UTC

(In reply to Zhixu Liu from comment #3)
> # sed -n '642p' /usr/lib/gcc/x86_64-pc-linux-gnu/14/include/avx2intrin.h
>   return (__m256i) __builtin_ia32_psignd256 ((__v8si)__X, (__v8si)__Y);
> 
> no "W_v8si" here, can you show the output of the command above?

# sed -n '642p' /usr/lib/gcc/x86_64-pc-linux-gnu/14/include/avx2intrin.h
  return (__m256i) __builtin_ia32_psignd256 ((__v8si)__X, (W_v8si)_[Y);

Comment 5 Sam James archtester

2025-01-21 22:44:13 UTC

(In reply to Dan Arnold from comment #4)
> (In reply to Zhixu Liu from comment #3)
> > # sed -n '642p' /usr/lib/gcc/x86_64-pc-linux-gnu/14/include/avx2intrin.h
> >   return (__m256i) __builtin_ia32_psignd256 ((__v8si)__X, (__v8si)__Y);
> > 
> > no "W_v8si" here, can you show the output of the command above?
> 
> # sed -n '642p' /usr/lib/gcc/x86_64-pc-linux-gnu/14/include/avx2intrin.h
>   return (__m256i) __builtin_ia32_psignd256 ((__v8si)__X, (W_v8si)_[Y);

I think this might be filesystem corruption. There has never been a 'W_v8si' in that file, looking at git history, and that line has been unchanged since 2011.

commit 977e83a3edc1a58077e33143ad3cc1f9349d6197
Author: Kirill Yukhin <kirill.yukhin@intel.com>
Date:   Mon Aug 22 13:57:18 2011 +0000

    Add support for AVX2 builtin functions.

    2011-08-22  Kirill Yukhin  <kirill.yukhin@intel.com>

Comment 6 Dan Arnold 2025-01-21 23:31:38 UTC

(In reply to Sam James from comment #5)
> (In reply to Dan Arnold from comment #4)
> > (In reply to Zhixu Liu from comment #3)
> > > # sed -n '642p' /usr/lib/gcc/x86_64-pc-linux-gnu/14/include/avx2intrin.h
> > >   return (__m256i) __builtin_ia32_psignd256 ((__v8si)__X, (__v8si)__Y);
> > > 
> > > no "W_v8si" here, can you show the output of the command above?
> > 
> > # sed -n '642p' /usr/lib/gcc/x86_64-pc-linux-gnu/14/include/avx2intrin.h
> >   return (__m256i) __builtin_ia32_psignd256 ((__v8si)__X, (W_v8si)_[Y);
> 
> I think this might be filesystem corruption. There has never been a 'W_v8si'
> in that file, looking at git history, and that line has been unchanged since
> 2011.
> 
> commit 977e83a3edc1a58077e33143ad3cc1f9349d6197
> Author: Kirill Yukhin <kirill.yukhin@intel.com>
> Date:   Mon Aug 22 13:57:18 2011 +0000
> 
>     Add support for AVX2 builtin functions.
> 
>     2011-08-22  Kirill Yukhin  <kirill.yukhin@intel.com>

Weird. This is a brand new install. I did experience a segfault during a build so maybe that somehow corrupted it. I'm going to re-emerge gcc and see if that fixes it, thank you.

Comment 7 Zhixu Liu 2025-01-22 01:55:26 UTC

could be caused by cosmic ray? only one bit changed

_ = 0x5f = 0101 1111
W = 0x57 = 0101 0111

Comment 8 Zhixu Liu 2025-01-22 02:00:13 UTC

(In reply to Zhixu Liu from comment #7)
> could be caused by cosmic ray? only one bit changed
> 
> _ = 0x5f = 0101 1111
> W = 0x57 = 0101 0111

if you haven't re-emerge yet, try 'equery check gcc'?
if have done re-emerge, are there any backup exist so we can verify the checksum?

Comment 9 Zhixu Liu 2025-01-22 02:07:14 UTC

(In reply to Zhixu Liu from comment #7)
> could be caused by cosmic ray? only one bit changed
> 
> _ = 0x5f = 0101 1111
> W = 0x57 = 0101 0111

another difference, still one bit change

[ = 0x5b = 0101 1011

Comment 10 Dan Arnold 2025-01-22 03:35:30 UTC

I already re-emerged gcc, sorry! I'm guessing the checksum would have failed had I ran it before re-emerging. It passes now.

But, I do have a log for what happened. I was emerging plasma-meta and then I opened another pane in tmux and tried to emerge lm-sensors (at the same time). I was also watching btop in another tmux pane at the same time. CPU was totally pegged, 100% on all 24 logical cores - it seemed like lm-sensors should have been moving faster (I emerged it later when idle and it took less than a minute) but I waited a while then, I ctrl-C'd the lm-sensors emerge and at that exact moment, the plasma-meta merge blew up with a segfault. Attaching my journalctl from the segfault. Maybe something had that file open and it got corrupted?

I am using zfs-2.3.0 from guru if that's relevant; root is on ZFS.

Comment 11 Dan Arnold 2025-01-22 03:37:44 UTC

Created attachment 917304 [details]
emerge segfault log

This segfault happened when I was emerging plasma-meta and then emerged lm-sensors simultaneously in another terminal. It seemed frozen (lm-sensors is tiny and emerges very quickly on this brand-new AMD Ryzen 9 9900X) so I ctrl-C'd the lm-sensors emerge and that's when the segfault happened.

Comment 12 Dan Arnold 2025-01-22 03:38:26 UTC

Created attachment 917305 [details]
qlop -m output

qlop -m output around the time of the segfault.

Comment 13 Dan Arnold 2025-01-22 03:38:56 UTC

Created attachment 917306 [details]
qlop -qEm output

qlop -qEm output around the time of the emerge segfault

Comment 14 Dan Arnold 2025-01-22 03:48:41 UTC

This PC is brand new with brand new hardware, built yesterday, so if there's a possibility there's a hardware issue causing this I'd love to know. Thanks for bearing with me on this :)

Comment 15 Sam James archtester

2025-01-22 03:57:36 UTC

(In reply to Dan Arnold from comment #10)
> I am using zfs-2.3.0 from guru if that's relevant; root is on ZFS.

I *have* hit various issues with ZFS before (such that I even have a wiki page listing them) but I'm not convinced that's what's going on here.

note: Hopefully not from guru, as it's in ::gentoo, and guru isn't allowed to have ebuilds for ::gentoo packages.

(In reply to Dan Arnold from comment #14)
> This PC is brand new with brand new hardware, built yesterday, so if there's
> a possibility there's a hardware issue causing this I'd love to know. Thanks
> for bearing with me on this :)

First candidate coming to mind here is XMP defaulting on in bios/uefi/firmware settings -- check that and for any other silly overclock stuff? A lot of OEMs turn them on by default these days.

I'd also do a memtest at least overnight (ideally 12+ hours minimum).

Comment 16 Dan Arnold 2025-01-22 04:09:41 UTC

(In reply to Sam James from comment #15)
> (In reply to Dan Arnold from comment #10)
> > I am using zfs-2.3.0 from guru if that's relevant; root is on ZFS.
> 
> I *have* hit various issues with ZFS before (such that I even have a wiki
> page listing them) but I'm not convinced that's what's going on here.
> 
> note: Hopefully not from guru, as it's in ::gentoo, and guru isn't allowed
> to have ebuilds for ::gentoo packages.
> 
> (In reply to Dan Arnold from comment #14)
> > This PC is brand new with brand new hardware, built yesterday, so if there's
> > a possibility there's a hardware issue causing this I'd love to know. Thanks
> > for bearing with me on this :)
> 
> First candidate coming to mind here is XMP defaulting on in
> bios/uefi/firmware settings -- check that and for any other silly overclock
> stuff? A lot of OEMs turn them on by default these days.
> 
> I'd also do a memtest at least overnight (ideally 12+ hours minimum).

You're right, my zfs package is from gentoo (~amd64), not guru.

I'm using the built-in EXPO profile for the RAM, running at 5600MT/s 1:1 with CAS latency 36 (RAM is DDR5 5600). The RAM is on the motherboard manufacturer's compatibility list. I'm not overclocking otherwise, that's the only thing I changed in the UEFI settings other than fan curves.

I will try a memtest overnight. Thank you!

Comment 17 Dan Arnold 2025-01-22 06:41:03 UTC

Haha, you were spot on Sam, it's bad RAM.

Time: 0:20:09 Status: Failed!
Pass: 0       Errors: 64360
pCPU  Pass  Test  Failing Address        Expected          Found
----  ----  ----  ---------------------  ----------------- ----------------
6     0     8     000fc8adaf70 (63.1GB)  948338d38a501631  94c388d38a501631
...etc

We can close this bug, thanks for everyone's assistance!

Comment 18 Dan Arnold 2025-01-22 06:42:59 UTC

Hardware error, resolving

Comment 19 Sam James archtester

2025-01-22 06:48:57 UTC

You're most welcome -- sorry for the bad news, but really glad we got to the bottom of it quickly!