Created attachment 907408 [details] backtrace Works when compiled with clang. Thread 1 "firefox" received signal SIGSEGV, Segmentation fault. 0x00007ffff12e028f in MakeDay (year=<optimized out>, month=<optimized out>, date=<optimized out>) at /usr/src/debug/www-client/firefox-132.0/firefox-132.0/js/src/jsdate.cpp:429 429 double monthday = DayFromMonth(mn, leap);
Created attachment 907409 [details] emerge --info
This will be fun :) * Does Valgrind work on your CPU? I think it should but there's a few instructions on some old AMD CPUs which it can't handle. If you can, please try launching Firefox under Valgrind and show me the output. * Can you tell me what -march=native expands to for you? resolve-march-native can give the value * Does it happen with a fresh profile? * Is it literally as soon as FF starts up?
My hunch is it'll be specific to some tuning which only happens on your older AMD, so when I have your values, I'll try repro.
(In reply to Sam James from comment #2) > This will be fun :) > > * Does Valgrind work on your CPU? I think it should but there's a few > instructions on some old AMD CPUs which it can't handle. If you can, please > try launching Firefox under Valgrind and show me the output. I used it on the clang compiled Firefox, there is a lot of output but I am not sure if that is usable . > > * Can you tell me what -march=native expands to for you? > resolve-march-native can give the value resolve-march-native -march=amdfam10 --param=l1-cache-line-size=64 --param=l1-cache-size=64 --param=l2-cache-size=1024 processor : 0 vendor_id : AuthenticAMD cpu family : 16 model : 6 model name : AMD Athlon(tm) II X2 255 Processor stepping : 3 microcode : 0x10000c8 cpu MHz : 3100.000 cache size : 1024 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 2 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt nodeid_msr hw_pstate vmmcall npt lbrv svm_lock nrip_save bugs : tlb_mmatch fxsave_leak sysret_ss_attrs null_seg amd_e400 spectre_v1 spectre_v2 bogomips : 6228.14 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: ts ttp tm stc 100mhzsteps hwpstate > > * Does it happen with a fresh profile? Need to test it when I compile it again with gcc. Unfortunately I can't use firefox-bin as fallback for a good working browser on my system. Bug 940767 > > * Is it literally as soon as FF starts up? Yes. No window seen.
(In reply to jospezial from comment #4) > (In reply to Sam James from comment #2) > > This will be fun :) > > > > * Does Valgrind work on your CPU? I think it should but there's a few > > instructions on some old AMD CPUs which it can't handle. If you can, please > > try launching Firefox under Valgrind and show me the output. > I used it on the clang compiled Firefox, there is a lot of output but I am > not sure if that is usable . Please attach it compressed, although I will need it for the GCC build too unfortunately. > > > > * Does it happen with a fresh profile? > Need to test it when I compile it again with gcc. > Unfortunately I can't use firefox-bin as fallback for a good working browser > on my system. Bug 940767 Yeah, I wish I had some ideas for that too. Although, speaking of, I actually wonder if it works for you with a new Linux user, and/or also a clean FF profile. I find that bug really odd.
Shot in the dark: can you try disabling the JS JIT by setting: javascript.options.baselinejit = false javascript.options.ion = false just to see whether it's the JIT or JS itself.
Created attachment 907496 [details] valgrind on gcc firefox I hope this helps you. I also unmasked and enabled valgrind USEflag on firefox. (Bug 906509) I don't know if that is good. The output looks nearly same as with clang firefox. The crash also happens in new user profile.
(In reply to Holger Hoffstätte from comment #6) > Shot in the dark: can you try disabling the JS JIT by setting: > > javascript.options.baselinejit = false > javascript.options.ion = false > > just to see whether it's the JIT or JS itself. That does not change the crash. I did this setting from fedora where I share my home folder with gentoo.
If I look at https://crash-stats.mozilla.org/signature/?signature=MakeDay&date=%3E%3D2024-04-30T18%3A38%3A00.000Z&date=%3C2024-10-31T18%3A38%3A00.000Z&_sort=-date I mostly see there Windows as OS. Maybe because of http://support.microsoft.com/kb/982107 ? And the problem seems to be very old. https://bugzilla.mozilla.org/buglist.cgi?quicksearch=ALL+MakeDay https://bugzilla.mozilla.org/show_bug.cgi?id=635617 https://bugzilla.mozilla.org/show_bug.cgi?id=732897 13 year old bugs but still signature gets crash report hits. Btw, what for stands that date 2022-02-08 in backtrace and why is it processed?
Building using gcc trunk with -march=k8 -mtune=k8 fails on znver4 because of pi2fd not being available. Building using gcc trunk with -mtune=k8 works fine.
Could you try to get me the build.log from Firefox?
(In reply to jospezial from comment #7) > Created attachment 907496 [details] > valgrind on gcc firefox > > I hope this helps you. I also unmasked and enabled valgrind USEflag on > firefox. > (Bug 906509) > I don't know if that is good. The output looks nearly same as with clang > firefox. > > The crash also happens in new user profile. Thank you -- unfortunately, it does not appear helpful (not your fault) because Valgrind is dying on that issue I mentioned with some AMD CPUs (it cannot recognise a somewhat-rare instruction) :(
(In reply to Sam James from comment #10) > Building using gcc trunk with -march=k8 -mtune=k8 fails on znver4 because of > pi2fd not being available. > > Building using gcc trunk with -mtune=k8 works fine. k8 != k10, retrying...
Thunderbird behaves the same. I used thunderbird-128.4.0.ebuild and modified it for 132.0 -FIREFOX_PATCHSET="firefox-128esr-patches-04.tar.xz" +FIREFOX_PATCHSET="firefox-132-patches-01.tar.xz" -MOZ_ESR=yes +MOZ_ESR= - --disable-gpsd \ The build log for jsdate.cpp has many warnings and notes. I'm trying to upload the build.log. The uncompressed size is 40MB and with bzip2 --best it is 1,4MB. Maximum allowed is 1MB Sam, I have sent it you per e-mail.
Try xz -9
Created attachment 907681 [details] thunderbird-132.0_gcc_build.log.bz2 part2 xz -9 did only save a few kb. xz -9 -e compressed to 1.1MB . Now I use split on the bz2 file.
Created attachment 907682 [details] thunderbird-132.0_gcc_build.log.bz2 part1 part1
Thanks, thunderbird might be a nicer case. Bit smaller and less complex. Will check log when back at pc.
Minor status update: * Asked around for more hardware I can hopefully reproduce on * Trying a chroot w/ qemu-user using `/usr/bin/qemu-x86_64 -cpu Opteron_G2-v1 /bin/bash` with firefox+xvfb-run
jospezial, while I work on those leads, can you try something a bit tedious for me? :( I am really hoping we can get Valgrind output for that crash. The problem is, Valgrind can't decode certain old AMD instructions in libraries that Firefox uses. In your output, there was: ``` vex amd64->IR: unhandled instruction bytes: 0xF 0xF 0x43 0x28 0xD 0x48 0x89 0x4 0x24 0x48 vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=0F vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 ==18537== valgrind: Unrecognised instruction at address 0x6998e1c. ==18537== at 0x6998E1C: ??? (in /usr/lib64/libharfbuzz.so.0.61001.0) ==18537== by 0x6BDFA89: ??? (in /usr/lib64/libpangoft2-1.0.so.0.5200.2) ==18537== by 0x6904652: pango_font_get_hb_font (in /usr/lib64/libpango-1.0.so.0.5200.2) ==18537== by 0x6927A0A: ??? (in /usr/lib64/libpango-1.0.so.0.5200.2) ==18537== by 0x6928476: ??? (in /usr/lib64/libpango-1.0.so.0.5200.2) ==18537== by 0x6928BCE: pango_shape_item (in /usr/lib64/libpango-1.0.so.0.5200.2) ==18537== by 0x6915062: ??? (in /usr/lib64/libpango-1.0.so.0.5200.2) ==18537== by 0x691681F: ??? (in /usr/lib64/libpango-1.0.so.0.5200.2) ==18537== by 0x6918A29: ??? (in /usr/lib64/libpango-1.0.so.0.5200.2) ==18537== by 0x691AB12: pango_layout_get_unknown_glyphs_count (in /usr/lib64/libpango-1.0.so.0.5200.2) ==18537== by 0x5DDC771: ??? (in /usr/lib64/libgtk-3.so.0.2410.32) ==18537== by 0x5DDCAD7: ??? (in /usr/lib64/libgtk-3.so.0.2410.32) ``` Could you try build pango+harfbuzz without -march=... and then try Valgrind + Firefox again? If it fails again with "Unrecognised instruction" inside of non-Firefox, repeat the same steps (build $library without -march). If it is Firefox itself at the top of the stack, then we're stuck ofc.
Created attachment 907825 [details] firefox_valgrind_gcc libs_march_x86-64 I have rebuilt media-libs/harfbuzz x11-libs/gtk+ x11-libs/pango with -march=x86-64 -Og -pipe -ggdb3 . Now firefox and thunderbird crash with the same MakeDay segfault right after the window opens. Valgrind now goes a lot further but has: vex amd64->IR: unhandled instruction bytes: 0xF 0xF 0x44 0x24 0x20 0xD 0x48 0xC7 0x84 0x24 vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=0F vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 ==30431== valgrind: Unrecognised instruction at address 0x83a0f99. ==30431== at 0x83A0F99: mozilla::layers::WebRenderLayerManager::EndTransactionWithoutLayer(mozilla::nsDisplayList*, mozilla::nsDisplayListBuilder*, WrFiltersHolder&&, mozilla::layers::WebRenderBackgroundData*, double) (WebRenderLayerManager.cpp:446) ==30431== by 0xB5DD04F: mozilla::nsDisplayList::PaintRoot(mozilla::nsDisplayListBuilder*, gfxContext*, unsigned int, mozilla::Maybe<double>) (nsDisplayList.cpp:2294) ==30431== by 0xB309C29: nsLayoutUtils::PaintFrame(gfxContext*, nsIFrame*, nsRegion const&, unsigned int, mozilla::nsDisplayListBuilderMode, nsLayoutUtils::PaintFrameFlags) (nsLayoutUtils.cpp:3195) ==30431== by 0xB271B69: mozilla::PresShell::PaintInternal(nsView*, mozilla::PaintInternalFlags) (PresShell.cpp:6513) ==30431== by 0xAF2363D: nsViewManager::ProcessPendingUpdatesPaint(nsIWidget*) (nsViewManager.cpp:406) ==30431== by 0xAF23A4B: nsViewManager::ProcessPendingUpdatesForView(nsView*, bool) (nsViewManager.cpp:341) ==30431== by 0xAF24002: ProcessPendingUpdates (nsViewManager.cpp:896) ==30431== by 0xAF24002: nsViewManager::ProcessPendingUpdates() (nsViewManager.cpp:882) ==30431== by 0xB2470F8: nsRefreshDriver::Tick(mozilla::layers::BaseTransactionId<mozilla::VsyncIdType>, mozilla::TimeStamp, nsRefreshDriver::IsExtraTick) (nsRefreshDriver.cpp:2885) ==30431== by 0xB247C21: TickDriver (nsRefreshDriver.cpp:368) ==30431== by 0xB247C21: mozilla::RefreshDriverTimer::TickRefreshDrivers(mozilla::layers::BaseTransactionId<mozilla::VsyncIdType>, mozilla::TimeStamp, nsTArray<RefPtr<nsRefreshDriver> >&) [clone .isra.0] (nsRefreshDriver.cpp:346) ==30431== by 0xB247D93: mozilla::RefreshDriverTimer::Tick(mozilla::layers::BaseTransactionId<mozilla::VsyncIdType>, mozilla::TimeStamp) (nsRefreshDriver.cpp:362) ==30431== by 0xB247F73: RunRefreshDrivers (nsRefreshDriver.cpp:952) ==30431== by 0xB247F73: mozilla::VsyncRefreshDriverTimer::TickRefreshDriver(mozilla::layers::BaseTransactionId<mozilla::VsyncIdType>, mozilla::TimeStamp) (nsRefreshDriver.cpp:862) ==30431== by 0xB248940: NotifyVsyncTimerOnMainThread (nsRefreshDriver.cpp:593) ==30431== by 0xB248940: operator() (nsRefreshDriver.cpp:565) ==30431== by 0xB248940: mozilla::detail::RunnableFunction<mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::NotifyVsync(mozilla::VsyncEvent const&)::{lambda()#1}>::Run() (nsThreadUtils.h:548)
(In reply to Sam James from comment #20) > Could you try build pango+harfbuzz without -march=... and then try Valgrind > + Firefox again? > > If it fails again with "Unrecognised instruction" inside of non-Firefox, > repeat the same steps (build $library without -march). If it is Firefox > itself at the top of the stack, then we're stuck ofc. Can you explain me the benefit of compiling the libs a second time with same settings? Or do I understand you wrong?
(In reply to jospezial from comment #22) The idea was that if a different library shows up again, build that one with genreic -march too, until no libraries are causing an error. So it might be pango, then you rebuild pango, then valgrind shows an issue in freetype, then ...
(In reply to jospezial from comment #21) > Created attachment 907825 [details] > firefox_valgrind_gcc libs_march_x86-64 > > I have rebuilt media-libs/harfbuzz x11-libs/gtk+ x11-libs/pango with > -march=x86-64 -Og -pipe -ggdb3 . > > Now firefox and thunderbird crash with the same MakeDay segfault right after > the window opens. > Valgrind now goes a lot further but has: Thanks. Unfortunately, we cannot proceed further: > On x86 and amd64, there is no support for 3DNow! instructions. If the translator encounters these, Valgrind will generate a SIGILL when the instruction is executed. Apart from that, on x86 and amd64, essentially all instructions are supported, up to and including AVX and AES in 64-bit mode and SSSE3 in 32-bit mode. 32-bit mode does in fact support the bare minimum SSE4 instructions needed to run programs on MacOSX 10.6 on 32-bit targets. I have some more requests (thank you for your patience too): * Could you create a binpkg of broken Firefox, ideally with debug symbols? * Could you tell me if Firefox works if you drop -march=...? My current theory is that it's a GCC bug involving 3dnow instructions (which I will debug once we get there) but I need to be sure.
https://www.phoronix.com/news/Linux-Kernel-Drop-AMD-3DNow And since then I have in /etc/portage/make.conf the line CPU_FLAGS_X86="3dnow 3dnowext mmx mmxext popcnt sse sse2 sse3 sse4a" changed to CPU_FLAGS_X86="mmx mmxext popcnt sse sse2 sse3 sse4a" But does gcc use that variable? So GCC with native still uses 3dnow instructions on my system? llvm/clang has removed that since 19.1. But I get a working firefox and tb with sys-devel/clang-18.1.8-r6 https://github.com/search?q=repo%3Allvm%2Fllvm-project+3dnow&type=commits&s=committer-date&o=desc Where do you see 3dnow in the logs?
(In reply to jospezial from comment #25) > https://www.phoronix.com/news/Linux-Kernel-Drop-AMD-3DNow > This is a misunderstanding. Your CPU still supports 3dnow instructions and it still works. But the kernel dropped some accelerated paths using it. > And since then I have in /etc/portage/make.conf > the line > CPU_FLAGS_X86="3dnow 3dnowext mmx mmxext popcnt sse sse2 sse3 sse4a" > changed to > CPU_FLAGS_X86="mmx mmxext popcnt sse sse2 sse3 sse4a" > This only controls hand-written asm in programs. > But does gcc use that variable? No. > > So GCC with native still uses 3dnow instructions on my system? Yes! > llvm/clang has removed that since 19.1. But I get a working firefox and tb > with sys-devel/clang-18.1.8-r6 My theory is that it is a GCC bug when it is emitting 3dnow instructions, so even older Clang + 3dnow would work.
(In reply to Sam James from comment #24) > > I have some more requests (thank you for your patience too): > * Could you create a binpkg of broken Firefox, ideally with debug symbols? Non public Downloadlink per e-mail.
(In reply to jospezial from comment #27) > Non public Downloadlink per e-mail. Thanks! I have this saved locally. Minor update: * I'm currently preparing a machine (an Opteron 252) with some help from a kind volunteer & contributor. Its native -march is k8-sse3, not amdfam10. * That machine seemed to work when I opened FF but it was built with GCC 13. I'm upgrading everything now to ~arch and so on. * floppym has a Phenom which *does* reproduce the crash (!) but it is his main workstation so I can't have/request access to it. We may have to ask him to probe it. * Chiitoo has mentioned he may have a Phenom around that may be an option too. The differences between -march=k8-sse3 and -march=amdfam10 aren't too big, the main issue is -msse4a is available on amdfam10. If needed, I'll try diff your provided binary with mine with k8-sse3 to see if jsdate even differed much there.
jospezial, if you are able, it might be useful to know: * does GCC 13 work? unfortunately, you cannot test this easily because Firefox depends on some C++ libraries. You would have to test it with USE=-system-icu at least. It's OK if you don't try this, but it may be useful data. * does dropping -march=native help? (does "-O2 -mtune=native" fail?) * does "-O2 -march=amdfam10" fail?
(In reply to Sam James from comment #29) > jospezial, if you are able, it might be useful to know: > * does GCC 13 work? unfortunately, you cannot test this easily because > Firefox depends on some C++ libraries. You would have to test it with > USE=-system-icu at least. It's OK if you don't try this, but it may be > useful data. > > * does dropping -march=native help? (does "-O2 -mtune=native" fail?) > > * does "-O2 -march=amdfam10" fail? * Under gdb, when you are at: Thread 1 "firefox" received signal SIGSEGV, Segmentation fault. 0x00007ffff12e028f in MakeDay (year=<optimized out>, month=<optimized out>, date=<optimized out>) at /usr/src/debug/www-client/firefox-132.0/firefox-132.0/js/src/jsdate.cpp:429 429 double monthday = DayFromMonth(mn, leap); (gdb) bt f Can you please do: 'x/5i $pc'.
I am writing this now from firefox built with -march=x86-64 -Og -pipe -ggdb3 I remember I have seen in build.log that firefox changes that to something like -march=x86-64 -O2 -pipe -gdwarf-4 So far no crash in these first minutes.
(In reply to Sam James from comment #30) > > * Under gdb, when you are at: > > > Thread 1 "firefox" received signal SIGSEGV, Segmentation fault. > 0x00007ffff12e028f in MakeDay (year=<optimized out>, month=<optimized out>, > date=<optimized out>) at > /usr/src/debug/www-client/firefox-132.0/firefox-132.0/js/src/jsdate.cpp:429 > 429 double monthday = DayFromMonth(mn, leap); > (gdb) bt f > > Can you please do: 'x/5i $pc'. from thunderbird I hope it has all infos because I cleaned up /usr/src/debug/ when I needed space.: (gdb) x/5i $pc => 0x7ffff159ab8f: movd (%rdx,%rax,4),%xmm0 0x7ffff159ab94: cvtdq2pd %xmm0,%xmm0 0x7ffff159ab98: addsd %xmm1,%xmm0 0x7ffff159ab9c: addsd %xmm2,%xmm0 0x7ffff159aba0: subsd %xmm4,%xmm0
(In reply to Sam James from comment #29) > jospezial, if you are able, it might be useful to know: > * does dropping -march=native help? (does "-O2 -mtune=native" fail?) > > * does "-O2 -march=amdfam10" fail? -march=amdfam10 -Og -pipe -ggdb3 Without mtune works. no crash. thunderbird-133.0_beta2
(In reply to jospezial from comment #33) > (In reply to Sam James from comment #29) > > jospezial, if you are able, it might be useful to know: > > * does dropping -march=native help? (does "-O2 -mtune=native" fail?) > > > > * does "-O2 -march=amdfam10" fail? > > -march=amdfam10 -Og -pipe -ggdb3 > Without mtune works. no crash. thunderbird-133.0_beta2 Can you clarify? -march=amdfam10 should already imply -mtune=amdfam10. You can verify this with: 'for t in param target optimize optimizer; do cmd="gcc -Q --help=$t"; diff -U0 <(LANG=C $cmd -O2 -march=amdfam10) <(LANG=C $cmd -O2 -march=amdfam10 -mtune=amdfam10); done'.
But note that -march=amdfam10 is different from -march=native because -march=native may include more --param ... So does '-march=amdfam10 --param=l1-cache-line-size=64 --param=l1-cache-size=64 --param=l2-cache-size=1024' crash? And then -march=amdfam10' works?
(with -O2 on top of course for both)
(In reply to Sam James from comment #36) > (with -O2 on top of course for both) The build process changes that to -O2 anyway.
(In reply to Sam James from comment #34) > (In reply to jospezial from comment #33) > > (In reply to Sam James from comment #29) > > > jospezial, if you are able, it might be useful to know: > > > * does dropping -march=native help? (does "-O2 -mtune=native" fail?) > > > > > > * does "-O2 -march=amdfam10" fail? > > > > -march=amdfam10 -Og -pipe -ggdb3 > > Without mtune works. no crash. thunderbird-133.0_beta2 > > > Can you clarify? > > -march=amdfam10 should already imply -mtune=amdfam10. > > You can verify this with: 'for t in param target optimize optimizer; do > cmd="gcc -Q --help=$t"; diff -U0 <(LANG=C $cmd -O2 -march=amdfam10) <(LANG=C > $cmd -O2 -march=amdfam10 -mtune=amdfam10); done'. no output
(In reply to jospezial from comment #38) > > > > -march=amdfam10 should already imply -mtune=amdfam10. > > > > You can verify this with: 'for t in param target optimize optimizer; do > > cmd="gcc -Q --help=$t"; diff -U0 <(LANG=C $cmd -O2 -march=amdfam10) <(LANG=C > > $cmd -O2 -march=amdfam10 -mtune=amdfam10); done'. > > no output Exactly - they're the same.
(In reply to jospezial from comment #32) > (In reply to Sam James from comment #30) > > > > * Under gdb, when you are at: > > > > > > Thread 1 "firefox" received signal SIGSEGV, Segmentation fault. > > 0x00007ffff12e028f in MakeDay (year=<optimized out>, month=<optimized out>, > > date=<optimized out>) at > > /usr/src/debug/www-client/firefox-132.0/firefox-132.0/js/src/jsdate.cpp:429 > > 429 double monthday = DayFromMonth(mn, leap); > > (gdb) bt f > > > > Can you please do: 'x/5i $pc'. > > from thunderbird > I hope it has all infos > because I cleaned up /usr/src/debug/ when I needed space.: > (gdb) x/5i $pc > => 0x7ffff159ab8f: movd (%rdx,%rax,4),%xmm0 > 0x7ffff159ab94: cvtdq2pd %xmm0,%xmm0 > 0x7ffff159ab98: addsd %xmm1,%xmm0 > 0x7ffff159ab9c: addsd %xmm2,%xmm0 > 0x7ffff159aba0: subsd %xmm4,%xmm0 Does that tell us why it crashes? I have read this is SSE2 stuff. Not 3Dnow. Could we isolate a testcase? My PC is working for about 7 hours on each build.
Yes, I'm trying to, but it's not easy when I can't yet reproduce it. The Opteron is still updating. If you are able to try work on that, that would be great though. In the meantime, finding the exact options which do trigger it would help a lot.
(In reply to jospezial from comment #40) > I have read this is SSE2 stuff. Not 3Dnow. Yes, it's not necessarily 3dnow, but tuning related to the instructions you have (including 3dnow).
Understanding exactly which options do and don't trigger it would mean that I can at least have a minimised diff b/t binaries, and also speeds up being able to reproduce. That includes understanding which instruction sets trigger it and if -mtune is required or not.
I can reproduce it now on the Opteron.
Sam, any news from your builds and debugging on your opteron? Did you try the equivalent for your machine of '-march=amdfam10 --param=l1-cache-line-size=64 --param=l1-cache-size=64 --param=l2-cache-size=1024' or what says resolve-march-native
I'll post updates as they occur. The only bits so far aren't really worth mentioning, but given I'm writing this anyway: -Og works, -O2 doesn't; __attribute__((optimize("O0")) on MakeDay still fails (as does noipa on it which I tried first). All of that was with -march=k8-sse3. I don't remember if I tried without it yet. To give an idea: generally, miscompilation (suspected compiler bugs) take at least a week of effort usually for non-trivial applications for me. Some are much quicker though. Firefox on the other hand is a massive application where I don't even have the luxury of debugging it on a fast machine. Last few days I was busy with other work. Iteration time is pretty painful and spent some time trying to set up an environment where it was smaller (build locally, copy over, run).
(In reply to Sam James from comment #35) > But note that -march=amdfam10 is different from -march=native because > -march=native may include more --param ... > > So does '-march=amdfam10 --param=l1-cache-line-size=64 > --param=l1-cache-size=64 --param=l2-cache-size=1024' crash? No crash, works. -march=amdfam10 --param=l1-cache-line-size=64 --param=l1-cache-size=64 --param=l2-cache-size=1024 -O2 -pipe firefox-133.0_beta7 USE="-valgrind"
That is a bit unexpected.
From looking at disassembly, I suspect the ParseISOStyleDate, the caller of MakeDay, is miscompiled by GCC such that MakeDay receives a NaN as the first argument. Since you already can reliably hit the segfault under gdb, you may be able to confirm that by placing a breakpoint on MakeDay, and then using 'p $xmm0' to print the first argument (assuming the caller is ParseISOStyleDate on the first hit).
(In reply to Alexander Monakov from comment #49) > From looking at disassembly, I suspect the ParseISOStyleDate, the caller of > MakeDay, is miscompiled by GCC such that MakeDay receives a NaN as the first > argument. > > Since you already can reliably hit the segfault under gdb, you may be able > to confirm that by placing a breakpoint on MakeDay, and then using 'p $xmm0' > to print the first argument (assuming the caller is ParseISOStyleDate on the > first hit). Thread 1 hit Breakpoint 1, 0x00007ffff170a190 in MakeDay(double, double, double) () from target:/usr/lib64/firefox/libxul.so (gdb) p $xmm0 $1 = {v8_bfloat16 = {0, 0, -1.654e-24, 4.969, 0, 0, 0, 0}, v8_half = {0, 0, -0.0019531, 2.3105, 0, 0, 0, 0}, v4_float = {0, 4.98730469, 0, 0}, v2_double = {2022, 0}, v16_int8 = {0, 0, 0, 0, 0, -104, -97, 64, 0, 0, 0, 0, 0, 0, 0, 0}, v8_int16 = {0, 0, -26624, 16543, 0, 0, 0, 0}, v4_int32 = {0, 1084200960, 0, 0}, v2_int64 = { 4656607665491804160, 0}, uint128 = 4656607665491804160} (gdb) bt #0 0x00007ffff170a190 in MakeDay(double, double, double) () from target:/usr/lib64/firefox/libxul.so #1 0x00007ffff170cefd in bool ParseISOStyleDate<unsigned char>(js::DateTimeInfo::ForceUTC, unsigned char const*, unsigned long, JS::ClippedTime*) () from target:/usr/lib64/firefox/libxul.so [...]
Thanks. Here you show 2022 as the first argument (double year), which looks fine, but then presumably this invocation of MakeDay will not segfault at all. In situations like these, when you want to breakpoint on the faulting call to a function, but it's not the first call, you can use the 'ignore' command in GDB, first to count the number of non-faulting calls, then to stop after the last faulting call: (gdb) b MakeDay breakpoint 1 at ... (gdb) ignore 1 9999999 (gdb) r after GDB reports the segfault: (gdb) i b 1 GDB will inform you that breakpoint 1 was already hit N times. Next, 'ignore' it N-1 times, restart, at you will be stopped on the faulting call, can inspect its arguments, the callers' arguments, etc.
gah, thanks. Thread 1 received signal SIGSEGV, Segmentation fault. 0x00007ffff170a551 in MakeDay(double, double, double) () from target:/usr/lib64/firefox/libxul.so (gdb) i b 1 Num Type Disp Enb Address What 1 breakpoint keep y 0x00007ffff170a190 <MakeDay(double, double, double)> breakpoint already hit 4 times ignore next 9999995 hits (gdb) so Thread 1 hit Breakpoint 1, 0x00007ffff170a190 in MakeDay(double, double, double) () from target:/usr/lib64/firefox/libxul.so (gdb) p $xmm0 $1 = {v8_bfloat16 = {0, 0, -1.084e-19, 4.969, 0, 0, 0, 0}, v8_half = {0, 0, -0.0078125, 2.3105, 0, 0, 0, 0}, v4_float = {0, 4.98828125, 0, 0}, v2_double = {2024, 0}, v16_int8 = {0, 0, 0, 0, 0, -96, -97, 64, 0, 0, 0, 0, 0, 0, 0, 0}, v8_int16 = {0, 0, -24576, 16543, 0, 0, 0, 0}, v4_int32 = {0, 1084203008, 0, 0}, v2_int64 = { 4656616461584826368, 0}, uint128 = 4656616461584826368} (gdb) n Single stepping until exit from function _ZL7MakeDayddd, which has no line number information. Thread 1 received signal SIGSEGV, Segmentation fault. 0x00007ffff170a551 in MakeDay(double, double, double) () from target:/usr/lib64/firefox/libxul.so (gdb) p $xmm0 $2 = {v8_bfloat16 = {0, 0, 96, 6.594, 0, 0, 0, 0}, v8_half = {0, 0, 3.375, 2.4121, 0, 0, 0, 0}, v4_float = {0, 6.60189819, 0, 0}, v2_double = {19723, 0}, v16_int8 = {0, 0, 0, 0, -64, 66, -45, 64, 0, 0, 0, 0, 0, 0, 0, 0}, v8_int16 = {0, 0, 17088, 16595, 0, 0, 0, 0}, v4_int32 = {0, 1087587008, 0, 0}, v2_int64 = {4671150630914490368, 0}, uint128 = 4671150630914490368} (gdb) n Single stepping until exit from function _ZL7MakeDayddd, which has no line number information. [Thread 25819.25847 exited] 0x00007ffff1d3fbd0 in WasmTrapHandler(int, siginfo_t*, void*) () from target:/usr/lib64/firefox/libxul.so (gdb)
Okay, 2024 is still correct. Can you check $xmm1 and $xmm2 (month and day, respectively) on entry too? I was looking at immolo's binaries, there's a chance yours are miscompiled differently. If all of xmm0/xmm1/xmm2 on entry are fine, that would mean that my analysis of immolo's binary is inapplicable to you, and you'll have to step through your MakeDay to figure out what causes an out-of-bounds access (again, looking at full backtrace supplied by immolo I deduced that it was a rogue NaN).
(gdb) p $xmm0 $11 = {v8_bfloat16 = {0, 0, -1.084e-19, 4.969, 0, 0, 0, 0}, v8_half = {0, 0, -0.0078125, 2.3105, 0, 0, 0, 0}, v4_float = {0, 4.98828125, 0, 0}, v2_double = {2024, 0}, v16_int8 = {0, 0, 0, 0, 0, -96, -97, 64, 0, 0, 0, 0, 0, 0, 0, 0}, v8_int16 = {0, 0, -24576, 16543, 0, 0, 0, 0}, v4_int32 = {0, 1084203008, 0, 0}, v2_int64 = { 4656616461584826368, 0}, uint128 = 4656616461584826368} (gdb) p $xmm1 $8 = {v8_bfloat16 = {0, 0, 0, 2.5, 0, 0, 0, 0}, v8_half = {0, 0, 0, 2.0625, 0, 0, 0, 0}, v4_float = {0, 2.5, 0, 0}, v2_double = {8, 0}, v16_int8 = {0, 0, 0, 0, 0, 0, 32, 64, 0, 0, 0, 0, 0, 0, 0, 0}, v8_int16 = {0, 0, 0, 16416, 0, 0, 0, 0}, v4_int32 = {0, 1075838976, 0, 0}, v2_int64 = {4620693217682128896, 0}, uint128 = 4620693217682128896} (gdb) p $xmm2 $9 = {v8_bfloat16 = {0, 0, 0, 2.625, 0, 0, 0, 0}, v8_half = {0, 0, 0, 2.0781, 0, 0, 0, 0}, v4_float = {0, 2.625, 0, 0}, v2_double = {12, 0}, v16_int8 = {0, 0, 0, 0, 0, 0, 40, 64, 0, 0, 0, 0, 0, 0, 0, 0}, v8_int16 = {0, 0, 0, 16424, 0, 0, 0, 0}, v4_int32 = {0, 1076363264, 0, 0}, v2_int64 = {4622945017495814144, 0}, uint128 = 4622945017495814144}
so it looks like it's fine and I need to step through? I'll ask for advice on doing that if possible, but I will also build again manually given that'll be useful to have anyway (and am curious as to if it has the NaN instead).
If it works fine the first three times around, and segfaults on the fourth call with (2024.0, 8.0, 12.0) in arguments, that is surprising. For stepping, the 'display' GDB command might be helpful to request printing values of floating-point registers after each 'si' command, for instance: display $st0 display $xmm0.v2_double[0]
Thanks. Let me first double check (as I agree it's suspicious).
To verify: ``` # Attaching to the remote (gdbserver) 0x00007ffff7fe4840 in _start () from target:/lib64/ld-linux-x86-64.so.2 (gdb) b MakeDay Function "MakeDay" not defined. Make breakpoint pending on future shared library load? (y or [n]) y Breakpoint 1 (MakeDay) pending. (gdb) ignore 1 9999999 Will ignore next 9999999 crossings of breakpoint 1. (gdb) c Continuing. Thread 1 received signal SIGSEGV, Segmentation fault. 0x00007ffff170a551 in MakeDay(double, double, double) () from target:/usr/lib64/firefox/libxul.so (gdb) p yearday i No symbol "yearday" in current context. (gdb) i b 1 Num Type Disp Enb Address What 1 breakpoint keep y 0x00007ffff170a190 <MakeDay(double, double, double)> breakpoint already hit 4 times ignore next 9999995 hits (gdb) ``` OK, so N=4, ignore it 3 times. Had to restart the session as FF has an annoying "clean startup" warning/error prompt first if the last start failed. Then trying again: ``` Reading symbols from target:/lib64/ld-linux-x86-64.so.2... Reading /usr/lib/debug/.build-id/ed/a8453b0094ddfaae7ee9a1f557682089f9abef.debug from remote target... 0x00007ffff7fe4840 in _start () from target:/lib64/ld-linux-x86-64.so.2 (gdb) b MakeDay Function "MakeDay" not defined. Make breakpoint pending on future shared library load? (y or [n]) y Breakpoint 2 (MakeDay) pending. (gdb) ignore 1 3 Will ignore next 3 crossings of breakpoint 1. (gdb) c Continuing. [...] Thread 1 hit Breakpoint 2, 0x00007ffff170a190 in MakeDay(double, double, double) () from target:/usr/lib64/firefox/libxul.so (gdb) p $xmm0 $2 = {v8_bfloat16 = {0, 0, -1.654e-24, 4.969, 0, 0, 0, 0}, v8_half = {0, 0, -0.0019531, 2.3105, 0, 0, 0, 0}, v4_float = {0, 4.98730469, 0, 0}, v2_double = {2022, 0}, v16_int8 = {0, 0, 0, 0, 0, -104, -97, 64, 0, 0, 0, 0, 0, 0, 0, 0}, v8_int16 = {0, 0, -26624, 16543, 0, 0, 0, 0}, v4_int32 = {0, 1084200960, 0, 0}, v2_int64 = { 4656607665491804160, 0}, uint128 = 4656607665491804160} (gdb) p $xmm1 $3 = {v8_bfloat16 = {0, 0, 0, 1.875, 2.503e-06, 1.685e-33, 1.414e-34, -nan(0x7e)}, v8_half = {0, 0, 0, 1.9844, 0.38477, 0.00015402, 0.00011039, -nan(0x3fe)}, v4_float = { 0, 1.875, 1.68773512e-33, -nan(0x7e073c)}, v2_double = {1, -nan(0xe073c090c3628)}, v16_int8 = {0, 0, 0, 0, 0, 0, -16, 63, 40, 54, 12, 9, 60, 7, -2, -1}, v8_int16 = { 0, 0, 0, 16368, 13864, 2316, 1852, -2}, v4_int32 = {0, 1072693248, 151795240, -129220}, v2_int64 = {4607182418800017408, -554995522193880}, uint128 = 340272129060578498174165271179148918784} ``` so I made a mistake last time!
(although I wonder what happened in https://bugs.gentoo.org/942573#c52, and maybe the profile affected it, but w/e)
Oh, please also do 'p $ftag' on entry to MakeDay. It was visible in the pastebinned log supplied by immolo on IRC, but I don't have that part of conversation anymore. Perhaps some previous function executed an mmx/3dnow instruction without subsequent (f)emms, leaving x87 state invalid.
Immediately after the above prints in the same session: (gdb) p $ftag $6 = 65535 (gdb) info registers rax 0x0 0 rbx 0x0 0 rcx 0xa 10 rdx 0x8 8 rsi 0x0 0 rdi 0x0 0 rbp 0x1 0x1 rsp 0x7fffffff9db8 0x7fffffff9db8 r8 0xa 10 r9 0x0 0 r10 0x8 8 r11 0x7e6 2022 r12 0x0 0 r13 0x0 0 r14 0x7fffffff9f90 140737488330640 r15 0x1 1 rip 0x7ffff170a190 0x7ffff170a190 <MakeDay(double, double, double)> eflags 0x242 [ ZF IF ] cs 0x33 51 ss 0x2b 43 ds 0x0 0 es 0x0 0 fs 0x0 0 gs 0x0 0 fs_base 0x7ffff7e9c780 140737352681344 gs_base 0x0 0 (gdb)
Created attachment 913323 [details] gdb-info-registers-all.txt
Hunting for the missing femms would have been a spicy challenge, but (un)fortunately $ftag being 0xffff is perfectly fine. But that would explain why the first three calls work fine, and the fourth call faults while arguments are still okay. I hope Firefox is not multithreaded yet at that point, and its really, deterministically, the fourth call that segfaults each time?
(In reply to Alexander Monakov from comment #63) > Hunting for the missing femms would have been a spicy challenge, but > (un)fortunately $ftag being 0xffff is perfectly fine. But that would explain > why the first three calls work fine, and the fourth call faults while > arguments are still okay. > > I hope Firefox is not multithreaded yet at that point, and its really, > deterministically, the fourth call that segfaults each time? I have bad news. Immediately after the above: (gdb) p $ftag $10 = 65535 (gdb) c Continuing. [Thread 17683.17713 exited] Thread 1 hit Breakpoint 2, 0x00007ffff170a190 in MakeDay(double, double, double) () from target:/usr/lib64/firefox/libxul.so (gdb) p $ftag $11 = 65535 (gdb) c Continuing. Thread 1 hit Breakpoint 2, 0x00007ffff170a190 in MakeDay(double, double, double) () from target:/usr/lib64/firefox/libxul.so (gdb) c Continuing. Thread 1 hit Breakpoint 1, 0x00007ffff170a190 in MakeDay(double, double, double) () from target:/usr/lib64/firefox/libxul.so (gdb) p $ftag $12 = 20822 (gdb) c Continuing. [New Thread 17683.19969] Thread 1 received signal SIGSEGV, Segmentation fault. 0x00007ffff170a551 in MakeDay(double, double, double) () from target:/usr/lib64/firefox/libxul.so (gdb) p $ftag $13 = 64854 (gdb) i b 1 Num Type Disp Enb Address What 1 breakpoint keep y 0x00007ffff170a190 <MakeDay(double, double, double)> breakpoint already hit 8 times so it's not deterministically 4th at all...
(gdb) p $ftag $12 = 20822 Bingo! This confirms that MakeDay is being called with invalid x87 state (all x87 stack slots are in use). Woohoo, progress! The most likely cause would be some other function using an mmx or a 3dnow instruction, marking x87 stack registers used, and not releasing them via the emms or femms instruction. (I don't have a ready recipe for that particular needle)
Created attachment 913370 [details] instrumentation for missed emms hunting Okay, here's a recipe. Use the attachment to create libgcc.a with instrumentation helpers, replace the system libgcc with that (gcc -print-file-name=libgcc.a gives you the location). Then rebuild Firefox with -pg -mfentry -minstrument-return=call in flags (how 'make test' in the attachment does). Then you should get SIGILL under GDB, check backtrace and work from there. With luck, you will be stopped exactly when the naughty function returns (if you're in the __return__ handler) or calls another function (if in the __fentry__ handler) with invalid x87 state.
Thread 1 received signal SIGILL, Illegal instruction. 0x00005555555cae3d in __return__ () (gdb) bt #0 0x00005555555cae3d in __return__ () #1 0x00007fffeba939f5 in ?? () #2 0x00007fffebf3a013 in ?? () #3 0x00007ffff4e3006c in ?? () from target:/usr/lib64/libfreetype.so.6 #4 0x00007fffe530d780 in ?? () #5 0x0000000000000000 in ?? () (gdb) I'm not sure why the backtrace is useless, it was certainly built w/ -ggdb3 and `file` at least claims firefox isn't stripped.
. Reading /home/sjames/build/ff-instrumented/dist/bin/libxul.so from remote target... Error while mapping shared library sections: `target:/home/sjames/build/ff-instrumented/dist/bin/libxul.so': not in executable format: file format not recognized
Created attachment 913410 [details] backtrace-on-sigill.txt Moving on from gdbserver for now.. #0 0x00005555555cae3d in __return__ () #1 0x00007fffeba939f5 in hb_font_set_scale (font=<optimized out>, x_scale=<optimized out>, y_scale=<optimized out>) at /home/sjames/git/firefox-132.0/gfx/harfbuzz/src/hb-font.cc:2347 #2 0x00007fffebf3a013 in gfxHarfBuzzShaper::CreateHBFont (aFont=0x7fffeba939f5 <hb_font_set_scale(hb_font_t*, int, int)+341>, aFontFuncs=<optimized out>, aCallbackData=aCallbackData@entry=0x7fffe01a95d8) at /home/sjames/git/firefox-132.0/gfx/thebes/gfxHarfBuzzShaper.cpp:1323 #3 0x00007fffebf3a50e in gfxHarfBuzzShaper::Initialize (this=this@entry=0x7fffe01a9580) at /home/sjames/git/firefox-132.0/gfx/thebes/gfxHarfBuzzShaper.cpp:1302 #4 0x00007fffebf3a794 in gfxHarfBuzzShaper::Initialize (this=this@entry=0x7fffe01a9580) at /home/sjames/git/firefox-132.0/gfx/thebes/gfxHarfBuzzShaper.cpp:1305 #5 0x00007fffebefd7c4 in gfxFont::GetHarfBuzzShaper (this=this@entry=0x7fffe15e0c30) at /home/sjames/git/firefox-132.0/gfx/thebes/gfxFont.cpp:1078 #6 0x00007fffebf0678b in gfxFont::ShapeText (this=0x7fffe15e0c30, aDrawTarget=<optimized out>, aText=<optimized out>, aOffset=0, aLength=4, aScript=mozilla::intl::Script::LATIN, aLanguage=<optimized out>, aVertical=<optimized out>, aRounding=<optimized out>, aShapedText=<optimized out>) at /home/sjames/git/firefox-132.0/gfx/thebes/gfxFont.cpp:3443 #7 0x00007fffebf005ea in gfxFont::ShapeText (this=this@entry=0x7fffe15e0c30, aDrawTarget=aDrawTarget@entry=0x7fffe64cf340, aText=aText@entry=0x7fffffff2bd0 "File\020", aOffset=aOffset@entry=0, aLength=aLength@entry=4, aScript=aScript@entry=mozilla::intl::Script::LATIN, aLanguage=aLanguage@entry=0x7fffe08277a0, aVertical=aVertical@entry=false, aRounding=aRounding@entry=gfxFontShaper::RoundingFlags::kRoundY, aShapedText=aShapedText@entry=0x7fffdf3fe180) at /home/sjames/git/firefox-132.0/gfx/thebes/gfxFont.cpp:3412
Created attachment 913411 [details] disas-of-hb_font_set_scale.txt
(gdb) p $ftag $2 = 342
Created attachment 913414 [details] Unified_cpp_gfx_harfbuzz_src0.ii.xz /usr/bin/ccache /usr/bin/g++ -o Unified_cpp_gfx_harfbuzz_src0.o -c -I/home/sjames/build/ff-instrumented/dist/stl_wrappers -I/home/sjames/build/ff-instrumented/dist/system_wrappers -include /home/sjames/git/firefox-132.0/config/gcc_hidden.h -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -fstack-protector-strong -fstrict-flex-arrays=1 -DNDEBUG=1 -DTRIMMED=1 '-DPACKAGE_VERSION="moz"' '-DPACKAGE_BUGREPORT="http://bugzilla.mozilla.org/"' -DHAVE_OT=1 -DHAVE_ROUND=1 -DHB_NO_BUFFER_VERIFY -DHB_NO_FALLBACK_SHAPE -DHB_NO_UCD -DHB_NO_UNICODE_FUNCS -DMOZ_HAS_MOZGLUE -DMOZILLA_INTERNAL_API -DIMPL_LIBXUL -DMOZ_SUPPORT_LEAKCHECKING -DSTATIC_EXPORTABLE_JS_API -I/home/sjames/git/firefox-132.0/gfx/harfbuzz/src -I/home/sjames/build/ff-instrumented/gfx/harfbuzz/src -I/home/sjames/build/ff-instrumented/dist/include -I/usr/include/nspr -I/usr/include/nss -I/usr/include/nspr -I/home/sjames/build/ff-instrumented/dist/include/nss -I/usr/include/pixman-1 -DMOZILLA_CLIENT -include /home/sjames/build/ff-instrumented/mozilla-config.h -fno-rtti -pthread -fno-sized-deallocation -fno-aligned-new -ffunction-sections -fdata-sections -fno-math-errno -fno-exceptions -pipe -fPIC -specs=/home/sjames/gcc.specs -O2 -ggdb3 -pipe -march=k8-sse3 -pg -mfentry -minstrument-return=call -gdwarf-4 -O2 -fomit-frame-pointer -funwind-tables -Wall -Wempty-body -Wignored-qualifiers -Wpointer-arith -Wsign-compare -Wtype-limits -Wunreachable-code -Wno-invalid-offsetof -Wcomma-subscript -Wvolatile -Wno-deprecated-enum-enum-conversion -Wduplicated-cond -Wimplicit-fallthrough -Wlogical-op -Wno-error=maybe-uninitialized -Wno-error=deprecated-declarations -Wno-error=array-bounds -Wno-error=coverage-mismatch -Wno-error=free-nonheap-object -Wno-multistatement-macros -Wno-error=class-memaccess -Wformat -Wformat-security -Wformat-overflow=2 -Wno-psabi -Wno-error=builtin-macro-redefined -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include -I/usr/lib64/libffi/include -fno-strict-aliasing -ffp-contract=off -MD -MP -MF .deps/Unified_cpp_gfx_harfbuzz_src0.o.pp Unified_cpp_gfx_harfbuzz_src0.cpp -save-temps
(In reply to Sam James from comment #68) > . > Reading /home/sjames/build/ff-instrumented/dist/bin/libxul.so from remote > target... > Error while mapping shared library sections: > `target:/home/sjames/build/ff-instrumented/dist/bin/libxul.so': not in > executable format: file format not recognized FTR, I think this is https://sourceware.org/bugzilla/show_bug.cgi?id=26196.
Thank you jospezial for the report, amonakov for the extensive help debugging & analysing the problem, immolo for doing initial debugging with amonakov, an unnamed contributor who kindly set up a machine for me to use and test on, and all other offers of help. amonakov reported it to GCC at https://gcc.gnu.org/PR117926 and Uros has fixed it already on trunk (not yet backported to 14). I'll test the fix over the weekend. I'll let you know when a version in-tree is expected to work.
sys-devel/gcc-14.3.9999 www-client/firefox-133.0 -march=native -O2 -pipe No crash, runs fine.
(In reply to jospezial from comment #75) > sys-devel/gcc-14.3.9999 www-client/firefox-133.0 > -march=native -O2 -pipe > > No crash, runs fine. Fantastic. Uros backported it already on the 14 branch so I imagine you had it in there, depending on when you started the build. Alexander pointed out on IRC that you will need to rebuild a lot of packages, unfortunately. The issue is that it's not as simple as a particular package crashing. The bug involved x87 FPU state being left corrupted which means it can "carry across" packages. You have a few options: 1) We could try find which binaries on your system at least use MMX and rebuild those; 2) We could analyse those results and see if they seem miscompiled, and only rebuild those; 3) Just rebuild everything. Which would you like to do?
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=7e3e2d5257fe376adc87504bc09505eccff7aab0 commit 7e3e2d5257fe376adc87504bc09505eccff7aab0 Author: Sam James <sam@gentoo.org> AuthorDate: 2024-12-15 01:10:54 +0000 Commit: Sam James <sam@gentoo.org> CommitDate: 2024-12-15 01:10:54 +0000 sys-devel/gcc: add 14.2.1_p20241214 Bug: https://bugs.gentoo.org/942573 Signed-off-by: Sam James <sam@gentoo.org> sys-devel/gcc/Manifest | 1 + sys-devel/gcc/gcc-14.2.1_p20241214.ebuild | 54 +++++++++++++++++++++++++++++++ 2 files changed, 55 insertions(+)
https://forums.gentoo.org/viewtopic-p-8849550.html may be another instance.
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=589141eab7000d561f95958da64317c461f3595b commit 589141eab7000d561f95958da64317c461f3595b Author: Sam James <sam@gentoo.org> AuthorDate: 2024-12-23 02:30:05 +0000 Commit: Sam James <sam@gentoo.org> CommitDate: 2024-12-23 02:33:48 +0000 sys-devel/gcc: keyword 14.2.1_p20241221 Has a bunch of misc. fixes but importantly fixes a miscompilation with 3DNow! instructions where x87 FPU state was left corrupted. This only affected >= GCC 14 and is now fixed. Unfortunately, the nature of the bug means that all packages may need to be recompiled (see https://bugs.gentoo.org/942573#c76 for more detail there). We can still consider a news item describing how to find potentially affected packages, but that's not a reason to put off keywording (and shortly, stabilisation). Thanks again to amonakov for the help in debugging, jospezial for the report, immolo for initially working with amonakov on it, and all others who helped & offered help. And Uros for fixing it upstream, of course. Will file a stable bug soon. I'd been planning on keywording this shortly anyway but was waiting for Christmas for things to settle down: now is a good time, and also was prompted by a potential other report of this on the forums at https://forums.gentoo.org/viewtopic-p-8849550.html. Bug: https://gcc.gnu.org/PR117926 Bug: https://bugs.gentoo.org/942573 Signed-off-by: Sam James <sam@gentoo.org> sys-devel/gcc/gcc-14.2.1_p20241221.ebuild | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)