Created attachment 919803 [details] emerge --info firefox Reporting the known issue that sam has been afflicted with as well, now with attached backtraces. #0 0x00007fb7a7cabefc in pthread_kill () at /lib64/libc.so.6 #1 0x00007fb7a7c47c96 in raise () at /lib64/libc.so.6 #2 0x00007fb7a0befd8a in nsProfileLock::FatalSignalHandler (signo=11, info=0x7fffc1c19db0, context=0x7fffc1c19c80) at /usr/src/debug/www-client/firefox-135.0.1/firefox-135.0.1/toolkit/profile/nsProfileLock.cpp:177 #3 0x00007fb7a7c47dc0 in <signal handler called> () at /lib64/libc.so.6 #4 mozilla::RefPtrTraits<mozilla::dom::BrowsingContext>::Release (aPtr=<optimized out>, aPtr=<optimized out>) at /usr/src/debug/www-client/firefox-135.0.1/firefox_build/dist/include/mozilla/RefPtr.h:49 #5 RefPtr<mozilla::dom::BrowsingContext>::ConstRemovingRefPtrTraits<mozilla::dom::BrowsingContext>::Release (aPtr=<optimized out>, aPtr=<optimized out>) at /usr/src/debug/www-client/firefox-135.0.1/firefox_build/dist/include/mozilla/RefPtr.h:409 #6 RefPtr<mozilla::dom::BrowsingContext>::~RefPtr (this=<optimized out>, this=<optimized out>) at /usr/src/debug/www-client/firefox-135.0.1/firefox_build/dist/include/mozilla/RefPtr.h:80 #7 mozilla::dom::MaybeDiscarded<mozilla::dom::BrowsingContext>::~MaybeDiscarded (this=<optimized out>, this=<optimized out>) at /usr/src/debug/www-client/firefox-135.0.1/firefox_build/dist/include/mozilla/dom/MaybeDiscarded.h:23 #8 IPC::ReadResult<mozilla::dom::MaybeDiscarded<mozilla::dom::BrowsingContext>, true>::~ReadResult (this=<optimized out>, this=<optimized out>) at /usr/src/debug/www-client/firefox-135.0.1/firefox-135.0.1/ipc/chromium/src/chrome/common/ipc_message_utils.h:248 #9 mozilla::dom::PSessionStoreParent::OnMessageReceived (this=<optimized out>, msg__=<optimized out>) at /usr/src/debug/www-client/firefox-135.0.1/firefox_build/ipc/ipdl/PSessionStoreParent.cpp:352 #10 0x00007fb79fb566cd in mozilla::dom::PContentParent::OnMessageReceived (this=<optimized out>, msg__=...) at /usr/src/debug/www-client/firefox-135.0.1/firefox_build/ipc/ipdl/PContentParent.cpp:6735 #11 0x00007fb79cb48da1 in mozilla::ipc::MessageChannel::DispatchAsyncMessage (this=0x7fb754322c80, aProxy=0x7fb75b3b0da0, aMsg=...) at /usr/src/debug/www-client/firefox-135.0.1/firefox-135.0.1/ipc/glue/MessageChannel.cpp:1790 #12 mozilla::ipc::MessageChannel::DispatchMessage (this=this@entry=0x7fb754322c80, aProxy=aProxy@entry=0x7fb75b3b0da0, aMsg=...) at /usr/src/debug/www-client/firefox-135.0.1/firefox-135.0.1/ipc/glue/MessageChannel.cpp:1717 #13 0x00007fb79cb4a5df in mozilla::ipc::MessageChannel::RunMessage (this=<optimized out>, aProxy=0x7fb75b3b0da0, aTask=...) at /usr/src/debug/www-client/firefox-135.0.1/firefox-135.0.1/ipc/glue/MessageChannel.cpp:1508 #14 mozilla::ipc::MessageChannel::RunMessage (this=<optimized out>, aProxy=0x7fb75b3b0da0, aTask=...) at /usr/src/debug/www-client/firefox-135.0.1/firefox-135.0.1/ipc/glue/MessageChannel.cpp:1469 #15 mozilla::ipc::MessageChannel::MessageTask::Run (this=0x7fb752962350) at /usr/src/debug/www-client/firefox-135.0.1/firefox-135.0.1/ipc/glue/MessageChannel.cpp:1608 #16 0x00007fb79c459252 in mozilla::RunnableTask::Run (this=0x7fb753bfd180) at /usr/src/debug/www-client/firefox-135.0.1/firefox-135.0.1/xpcom/threads/TaskController.cpp:688 #17 0x00007fb79c4ce976 in mozilla::TaskController::RunTask (aTask=0x7fb753bfd180) at /usr/src/debug/www-client/firefox-135.0.1/firefox-135.0.1/xpcom/threads/TaskController.cpp:247 #18 mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal (this=this@entry=0x7fb798773260, aProofOfLock=...) at /usr/src/debug/www-client/firefox-135.0.1/firefox-135.0.1/xpcom/threads/TaskController.cpp:1015 #19 0x00007fb79c4cf574 in mozilla::TaskController::ExecuteNextTaskOnlyMainThreadInternal (this=0x7fb798773260, aProofOfLock=...) at /usr/src/debug/www-client/firefox-135.0.1/firefox-135.0.1/xpcom/threads/TaskController.cpp:838 #20 mozilla::TaskController::ProcessPendingMTTask (aMayWait=false, this=0x7fb798773260) at /usr/src/debug/www-client/firefox-135.0.1/firefox-135.0.1/xpcom/threads/TaskController.cpp:624 #21 operator() (__closure=<optimized out>) at /usr/src/debug/www-client/firefox-135.0.1/firefox-135.0.1/xpcom/threads/TaskController.cpp:336 #22 mozilla::detail::RunnableFunction<mozilla::TaskController::TaskController()::<lambda()> >::Run(void) (this=<optimized out>) at /usr/src/debug/www-client/firefox-135.0.1/firefox-135.0.1/xpcom/threads/nsThreadUtils.h:548 #23 0x00007fb79c50346c in nsThread::ProcessNextEvent (this=0x7fb7a7a90280, aMayWait=<optimized out>, aResult=0x7fffc1c1bc70) at /usr/src/debug/www-client/firefox-135.0.1/firefox-135.0.1/xpcom/threads/nsThread.cpp:1159 #24 0x00007fb79cafeee2 in nsThread::ProcessNextEvent (aMayWait=false, this=0x7fb7a7a90280, aResult=0x7fffc1c1bc70) at /usr/src/debug/www-client/firefox-135.0.1/firefox-135.0.1/xpcom/threads/nsThread.cpp:1055 #25 NS_ProcessNextEvent (aMayWait=false, aThread=0x7fb7a7a90280) at /usr/src/debug/www-client/firefox-135.0.1/firefox-135.0.1/xpcom/threads/nsThreadUtils.cpp:480 #26 mozilla::ipc::MessagePump::Run (this=0x7fb79a7d9300, aDelegate=0x7fb7a7a69040) at /usr/src/debug/www-client/firefox-135.0.1/firefox-135.0.1/ipc/glue/MessagePump.cpp:85 --Type <RET> for more, q to quit, c to continue without paging-- #27 0x00007fb7a006f680 in MessageLoop::RunInternal (this=<optimized out>) at /usr/src/debug/www-client/firefox-135.0.1/firefox-135.0.1/ipc/chromium/src/base/message_loop.cc:369 #28 MessageLoop::RunHandler (this=<optimized out>) at /usr/src/debug/www-client/firefox-135.0.1/firefox-135.0.1/ipc/chromium/src/base/message_loop.cc:362 #29 MessageLoop::Run (this=<optimized out>) at /usr/src/debug/www-client/firefox-135.0.1/firefox-135.0.1/ipc/chromium/src/base/message_loop.cc:344 #30 nsBaseAppShell::Run (this=0x7fb79a75c500) at /usr/src/debug/www-client/firefox-135.0.1/firefox-135.0.1/widget/nsBaseAppShell.cpp:148 #31 nsAppShell::Run (this=0x7fb79a75c500) at /usr/src/debug/www-client/firefox-135.0.1/firefox-135.0.1/widget/gtk/nsAppShell.cpp:470 #32 0x00007fb7a0c68361 in nsAppStartup::Run (this=0x7fb795efe600) at /usr/src/debug/www-client/firefox-135.0.1/firefox_build/dist/include/nsCOMPtr.h:751 #33 XREMain::XRE_mainRun (this=this@entry=0x7fffc1c1c400) at /usr/src/debug/www-client/firefox-135.0.1/firefox-135.0.1/toolkit/xre/nsAppRunner.cpp:5835 #34 0x00007fb7a0c6b362 in XREMain::XRE_main (this=0x7fffc1c1c400, argc=<optimized out>, argv=<optimized out>, aConfig=<optimized out>) at /usr/src/debug/www-client/firefox-135.0.1/firefox-135.0.1/toolkit/xre/nsAppRunner.cpp:6075 #35 XRE_main (argc=<optimized out>, argv=<optimized out>, aConfig=<optimized out>) at /usr/src/debug/www-client/firefox-135.0.1/firefox-135.0.1/toolkit/xre/nsAppRunner.cpp:6148 #36 0x000055d378478a25 in do_main (argc=<optimized out>, argv=<optimized out>, envp=envp@entry=0x7fffc1c1d7a8) at /usr/src/debug/www-client/firefox-135.0.1/firefox-135.0.1/browser/app/nsBrowserApp.cpp:232 #37 0x000055d37846cc7a in main (argc=<optimized out>, argv=<optimized out>, envp=0x7fffc1c1d7a8) at /usr/src/debug/www-client/firefox-135.0.1/firefox-135.0.1/browser/app/nsBrowserApp.cpp:464
about:buildconfig Build Configuration Please be aware that this page doesn't reflect all the options used to build Firefox. Build platform target x86_64-pc-linux-gnu Build tools Compiler Version Compiler flags /usr/bin/x86_64-pc-linux-gnu-gcc -std=gnu17 15.0.1 -pthread -ffunction-sections -fdata-sections -fno-math-errno -pipe -fPIC -march=znver2 -pipe -fuse-linker-plugin -frecord-gcc-switches -Werror=strict-aliasing /usr/bin/x86_64-pc-linux-gnu-g++ 15.0.1 -fno-rtti -pthread -fno-sized-deallocation -fno-aligned-new -ffunction-sections -fdata-sections -fno-math-errno -fno-exceptions -pipe -fPIC -march=znver2 -pipe -fuse-linker-plugin -frecord-gcc-switches -Werror=strict-aliasing -gdwarf-4 -O3 -fomit-frame-pointer -funwind-tables /usr/lib/rust/1.85.0/bin/rustc 1.85.0 Configure options --enable-application=browser --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu --enable-update-channel=release MOZBUILD_STATE_PATH=/var/tmp/notmpfs/portage/www-client/firefox-135.0.1/work/firefox_build --prefix=/usr --libdir=/usr/lib64 CPPFLAGS= 'CFLAGS=-march=znver2 -pipe -fuse-linker-plugin -frecord-gcc-switches -Werror=strict-aliasing' 'CXXFLAGS=-march=znver2 -pipe -fuse-linker-plugin -frecord-gcc-switches -Werror=strict-aliasing' 'LDFLAGS=-Wl,-O1 -Wl,--as-needed -Wl,--defsym=__gentoo_check_ldflags__=0 -frecord-gcc-switches -Werror=strict-aliasing -Wl,--undefined-version -Wl,-rpath=/usr/lib64/firefox,--enable-new-dtags' --enable-optimize=-O3 --with-toolchain-prefix=x86_64-pc-linux-gnu- CC=x86_64-pc-linux-gnu-gcc LD=x86_64-pc-linux-gnu-ld CXX=x86_64-pc-linux-gnu-g++ HOST_CC=x86_64-pc-linux-gnu-gcc HOST_CXX=x86_64-pc-linux-gnu-g++ --enable-linker=bfd 'AS=x86_64-pc-linux-gnu-gcc -c' AR=x86_64-pc-linux-gnu-ar NM=x86_64-pc-linux-gnu-nm PKG_CONFIG=x86_64-pc-linux-gnu-pkg-config --enable-lto=full READELF=llvm-readelf RUSTC=/usr/lib/rust/1.85.0/bin/rustc CARGO=/usr/lib/rust/1.85.0/bin/cargo --disable-cargo-incremental --with-libclang-path=/usr/lib/llvm/19/lib64 --with-system-ffi --enable-rust-simd --with-system-icu --enable-default-toolkit=cairo-gtk3-x11-wayland --disable-wmf --with-system-av1 --disable-real-time-tracing --with-mozilla-api-keyfile=/var/tmp/notmpfs/portage/www-client/firefox-135.0.1/work/firefox-135.0.1/api-mozilla.key --with-google-location-service-api-keyfile=/var/tmp/notmpfs/portage/www-client/firefox-135.0.1/work/firefox-135.0.1/api-location.key --with-google-safebrowsing-api-keyfile=/var/tmp/notmpfs/portage/www-client/firefox-135.0.1/work/firefox-135.0.1/api-google.key --with-system-webp --with-system-graphite2 --with-system-harfbuzz --disable-geckodriver --enable-elf-hack=relr --with-unsigned-addon-scopes=app,system --allow-addon-sideload --with-system-libvpx --with-system-jpeg --without-wasm-sandboxed-libraries --with-system-nss --disable-updater --with-system-libevent --disable-crashreporter --disable-necko-wifi --disable-parental-controls --enable-system-pixman --disable-legacy-profile-creation XARGS=/usr/bin/xargs --disable-install-strip --with-system-zlib --enable-official-branding --x-includes=/usr/include --x-libraries=/usr/lib64
Created attachment 919805 [details] gdb bt full
Example of a previous such bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1664151.
Created attachment 919873 [details] disassembly
[Saturday 25 January 2025] [09:11:40 Greenwich Mean Time] <sam_> not sure I get it yet [Saturday 25 January 2025] [09:11:42 Greenwich Mean Time] <sam_> x [Saturday 25 January 2025] [09:11:42 Greenwich Mean Time] <sam_> 0x00007ffff068382f <+1471>: test %rcx,%rcx [Saturday 25 January 2025] [09:11:42 Greenwich Mean Time] <sam_> 0x00007ffff0683832 <+1474>: je 0x7ffff0683845 <_ZN7mozilla3dom19PSessionStoreParent17OnMessageReceivedERKN3IPC7MessageE+1493> [Saturday 25 January 2025] [09:11:42 Greenwich Mean Time] <sam_> => 0x00007ffff0683834 <+1476>: mov (%rcx),%rax [Saturday 25 January 2025] [09:12:12 Greenwich Mean Time] <sam_> so %rcx is 0 and it dereferences i [Saturday 25 January 2025] [09:12:45 Greenwich Mean Time] <sam_> and it was supposed to be whatever is 0x58 on the stack? [Saturday 25 January 2025] [09:12:49 Greenwich Mean Time] <sam_> *is at [Saturday 25 January 2025] [09:44:29 Greenwich Mean Time] <sam_> i'm not sure if it's hard to follow because of the std::move
Created attachment 919874 [details] register dump
For anyone else coming across this: what would be *really* useful is a way to reproduce it on demand. It crashes with a high probability within say, 5 minutes of FF startup, but not always, and I can't trigger it on-demand.
https://searchfox.org/mozilla-release/source/__GENERATED__/ipc/ipdl/PSessionStoreParent.cpp#352 #9 mozilla::dom::PSessionStoreParent::OnMessageReceived (this=<optimized out>, msg__=<optimized out>) at /usr/src/debug/www-client/firefox-135.0.1/firefox_build/ipc/ipdl/PSessionStoreParent.cpp:352 maybe__aBrowsingContext = {mIsOk = true, mData = {mId = 57982058497, mPtr = {mRawPtr = 0x0}}} aBrowsingContext = <optimized out> [...]
(I think we're in the PSessionStore::Msg_IncrementalSessionStoreUpdate__ID case, and then we die on leaving the scope when it destructs the obj)
then down the line ~RefPtr() { if (mRawPtr) { ConstRemovingRefPtrTraits<T>::Release(mRawPtr); } } and I guess the check for it being null gets optimised out
kostadin gave me a useful reproducer, youtube.com shorts and scrolling down with just the down arrow key seems to hit it within <10s for me
*** Bug 950352 has been marked as a duplicate of this bug. ***
So far, bisected objects to: * build/toolkit/components/sessionstore/Unified_cpp_sessionstore0.o * build/docshell/base/Unified_cpp_docshell_base0.o w/ LTO. Combining the two of them builds fine but either building it with LTO (to allow merging with other LTO objects, not that there's many left, just thinking of the .a which I didn't need to split out) or without doesn't lead to the crash. Uploaded a tarball [0] as a checkpoint of progress with preprocessed sources (Unified_cpp_sessionstore0.ii, Unified_cpp_docshell_base0.ii), compile/link commands for those, and the full link command + list of objects from bisection. [0] https://dev.gentoo.org/~sam/bugs/gentoo/950229/950229-checkpoint.tar.xz
Next steps are (given the merging attempt failed): * split the unified versions out to make it easier to see what's going on (and discard a lot of code) -- hopefully it still reproduces there, and/or * do a full build w/ unified disabled and hope it still repros there and if needed redo full bisection of objects (bleh)
# Split out from Unified_cpp_docshell_base0.cpp merge_1=( /opt/sam-debugging/ff/firefox-135.0.1/docshell/base/BaseHistory.cpp /opt/sam-debugging/ff/firefox-135.0.1/docshell/base/BrowsingContext.cpp /opt/sam-debugging/ff/firefox-135.0.1/docshell/base/BrowsingContextGroup.cpp /opt/sam-debugging/ff/firefox-135.0.1/docshell/base/BrowsingContextWebProgress.cpp /opt/sam-debugging/ff/firefox-135.0.1/docshell/base/CanonicalBrowsingContext.cpp /opt/sam-debugging/ff/firefox-135.0.1/docshell/base/ChildProcessChannelListener.cpp /opt/sam-debugging/ff/firefox-135.0.1/docshell/base/LoadContext.cpp /opt/sam-debugging/ff/firefox-135.0.1/docshell/base/SerializedLoadContext.cpp /opt/sam-debugging/ff/firefox-135.0.1/docshell/base/WindowContext.cpp /opt/sam-debugging/ff/firefox-135.0.1/docshell/base/nsAboutRedirector.cpp /opt/sam-debugging/ff/firefox-135.0.1/docshell/base/nsDSURIContentListener.cpp /opt/sam-debugging/ff/firefox-135.0.1/docshell/base/nsDocShell.cpp /opt/sam-debugging/ff/firefox-135.0.1/docshell/base/nsDocShellEditorData.cpp /opt/sam-debugging/ff/firefox-135.0.1/docshell/base/nsDocShellEnumerator.cpp /opt/sam-debugging/ff/firefox-135.0.1/docshell/base/nsDocShellLoadState.cpp /opt/sam-debugging/ff/firefox-135.0.1/docshell/base/nsDocShellTelemetryUtils.cpp ) # Split out from Unified_cpp_sessionstore0.cpp merge_2=( /opt/sam-debugging/ff/firefox-135.0.1/toolkit/components/sessionstore/BrowserSessionStore.cpp /opt/sam-debugging/ff/firefox-135.0.1/toolkit/components/sessionstore/RestoreTabContentObserver.cpp /opt/sam-debugging/ff/firefox-135.0.1/toolkit/components/sessionstore/SessionStoreChangeListener.cpp /opt/sam-debugging/ff/firefox-135.0.1/toolkit/components/sessionstore/SessionStoreChild.cpp /opt/sam-debugging/ff/firefox-135.0.1/toolkit/components/sessionstore/SessionStoreFormData.cpp /opt/sam-debugging/ff/firefox-135.0.1/toolkit/components/sessionstore/SessionStoreListener.cpp /opt/sam-debugging/ff/firefox-135.0.1/toolkit/components/sessionstore/SessionStoreParent.cpp /opt/sam-debugging/ff/firefox-135.0.1/toolkit/components/sessionstore/SessionStoreRestoreData.cpp /opt/sam-debugging/ff/firefox-135.0.1/toolkit/components/sessionstore/SessionStoreScrollData.cpp /opt/sam-debugging/ff/firefox-135.0.1/toolkit/components/sessionstore/SessionStoreUtils.cpp /opt/sam-debugging/ff/build/ipc/ipdl/PSessionStore.cpp /opt/sam-debugging/ff/build/ipc/ipdl/PSessionStoreChild.cpp /opt/sam-debugging/ff/build/ipc/ipdl/PSessionStoreParent.cpp /opt/sam-debugging/ff/build/ipc/ipdl/SessionStoreTypes.cpp )
(In reply to Sam James from comment #14) > Next steps are (given the merging attempt failed): > * split the unified versions out to make it easier to see what's going on > (and discard a lot of code) -- hopefully it still reproduces there This builds fine and runs fine w/o any crashes :( > * do a full build w/ unified disabled and hope it still repros there and if > needed redo full bisection of objects (bleh) I'll try this next but will probably call it a night here.
(In reply to Sam James from comment #13) > So far, bisected objects to: > * build/toolkit/components/sessionstore/Unified_cpp_sessionstore0.o > * build/docshell/base/Unified_cpp_docshell_base0.o > w/ LTO. > > Combining the two of them builds fine but either building it with LTO (to > allow merging with other LTO objects, not that there's many left, just > thinking of the .a which I didn't need to split out) or without doesn't lead > to the crash. > > Uploaded a tarball [0] as a checkpoint of progress with preprocessed sources > (Unified_cpp_sessionstore0.ii, Unified_cpp_docshell_base0.ii), compile/link > commands for those, and the full link command + list of objects from > bisection. > > [0] https://dev.gentoo.org/~sam/bugs/gentoo/950229/950229-checkpoint.tar.xz I'm sure you are aware of this, but you can't really bisect lto failures by compiling some files with -flto and others without. I learned that when debugging an lto segfault due to strict aliasing violations. I ended up writing a separate file to put functions into, and got it down to a few (3 iirc) functions that when built without lto, everything would work, but build any of them with lto, and the segfault would start happening. In the end, the strict aliasing violation wasn't in those functions, it was in a completely different place. If it helps give any ideas here, I tracked it down by adding volatile's to function args and then removing some of them, and bisected it that way. When debugging that lto crash, gdb was showing some insane output. Something like, foo = NULL, foo->bar = something, foo->bar->baz (segfault) Again I'm sure you already know what can happen when debugging lto crashes, but I wanted to point it out.
(In reply to stefan11111 from comment #17) > (In reply to Sam James from comment #13) > > So far, bisected objects to: > > * build/toolkit/components/sessionstore/Unified_cpp_sessionstore0.o > > * build/docshell/base/Unified_cpp_docshell_base0.o > > w/ LTO. > > > > Combining the two of them builds fine but either building it with LTO (to > > allow merging with other LTO objects, not that there's many left, just > > thinking of the .a which I didn't need to split out) or without doesn't lead > > to the crash. > > > > Uploaded a tarball [0] as a checkpoint of progress with preprocessed sources > > (Unified_cpp_sessionstore0.ii, Unified_cpp_docshell_base0.ii), compile/link > > commands for those, and the full link command + list of objects from > > bisection. > > > > [0] https://dev.gentoo.org/~sam/bugs/gentoo/950229/950229-checkpoint.tar.xz > > I'm sure you are aware of this, but you can't really bisect lto failures > by compiling some files with -flto and others without. > No, you can -- that's what I did. The rub is partitioning, but it doesn't mean you can't do it at all, or that it has no value. It just means you can't (obviously) get it down to a single TU. Any files built with LTO *can't* be relevant because they don't have any GIMPLE in them anymore, they're simply object files with no metadata. (Most LTO issues also boil down to "more inlining" which is why you can almost-always transform an LTO bug into a single-file testcase in the end. It's just getting there that sucks.)
(In reply to Sam James from comment #18) > > Any files built with LTO *can't* be relevant because they don't have any > GIMPLE in them anymore, they're simply object files with no metadata. > with->without
(https://gcc.gnu.org/PR117315 is a recent example where it ended up being the malloc attribute causing DSE a while down the line, by the way.)
s/DSE/DCE, and now bed..
Testing the patch from https://bugzilla.mozilla.org/show_bug.cgi?id=1790526 (https://bugzilla.mozilla.org/attachment.cgi?id=9465385), but if it works, I think it'd make sense too. https://searchfox.org/mozilla-central/source/toolkit/components/sessionstore/SessionStoreParent.cpp#197 because of the access of aBrowsingContext in: aBrowsingContext.GetMaybeDiscarded()->Canonical(), aFormData, it might, with LTO, conclude "ok, it's not null, as you accessed it before", and then kill the mRawPtr check later (which would be legitimate).
Awesome, let me know if it works and can be added to 136.
(In reply to Joonas Niilola from comment #23) > Awesome, let me know if it works and can be added to 136. Thank you! It's looking very good so far. No luck crashing on the standard reproducer with a manual build, and my system FF has been patched since around the time of my comment. Please do add it in if you can on the next version. I'll comment on the upstream bug/review later today.
Stefan, it would be nice if you could check it out and see if it helps your crash too. parona too?
(Not 15 specific; kostadin hit it w/ 14 at least, not 13. It also started between 129 and 131, we think.)
(In reply to Sam James from comment #25) > Stefan, it would be nice if you could check it out and see if it helps your > crash too. parona too? The patch fixes both my crashes.
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=7dfd8a45f803cd33f4714dbcc664fc965a5748d9 commit 7dfd8a45f803cd33f4714dbcc664fc965a5748d9 Author: Joonas Niilola <juippis@gentoo.org> AuthorDate: 2025-03-03 20:13:21 +0000 Commit: Joonas Niilola <juippis@gentoo.org> CommitDate: 2025-03-03 20:16:59 +0000 www-client/firefox: add 136.0 - add a patch that's currently being worked upstream which fixes runtime issues when compiled with gcc and lto, - attempt removing "*bgo-925101-force-software-rendering-during-pgo-build.patch" as it's supposed to be fixed by mesa updates, - disable pref "permissions.manager.remote.enabled" by default - can be enabled by corporations managing their browsers via remote-settings, - handle rust-simd by enabling it on supported arches, so unkeyworded arches can probably compile the browser with --disable-rust-simd by default without editing the ebuild, - increase nss, icu and libpng version requirements, - remove our custom patch enabling vaapi on all amd cards since it's merged upstream, - remove our custom system-av1 & system-libvpx patches as they've been merged upstream. Bug: https://bugs.gentoo.org/950229 Bug: https://bugs.gentoo.org/950305 Signed-off-by: Joonas Niilola <juippis@gentoo.org> www-client/firefox/Manifest | 102 ++ www-client/firefox/files/gentoo-default-prefs.js | 3 + www-client/firefox/firefox-136.0.ebuild | 1382 ++++++++++++++++++++++ 3 files changed, 1487 insertions(+)
Thank you all! Given that we think it got introduced after the last ESR, closing.
(In reply to Sam James from comment #29) > Thank you all! Given that we think it got introduced after the last ESR, > closing. or "introduced", I should say -- the upstream bug is older than that (2 years old) and it's been latent, I think.
Can someone explain why this patch fixes the issue? I know gcc removes NULL-pointer checks in cases like: foo (char *ptr) { if (ptr) bar(ptr); } bar (char *ptr) { if (ptr) *ptr = 0; } And cases like: foo (volatile char *ptr) /* volatile so if won't get optimized for other reasons */ { *ptr = 0; if (ptr) *ptr = 1; } If it were the first case, then the NULL-pointer check wouldn't cause segfaults. If it were the second case, It wouldn't have mattered if the NULL-pointer check was removed or not, a segfault would still happen. Is there something I'm missing here?
It's nearly the first case. int bar(int *p) { if (!p) /* check is deleted with LTO if all callers are known because it has to be non-NULL for the earlier printf to be valid */ __builtin_abort(); } int foo(int* p) { printf("%p\n", p); /* p must be non-null given we used it */ bar(p); }
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=4e2aeadbeb3d2f677a1ed5d4286b090620c896ec commit 4e2aeadbeb3d2f677a1ed5d4286b090620c896ec Author: Joonas Niilola <juippis@gentoo.org> AuthorDate: 2025-03-04 14:37:32 +0000 Commit: Joonas Niilola <juippis@gentoo.org> CommitDate: 2025-03-04 14:37:32 +0000 www-client/firefox: add 128.8.0 - add a patch that's currently being worked upstream which fixes runtime issues when compiled with gcc and lto, - handle rust-simd by enabling it on supported arches, so unkeyworded arches can probably compile the browser with --disable-rust-simd by default without editing the ebuild, - sync the updated configure option from rapid to esr (system-ffi, update-channel). - while "permissions.manager.remote.enabled" is disabled through the default pref settings, the setting will actually only be active in the rapid (136) version. Bug: https://bugs.gentoo.org/950229 Signed-off-by: Joonas Niilola <juippis@gentoo.org> www-client/firefox/Manifest | 102 +++ www-client/firefox/firefox-128.8.0.ebuild | 1380 +++++++++++++++++++++++++++++ 2 files changed, 1482 insertions(+)
(https://blog.llvm.org/2011/05/what-every-c-programmer-should-know_14.html)
(In reply to Sam James from comment #32) > It's nearly the first case. > > int bar(int *p) { > if (!p) /* check is deleted with LTO if all callers are known because it > has to be non-NULL for the earlier printf to be valid */ > __builtin_abort(); > } > > int foo(int* p) { > printf("%p\n", p); /* p must be non-null given we used it */ > bar(p); > } Printing a NULL pointer is UB or something in c++? I don't see why a NULL pointer can't be printed, as it's just printing an unsigned long as hex, with some formatting. $ cat printf.cpp #include <stdio.h> int main() { int *p = NULL; printf("%p\n", p); return 0; } $ g++ printf.cpp -O3 -flto -o printf $ ./printf (nil)
I think printf was a bad example. The rest stands. Please read the linked post.