Summary: | www-client/firefox-bin-119.0 with apulse [alsa,-pulseaudio] : SEGV on launch | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | Phil Stracchino (Unix Ronin) <phils> |
Component: | Current packages | Assignee: | Mozilla Gentoo Team <mozilla> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | aj-lists, andy, dark.knight.ita, email200202, herrtimson, phils, preed, teika |
Priority: | Normal | ||
Version: | unspecified | ||
Hardware: | All | ||
OS: | Linux | ||
See Also: |
https://bugzilla.mozilla.org/show_bug.cgi?id=1850866 https://bugzilla.mozilla.org/show_bug.cgi?id=1839740 |
||
Whiteboard: | |||
Package list: | Runtime testing required: | --- | |
Attachments: |
strace output
strace firefox-bin Fixed ebuild. Patch to the official ebuild. |
Description
Phil Stracchino (Unix Ronin)
2023-10-24 15:15:41 UTC
Can someone hitting this please run it under gdb and try get a backtrace? Thanks. (In reply to Sam James from comment #1) > Can someone hitting this please run it under gdb and try get a backtrace? > Thanks. I tried ... (gdb) file /usr/bin/firefox-bin "/usr/bin/firefox-bin": not in executable format: file format not recognized Ugh, the wrapper thing.. sorry, I cant look up the real path right now, can when home. (In reply to Sam James from comment #3) > Ugh, the wrapper thing.. sorry, I cant look up the real path right now, can > when home. I found it, it's /opt/firefox/firefox-bin. Don't know whether this is enough information: (gdb) file /opt/firefox/firefox-bin Reading symbols from /opt/firefox/firefox-bin... (No debugging symbols found in /opt/firefox/firefox-bin) (gdb) run Starting program: /opt/firefox/firefox-bin [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". [New Thread 0x7ffff77ff6c0 (LWP 23068)] [Thread 0x7ffff77ff6c0 (LWP 23068) exited] [Detaching after fork from child process 23069] Thread 1 "firefox-bin" received signal SIGSEGV, Segmentation fault. 0x0000000005f3f650 in ?? () (gdb) bt #0 0x0000000005f3f650 in () #1 0x00007ffff7fcee5e in () at /lib64/ld-linux-x86-64.so.2 #2 0x00007ffff7fcef4c in () at /lib64/ld-linux-x86-64.so.2 #3 0x00007ffff7fcb556 in _dl_catch_exception () at /lib64/ld-linux-x86-64.so.2 #4 0x00007ffff7fd583f in () at /lib64/ld-linux-x86-64.so.2 #5 0x00007ffff7fcb4c9 in _dl_catch_exception () at /lib64/ld-linux-x86-64.so.2 #6 0x00007ffff7fd5bdd in () at /lib64/ld-linux-x86-64.so.2 #7 0x00007ffff7aa9a18 in () at /lib64/libc.so.6 #8 0x00007ffff7fcb4c9 in _dl_catch_exception () at /lib64/ld-linux-x86-64.so.2 #9 0x00007ffff7fcb5ef in () at /lib64/ld-linux-x86-64.so.2 #10 0x00007ffff7aa94ea in () at /lib64/libc.so.6 #11 0x00007ffff7aa9ad1 in dlopen () at /lib64/libc.so.6 #12 0x00005555555e47e9 in _start () (gdb) (I still have the gdb session open if you can give me specific commands you'd like me to run.) It's failing to load some shared library via dlopen(). You would need to rebuild sys-libs/glibc with debug symbols to determine which library it is trying to load. Alternatively, you could run strace to see what file is getting loaded. (In reply to Mike Gilbert from comment #6) > It's failing to load some shared library via dlopen(). You would need to > rebuild sys-libs/glibc with debug symbols to determine which library it is > trying to load. > > Alternatively, you could run strace to see what file is getting loaded. Here's the tail end of the strace: openat(AT_FDCWD, "/usr/lib64/apulse/libX11-xcb.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/opt/firefox/libX11-xcb.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/lib64/apulse/libX11-xcb.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/lib64/libX11-xcb.so.1", O_RDONLY|O_CLOEXEC) = 7 read(7, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 832) = 832 newfstatat(7, "", {st_mode=S_IFREG|0755, st_size=13816, ...}, AT_EMPTY_PATH) = 0 mmap(NULL, 16400, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 7, 0) = 0x7f49ed45a000 mmap(0x7f49ed45b000, 4096, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 7, 0x1000) = 0x7f49ed45b000 mmap(0x7f49ed45c000, 4096, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 7, 0x2000) = 0x7f49ed45c000 mmap(0x7f49ed45d000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 7, 0x2000) = 0x7f49ed45d000 close(7) = 0 mprotect(0x7f49ed45d000, 4096, PROT_READ) = 0 mprotect(0x7f49ebacd000, 4096, PROT_READ) = 0 mprotect(0x7f49ed466000, 4096, PROT_READ) = 0 mprotect(0x7f49ebbb6000, 24576, PROT_READ) = 0 mprotect(0x7f49eb3ae000, 5156864, PROT_READ) = 0 mprotect(0x7f49eb3ae000, 5156864, PROT_READ|PROT_WRITE) = 0 mprotect(0x7f49eb3ae000, 5156864, PROT_READ) = 0 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x5f3f650} --- +++ killed by SIGSEGV +++ Segmentation fault Seeing a lot of ENOENTs there. It looks to me like firefox-bin-119 is trying to open those libraries from places where they shouldn't be...? (In reply to Phil Stracchino (Unix Ronin) from comment #8) > Seeing a lot of ENOENTs there. It looks to me like firefox-bin-119 is > trying to open those libraries from places where they shouldn't be...? And/or versions of libraries that aren't installed... libx11-xcb.so.1 is likely being loaded as a dependency of some other library. Please attach the full strace output. Created attachment 873415 [details]
strace output
The bug has been closed via the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=0512a42f7fccccc16c44a2047cdc9d314cd044db commit 0512a42f7fccccc16c44a2047cdc9d314cd044db Author: Joonas Niilola <juippis@gentoo.org> AuthorDate: 2023-10-24 18:05:23 +0000 Commit: Joonas Niilola <juippis@gentoo.org> CommitDate: 2023-10-24 18:06:41 +0000 www-client/firefox-bin: fix RDEPEND typo on 119.0 Closes: https://bugs.gentoo.org/916230 Signed-off-by: Joonas Niilola <juippis@gentoo.org> .../{firefox-bin-119.0.ebuild => firefox-bin-119.0-r1.ebuild} | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) I guess this is it but not really sure how you hit it now. Please reopen if it still segfaults. Wait, no, that can't be it. The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=866e2402f3b421eb12b4de2714cbdc43f48ef61d commit 866e2402f3b421eb12b4de2714cbdc43f48ef61d Author: Joonas Niilola <juippis@gentoo.org> AuthorDate: 2023-10-24 18:13:14 +0000 Commit: Joonas Niilola <juippis@gentoo.org> CommitDate: 2023-10-24 18:13:19 +0000 www-client/firefox-bin: re-add 118.0.2 - in case 119.0 is broken for some people. Bug: https://bugs.gentoo.org/916230 Signed-off-by: Joonas Niilola <juippis@gentoo.org> www-client/firefox-bin/Manifest | 98 ++++++ www-client/firefox-bin/firefox-bin-118.0.2.ebuild | 382 ++++++++++++++++++++++ 2 files changed, 480 insertions(+) Based on the strace log, it looks like it is crashing in dlopen("/opt/firefox/libxul.so"). Okay, reproduced with apulse. The problem is still there in www-client/firefox-bin-119.0-r1 $ firefox-bin Segmentation fault See attached strace output Created attachment 873416 [details]
strace firefox-bin
> The problem is still there in www-client/firefox-bin-119.0-r1
Me too, with +alsa -pulseaudio.
Thanks all. Regards.
Re: www-client/firefox-bin-119.0-r1 Ditto. (Me too.) Reverting to 118.0.2 restored functionality. So I think apulse needs to be updated to work with 119.0. Looking at https://github.com/i-rinat/apulse/issues/121 I'm not very hopeful that would happen. Someone should still report this to apulse upstream, maybe there's someone with an idea how to fix it. Unfortunately as the issue seems to be at least in firefox-bin, there's not much that can be done to fix it. Created attachment 873560 [details]
Fixed ebuild.
Dear patients, please try the attached ebuild.
It simply drops the line of "patchelf". (And accompanying BDEPEND fix.) At least it works for me.
Created attachment 873561 [details, diff]
Patch to the official ebuild.
If you're sane, you would prefer this patch rather than diretly relying on my ebuild.
Hope this works. Thanks all!
(In reply to Teika kazura from comment #23) > Created attachment 873560 [details] > Fixed ebuild. > > Dear patients, please try the attached ebuild. > > It simply drops the line of "patchelf". (And accompanying BDEPEND fix.) At > least it works for me. YES! I downloaded the ebuild as firefox-bin-119.0-r2.ebuild in my local repo, emerged it then have it running now. Appears to work fine. Has sound. Only error running from shell, which has been present for a while now, is: $ firefox-bin ATTENTION: default value of option mesa_glthread overridden by environment. ATTENTION: default value of option mesa_glthread overridden by environment. Very happy. Wondering if there are any long term casualties in adopting this? I don't know. I'm just a patient. Mysterious. I wonder if they added some apulse detection upstream in 119 which makes it obsolete or something. /usr/bin/firefox-bin still has these lines: APULSELIB_DIR="/usr/lib64/apulse" ... export LD_LIBRARY_PATH="${APULSELIB_DIR:+${APULSELIB_DIR}:}${MOZILLA_FIVE_HOME}" So they suffice, I guess. # Or rather, why LD_LIBRARY_PATH is insufficient, and patchelf is necessary, unless libpulse.so path is hardcoded? ## Dunno what patchelf is exactly. Sorry if my comment misses the point. (In reply to Teika kazura from comment #23) > Created attachment 873560 [details] > Fixed ebuild. > > Dear patients, please try the attached ebuild. > > It simply drops the line of "patchelf". (And accompanying BDEPEND fix.) At > least it works for me. (In reply to Andy Figueroa from comment #25) > (In reply to Teika kazura from comment #23) > > Created attachment 873560 [details] > > Fixed ebuild. > > > > Dear patients, please try the attached ebuild. > > > > It simply drops the line of "patchelf". (And accompanying BDEPEND fix.) At > > least it works for me. > > YES! > > I downloaded the ebuild as firefox-bin-119.0-r2.ebuild in my local repo, > emerged it then have it running now. Appears to work fine. Has sound. Only > error running from shell, which has been present for a while now, is: > > $ firefox-bin > ATTENTION: default value of option mesa_glthread overridden by environment. > ATTENTION: default value of option mesa_glthread overridden by environment. > > Very happy. Wondering if there are any long term casualties in adopting > this? I don't know. I'm just a patient. Do either of you have pulseaudio or pipewire installed? It shouldn't be using apulse after that modification. And I also doubt upstream Mozilla has done anything to support apulse either, buuut who knows, might have to take a deeper look. (In reply to Joonas Niilola from comment #28) > > Do either of you have pulseaudio or pipewire installed? It shouldn't be > using apulse after that modification. And I also doubt upstream Mozilla has > done anything to support apulse either, buuut who knows, might have to take > a deeper look. No, my system is stricly -pulseaudio +alsa with no pipewire. apulse is installed. (In reply to Andy Figueroa from comment #29) > (In reply to Joonas Niilola from comment #28) > > > > Do either of you have pulseaudio or pipewire installed? It shouldn't be > > using apulse after that modification. And I also doubt upstream Mozilla has > > done anything to support apulse either, buuut who knows, might have to take > > a deeper look. > > No, my system is stricly -pulseaudio +alsa with no pipewire. apulse is > installed. Same here: +alsa -pulseaudio, apulse, no pipewire. I long ago learned that any of the open-surce software mixers, even if you can get them correctly configured, are a poor substitute for an EMU10K1 card with hardware mixing. I've tested Teika's patch and verified it appears to work correctly for me. I, too, can confirm that removing the patchelf command gives a working firefox build (including sound!) I did some poking around Mozilla's bugzilla and didn't see anything related to audio that looked all that interesting, so I used Mozilla's mozregression tool to see if it provided anything useful... here's what it said: 13:25.85 INFO: Narrowed integration regression window from [c8336f0a, 68879cdd] (3 builds) to [c8336f0a, 032b87ff] (2 builds) (~1 steps left) 13:25.85 INFO: No more integration revisions, bisection finished. 13:25.85 INFO: Last good revision: c8336f0ad9fba7249e6fa871f0cf96660c9c3b97 13:25.85 INFO: First bad revision: 032b87ff55061bcbdc7a85d9e18fde814797073a 13:25.85 INFO: Pushlog: https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=c8336f0ad9fba7249e6fa871f0cf96660c9c3b97&tochange=032b87ff55061bcbdc7a85d9e18fde814797073a This is the specific commit referenced in the pushlog: https://hg.mozilla.org/integration/autoland/rev/032b87ff55061bcbdc7a85d9e18fde814797073a And this is the original Mozilla bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1850866 This, in turn, references https://bugzilla.mozilla.org/show_bug.cgi?id=1839740, which is about "modernizing elfhack," (whatever that is... but wait for it!) the bug conveniently references a blog post on elfhack: https://glandium.org/blog/?p=4297 (which, in turn, goes into a fair amount of details about what "relhack" is, and that seems to be the smoking gun). So, tl;dr: I don't think this has (or ever had) anything to do with apulse or audio in general; it was the patchelf command in the ebuild that probably conflicted with Mozilla's new "relhack" linker munging tool, which got turned on in their official build system for 119.x. I'd also guess that the reason "everything works" after removing that command is because the /usr/bin/firefox-bin wrapper still sets LD_LIBRARY_PATH to include the apulse lib directory for -pulseaudio +alsa configurations. So, I think the above patch is a pretty solid candidate for a firefox-bin-119.0-r2 ebuild. Many thanks for doing that work. Agreed! (In reply to J. Paul Reed from comment #31) > > I'd also guess that the reason "everything works" after removing that > command is because the /usr/bin/firefox-bin wrapper still sets > LD_LIBRARY_PATH to include the apulse lib directory for -pulseaudio +alsa > configurations. > Ahh indeed, that explains it. Thanks for digging into it :) let's go with this. The bug has been closed via the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=e64a17210c65081ee68e1e63031e66e45b6bc09c commit e64a17210c65081ee68e1e63031e66e45b6bc09c Author: Joonas Niilola <juippis@gentoo.org> AuthorDate: 2023-10-29 07:37:31 +0000 Commit: Joonas Niilola <juippis@gentoo.org> CommitDate: 2023-10-29 07:37:31 +0000 www-client/firefox-bin: fix 119.0 when using apulse Closes: https://bugs.gentoo.org/916230 Signed-off-by: Joonas Niilola <juippis@gentoo.org> ...119.0-r1.ebuild => firefox-bin-119.0-r2.ebuild} | 28 +++++++--------------- 1 file changed, 8 insertions(+), 20 deletions(-) |