Summary: | www-client/firefox-90.0.2: fails to build with glibc 2.34 (note: non-constexpr function '__sysconf' cannot be used in a constant expression) | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | Sam James <sam> |
Component: | Current packages | Assignee: | Mozilla Gentoo Team <mozilla> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | canarauc, da5id2001, finkandreas, toolchain |
Priority: | Normal | Keywords: | PATCH |
Version: | unspecified | ||
Hardware: | All | ||
OS: | Linux | ||
See Also: |
https://bugzilla.redhat.com/show_bug.cgi?id=1983703 https://bugs.gentoo.org/show_bug.cgi?id=803953 https://bugzilla.mozilla.org/show_bug.cgi?id=1721326 https://bugs.gentoo.org/show_bug.cgi?id=828070 |
||
Whiteboard: | |||
Package list: | Runtime testing required: | --- | |
Bug Depends on: | |||
Bug Blocks: | 803482 | ||
Attachments: |
build.log.bz2
glibc-2.34-r3 patch |
Description
Sam James
2021-07-25 04:16:24 UTC
Created attachment 726670 [details]
build.log.bz2
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=704508d6adb23fef2ce6e14a25166848ff3bcbcb commit 704508d6adb23fef2ce6e14a25166848ff3bcbcb Author: Thomas Deutschmann <whissi@gentoo.org> AuthorDate: 2021-08-11 01:19:37 +0000 Commit: Thomas Deutschmann <whissi@gentoo.org> CommitDate: 2021-08-11 01:38:15 +0000 www-client/firefox: bump to v91.0 Bug: https://bugs.gentoo.org/803950 Package-Manager: Portage-3.0.21, Repoman-3.0.3 Signed-off-by: Thomas Deutschmann <whissi@gentoo.org> www-client/firefox/Manifest | 98 +++ www-client/firefox/files/firefox-r1.sh | 116 ++++ www-client/firefox/firefox-91.0.ebuild | 1148 ++++++++++++++++++++++++++++++++ 3 files changed, 1362 insertions(+) https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=daeb80d270ac8aa5c8a70002e4a2dff5c976e7c7 commit daeb80d270ac8aa5c8a70002e4a2dff5c976e7c7 Author: Thomas Deutschmann <whissi@gentoo.org> AuthorDate: 2021-08-10 21:48:54 +0000 Commit: Thomas Deutschmann <whissi@gentoo.org> CommitDate: 2021-08-11 01:38:15 +0000 www-client/firefox: bump to v78.13.0 ESR Bug: https://bugs.gentoo.org/803950 Package-Manager: Portage-3.0.21, Repoman-3.0.3 Signed-off-by: Thomas Deutschmann <whissi@gentoo.org> www-client/firefox/Manifest | 97 +++ www-client/firefox/firefox-78.13.0.ebuild | 1187 +++++++++++++++++++++++++++++ 2 files changed, 1284 insertions(+) Upstream has done some good stuff with firefox-91 : firefox + rust-1.54 -- works firefox + python-3.10 -- works firefox + glibc-2.34 -- works Firefox 91.0.2 compile fine with glibc-2.34, no patch required, but most of extensions does not work out of the box. I had to export MOZ_DISABLE_CONTENT_SANDBOX=1 to restore extensions to a working status. It seems that I am not the only one affected: https://forums.gentoo.org/viewtopic-t-1141441-highlight-chromium.html Nothing else work: firefox-bin, new profile, safe mode or restore defaults settings. Just to make this clear: Only >=www-client/firefox-78.13.0 & >=www-client/firefox-91.0 are carrying the linked patches which will be added upstream with there next releases. So www-client/firefox-bin currently doesn't have these patches. If >=www-client/firefox-78.13.0 or >=www-client/firefox-91.0 doesn't work for you with glibc-2.34 please give us a ping. Firefox's extensions does not work. emerge -pvO sys-libs/glibc www-client/firefox These are the packages that would be merged, in order: [ebuild R *] sys-libs/glibc-2.34:2.2::gentoo USE="caps multiarch ssp (static-libs) systemd -audit (-cet) -compile-locales (-crypt) -custom-cflags -doc -gd -headers-only (-multilib) -multilib-bootstrap -nscd -profile (-selinux) -static-pie -suid -systemtap -test (-vanilla)" 0 KiB [ebuild R ] www-client/firefox-91.0.2:0/91::gentoo USE="clang dbus gmp-autoupdate hwaccel lto openh264 pulseaudio system-av1 system-harfbuzz system-icu system-jpeg system-libevent system-libvpx system-webp -debug -eme-free -geckodriver -hardened -jack -pgo -screencast (-selinux) -sndio -wayland -wifi" L10N="ro -ach -af -an -ar -ast -az -be -bg -bn -br -bs -ca -ca-valencia -cak -cs -cy -da -de -dsb -el -en-CA -en-GB -eo -es-AR -es-CL -es-ES -es-MX -et -eu -fa -ff -fi -fr -fy -ga -gd -gl -gn -gu -he -hi -hr -hsb -hu -hy -ia -id -is -it -ja -ka -kab -kk -km -kn -ko -lij -lt -lv -mk -mr -ms -my -nb -ne -nl -nn -oc -pa -pl -pt-BR -pt-PT -rm -ru -sco -si -sk -sl -son -sq -sr -sv -szl -ta -te -th -tl -tr -trs -uk -ur -uz -vi -xh -zh-CN -zh-TW" 0 KiB Total: 2 packages (2 reinstalls), Size of downloads: 0 KiB diff -u /usr/bin/firefox.orig /usr/bin/firefox --- /usr/bin/firefox.orig 2021-08-31 21:08:33.632344218 +0300 +++ /usr/bin/firefox 2021-08-30 10:45:24.158266864 +0300 @@ -111,6 +111,7 @@ # Don't throw "old profile" dialog box. export MOZ_ALLOW_DOWNGRADE=1 +export MOZ_DISABLE_CONTENT_SANDBOX=1 # Run the browser exec ${MOZ_PROGRAM} "${@}" Without MOZ_DISABLE_CONTENT_SANDBOX=1 ublock Origin, Enhancer for Youtube, h264ify does not work. This issue persists in todays build of www-client/firefox-94.0.2 Are the sandbox crashes in extensions (ublock origin) in FF94 related to the glibc version 2.34? Has anyone tried glibc 2.33 to see if extension crashes go away? So, just did 10 builds with randomized use flags for 94.0.2 on a fresh ~amd64. glibc-2.34-r2. No issues at all here. I can confirm that there is a crash with firefox. It seems to be related to glibc-2.34, because it showed up on my side at around the time when glibc-2.34 was installed. Another way to let it crash is to try dragging a (large enough) image. Downgrading to glibc-2.33 does not seem like a road that I would like to go, as downgrades are not really recommended. I wanted to create a backtrace but to my surprise `ulimit -c unlimited` did not dump a core file after the segfault. I'm not sure why it did not leave any core dump (a test executable left a core dump on crash, so generally it works). (In reply to Andreas Fink from comment #10) > Another way to let it crash is to try dragging a > (large enough) image. Can confirm that this crashes my firefox-94.0.2 here, too, with glibc-2.34-r2. Makes me suspect this might have something to do with the compositor? Since the extensions that crash for me open up new windows with rounded corners (and thus require a cross-window alpha over, which in turn requires a compositor). Also note that I'm grossly spitballing, and any information above can be completely wrong. Anyways, I'm running on sway (wayland): sway version 1.6-fc25e494 (Nov 13 2021, branch 'HEAD') with wlroots-0.15.0 and mesa-21.3.0 on a Ellesmere graphics card (Radeon RX 580, iirc), on a somewhat recent kernel: Linux Ryzen2600 5.13.2-gentoo #1 SMP PREEMPT Sat Jul 17 00:58:11 CEST 2021 x86_64 AMD Ryzen 5 2600 Six-Core Processor AuthenticAMD GNU/Linux > Downgrading to glibc-2.33 does not seem like a road > that I would like to go, as downgrades are not really recommended. I tried to force the downgrade with I_ALLOW_TO_BREAK_MY_SYSTEM=yes emerge -1Ov "=sys-libs/glibc-2.33-r7" But it (luckily) bailed out in the post-install step, where it checks whether some binaries (something like `file`) can still be launched with the selected glibc, and, it turned out, they couldn't. > I wanted to create a backtrace but to my surprise `ulimit -c unlimited` did > not dump a core file after the segfault. I'm not sure why it did not leave > any core dump (a test executable left a core dump on crash, so generally it > works). Yeah, same. Not sure how the ulimit -c propagates to the processes forked by firefox. I also went a step further and modified the core-dump pattern echo "/tmp/cores/core.%e.%p.%h.%t" > /proc/sys/kernel/core_pattern ... to no avail. No coredumps generated. Anyways, with MOZ_DISABLE_CONTENT_SANDBOX=1, it works. I'm having the same issue, and also ran into the same thing that Dominik Schmidt did when I tried to downgrade glibc for testing: /usr/bin/free: ./libc.so.6: version `GLIBC_2.34' not found (required by /lib64/libsystemd.so.0). However, the workaround does work for me as well. 1. Is it worth opening an issue with upstream glibc to even understand what the firefox extension crash is? 2. Should we file a new bug or change the title of this one to reflect the issue we are discussing? Its not a build problem. I'm trying to summarize here a little bit of what I have found so far (look also at this site: https://wiki.mozilla.org/Security/Sandbox#Content_Levels): - Starting firefox with MOZ_DISABLE_CONTENT_SANDBOX=1 works just fine - Changing security.sandbox.content.level to 2 also fixes the problem - Changing security.sandbox.content.level to 3 will make it crash again (this should give an idea that a read violation happens) Now a bit insight into what i have found so far (always started without setting MOZ_DISABLE_CONTENT_SANDBOX): Starting firefox with MOZ_SANDBOX_LOGGING=1, the last lines before the crash: Sandbox: SandboxBroker: denied op=stat rflags=0 perms=0 path=/ for pid=263 Sandbox: Failed errno -13 op stat flags 00 path / Now reading the documentation for the sandbox from above link I whitelisted '/' in the config option security.sandbox.content.read_path_whitelist. However that effectively gives read access to all of the filesystem (doc: To allow access to an entire directory tree (rather than just the directory itself), include a trailing / character). If I whitelist '/' and keep security.sandbox.content.level at 4, everything works fine again. Giving read access to '/' via whitelist is probably almost the same as having security.sandbox.content.level==2. Attaching gdb to the sandboxed process and let it crash I see this backtrace (the SIGSYS is the sandbox violation, followed by a SIGSEGV): Thread 1 "Web Content" received signal SIGSYS, Bad system call. Thread 1 "Web Content" received signal SIGSEGV, Segmentation fault. 0x00007f2749dbca69 in __GI___nss_lookup_function (fct_name=fct_name@entry=0x7f2749e3a7e9 "getpwuid_r", ni=<optimized out>) at nsswitch.c:136 136 if (ni->module == NULL) (gdb) bt #0 0x00007f2749dbca69 in __GI___nss_lookup_function (fct_name=fct_name@entry=0x7f2749e3a7e9 "getpwuid_r", ni=<optimized out>) at nsswitch.c:136 #1 __GI___nss_lookup (ni=ni@entry=0x7ffd894d9d88, fct_name=fct_name@entry=0x7f2749e3a7e9 "getpwuid_r", fct2_name=fct2_name@entry=0x0, fctp=fctp@entry=0x7ffd894d9d90) at nsswitch.c:68 #2 0x00007f2749dbdc93 in __GI___nss_passwd_lookup2 (ni=ni@entry=0x7ffd894d9d88, fct_name=fct_name@entry=0x7f2749e3a7e9 "getpwuid_r", fct2_name=fct2_name@entry=0x0, fctp=fctp@entry=0x7ffd894d9d90) at /usr/src/debug/sys-libs/glibc-2.34-r3/glibc-2.34/nss/XXX-lookup.c:58 #3 0x00007f2749d60303 in __getpwuid_r (uid=uid@entry=1000, resbuf=resbuf@entry=0x7f2749e7da40 <resbuf>, buffer=0x7f27336f2c00 '\345' <repeats 199 times>, <incomplete sequence \345>..., buflen=buflen@entry=1024, result=result@entry=0x7ffd894d9de0) at ../nss/getXXbyYY_r.c:265 #4 0x00007f2749d5fd0b in getpwuid (uid=1000) at ../nss/getXXbyYY.c:135 #5 0x00007f27309247c4 in () at /usr/lib64/libfam.so.0 #6 0x00007f2730924f10 in FAMOpen () at /usr/lib64/libfam.so.0 #7 0x00007f2734813436 in () at /usr/lib64/gio/modules/libgiofam.so #8 0x00007f2748280307 in () at /usr/lib64/libgio-2.0.so.0 #9 0x00007f2748337b26 in () at /usr/lib64/libgio-2.0.so.0 #10 0x00007f2748338cd6 in () at /usr/lib64/libgio-2.0.so.0 #11 0x00007f27482c91f6 in () at /usr/lib64/libgio-2.0.so.0 #12 0x00007f27482ca074 in g_app_info_get_default_for_type () at /usr/lib64/libgio-2.0.so.0 #13 0x00007f27454f5719 in () at /usr/lib64/firefox/libxul.so #14 0x00007f2742bb14bb in () at /usr/lib64/firefox/libxul.so #15 0x00007f2742ba0e0b in () at /usr/lib64/firefox/libxul.so #16 0x00007f2742ba1344 in () at /usr/lib64/firefox/libxul.so #17 0x00007f2742bacc38 in () at /usr/lib64/firefox/libxul.so ... (and many many more libxul.so without any further information) This is my insight to it. I will try to recompile glibc to check for NULL-ptr, and return an error code instead of segfaulting. Will report back once I've got results. Created attachment 757176 [details, diff]
glibc-2.34-r3 patch
Recompiling glibc with the patch in the folder
/etc/portage/patches/sys-libs/glibc-2.34-r3/
This fixes the NULL-ptr access in glibc (this is a trivial fix, I think the real fix should make the call to __nss_database_get return already a false value, but this function looked way to complicated to mess with it).
No need to recompile firefox afterwards. No need for any other workarounds.
Could somebody please report it to libc?
(In reply to devsk from comment #13) > 2. Should we file a new bug or change the title of this one to reflect the > issue we are discussing? Its not a build problem. Yes please. @ Andreas: Please include in your new bug how to reproduce. Never seen this yet and I wonder if you have a customized /etc/nsswitch.conf. Are you able to reproduce in headless mode, i.e. `firefox --headless --screenshot https://www.gentoo.org/`? If you are able to reproduce in headless mode, please try to reproduce in a fresh stage3. I will not be able to reproduce it in headless mode, as it needs user interaction to crash. I have two ways to consistently crash it: 1. Open a website with a ("large") image -> Start dragging the image --> Tab crash 2. Install ublock Origin addon, click on the ublock Origin button in the top bar --> Only a thin vertical line pops up (and in /var/log/messages a crash in libc is logged) Both seem to have the same root cause as my patch fixed it for both. I will try to setup a small test program with a sandbox that denies read access to / with a sandbox and see if I can reproduce it there. After all, the backtrace code is just calling getpwuid(getuid()). Not sure if I'll have the time though in the next days, so don't wait for me to get the testprogram up and running. In my understanding a sandbox that forbids read access to '/' should crash a test program that is only calling getpwuid(getuid()). I opened a new bug with a little reproducer that crashes on my system. I wonder if it does not crash for people where firefox does work correctly: https://bugs.gentoo.org/828070 Excellent job there Andreas! Really appreciate your effort in tracking this down! (In reply to Andreas Fink from comment #15) > Created attachment 757176 [details, diff] [details, diff] > glibc-2.34-r3 patch > > Recompiling glibc with the patch in the folder > /etc/portage/patches/sys-libs/glibc-2.34-r3/ > > This fixes the NULL-ptr access in glibc (this is a trivial fix, I think the > real fix should make the call to __nss_database_get return already a false > value, but this function looked way to complicated to mess with it). > > No need to recompile firefox afterwards. No need for any other workarounds. > > Could somebody please report it to libc? I can confirm that the patch at least corrected the issues I was have with Firefox. Thank you! Tracking Status firefox-esr78 --- wontfix firefox-esr91 --- fixed firefox90 --- wontfix firefox91 --- wontfix firefox92 --- fixed So, also fixed in stable. This patch does not help with Firefox 97 and latest glibc. The extensions are broken again. If I use MOZ_DISABLE_CONTENT_SANDBOX=1, the extensions work again and FF works. Otherwise, FF just hangs loading the tab. (In reply to devsk from comment #22) > This patch does not help with Firefox 97 and latest glibc. The extensions > are broken again. > > If I use MOZ_DISABLE_CONTENT_SANDBOX=1, the extensions work again and FF > works. Otherwise, FF just hangs loading the tab. That's a different bug. See the latest comments in bug 828070. |