With a hardened-sources kernel with PaX RANDMMAP
(randomization of mmap addresses) enabled by default,
firefox 3.5.1 runs into an infinite loop early during startup
when allocating memory:
* It tries to alloc a block of memory at a specific address with mmap.
* The kernel returns a block of the requested size, but (due to RANDMMAP)
at a different address.
* Firefox doesn't like that, immediately unmaps the block, and tries again,
"paxctl -r /usr/lib64/mozilla-firefox/firefox" (turning off RANDMMAP)
makes firefox happy, but turning off security features against code
injection is a very bad idea, especially for web browsers!
Not only 64 bits problem.
Exactly the same on x86 platform.
*** Bug 279036 has been marked as a duplicate of this bug. ***
Please post your emerge --info and kernel config.
Portage 2.2_rc33-r4 (unavailable, gcc-4.4.1, glibc-2.9_p20081201-r5, 2.6.28-hardened-r9 x86_64)
System uname: Linux-2.6.28-hardened-r9-x86_64-Intel-R-_Core-TM-2_Quad_CPU_Q6600_@_2.40GHz-with-gentoo-2.0.1
Timestamp of tree: Mon Jul 27 07:41:44 CEST 2009
ccache version 2.4 [enabled]
sys-devel/autoconf: 2.13, 2.63-r1
sys-devel/automake: 1.4_p6, 1.9.6-r2, 1.10.2, 1.11
CFLAGS="-march=nocona -O2 -pipe"
CONFIG_PROTECT="/etc /usr/kde/3.5/env /usr/kde/3.5/share/config /usr/kde/3.5/shutdown /usr/share/config /var/lib/hsqldb"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/java/ /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/udev/rules.d"
CXXFLAGS="-march=nocona -O2 -pipe"
FEATURES="assume-digests autoconfig ccache collision-protect distlocks fixpackages metadata-transfer parallel-fetch preserve-libs protect-owned sandbox sfperms sign strict unmerge-logs unmerge-orphans userfetch userpriv usersandbox"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTDIR_OVERLAY="/usr/local/portage/layman/toolchain-overlay /usr/local/portage/layman/toolchain-overlay-testing /usr/local/portage/layman/multilib-overlay /usr/local/portage/layman/sunrise /usr/local/portage/layman/java-overlay /usr/local/portage/layman/enlightenment /usr/local/portage"
USE="3dnow X alsa amd64 berkdb cracklib crypt cups custom-cflags custom-cxxflags gpm hardened justify midi mmx multilib ncurses nls nptl nptlonly nsplugin ogg opengl openmp pam pic readline scanner sse sse2 ssl sysfs tcpd unicode urandom vorbis xorg zlib" ALSA_CARDS="hda-intel" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" ELIBC="glibc" INPUT_DEVICES="keyboard mouse" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LINGUAS="de" USERLAND="GNU" VIDEO_CARDS="nvidia"
Unset: CPPFLAGS, CTARGET, FFLAGS, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
Created attachment 199354 [details]
In addition, i use hardened gcc:4.4 from zorry/xake overlay
Created attachment 199358 [details]
Created attachment 199360 [details]
My emerge --info
This bug has been present since the 3.1 beta's, ever since the native memory allocator was introduced. I filed a bug 8 months ago:
The fix is nontrivial since the allocator relies on these flawed assumptions about mmap behavior to make its threading model work. It even works around OS specific quirks in mmap to shoehorn it into this purpose.
IMHO the whole thing is a gigantic mess and needs to be rewritten to use proper threading primitives, but good luck convincing mozilla of that.
*** Bug 280580 has been marked as a duplicate of this bug. ***
(In reply to comment #0)
> "paxctl -r /usr/lib64/mozilla-firefox/firefox" (turning off RANDMMAP)
> makes firefox happy, but turning off security features against code
> injection is a very bad idea, especially for web browsers!
(In reply to comment #8)
> This bug has been present since the 3.1 beta's, ever since the native memory
> allocator was introduced. I filed a bug 8 months ago:
> The fix is nontrivial since the allocator relies on these flawed assumptions
> about mmap behavior to make its threading model work. It even works around OS
> specific quirks in mmap to shoehorn it into this purpose.
> IMHO the whole thing is a gigantic mess and needs to be rewritten to use proper
> threading primitives, but good luck convincing mozilla of that.
So, IIUC, the obvious two choices are to deactivate randmmap in 3.5.2 and rely on residual kernel-level memory protection, or to keep it functioning and downgrade to 3.0.13 (which is not yet in the repository).
(I apologize in advance for the impossible question): Has anyone guestimated the "amount" and nature of the security (memory) protection lost by deactivating randmmap; and if the remaining memory protection in a hardened system is "pretty good" (running FF in a chroot jail)? (IIUC, the primary reason for upgrading are the speed improvements in 3.5.2; that 3.5.2 and 3.0.13 are still maintained by Moz).
(In reply to comment #10)
Just my personal opinion:
I would not be worried too much about the turned-off randmmap
*if* (and only if) it is granted that firefox is run with mprotect enabled,
because as long as you can't execute code at all in writeable mmap's,
making code insertion harder by varying the address of mmap's
does not gain much additional safety.
However, it is just a question of time until mprotect must be turned off
for firefox, and then randmmap could make the big difference against an exploit.
with mprotect, because it is their fundamental principle to execute
dynamically generated code.
And running firefox in a chroot jail adds efforts and reduces comfort
(how many percent of the firefox users do so?),
whereas mprotect and randmmap come for free from the user's point of view
(if they would work).
Running thunderbird mail in a chroot jail would be even more annoying w.r.t.
(In reply to comment #11)
> Just my personal opinion:
Thank you for replying to a horrible question.
> However, it is just a question of time until mprotect must be turned off
> for firefox, and then randmmap could make the big difference against an
> with mprotect, because it is their fundamental principle to execute
> dynamically generated code.
Would one be able to simply turn mprotect off for the plugin?
> And running firefox in a chroot jail adds efforts and reduces comfort
> (how many percent of the firefox users do so?),
> whereas mprotect and randmmap come for free from the user's point of view
> (if they would work).
> Running thunderbird mail in a chroot jail would be even more annoying w.r.t.
Yes, it can be annoying; I have them each jailed in separate jails, and therefor need to copy links in TB and paste them onto FF. But that has actually seemed easier than trying to keep RBAC current (I'm always changing something)
OTOH, given the loss of both randmmem and mprotect, I'd guess that RBAC will become much more compelling for hardened users - though I suppose that neither RBAC nor jailing would protect against a Firefox memory exploit that quietly logged passwords :-( .
Are other OSs (e.g. BSDs; Solaris) having indigestion with 3.5.x? If they are, could there be a united appeal to Moz to "fix fox"?
Please test the changes as soon as possible. If all works fine I will get it commited to the tree.
Hi Jory, still doesn't work with hardened defaults on x86/gcc-3.4.6. The endless mmap2/munmap loop is gone but it segfaults on startup. Recompiled xulrunner and mozilla-firefox with gcc-3.4.6-vanilla profile and it works. I'll figure out what part of gcc-3.4.6-hardened it doesn't like and report back.
Still needs test on:
amd64/gcc-3.4.6 - can do
amd64/>=gcc-4.3.x+ssp - someone else plz, hardened-experimental overlay will do fine for testing.
x86/>=gcc-4.3.x+ssp - can do
Since mozilla team is in a hurry to stabilize =www-client/mozilla-firefox-3.5.2 will you mask the related packages on hardened for now prior to proceeding? We'll get it sorted asap and then removing maskings. Profiles needing masking are @ hardened/package.mask and hardened/linux/package.mask.
Keywording or USE with "hardened" is the wrong approach:
"hardened" means that one uses a hardened gcc toolchain and PIE executables.
It has nothing to do with using the hardened-sources kernel with PaX
address space protection and randomization.
I don't have USE="hardened", because I use a standard userland
with plain gcc toolchain and standard executables.
Nevertheless, I have a hardened-sources kernel with most PaX features
(among them RANDMMAP) turned on.
Should all be working now on hardened, with jemalloc disabled however.
x86/gcc-3.4.6 also required -fstack-protector disabled on xulrunner entirely, rather than just CXXFLAGS as before. Additionally, sync'd up =net-libs/xulrunner-18.104.22.168 and =www-client/mozilla-firefox-3.5.2 with overlay differences as requested.
amd64/gcc-3.4.6 not tested, should work now too. Might be able to add back -fstack-protector for C and only disable for C++ but I doubt it (likely have the same problem as x86/gcc-3.4.6). Will confirm though.
>=gcc-4.3.x SSP is not so problematic typically, probably won't need -fstack-protector disabled for either x86 or amd64. Both should work as well. I still plan to test/confirm for x86/>=gcc-4.3. If someone wants to test/confirm for amd64/>=gcc-4.3 on hardened and report back here it would be appreciated.
Thanks Jory for the workaround until this can be resolved better. Patches welcome.
(In reply to comment #15)
> Keywording or USE with "hardened" is the wrong approach:
> "hardened" means that one uses a hardened gcc toolchain and PIE executables.
> It has nothing to do with using the hardened-sources kernel with PaX
> address space protection and randomization.
Unfortunately it's the best we've got at the moment as there's a rush to stabilize firefox-3.5.2. hardened is intended to be used toolchain + kernel. If you're doing something different you can adjust your local flags via /etc/portage/package.use.
But yeah, this fact had crossed my mind. I'm thinking it would be nice to make some additions to linux-info.eclass that allowed toggling a USE flag on/off depending on config options, rather than just check if a particular config option exist or not.
Even better in this case would be a patch to make jemalloc compatible with PaX RANDMMAP. :)
Also seamonkey-2.0 in the overlay will likely need similar treatment as it appears to use jemalloc too. I'm getting an unrelated error building that though to figure out first before I can test.
THANK YOU THANK YOU All for the attention and quick action!
FWIW, a couple of newbie thoughts:
1. Please come out with a different ebuild number. The welcome changes you've made are big; and need to be signaled with something like www-client/mozilla-firefox-22.214.171.124. I caught it only because of an email from Bugzilla. I understanding that these changes won't affect non-hardened users - the majority of Gentoo - so perhaps a new ebuild number and an additional little note within portage could explain that it affects hardened users only. FWICT, gcc does this - though not as informatively as it could be.
2. Please feed this issue up the chain to Mozilla. FWICT, they take a lot of pride in their BSD-originated jemalloc tool, and may entertain changes at this early point - especially if other hardened distributions and OpenSolaris coordinated a bit in expressing concern about this common issue.
(In reply to comment #18)
> 1. Please come out with a different ebuild number. The welcome changes you've
> made are big; and need to be signaled with something like
> www-client/mozilla-firefox-126.96.36.199. I caught it only because of an email from
> Bugzilla. I understanding that these changes won't affect non-hardened users -
> the majority of Gentoo - so perhaps a new ebuild number and an additional
> little note within portage could explain that it affects hardened users only.
> FWICT, gcc does this - though not as informatively as it could be.
This only effects hardened, if it was to effect a larger number of people such as an amd64 arch it would warrant a revision bump. There is no need to force all users to rebuild a package one the problem only effects hardened users.
I understand it's not ideal but have to agree with Jory, most hardened users are stable users, not ~arch and so will never hit this. For those that have hit this its quite obvious firefox doesn't work so they can find info here on bugzilla. All hardened ~arch users should be pretty familiar with bugzilla by now. ;)
(In reply to comment #14)
> Still needs test on:
> amd64/>=gcc-4.3.x+ssp - someone else plz, hardened-experimental overlay will do
just tested amd64/gcc-4.4.1+ssp (from overlay), seems to work fine
Created attachment 200840 [details, diff]
fix jemalloc vs. ASLR
guys, can you please try this patch instead (and handle upstream submission as usual, i don't have time for fighting this)?
Hmmm, just from a quick look at the patch (I did not try it):
1.) You are dropping the #ifdef conditions MAP_ALIGN and JEMALLOC_NEVER_USES_MAP_ALIGN in the Solaris case.
I assume they are there for a reason?
2.) In the MALLOC_PAGEFILE case (which is true for Linux I think),
your mmap(ret, ... MAP_FIXED, pfd, 0) lacks the "assert(ret != NULL)" check.
3.) You are not checking the return code of the two munmap.
4.) If "size" is not a multiple of the pagesize (no idea if this ever happens),
your second munmap call has an invalid first argument (not page-aligned).
Otherwise, the changes look reasonable to me.
(In reply to comment #22)
> Created an attachment (id=200840) 
> fix jemalloc vs. ASLR
> guys, can you please try this patch instead (and handle upstream submission as
> usual, i don't have time for fighting this)?
i applied this patch to both xulrunner+mozilla-firefox, dropped the hardened useflag for them and can start firefox now without touching paxctl.
This is with amd64/gcc-4.4.1 from overlay
Thank you PaX Team. Less than 23 hours after adding you to CC you provide a patch. :)
Will try the patch as soon as I can.
(In reply to comment #23)
> Hmmm, just from a quick look at the patch (I did not try it):
> 1.) You are dropping the #ifdef conditions MAP_ALIGN and
it's no longer needed, i emulate its behaviour by using mmap and munmap to chop off the unaligned/unneeded parts.
> JEMALLOC_NEVER_USES_MAP_ALIGN in the Solaris case.
> I assume they are there for a reason?
that define is nowhere to be found in the tree, so i assume it's some leftover debugging code, but upstream can reinstate it if they so wish, it's pretty pointless otherwise.
> 2.) In the MALLOC_PAGEFILE case (which is true for Linux I think),
> your mmap(ret, ... MAP_FIXED, pfd, 0) lacks the "assert(ret != NULL)" check.
it is implicitly checked by the anon mmap case. if that returns NULL (a silly test btw, since NULL is a valid address to mmap in general and it won't save anyone from NULL derefs since such bugs can go well beyond the first page, but i digress) then the assert will trigger, otherwise the 2nd mmap cannot go at NULL either due to the MAP_FIXED.
> 3.) You are not checking the return code of the two munmap.
it's on purpose, it's pointless to check. if you look at the function below the one i patched, it simply aborts in that case or returns NULL. neither makes much sense to me. munmap can only fail when the kernel fails to allocate memory for the new vma structures (we're splitting the first mmap into 2 or 3 parts), in which case you will have bigger issues at hand than jemalloc failing at the munmap (and since the first mmap succeeded, technically the allocation succeeded, it's just that there'll be some virtual address space wasted). again, it's something upstream can tighten up for generic use, i just wanted to get the generic case to work.
> 4.) If "size" is not a multiple of the pagesize (no idea if this ever happens),
> your second munmap call has an invalid first argument (not page-aligned).
as far as i checked the callers, size is always a multiple of some chunk size variable, iirc, 1 or 2MB at least. but if upstream or anyone knows better, feel free to align it.
(In reply to comment #24)
> i applied this patch to both xulrunner+mozilla-firefox, dropped the hardened
> useflag for them and can start firefox now without touching paxctl.
Many thanks to the PAX team! The aslr patch together with
--disable-jit has made firefox usable again on my
hardened system without any paxctl intervention (I don't use
flash so that's all I really need).
Patched xulrunner and firefox ebuilds are in the tree. Leaving open to track upstream report and possibly looking into issues with gcc-3.4.6 and js JIT.
Hmmm I see a new flag with xulrunner and mozilla-firefox: (-hardened%*)
ISTM that I would want (hardened%*) .
So I ran an emerge --info and got a bunch of stuff, including hardened and pic; but emerge still gives me -hardened.
So I added hardened to my use flags, and mozilla-firefox still wants to emerge with -hardened.
(BTW, last night I effected the layman -o http://github.com/Xake/toolchain-overlay.git/xake-toolchain.xml -fa xake-toolchain procedure)
1. Am I correct in presuming that the flag should be hardened, and not -hardened?
2. Any ideas on why my make.conf use flags are not taking effect?
Thanks in advance
The 'hardened' USE flag is gone from those ebuilds as we no longer need it.
err FWICT, it is still there:
Calculating dependencies... done!
o.k. I get it. that indicates a change, not a new flag
The ebuild now turns off MPROTECT for firefox, implicitely and unconditionally.
That's not what we want in a security-aware environment.
and to leave MPROTECT enabled (works fine as far as I can tell).
One looses the java plugin and 70 % of all flash, but I don't need them anyway.
(In reply to comment #34)
> The ebuild now turns off MPROTECT for firefox, implicitely and unconditionally.
> That's not what we want in a security-aware environment.
> and to leave MPROTECT enabled (works fine as far as I can tell).
> One looses the java plugin and 70 % of all flash, but I don't need them anyway.
If so, then FF could be compiled with jemalloc off - the earlier workaround that allowed MPROTECT, and was fully satisfactory on my box - and those wanting it could use gnash as an alternative for flash (presuming gnash will not be killed by MPROTECT).
MPROTECT must be retained, IMHO.
As far as I can tell, firefox is fine with mprotect,
even with jemalloc (at least with the fix from the PaX team mentioned above,
mprotect is needed only for the Java plugin and for the Flash plugin.
And for Flash:
* If necessary, I can go without it, at least on security-sensitive machines
(anyway, there was no flash for pure amd64 environments like mine for years...).
* Depending on the actual flash contents, some flash sites even work
with Adobe Flash and mprotect turned on (Youtube for example I think).
* There are free alternatives to Adobe Flash which should not have problems with mprotect.
Calm down gents, we're aware of the mprotect vs. jit thing. MPROTECT off with jit enabled is only temporary.
With 3.5.2 I need to disable MPROTECT to avoid that firefox crashes when starting. With 3.5 this was only necessary for flash.
(In reply to comment #38)
> With 3.5.2 I need to disable MPROTECT to avoid that firefox crashes when
> starting. With 3.5 this was only necessary for flash.
have you got any logs? is it text relocations in a library? or is it a PaX kill? in any case, open a new bug please (and CC me and hardened) because this one is about randomization triggering a jemalloc bug.
Could you please cross-reference the number of the new bug here?
Mozilla team is out, soon we shall have libjemalloc standalone which has addressed issue.
Current versions are fine.