| Summary: | =gnome-base/gnome-shell-3.6.2-r1 segfaults with dev-libs/libffi-3.0.12[pax_kernel] | ||
|---|---|---|---|
| Product: | Gentoo Linux | Reporter: | Charles Svitlik <staticsunn> |
| Component: | [OLD] GNOME | Assignee: | Gentoo Toolchain Maintainers <toolchain> |
| Status: | RESOLVED DUPLICATE | ||
| Severity: | normal | CC: | gnome, hardened |
| Priority: | Normal | ||
| Version: | unspecified | ||
| Hardware: | AMD64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Package list: | Runtime testing required: | --- | |
| Attachments: |
Log of abrt report @0 from amd64 system
abrt log from intel system |
||
|
Description
Charles Svitlik
2013-02-13 19:16:44 UTC
> If I paxctl -pemrxs /usr/bin/gnome-shell it segfaults anyways. Does the dmesg message change in this case? > List of packages updated: http://pastebin.com/3R0ni5h6 Please attach files to the bug report as attachments, not in pastebin. (First, pastebins are automatically deleted after some time, but we sometimes need to refer to a bug report months later. Second, pastebin.com specifically is blocked by many company firewalls.) In any case, we need a gdb backtrace to diagnose the crash. Please 1. re-emerge dev-libs/glib, spidermonkey, gjs, cogl, clutter, mutter, and gnome-shell with -ggdb in CFLAGS and splitdebug in FEATURES (see http://www.gentoo.org/proj/en/qa/backtraces.xml for more information); 2. install app-admin/abrt, do /etc/init.d/abrt start 3. make gnome-shell crash 4. obtain the backtrace from abrt (you can use abrt-gui or abrt-cli), and add it to this bug report as attachment. Created attachment 338816 [details]
Log of abrt report @0 from amd64 system
OK.
I guess I should also mention that I have two systems - one is Intel and one is AMD64. When I filed this bug report I filed it from my Intel system, but marked the Architecture field as AMD64. I am having this problem on both systems, both of which are PaX kernels.
This abrt log is from my AMD64 system.
Also, I rebuilt the Intel kernel with grsec/pax disabled, and am still having this issue. I will attach abrt output from the Intel system shortly.
Created attachment 338822 [details]
abrt log from intel system
This one seems like it could be more helpful...
The problem may be here: /usr/lib64/libffi.so.6.0.1 (In reply to comment #3) > Created attachment 338822 [details] > abrt log from intel system > > This one seems like it could be more helpful... Thanks. This one contains enough detail, but it presents a scenario that simply makes no sense. gnome-shell, via gjs, spidermonkey, and libffi, calls clutter_actor_add_constraint(), which calls _clutter_meta_group_add_meta to append the new constrain to actor's constraint list, which after doing the appending, in turn calls _clutter_actor_meta_set_actor to point the new constraint's actor field to our actor. So far so good. Now, _clutter_actor_meta_set_actor looks like this: void _clutter_actor_meta_set_actor (ClutterActorMeta *meta, ClutterActor *actor) { g_return_if_fail (CLUTTER_IS_ACTOR_META (meta)); g_return_if_fail (actor == NULL || CLUTTER_IS_ACTOR (actor)); CLUTTER_ACTOR_META_GET_CLASS (meta)->set_actor (meta, actor); } We pass the first two lines, which means "meta" is a valid constraint pointer, and "actor" is a valid actor pointer. Which means CLUTTER_ACTOR_META_GET_CLASS (meta)->set_actor (meta, actor) simply cannot fail. But the next function call is to address 0x00007ff0f360a010, which (as you may notice) is not in libclutter's memory mapping range at all. In fact, 0x00007ff0f360a010 is the address of the address of some class (I am guessing a wrapper for manipulating the actor via javascript) in libmozjs in spidermonkey! In other words, at some point, "meta"'s set_actor field got changed from a valid function pointer to a pointer to a spidermonkey javascript class. And since a javascript class is not a C function, this results in a crash. There is a memory access bug somewhere, but I have no idea where. Could be libffi, could be spidermonkey, could be gjs, could be clutter :/ (In reply to comment #5) Corrections: > Which means > CLUTTER_ACTOR_META_GET_CLASS (meta)->set_actor (meta, actor) simply cannot > fail as long as set_actor points to the right place > 0x00007ff0f360a010 is the address of the address of some class should be: is the address of a structure representing some javascript class. I think it's libffi. I downgraded to libffi-3.0.11 on both systems and re-merged all the packages you mentioned (clutter, mutter, glib, ...) as well as updated to the latest unstable video card drivers (mesa-9999, llvm-9999, libdrm-9999, and xf86-video-intel-9999 on the intel system, and xf86-video-radeon-9999 on the amd64 system), and am able to log in, everything is back to normal. While chatting in #gentoo-hardened, someone mentioned something about PaX markings, then I noticed that libffi-3.0.11 has USE=(-pax_kernel) and libffi-3.0.12 has USE=pax_kernel. Maybe they're connected? Maybe not? I don't know either, sorry! Anyways, I'm able to use my system as normal again. I guess you can close this bug? I'm more than willing to help figure out what's up, if anything. Thanks for all your help. Assigning to libffi maintainers. *** This bug has been marked as a duplicate of bug 457194 *** |