Emacs segfaults on startup when merged through the sandbox. I've just spent about 4 hours today and 2 hours yesterday trying to figure out why emacs segfaults when started. I've rebuilt gcc/glibc/binutils/.../, but then I remembered a problem gbevin was having with xemacs when merged through the sandbox. So, disabling the sandbox and remerging emacs and voila!
Sadly enough I've been aware of this problem, but then with xemacs. Azarah and I have been trying to work around it bad I can seem to find a solution, nor why it happens. I wild guess is that something might be going wrong during the compilation of the lisp files. Could you maybe try to selectively disable/enable the sandbox during the compilation to pinpoint the exact cause of failure. I have isolated the code that creates this segfaulting emacs in a seperate file. Which I'll attach. Anyone, feel free to take a look at it.
#define _GNU_SOURCE #define _REENTRANT #define open xxx_open # include <dlfcn.h> # include <errno.h> # include <fcntl.h> # include <stdlib.h> # include <sys/stat.h> # include <sys/types.h> #undef open extern int open(const char*, int, mode_t); int (*orig_open)(const char*, int, mode_t) = NULL; int open(const char* pathname, int flags, mode_t mode) { int old_errno = errno; /* code that makes xemacs' compilation produce a segfaulting executable */ char** test = NULL; test = (char**)malloc(sizeof(char*)); free(test); /* end of that code */ if (!orig_open) { orig_open = dlsym(RTLD_NEXT, "open"); } errno = old_errno; return orig_open(pathname, flags, mode); }
fixed this by adding DISABLE_SANDBOX in the ebuild.
I don't think this bug should be put FIXED yet since we didn't find the exact cause of the problem. Putting SANDBOX_DISABLED in the ebuild is just a patchy workaround. The real issue should be found and tracked down.
ok
doesn't seem to be resolved. :((( on cvs.gentoo.org it segfaults even when SANDBOX_DISABLED="1" is in the file.
Could you try with sandbox not in the maintainer settings? Then it isn't even called at all and nothing goes through it.
I will ask Kabau to do so. A user claims to have this problem without MAINTAINER set so it might not be the sandbox :( At my machines all three worked when I disabled the sandbox, this is very strange!
I'll second the unnamed user's reports of segfaulting without MAINTAINER set. I'm currently getting my system up to speed again (wiped it yesterday before I knew about bugs.gentoo.org) and would be happy to help try and troubleshoot this issue.
This has been happening to people that don't use the sandbox too. Also with the latest versions I haven't had this problem anymore.
I just emerge unmerged and then merged the latest version of emacs with the sandbox disable=1 in the ebuild and it still segfaults.
I also reemerged it as Michael did with the exact same result. No solution so far for me
I updated portage from version 2.0.23 to 2.0.25. Now everything works fine again. i can emerge emacs-21.2-r1 without any change to the config file
I had the same experience - stopped working, update to portage 2.0.25 (from 24 I believe), started merging with sandbox again
With portage-2.0.27, I'm seeing this problem again for emacs-21.2-r1, but not for emacs-21.2. The problem is that emerge apparantly thinks that 21.2-r1 is a newer version than just 21.2 (it isn't, right?). So maybe this is a portage bug, but I see two solutions: * Remove emacs-21.2-r1.ebuild (this is what I did, and it worked) * Put something in the make profile to give the newer version preference Maybe not everyone has problems with -r1, but shouldn't it be removed anyway?
I may be chasing the wrong lead here but I figured I'd share this stack trace from the crashing emacs: #0 0x405a57ba in chunk_alloc (ar_ptr=0x40650e40, nb=32) at malloc.c:2904 #1 0x405a653b in chunk_realloc (ar_ptr=0x40650e40, oldp=0x8337f48, oldsize=12, nb=32) at malloc.c:3514 #2 0x405a61db in __libc_realloc (oldmem=0x40650e40, bytes=1) at malloc.c:3388 #3 0x08159696 in emacs_blocked_realloc (ptr=0x8337f50, size=22) at alloc.c:801 #4 0x405a60fc in __libc_realloc (oldmem=0x40650e40, bytes=1) at malloc.c:3329 #5 0x081592ce in xrealloc (block=0x8337f50, size=22) at alloc.c:544 #6 0x0814c0b8 in regex_compile (pattern=0x825eab8 " +", size=2, syntax=3408388, bufp=0x82da47c) at regex.c:2383 #7 0x08158205 in re_compile_pattern (pattern=0x825eab8 " +", length=2, bufp=0x82da47c) at regex.c:5720 #8 0x08143b9c in compile_pattern_1 (cp=0x82da474, pattern=942009000, translate=1211012816, regp=0x82d18a4, posix=0, multibyte=0) at search.c:166 #9 0x08143d8b in compile_pattern (pattern=942009000, regp=0x82d18a4, translate=1211012816, posix=0, multibyte=0) at search.c:237 ... That malloc.c is inside glibc, alloc.c is in emacs. It wasn't clear looking at the code where the segfault comes from, I printed some variable values and didn't find any null pointers or anything. If the line number in malloc.c is to be trusted, that's inside a cpp macro, doing some pointer deferencing, but none of the pointers were null. I also tried capturing a log of the entire emerge of emacs with sandbox on and then with sandbox off, to compare. Some gcc warnings moved around but the output of each was exactly the same byte length, where wasn't any meaningful difference. A diff of the two emacs binaries show substancial differences, but reading a hex dump of a binary doesn't teach me anything. More investigation is needed.
This bug has been driving me nuts! I have been able to modify sandbox to avoid the segfaulting emacs but I haven't been able to determine the exact cause. My suspicion is it's a combination of how emacs does its own custom malloc things and how it produces the final emacs binary with a coredump-like technique of writing out the current in-memory process. I can confirm comment #1 that the issue stems from calling malloc, I narrowed it down to the open64 wrapper. If you disable just that wrapper, this bug doesn't happen. If you replace it with a simple malloc/free combo (like in comment #2), the bug happens. I tried defining my own __malloc_hooks in libsandbox in the hopes that maybe the problem was related to sandbox using the hooks emacs defines, but that wasn't fruitful. My workaround was to attempt to avoid calling malloc as much as possible inside the syscall wrappers. before_syscall() in libsandbox.c used to parse and allocate strings for each path in a number of environment variables, do some tests, then free all the strings. This function is called by every syscall wrapper, so it gets called a lot. I changed it so that it only does the parse/allocate if the environment variables' values have changed since last time it was called. There is something really confusing here. If I replace before_syscall() with a simple function that just calls malloc(), free(), and then returns, the emacs gets created bad. One would assume then that calling malloc/free here is at fault. However, with my changes in place, before_syscall() calls check_syscall(), and in that function at least one call to malloc always occurs. But even with this malloc call, emacs builds fine. Something really nasty is going on. I'm not sure if I should commit my changes. It makes emacs work, but doesn't really fix the problem. I did clean up a fair amount of code, so it may be worth committing for that sake.
Created attachment 6347 [details, diff] libsandbox.c.diff Here are my changes if you'd like to examine them.
Ok, to be honest, I was too lazy to check the whole thing with a comb ... shoot me :P But a quick scroll though looks good :) It being what it is, I think it will be better if you commit it to sandbox-dev (it there for this perpose) and *not* sandbox-1.1, and then we can give it a bit of testing first. Also have a look at the execve wrapper (and if we need to add to other execve calls, although i think most that modify env calls this one, but havent checked that glibc code for some time), as that is pretty much all my code, and I do not know how buffer overflow proof it is ... Great work btw =)
I don't need to know when the env var is changed (I don't want to have to trap even more syscalls like setenv()), instead I save away the value of getenv() the first time it is parsed, and on each syscall I strcmp() it with the current value to determine if it needs to be parsed again. This is why you mention execve right?
Nope, just though that while you were busy with sandbox, you could just check that it is sound (no possible segfaults/overflows/etc ) ...
jared -- solve this
This bug is ancient. Is this still an issue?
i don't think so...
Latest ebuilds don't have this problem. I've tested on emacs-21.4-r1. This could be closed.
Tried with broken out sandbox versions ?