Summary: | redis-cli 2.6.7 crashes with segmentation fault | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | Thomas Stein <himbeere> |
Component: | Current packages | Assignee: | Dirkjan Ochtman (RETIRED) <djc> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | bug, bugs, lu_zero, robbat2 |
Priority: | Normal | ||
Version: | unspecified | ||
Hardware: | AMD64 | ||
OS: | Linux | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- | |
Bug Depends on: | 453606 | ||
Bug Blocks: | |||
Attachments: | build.log |
Description
Thomas Stein
2013-01-10 11:16:14 UTC
Confirm. Also it fails only with jemalloc. Sorry for my late response. I somehow missed this. Questions to both: 1. What version of jemalloc do you run? 2. Did you just update this? 3. Did you rebuild redis afterwards? Also, please try rebuilding redis. The jemalloc 3.2.0 ebuild did a change that would result in something similar to redis. Hello.
Just updated to 2.6.8. Problem remains.
To your questions:
> 1. What version of jemalloc do you run?
[ebuild R ~] dev-libs/jemalloc-3.2.0 USE="-debug -static-libs -stats"
2. Did you just update this?
No. I suppose a while back. But e redid it.
3. Did you rebuild redis afterwards?
Yes.
best regards
t.
(In reply to comment #3) > Hello. > > Just updated to 2.6.8. Problem remains. > > To your questions: > > > 1. What version of jemalloc do you run? > > [ebuild R ~] dev-libs/jemalloc-3.2.0 USE="-debug -static-libs -stats" > > 2. Did you just update this? > > No. I suppose a while back. But e redid it. > > 3. Did you rebuild redis afterwards? > > Yes. > > best regards > t. thanks, I think I managed to reproduce this. Will look into it tomorrow. Here's output from debug builds: Program received signal SIGSEGV, Segmentation fault. 0x00007ffff7bbc8c9 in arena_mapbits_get () from /usr/lib64/libjemalloc.so.1 (gdb) bt #0 0x00007ffff7bbc8c9 in arena_mapbits_get () from /usr/lib64/libjemalloc.so.1 #1 0x00007ffff7bbcb09 in arena_mapbits_allocated_get () from /usr/lib64/libjemalloc.so.1 #2 0x00007ffff7bbf662 in arena_salloc () from /usr/lib64/libjemalloc.so.1 #3 0x00007ffff7bb85d7 in free () from /usr/lib64/libjemalloc.so.1 #4 0x000000000040cfbb in linenoise () #5 0x000000000040474b in main () The problem is essentially that it needs access to the internal jemalloc headers. Why these aren't installed, I'm not so sure about - but I'll keep investigating. If I rebuild jemalloc with --with-jemalloc-prefix=je_ everything works as intended. This brings us back to the jemalloc ebuild removing this for 3.2.0.ebuild. I _think_ its a naming conflict, since the default prefix for jemalloc functions are je_malloc,free etc. A couple of options 1. Start bashing at the jemalloc ebuild again 2. Use a static jemalloc (as upstream does) 3. Dive deeper into the issue (what I'm doing now) I don't want this bug open anymore, so I opted for #2. Please test ebuild from my overlay: https://github.com/jbergstroem/gentoo-overlay/compare/master...redis-jemalloc-static (note: this also version bumps redis to 2.6.9). I would really like to avoid using a static jemalloc, but lets solve that later. Hello. Unfortunately it still does not work: [ebuild R ~] dev-db/redis-2.6.9::jbergstroem USE="jemalloc -tcmalloc {-test}" 0 kB Total: 1 package (1 reinstall), Size of downloads: 0 kB hostname /usr/local # ls -l /usr/bin/redis-cli -rwxr-xr-x 1 root root 94528 Jan 21 10:44 /usr/bin/redis-cli hostname /usr/local # redis-cli redis 127.0.0.1:6379> set 1Segmentation fault hostname /usr/local # But thanks for your work so far. t. Uahh. I had to set -jemalloc. Now it works. redis 127.0.0.1:6379> help redis-cli 2.6.9 Type: "help @<group>" to get a list of commands in <group> "help <command>" for help on <command> "help <tab>" to get a list of possible help topics "quit" to exit redis 127.0.0.1:6379> set 1 (error) ERR wrong number of arguments for 'set' command redis 127.0.0.1:6379> cheers t. Created attachment 336390 [details] build.log (In reply to comment #9) > Uahh. I had to set -jemalloc. Now it works. > > redis 127.0.0.1:6379> help > redis-cli 2.6.9 > Type: "help @<group>" to get a list of commands in <group> > "help <command>" for help on <command> > "help <tab>" to get a list of possible help topics > "quit" to exit > redis 127.0.0.1:6379> set 1 > (error) ERR wrong number of arguments for 'set' command > redis 127.0.0.1:6379> > > cheers > t. I can't reproduce this. If you use USE=-jemalloc, you're not using jemalloc, but the libc one. Did you perhaps forget to checkout the branch? Attached results are built with jemalloc 3.2.0 from gentoo-x86 (irrelevant, but anyway), using the redis-jemalloc-static branch from my repo (commit edb15599743542c2e385459c0ae687e52fb399b4): g2-x86_64 /home/jbergstroem # emerge -av redis These are the packages that would be merged, in order: Calculating dependencies... done! [ebuild R ~] dev-db/redis-2.6.9::jbergstroem USE="jemalloc -tcmalloc {-test}" 0 kB Total: 1 package (1 reinstall), Size of downloads: 0 kB Would you like to merge these packages? [Yes/No] <see attached log> g2-x86_64 /home/jbergstroem # redis-cli info # Server redis_version:2.6.8 ... >Did you perhaps forget to checkout the branch? I just used your ebuild. >redis_version:2.6.8 Why 2.6.8? Shouldn't this be 2.6.9? best regards t. (In reply to comment #11) > >Did you perhaps forget to checkout the branch? > > I just used your ebuild. Yes, but as I showed earlier, I did all the commits for this change in a specific branch. Please check this out since you otherwise will be running the same ebuild as in tree. Exact steps: git clone/layman git://github.com/jbergstroem/gentoo-overlay.git cd gentoo-overlay && git checkout redis-jemalloc-static emerge redis > > >redis_version:2.6.8 > > Why 2.6.8? Shouldn't this be 2.6.9? I just upgraded without restarting server. Not really relevant since the issue here is -cli. > > best regards > t. Let me know how you go. I just hit this and I figured out what's happening. So the backtrace: #3 0x00007ffff7bb85d7 in free () from /usr/lib64/libjemalloc.so.1 #4 0x000000000040cfbb in linenoise () #5 0x000000000040474b in main () linenoise is where the issue is. In particular: AM_CFLAGS="-std=c99 -pedantic -Wall -W -D__EXTENSIONS__ -D_XPG6" http://sources.gentoo.org/cgi-bin/viewvc.cgi/gentoo-x86/dev-db/redis/files/configure.ac-2.2?view=markup So that means linenoise gets compiled with -std=c99. Which noting this line as part of linenoiseHistoryAdd: linecopy = strdup(line); Which means this happens: "gcc enables the appropriate macro(s) by default, but using -ansi or -std=c99 will disable them" ... "In the absence of a visible declaration, the compiler assumes (under C90 rules) that strdup() returns int. This results in undefined behavior." http://stackoverflow.com/questions/8359966/strdup-returning-address-out-of-bounds Thus leading to the fun segfault we notice. (In reply to comment #13) > I just hit this and I figured out what's happening. So the backtrace: > > #3 0x00007ffff7bb85d7 in free () from /usr/lib64/libjemalloc.so.1 > #4 0x000000000040cfbb in linenoise () > #5 0x000000000040474b in main () > > linenoise is where the issue is. In particular: > > AM_CFLAGS="-std=c99 -pedantic -Wall -W -D__EXTENSIONS__ -D_XPG6" > > http://sources.gentoo.org/cgi-bin/viewvc.cgi/gentoo-x86/dev-db/redis/files/ > configure.ac-2.2?view=markup > > So that means linenoise gets compiled with -std=c99. Which noting this line > as part of linenoiseHistoryAdd: > > linecopy = strdup(line); > > Which means this happens: > > "gcc enables the appropriate macro(s) by default, but using -ansi or > -std=c99 will disable them" > ... > "In the absence of a visible declaration, the compiler assumes (under C90 > rules) that strdup() returns int. This results in undefined behavior." > > http://stackoverflow.com/questions/8359966/strdup-returning-address-out-of- > bounds > > Thus leading to the fun segfault we notice. This is very interesting. Thanks for your work! What I don't quite understand is why this only happens when we compile jemalloc shared using JEMALLOC_NO_DEMANGLE and not with --with-jemalloc-prefix. If you've read through the thread, you've probably picked up that upstream chooses same prefixing as default names. I still don't get how that is tied to compiling linenoise with -std=c99. Also there seems to be some parts of the redis codebase that need -std=c99 so taking it out completely won't be a good idea imho. One solution would be to remove it from the generated Makefile.in. Also addition of '-D_GNU_SOURCE' seems to help with that strdup issue at least according to here: http://gcc.gnu.org/ml/gcc-help/2011-10/msg00117.html Though I'm not sure what effect that would have if enabled globally. > This is very interesting. Thanks for your work! What I don't quite
> understand is why this only happens when we compile jemalloc shared using
> JEMALLOC_NO_DEMANGLE and not with --with-jemalloc-prefix.
>
> If you've read through the thread, you've probably picked up that upstream
> chooses same prefixing as default names. I still don't get how that is tied
> to compiling linenoise with -std=c99.
Either that jemalloc thing is another issue, or it's causing free() to be a lot more strict. Either way if you notice this linenoise code:
case 13: /* enter */
history_len--;
free(history[history_len]);
return (int)len;
case 3: /* ctrl-c */
When you hit ENTER (which is what most people are going to do) that free() happens which if you see here:
linecopy = strdup(line);
if (!linecopy) return 0;
if (history_len == history_max_len) {
free(history[0]);
memmove(history,history+1,sizeof(char*)*(history_max_len-1));
history_len--;
}
history[history_len] = linecopy;
That history array is going to get populated with unknown data because you're compiling with -std=c99 which means that:
"gcc enables the appropriate macro(s) by default, but using -ansi or -std=c99 will disable them"
strdup's macro will not be visible and:
"In the absence of a visible declaration, the compiler assumes (under C90 rules) that strdup() returns int. This results in undefined behavior."
strdup has undefined behavior so you're going to be trying to free garbage basically. Also note this doesn't happen in 2.4 series because the linenoise Makefile doesn't contain CFLAGS so this sed doesn't happen:
for MKF in $(find -name 'Makefile' | cut -b 3-); do
mv "${MKF}" "${MKF}.in"
sed -i -e 's:$(CC):@CC@:g' \
-e 's:$(CFLAGS):@AM_CFLAGS@:g' \
-e 's: $(DEBUG)::g' \
-e 's:$(OBJARCH)::g' \
-e 's:ARCH:TARCH:g' \
-e '/^CCOPT=/s:$: $(LDFLAGS):g' \
"${MKF}.in" \
|| die "Sed failed for ${MKF}"
in 2.6 series it does contain CFLAGS so the -std=c99 replacement doesn't occur.
Without diving to deep into jemalloc (which I'm actually looking forward to, with my friend Jim Bean), here's an updated branch that removes -std=c99 for linenoise (and a cosmetic one for Lua): https://github.com/jbergstroem/gentoo-overlay/compare/redis-linenoise-c99 Please let me know how it goes! (big thanks to Chris for looking into this) See "final" ebuild in bug 453606. Fixed in 2.6.9. |