mutt segfaults now after updating to ncurses-6.1, both 6.1-r1 and 6.1-r2 are the same. $ mutt Segmentation fault $ gdb mutt GNU gdb (Gentoo 7.12.1 vanilla) 7.12.1 Copyright (C) 2017 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-pc-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <https://bugs.gentoo.org/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from mutt...(no debugging symbols found)...done. (gdb) run Starting program: /usr/bin/mutt [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Program received signal SIGSEGV, Segmentation fault. 0x00007f08e04855fd in _nc_setupscreen_sp () from /lib64/libncursesw.so.6 (gdb) bt #0 0x00007f08e04855fd in _nc_setupscreen_sp () from /lib64/libncursesw.so.6 #1 0x00007f08e04800de in newterm_sp () from /lib64/libncursesw.so.6 #2 0x00007f08e04805db in newterm () from /lib64/libncursesw.so.6 #3 0x00007f08e047b47a in initscr () from /lib64/libncursesw.so.6 #4 0x000056373536ef91 in ?? () #5 0x00007f08dee455ad in __libc_start_main () from /lib64/libc.so.6 #6 0x000056373536feea in ?? () (gdb) quit A debugging session is active. Inferior 1 [process 59341] will be killed. Quit anyway? (y or n) y
ok, I've been running 6.0 for a while now, so this is a new "feature" from 6.1
I think this is because Mutt sets up the signal handlers before initscr, because of ncurses-4.2, but the ncurses manpage documents nowadays that signal handlers need to be setup _after_ initscr.
(In reply to Jason Zaman from comment #0) > both 6.1-r1 and 6.1-r2 are the same. -r2 simply fixes a problem with x11-terms/st.
what version of ncurses did you compile mutt against? compiled against 6.0, mutt runs fine with 6.1 here.
recompile against 6.1 also results in no crash for me.
I see an identical crash in mutt with ncurses-6.1. This occurs both when mutt is compiled against ncurses-6.0, and against ncurses-6.1. Downgrading to ncurses-6.0 results in no crash. My ncurses is compiled with these use-flags: sys-libs/ncurses-6.1-r2::gentoo was built with the following: USE="cxx threads tinfo unicode -ada -debug -doc -gpm -minimal -profile -static-libs -test -trace" ABI_X86="(64) -32 (-x32)" strace mutt gives: ... brk(0x55fa1db09000) = 0x55fa1db09000 rt_sigaction(SIGPIPE, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f1493bc4fc0}, NULL, 8) = 0 rt_sigaction(SIGTERM, {sa_handler=0x55fa1ce16750, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f1493bc4fc0}, NULL, 8) = 0 rt_sigaction(SIGHUP, {sa_handler=0x55fa1ce16750, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f1493bc4fc0}, NULL, 8) = 0 rt_sigaction(SIGQUIT, {sa_handler=0x55fa1ce16750, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f1493bc4fc0}, NULL, 8) = 0 rt_sigaction(SIGALRM, {sa_handler=0x55fa1ce16670, sa_mask=[TSTP], sa_flags=SA_RESTORER, sa_restorer=0x7f1493bc4fc0}, NULL, 8) = 0 rt_sigaction(SIGCONT, {sa_handler=0x55fa1ce16670, sa_mask=[TSTP], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f1493bc4fc0}, NULL, 8) = 0 rt_sigaction(SIGTSTP, {sa_handler=0x55fa1ce16670, sa_mask=[TSTP], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f1493bc4fc0}, NULL, 8) = 0 rt_sigaction(SIGINT, {sa_handler=0x55fa1ce16670, sa_mask=[TSTP], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f1493bc4fc0}, NULL, 8) = 0 rt_sigaction(SIGWINCH, {sa_handler=0x55fa1ce16670, sa_mask=[TSTP], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f1493bc4fc0}, NULL, 8) = 0 rt_sigaction(SIGCHLD, {sa_handler=0x55fa1ce16660, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART|SA_NOCLDSTOP, sa_restorer=0x7f1493bc4fc0}, NULL, 8) = 0 ioctl(1, TCGETS, {B38400 opost isig icanon echo ...}) = 0 stat("/home/alex/.terminfo", 0x55fa1daeb1b0) = -1 ENOENT (No such file or directory) stat("/etc/terminfo", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 stat("/usr/share/terminfo", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 access("/etc/terminfo/x/xterm-256color", R_OK) = 0 openat(AT_FDCWD, "/etc/terminfo/x/xterm-256color", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=3713, ...}) = 0 read(3, "\36\2%\0&\0\17\0\235\1\2\6xterm-256color|xterm"..., 4096) = 3713 read(3, "", 4096) = 0 close(3) = 0 ioctl(1, TCGETS, {B38400 opost isig icanon echo ...}) = 0 ioctl(1, TCGETS, {B38400 opost isig icanon echo ...}) = 0 ioctl(1, TCGETS, {B38400 opost isig icanon echo ...}) = 0 ioctl(1, TCGETS, {B38400 opost isig icanon echo ...}) = 0 ioctl(1, TIOCGWINSZ, {ws_row=92, ws_col=176, ws_xpixel=0, ws_ypixel=0}) = 0 ioctl(1, TCGETS, {B38400 opost isig icanon echo ...}) = 0 ioctl(1, TIOCGWINSZ, {ws_row=92, ws_col=176, ws_xpixel=0, ws_ypixel=0}) = 0 --- SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL, si_addr=NULL} --- +++ killed by SIGSEGV +++ Segmentation fault ltrace mutt gives: ... sigemptyset(<>) = 0 sigaction(SIGPIPE, { 0x1, <>, 0, 0 }, nil) = 0 sigaction(SIGTERM, { 0x562bfd006750, <>, 0, 0 }, nil) = 0 sigaction(SIGHUP, { 0x562bfd006750, <>, 0, 0 }, nil) = 0 sigaction(SIGQUIT, { 0x562bfd006750, <>, 0, 0 }, nil) = 0 sigaddset(<19>, SIGTSTP) = 0 sigaction(SIGALRM, { 0x562bfd006670, <19>, 0, 0 }, nil) = 0 sigaction(SIGCONT, { 0x562bfd006670, <19>, 0, 0 }, nil) = 0 sigaction(SIGTSTP, { 0x562bfd006670, <19>, 0, 0 }, nil) = 0 sigaction(SIGINT, { 0x562bfd006670, <19>, 0, 0 }, nil) = 0 sigaction(SIGWINCH, { 0x562bfd006670, <19>, 0, 0 }, nil) = 0 sigemptyset(<>) = 0 sigaction(SIGCHLD, { 0x562bfd006660, <>, 0, 0 }, nil) = 0 initscr(17, 0x7fff759ff480, 0, 0 <no return ...> --- SIGSEGV (Segmentation fault) --- +++ killed by SIGSEGV +++
(In reply to Fabian Groffen from comment #4) > what version of ncurses did you compile mutt against? > > compiled against 6.0, mutt runs fine with 6.1 here. I've recompiled mutt and neomutt and ncurses a whole bunch of times and still failing in exactly the same way.
ok, amd64-linux I suppose?
(In reply to Fabian Groffen from comment #8) > ok, amd64-linux I suppose? im on stable amd64 yeah. Portage 2.3.24 (python 2.7.14-final-0, default/linux/amd64/17.0/hardened/selinux, gcc-6.4.0, glibc-2.25-r10, 4.15.13-gentoo x86_64) stable mutt failed, so i updated to latest ~arch and still failed so i tried neomutt too and still nothing. nothing else using ncurses is failing tho but i cant figure out what is different on my machine than yours. even switching selinux to permissive makes no diff.
The gdb crash sounds similar, but I haven't had the chance to test on amd64-linux yet, so nothing conclusive. It just seems the codepath /does/ work on some setups.
Mutt 1.9.4 (2018-02-28, gentoo-1.9.4/r0) Copyright (C) 1996-2016 Michael R. Elkins and others. Mutt comes with ABSOLUTELY NO WARRANTY; for details type `mutt -vv'. Mutt is free software, and you are welcome to redistribute it under certain conditions; type `mutt -vv' for details. System: Linux 4.12.12-gentoo (x86_64) ncurses: ncurses 6.1.20180127 (compiled with 6.0) hcache backend: lmdb LMDB 0.9.18: (February 5, 2016) % qlist -Iv mutt ncurses mail-client/mutt-1.9.4 sys-libs/ncurses-6.1-r2 This works fine for me. Can you try testing this with empty config? Does it crash as well in that case?
Also, if it doesn't, just as wild guess, does removing the first mutt_signal_init() (line 293 below the comment about ncurses-4.2 trying to install SIGWINCH handler) call from start_curses in main.c prevent the crash? The code is odd there, and seems to re-init the signals anyway afterwards.
(In reply to Fabian Groffen from comment #12) > Also, if it doesn't, just as wild guess, does removing the first > mutt_signal_init() (line 293 below the comment about ncurses-4.2 trying to > install SIGWINCH handler) call from start_curses in main.c prevent the > crash? The code is odd there, and seems to re-init the signals anyway > afterwards. commenting that out makes no diff :( jason@baraddur ~ $ mv .muttrc .muttrc.bak jason@baraddur ~ $ mutt Segmentation fault jason@baraddur ~ $ gdb mutt (gdb) run Starting program: /usr/bin/mutt [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Program received signal SIGSEGV, Segmentation fault. 0x00007fbd747295fd in _nc_setupscreen_sp () from /lib64/libncursesw.so.6 (gdb) bt #0 0x00007fbd747295fd in _nc_setupscreen_sp () from /lib64/libncursesw.so.6 #1 0x00007fbd747240de in newterm_sp () from /lib64/libncursesw.so.6 #2 0x00007fbd747245db in newterm () from /lib64/libncursesw.so.6 #3 0x00007fbd7471f47a in initscr () from /lib64/libncursesw.so.6 #4 0x000055a07566ef8c in ?? () #5 0x00007fbd730e95ad in __libc_start_main () from /lib64/libc.so.6 #6 0x000055a07566feda in ?? ()
I've spent some time debugging the shit out of ncurses due to this bug, and it seems this is a link issue: mutt is linked against both libtinfo AND libtinfow, as well as against ncursesw. libtinfo comes first, so it gets to build the objects that will be used by ncursesw, but these two are not ABI-compatible. As a quick workaround, use: LD_PRELOAD=/lib64/libtinfow.so.6 mutt This forces libtinfow to be used over libtinfo. This did not happen with ncurses-6.0 because libtinfo and libtinfow were ABI-compatible (at least wrt the TERMINAL structure in tinfo). I'm guessing this problem does not show up if ncurses is built with USE -tinfo. Hopefully I can try to understand why mutt is linked against both tinfo libs later today.
hmmmm, let me check this on my setups, it seems wrong to link to both in any case Thanks so much for diving into this!!!
Created attachment 526196 [details, diff] 0007-ncurses-matching-tinfo.patch Can you try if this patch works for you? Let me know if you need help to apply the patch.
The patch works great! mutt is now linked to libtinfow only, and starts up fine. I haven't looked at mutt's configure script, but I'm wondering if things wouldn't have been easier if it made use of pkg-config? Since `pkg-config --libs ncursesw` correctly yields `-lncursesw -ltinfow`.
(In reply to Quentin Minster from comment #14) Awesome detective work! @grobian: the patch works for me too. Will it be applied to 1.9? the stable mutt fails too (1.7.2, thats why i tried the ~arch one in the first place). Can we either apply the patch to both 1.7 and 1.9 or stable 1.9? And yeah, it may be more robust to just use pkg-config (or ncursesw6-config --libs exists too?) but pkg-config may be something upstream should do instead.
Ok, thanks for trying, I'm going to roll out new patchball for 1.9.4 and a new revision. For 1.7.2 there's nothing I can do because it is stable, and to request a new stable keyword, I'd go for 1.9.4 immediately anyway. So, I guess my only option there is a blocker for ncurses-6.1. The bug on 1.7.2 does, however, mention tinfo isn't involved.
The bug has been closed via the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=9d21d15760711027915b4b2b36d66be2b6ff8ed4 commit 9d21d15760711027915b4b2b36d66be2b6ff8ed4 Author: Fabian Groffen <grobian@gentoo.org> AuthorDate: 2018-04-01 07:11:58 +0000 Commit: Fabian Groffen <grobian@gentoo.org> CommitDate: 2018-04-01 07:11:58 +0000 mail-client/mutt: bump patchset for 1.9.4 for ncurses-6.1, bug #651552 Closes: https://bugs.gentoo.org/651552 Package-Manager: Portage-2.3.24, Repoman-2.3.6 mail-client/mutt/Manifest | 3 +- .../{mutt-1.9.3.ebuild => mutt-1.9.4-r1.ebuild} | 2 +- mail-client/mutt/mutt-1.9.4.ebuild | 277 --------------------- 3 files changed, 2 insertions(+), 280 deletions(-)
Bug FIXED but, for the record, someone put a good explanation on a debugging session on the issue here: https://0f5f.blogs.minster.io/2018/04/debugging-ncurses-to-fix-a-mutt-segfault-on-gentoo