Created attachment 928164 [details] Output of: emerge --info Originally when building app-editors/emacs it failed when running "temacs" binary with Abort error. Only later I found that just glibc call of backtrace(3) is broken ("temacs" call it in main() function just to test it). Using example code below from https://issues.chromium.org/issues/41145149 // ------------- backtrace.c - start ----------- #define _GNU_SOURCE #include <execinfo.h> #include<stdlib.h> #include<stdio.h> int main(int argc, char **argv) { void *bt[10]; int i, cnt; cnt = backtrace(bt, 10); printf("backtrace() = %i\n", cnt); for (i = 0; i < cnt; ++i) printf("\t%p\n", bt[i]); return EXIT_SUCCESS; } // ------------- backtrace.c - end ----------- Building using common: cc -Wall -ggdb3 backtrace.c -o backtrace When just running: ./backtrace Aborted When running session in GDB (I already followed https://wiki.gentoo.org/wiki/Debugging and https://wiki.gentoo.org/wiki/GCC/ICE_Reporting_Guide to get Debug symbols, I get something like: $ gdb ./backtrace ... (gdb) run ... Program received signal SIGABRT, Aborted. 0x00007ffff7e5db0c in ?? () from /usr/lib64/libc.so.6 #0 0x00007ffff7e5db0c in ?? () from /usr/lib64/libc.so.6 #1 0x00007ffff7e07be6 in raise () from /usr/lib64/libc.so.6 #2 0x00007ffff7def8f7 in abort () from /usr/lib64/libc.so.6 #3 0x00007ffff7d9e79d in uw_init_context_1 (context=context@entry=0x7fffffffdbd0, outer_cfa=outer_cfa@entry=0x7fffffffde00, outer_ra=0x7ffff7ee72c1 <backtrace+97>) at /usr/src/debug/sys-devel/gcc-14.2.1_p20241221/gcc-14-20241221/libgcc/unwind-dw2.c:1336 #4 0x00007ffff7dbd796 in _Unwind_Backtrace (trace=0x7ffff7ee71c0, trace_argument=0x7fffffffde00) at /usr/src/debug/sys-devel/gcc-14.2.1_p20241221/gcc-14-20241221/libgcc/unwind.inc:296 #5 0x00007ffff7ee72c1 in backtrace () from /usr/lib64/libc.so.6 #6 0x00005555555551bc in main (argc=1, argv=0x7fffffffdfe8) at backtrace.c:12 (gdb) frame 3 #3 0x00007ffff7d9e79d in uw_init_context_1 (context=context@entry=0x7fffffffdbd0, outer_cfa=outer_cfa@entry=0x7fffffffde00, outer_ra=0x7ffff7ee72c1 <backtrace+97>) at /usr/src/debug/sys-devel/gcc-14.2.1_p20241221/gcc-14-20241221/libgcc/unwind-dw2.c:1336 (gdb) list 1336 gcc_assert (code == _URC_NO_REASON); 1331 context->ra = ra; 1332 if (!ASSUME_EXTENDED_UNWIND_CONTEXT) 1333 context->flags = EXTENDED_CONTEXT_BIT; 1334 1335 code = uw_frame_state_for (context, &fs); 1336 gcc_assert (code == _URC_NO_REASON); 1337 1338 #if __GTHREADS 1339 { 1340 static __gthread_once_t once_regsizes = __GTHREAD_ONCE_INIT; I tried to use "list" and "disass" but was unable to get value of "code" variable (print $code says "optimized out" and in case of disass I'm unable to find that gcc_assert - to infer value from registers). I even tried standard rebuild procedure (sort of https://wiki.gentoo.org/wiki/Changing_the_CHOST_variable even when I did not change CHOST): emerge -av1 sys-devel/binutils emerge -a1v sys-devel/gcc emerge -a1v sys-libs/glibc emerge -av dev-debug/gdb But with same results. What is even more puzzling - similar Gentoo installation (same glibc same gcc) works fine. Above command in that case produces: ./backtrace backtrace() = 4 0x560e3baf81bc 0x7f76948ae16e 0x7f76948ae229 0x560e3baf80c5 Also emacs builds fine there.
Found exact cause - it was my custom "-fno-strict-aliasing" in COMMON_FLAGS. When removed from COMMON_FLAGS in /etc/portage/make.conf and rebuilding emerge -a1v sys-libs/glibc Fixes crash in backtrace(3)
(In reply to Henryk Paluch from comment #1) > Found exact cause - it was my custom "-fno-strict-aliasing" in COMMON_FLAGS. I think this is still curious as -fno-strict-aliasing shouldn't cause that (the opposite could if a program normally builds with -fno-strict-aliasing and your -fstrict-aliasing overrode that). I'd read your bug in bed earlier but hadn't yet had a chance to ask for some bits, but suspected it might be this.
I fully agree that such behavior of -fno-strict-aliasing is unexpected and counterintuitive, but I verified it on two Gentoo installations. Once I just rebuild sys-libs/glibc without -fno-strict-aliasing, everything works properly: * backtrace.c example works * emacs build works (temacs no longer aborts when run on build stage) * and even binary /usr/src/linux/scripts/sorttable works - task 'SORTTAB vmlinux' at the end of kernel build (it also aborted before. I forget about it, but it turned to have same cause :-) Once I just rebuild sys-libs/glibc with -fno-strict-aliasing, all regressions are back. Yes, in sane world the opposite switch -fstrict-aliasing should break things, but real world is somehow playful and full of surprises.
(In reply to Henryk Paluch from comment #3) > I fully agree that such behavior of -fno-strict-aliasing is unexpected and > counterintuitive, but I verified it on two Gentoo installations. > What I'm saying is that it's a bug that merits investigation, not that you're wrong (hence reopening it).
Yes, I understood your comment that way (that -fno-strict-alias should not cause regressions - so there is something wrong on glibc or gcc side). I forgot to mention why I used -fno-strict-aliasing at all. It was inspired by following comment from https://man.openbsd.org/gcc-local.1 > The -O2 option does not include -fstrict-aliasing, > as this option causes issues on some legacy code. > -fstrict-aliasing is very unsafe with code that > plays tricks with casts, bypassing the already weak type system of C. And because I use default -O2 in COMMON_FLAGS I though that it would be right idea to add -fno-strict-aliasing, but ... now we know what happened.
Created attachment 928763 [details] gdb ./backtrace session of "bad" (glibc build with -fno-strict-aliasing) Details will follow on next attachment...
Created attachment 928764 [details] gdb ./backtrace session of "ok" (glibc build with default flags) I tried to compare 2 different gdb sessions (one on Gentoo with "ok" glibc - working backtrace(3) and another with "bad" glibc - build with -fno-strict-aliasing). Did this: break main run break break find_fde_tail # answer y cont The difference is here: - find_fde_tail returns NULL on "bad" which later leads to abort - find_fde_tail returns non-NULL data - program finishes without error Here is relevant gdb part for "ok": Breakpoint 2, find_fde_tail (dbase=0, pc=140737351776149, hdr=0x7ffff7dc4b88, bases=0x7fffffffdd48) at /usr/src/debug/sys-devel/gcc-14.2.1_p20241221/gcc-14-20241221/libgcc/unwind-dw2-fde-dip.c:396 396 if (hdr->version != 1) +where #0 find_fde_tail (dbase=0, pc=140737351776149, hdr=0x7ffff7dc4b88, bases=0x7fffffffdd48) at /usr/src/debug/sys-devel/gcc-14.2.1_p20241221/gcc-14-20241221/libgcc/unwind-dw2-fde-dip.c:396 #1 _Unwind_Find_FDE (pc=<optimized out>, bases=bases@entry=0x7fffffffdd48) at /usr/src/debug/sys-devel/gcc-14.2.1_p20241221/gcc-14-20241221/libgcc/unwind-dw2-fde-dip.c:552 #2 0x00007ffff7dbd50a in uw_frame_state_for (context=context@entry=0x7fffffffdca0, fs=fs@entry=0x7fffffffdb60) at /usr/src/debug/sys-devel/gcc-14.2.1_p20241221/gcc-14-20241221/libgcc/unwind-dw2.c:1005 #3 0x00007ffff7dbe860 in uw_init_context_1 (context=context@entry=0x7fffffffdca0, outer_cfa=outer_cfa@entry=0x7fffffffded0, outer_ra=0x7ffff7ee8201 <backtrace+97>) at /usr/src/debug/sys-devel/gcc-14.2.1_p20241221/gcc-14-20241221/libgcc/unwind-dw2.c:1335 #4 0x00007ffff7dbf796 in _Unwind_Backtrace (trace=0x7ffff7ee8100, trace_argument=0x7fffffffded0) at /usr/src/debug/sys-devel/gcc-14.2.1_p20241221/gcc-14-20241221/libgcc/unwind.inc:296 #5 0x00007ffff7ee8201 in backtrace () from /usr/lib64/libc.so.6 #6 0x00005555555551bc in main (argc=1, argv=0x7fffffffe0b8) at backtrace.c:12 +info frame Stack level 0, frame at 0x7fffffffdb00: rip = 0x7ffff7dc1e89 in find_fde_tail (/usr/src/debug/sys-devel/gcc-14.2.1_p20241221/gcc-14-20241221/libgcc/unwind-dw2-fde-dip.c:396); saved rip = 0x7ffff7dbd50a inlined into frame 1 source language c. Arglist at unknown address. Locals at unknown address, Previous frame's sp in rsp +next 399 if (__builtin_expect (hdr->eh_frame_ptr_enc == (DW_EH_PE_sdata4 +next 406 eh_frame = (_Unwind_Ptr) (p + value); +next 407 p += sizeof (value); +next 418 if (hdr->fde_count_enc != DW_EH_PE_omit +next _Unwind_Find_FDE (pc=<optimized out>, bases=bases@entry=0x7fffffffdd48) at /usr/src/debug/sys-devel/gcc-14.2.1_p20241221/gcc-14-20241221/libgcc/unwind-dw2-fde-dip.c:552 552 return find_fde_tail ((_Unwind_Ptr) pc, dlfo.dlfo_eh_frame, +print hdr No symbol "hdr" in current context. +info locals dlfo = {dlfo_flags = 0, dlfo_map_start = 0x7ffff7d9c000, dlfo_map_end = 0x7ffff7dc91e8, dlfo_link_map = 0x5555555592e0, dlfo_eh_frame = 0x7ffff7dc4b88, __dflo_reserved = {140737351639552, 93824992252640, 140737488345896, 140737488345892, 140737353949664, 140737488346096, 140737353563753}} data = <optimized out> ret = <optimized out> dbase = <optimized out> +next 530 bases->func = (void *) func; +info locals data = <optimized out> ret = <optimized out> dbase = <optimized out> +next uw_frame_state_for (context=context@entry=0x7fffffffdca0, fs=fs@entry=0x7fffffffdb60) at /usr/src/debug/sys-devel/gcc-14.2.1_p20241221/gcc-14-20241221/libgcc/unwind-dw2.c:1007 1007 if (fde == NULL) +print fde $1 = (const struct dwarf_fde *) 0x7ffff7dc7430 +where #0 uw_frame_state_for (context=context@entry=0x7fffffffdca0, fs=fs@entry=0x7fffffffdb60) at /usr/src/debug/sys-devel/gcc-14.2.1_p20241221/gcc-14-20241221/libgcc/unwind-dw2.c:1007 #1 0x00007ffff7dbe860 in uw_init_context_1 (context=context@entry=0x7fffffffdca0, outer_cfa=outer_cfa@entry=0x7fffffffded0, outer_ra=0x7ffff7ee8201 <backtrace+97>) at /usr/src/debug/sys-devel/gcc-14.2.1_p20241221/gcc-14-20241221/libgcc/unwind-dw2.c:1335 #2 0x00007ffff7dbf796 in _Unwind_Backtrace (trace=0x7ffff7ee8100, trace_argument=0x7fffffffded0) at /usr/src/debug/sys-devel/gcc-14.2.1_p20241221/gcc-14-20241221/libgcc/unwind.inc:296 #3 0x00007ffff7ee8201 in backtrace () from /usr/lib64/libc.so.6 #4 0x00005555555551bc in main (argc=1, argv=0x7fffffffe0b8) at backtrace.c:12 +list 1002 if (context->ra == 0) 1003 return _URC_END_OF_STACK; 1004 1005 fde = _Unwind_Find_FDE (context->ra + _Unwind_IsSignalFrame (context) - 1, 1006 &context->bases); 1007 if (fde == NULL) 1008 { 1009 #ifdef MD_FALLBACK_FRAME_STATE_FOR 1010 /* Couldn't find frame unwind info for this function. Try a 1011 target-specific fallback mechanism. This will necessarily On "bad" machine the "fde" is NULL at line 1007 - which wil execute some kind of fallback (lines 1008 and more) and ends up with abort. Unfortunately I don't yet understand why find_fde_tail() returns NULL in "bad" case...
(Sorry, not had a chance to poke more yet, and probably won't for a week or two.) Mind checking if FEATURES=test emerge -v1 glibc (i.e. the glibc testsuite) passes with -fno-strict-aliasing too?
Created attachment 928821 [details] List of failed tests on "bad" configuration (glibc build with -fno-strict-aliasing) Test Results are consistent with expectations: - "bad" glibc build fails with error due 123 FAIL tests (list uploaded in this attachment) - "ok" glibc build finishes successfully (not counting XFAIL - but they are same) Failed tests include all backtrace tests: $ fgrep backtrace gentoo-srv2-bad-failed-tests.log FAIL: debug/backtrace-tst FAIL: debug/tst-backtrace2 FAIL: debug/tst-backtrace3 FAIL: debug/tst-backtrace4 FAIL: debug/tst-backtrace5 FAIL: debug/tst-backtrace6 FAIL: nptl/tst-backtrace1 Here is quick diff comparing summaries ("ok" vs "bad"): diff -w gentoo-srv2-failed-tests-summary.log gentoo-srv2-bad-failed-tests-summary.log 8c8,9 < 5313 PASS --- > 123 FAIL > 5190 PASS 12c13 < make[1]: Leaving directory '/var/tmp/portage/sys-libs/glibc-2.40-r8/work/glibc-2.40' --- > make[1]: *** [Makefile:672: tests] Error 1 And here full summaries: $ cat gentoo-srv2-failed-tests-summary.log # "ok" build === Summary of results === 4949 PASS 44 UNSUPPORTED 17 XFAIL 10 XPASS make[1]: Leaving directory '/var/tmp/portage/sys-libs/glibc-2.40-r8/work/glibc-2.40' === Summary of results === 5313 PASS 99 UNSUPPORTED 17 XFAIL 8 XPASS make[1]: Leaving directory '/var/tmp/portage/sys-libs/glibc-2.40-r8/work/glibc-2.40' $ cat gentoo-srv2-bad-failed-tests-summary.log # "bad" build === Summary of results === 4949 PASS 44 UNSUPPORTED 17 XFAIL 10 XPASS make[1]: Leaving directory '/var/tmp/portage/sys-libs/glibc-2.40-r8/work/glibc-2.40' === Summary of results === 123 FAIL 5190 PASS 99 UNSUPPORTED 17 XFAIL 8 XPASS make[1]: *** [Makefile:672: tests] Error 1