I get the following error when trying to run genkernel --lvm --luks --install initramfs ERROR: Command 'b2 --user-config=/var/tmp/genkernel/gk_ldJZhGwB/boost/boost_1_73_0/user-config.jam --without-python gentoorelease -j17 -q -d+2 pch=off --disable-icu boost.locale.icu=off --without-mpi --without-locale --without-context --without-coroutine --without-fiber --without-stacktrace --boost-build=/var/tmp/genkernel/gk_ldJZhGwB/boost/buildroot/usr/share/boost-build --prefix=/usr --layout=system --no-cmake-config threading=multi link=shared,static -sNO_BZIP2=1 -sNO_LZMA=1 -sNO_ZLIB=1 -sNO_ZSTD=1' failed! * ERROR: create_initramfs(): append_data(): append_lvm(): populate_binpkg(): populate_binpkg(): gkbuild(): Failed to create binpkg of boost-1.73.0! It seems to be a problem with boost-1.73.0. I didnt have the problem before upgrading to version 1.73.0. Reproducible: Always Steps to Reproduce: I have boost 1.73.0 installed together with Genkernel 4.0.9. I am trying to upgrade the kernel (gentoo-sources) to version 5.7.6. Actual Results: ERROR: Command 'b2 --user-config=/var/tmp/genkernel/gk_ldJZhGwB/boost/boost_1_73_0/user-config.jam --without-python gentoorelease -j17 -q -d+2 pch=off --disable-icu boost.locale.icu=off --without-mpi --without-locale --without-context --without-coroutine --without-fiber --without-stacktrace --boost-build=/var/tmp/genkernel/gk_ldJZhGwB/boost/buildroot/usr/share/boost-build --prefix=/usr --layout=system --no-cmake-config threading=multi link=shared,static -sNO_BZIP2=1 -sNO_LZMA=1 -sNO_ZLIB=1 -sNO_ZSTD=1' failed! * ERROR: create_initramfs(): append_data(): append_lvm(): populate_binpkg(): populate_binpkg(): gkbuild(): Failed to create binpkg of boost-1.73.0 Expected Results: Working version of genkernel installed in /boot
Created attachment 647330 [details] genkernel.log
Portage 2.3.103 (python 3.7.8-final-0, default/linux/amd64/17.1/desktop/plasma, gcc-10.1.0, glibc-2.31-r5, 5.6.13-gentoo x86_64) ================================================================= System uname: Linux-5.6.13-gentoo-x86_64-AMD_Ryzen_7_1700X_Eight-Core_Processor-with-gentoo-2.7 KiB Mem: 65814752 total, 63491736 free KiB Swap: 75497468 total, 75497468 free Timestamp of repository gentoo: Wed, 01 Jul 2020 18:00:01 +0000 Head commit of repository gentoo: 56911e5b7b0e6a0e71a495f7768123184ac7ac3b sh bash 5.0_p17 ld GNU ld (Gentoo 2.34 p4) 2.34.0 app-shells/bash: 5.0_p17::gentoo dev-java/java-config: 2.3.1::gentoo dev-lang/perl: 5.30.3-r1::gentoo dev-lang/python: 2.7.18::gentoo, 3.6.10-r2::gentoo, 3.7.8::gentoo, 3.8.3::gentoo, 3.9.0_beta3::gentoo dev-util/cmake: 3.17.3::gentoo sys-apps/baselayout: 2.7::gentoo sys-apps/openrc: 0.42.1::gentoo sys-apps/sandbox: 2.20::gentoo sys-devel/autoconf: 2.13-r1::gentoo, 2.69-r5::gentoo sys-devel/automake: 1.16.2::gentoo sys-devel/binutils: 2.34-r1::gentoo sys-devel/gcc: 8.4.0-r1::gentoo, 10.1.0-r1::gentoo sys-devel/gcc-config: 2.3.1::gentoo sys-devel/libtool: 2.4.6-r6::gentoo sys-devel/make: 4.3::gentoo sys-kernel/linux-headers: 5.7::gentoo (virtual/os-headers) sys-libs/glibc: 2.31-r5::gentoo Repositories: gentoo location: /usr/portage sync-type: rsync sync-uri: rsync://rsync.gentoo.org/gentoo-portage priority: -1000 sync-rsync-verify-max-age: 24 sync-rsync-extra-opts: sync-rsync-verify-metamanifest: yes sync-rsync-verify-jobs: 1 guru location: /var/lib/layman/guru masters: gentoo priority: 50 steam-overlay location: /var/lib/layman/steam-overlay masters: gentoo priority: 50 ACCEPT_KEYWORDS="amd64 ~amd64" ACCEPT_LICENSE="* -@EULA" CBUILD="x86_64-pc-linux-gnu" CFLAGS="-march=znver1 -O2 -pipe -ggdb" CHOST="x86_64-pc-linux-gnu" CONFIG_PROTECT="/etc /etc/libvirt/libvirtd.conf /usr/lib64/libreoffice/program/sofficerc /usr/share/config /usr/share/gnupg/qualified.txt /usr/share/sddm/scripts/Xsetup" CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/sandbox.d /etc/terminfo" CXXFLAGS="-O2 -pipe" DISTDIR="/usr/portage/distfiles" EMERGE_DEFAULT_OPTS="--with-bdeps=y --keep-going=y --jobs 3" ENV_UNSET="DBUS_SESSION_BUS_ADDRESS DISPLAY GOBIN GOPATH PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR" FCFLAGS="-O2 -pipe" FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs config-protect-if-modified distlocks ebuild-locks fixlafiles ipc-sandbox merge-sync multilib-strict network-sandbox news parallel-fetch pid-sandbox preserve-libs protect-owned qa-unresolved-soname-deps sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr" FFLAGS="-O2 -pipe" GENTOO_MIRRORS="http://ftp-stud.hs-esslingen.de/pub/Mirrors/gentoo/ ftp://ftp-stud.hs-esslingen.de/pub/Mirrors/gentoo/ rsync://ftp-stud.hs-esslingen.de/gentoo/" LANG="da_DK.utf8" LDFLAGS="-Wl,-O1 -Wl,--as-needed" MAKEOPTS="-j17" PKGDIR="/usr/portage/packages" PORTAGE_CONFIGROOT="/" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git" PORTAGE_TMPDIR="/tmp" USE="X a52 aac acl acpi activities alsa amd64 bash-completion berkdb bluetooth branding bzip2 cairo cdda cdr cli crypt cups cxx dbus declarative dri dts dvd dvdr elogind emboss encode exif fam flac fortran gdbm gif glamor gtk hardened iconv icu ipv6 java jpeg kde kerberos kipi kwallet lcms ldap libnotify libtirpc mad mng modules mp3 mp4 mpeg multilib ncurses nls nptl ogg opengl openmp pam pango pcre pdf phonon plasma png policykit ppds pulseaudio qml qt5 readline sasl sdl seccomp semantic-desktop spell split-usr ssl startup-notification svg tcpd tiff truetype udev udisks unicode upower usb vorbis widgets wxwidgets x264 xattr xcb xcomposite xml xv xvid zlib" ABI_X86="64" ADA_TARGET="gnat_2018" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="karbon sheets words" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="mmx mmxext sse sse2" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock greis isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" GRUB_PLATFORMS="efi-64" INPUT_DEVICES="libinput" KERNEL="linux" L10N="da" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php7-2" POSTGRES_TARGETS="postgres10 postgres11" PYTHON_SINGLE_TARGET="python3_7" PYTHON_TARGETS="python3_6 python3_7" RUBY_TARGETS="ruby25 ruby27" SANE_BACKENDS="hp" USERLAND="GNU" VIDEO_CARDS="nvidia" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account" Unset: CC, CPPFLAGS, CTARGET, CXX, INSTALL_MASK, LC_ALL, LINGUAS, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
Paging in slyfox, as smells a lot like bug 724314: *** stack smashing detected ***: terminated
Is it reproducible for you? Or does it work after multiple attempts? Do you have same problem with =dev-libs/boost-1.73.0? I was told that this is likely an obscure bug with gcc-10 which only triggers under very specific conditions... you can be even lucky that 1.72.0 did work for you.
You could also test if reducing MAKEOPTS helps for some reason.
(In reply to Thomas Deutschmann from comment #4) > Is it reproducible for you? Or does it work after multiple attempts? > > Do you have same problem with =dev-libs/boost-1.73.0? > > I was told that this is likely an obscure bug with gcc-10 which only > triggers under very specific conditions... you can be even lucky that > 1.72.0 did work for you. I have tried 5 times, over two days, with the same outcome (waited a day and emerge --sync'd, trying my luck). I also tried reemerging dev-libs/boost. I just tried to run genkernel without makeopts=j17, same results. I will now try with =dev-libs/boost-1.73.0 and the retry genkernel. Btw thank you all for helping.
(In reply to Thomas Deutschmann from comment #5) > You could also test if reducing MAKEOPTS helps for some reason. I h(In reply to Stig Nielsen from comment #6) > (In reply to Thomas Deutschmann from comment #4) > > Is it reproducible for you? Or does it work after multiple attempts? > > > > Do you have same problem with =dev-libs/boost-1.73.0? > > > > I was told that this is likely an obscure bug with gcc-10 which only > > triggers under very specific conditions... you can be even lucky that > > 1.72.0 did work for you. > > I have tried 5 times, over two days, with the same outcome (waited a day and > emerge --sync'd, trying my luck). I also tried reemerging dev-libs/boost. > > I just tried to run genkernel without makeopts=j17, same results. I will now > try with =dev-libs/boost-1.73.0 and the retry genkernel. > > Btw thank you all for helping. I have tried it now and it gives the same results. I will try on my Intel machine and see if the CPU/AM4 is one of the reasons.
As in bug #724314, -march=znver1 on znver1 managed to break your gcc. Are you building whole system with flags as in your make.conf? CFLAGS="-march=znver1 -O2 -pipe -ggdb" CXXFLAGS="-O2 -pipe" Or you have some overrides in /etc/portage/package.env or the equivalent? The flags are fine, I'm just trying to find any pattern why some systems manage to exhibit this behaviour while most of znvers don't. Rebuilding gcc with CFLAGS="--O2 -pipe -ggdb" will probably workaround it. Worth a try. If you are san spend some time debugging actual failure can you try to minimize the sample source file that gcc fails on? https://wiki.gentoo.org/wiki/Gcc-ICE-reporting-guide#.5Bbonus.5D_minimize_self-contained_source_using_creduce has step by step instruction.
(In reply to Sergei Trofimovich from comment #8) > As in bug #724314, -march=znver1 on znver1 managed to break your gcc. Are > you building whole system with flags as in your make.conf? > CFLAGS="-march=znver1 -O2 -pipe -ggdb" > CXXFLAGS="-O2 -pipe" > Or you have some overrides in /etc/portage/package.env or the equivalent? > > The flags are fine, I'm just trying to find any pattern why some systems > manage to exhibit this behaviour while most of znvers don't. Rebuilding gcc > with CFLAGS="--O2 -pipe -ggdb" will probably workaround it. Worth a try. > > If you are san spend some time debugging actual failure can you try to > minimize the sample source file that gcc fails on? > https://wiki.gentoo.org/wiki/Gcc-ICE-reporting-guide#.5Bbonus. > 5D_minimize_self-contained_source_using_creduce has step by step instruction. Thank you for the comment. I dont have any other overrides but I do have cross-x86_64-mingw32 set up, could that be source of the error? I will try to recompile with the CFLAGS that you suggest. Thank you for the link. I will read it and follow it, so we dont get swamped in information. Btw the Intel machine works without problems.
(In reply to Stig Nielsen from comment #9) > (In reply to Sergei Trofimovich from comment #8) > > As in bug #724314, -march=znver1 on znver1 managed to break your gcc. Are > > you building whole system with flags as in your make.conf? > > CFLAGS="-march=znver1 -O2 -pipe -ggdb" > > CXXFLAGS="-O2 -pipe" > > Or you have some overrides in /etc/portage/package.env or the equivalent? > > > > The flags are fine, I'm just trying to find any pattern why some systems > > manage to exhibit this behaviour while most of znvers don't. Rebuilding gcc > > with CFLAGS="--O2 -pipe -ggdb" will probably workaround it. Worth a try. > > > > If you are san spend some time debugging actual failure can you try to > > minimize the sample source file that gcc fails on? > > https://wiki.gentoo.org/wiki/Gcc-ICE-reporting-guide#.5Bbonus. > > 5D_minimize_self-contained_source_using_creduce has step by step instruction. > > Thank you for the comment. I dont have any other overrides but I do have > cross-x86_64-mingw32 set up, could that be source of the error? I would not expect it to interfere. But it's an interesting hypothesis. I'll try to install it on znver2 machine to see if it has any effect for me.
I recompiled gcc without znver1 and reemerged libtool and then reemerged boost , as you hinted and it works. I will have some time these next days so I will try to work with the error on my other ZNVER1 (same set up hardware wise, but smaller gentoo build (I use it as a server)).
I get the same error with sys-kernel/genkernel-9999, dev-libs/boost-1.73.0, and sys-kernel/gentoo-sources-5-7-9. I'll post emerge --info in a minute.
Created attachment 650084 [details] output from "emerge --info" output from "emerge --info"
FYI: `emerge --info` do not really help here. Genkernel isn't using emerge or anything from portage. Please always add genkernel.log from failing run. And please notice slyfox's comment that your toolchain could be broken. Try rebuilding/switching compiler for example.
Created attachment 650112 [details] genkernel log I suppose broken toochain is possible, but after my last gcc upgrade, I did an "emerge -e @system" and the few failures were likely timing issues, as they all rebuilt with no errors when done one at a time. However, I'll go ahead and rebuild gcc (I only have one version installed.) Let me know if there is anything else useful I cay try, or information to provide.
> System uname: Linux-5.7.7-gentoo-x86_64-01-x86_64-AMD_Ryzen_5_2600_Six-Core_Processor-with-gentoo-2.6 > CFLAGS="-march=native -O2 -pipe" You also hit a jackpot. -march=znver1 somehow breaks gcc-10.1.0. Rebuild would likely not help (unless you remove -march=native). https://bugs.gentoo.org/730406#c8 suggests to extract minimal example that can reproduce crash on your gcc. That might ease stepping through gcc with gdb to see where stack is corrupted.
Taking the first error, I have the original cpp file (52 lines, 1939 bytes) the .ii file created using --save-temps (147096 lines, 4129327 bytes) and the command which fails. I can try stepping through gcc with gdb, but I'm not sure I even know what to look for. Also, if I'm looking for info on where gcc is when failing, do I need to re-emerge gcc with any flags for debug? I'm willing to put some time into this, but I need some guidance.
I added a https://wiki.gentoo.org/wiki/Project:Toolchain/724314-gcc-10-and-znver1#More_info to get some hints how to extract more info to debug it further. Specifically we need to track down where stack gets corrupted: https://wiki.gentoo.org/wiki/Stack-smashing-debugging-guide
(Minor typo in that first link - the Symptom line has mach instead of march.) I'll work through that, but I'm curious why I didn't have any trouble emerging boost 1.73.0, but I do trip the bug when genkernel tries to compile the same version for the initramfs. Would tracking down the differences between those two builds be of any help?
We should be carrying same patches (except python stuff we don't need) https://gitweb.gentoo.org/proj/genkernel.git/tree/patches/boost-build/1.73.0?h=v4.0.10. The only difference I can spot is in gkbuild, we are removing stuff we don't anymore in the ebuild (https://gitweb.gentoo.org/proj/genkernel.git/tree/gkbuilds/boost-build.gkbuild?h=v4.0.10#n15) but this should be harmless.
(In reply to Jack from comment #17) > Taking the first error, I have the original cpp file (52 lines, 1939 bytes) > the .ii file created using --save-temps (147096 lines, 4129327 bytes) and > the command which fails. I can try stepping through gcc with gdb, but I'm > not sure I even know what to look for. Also, if I'm looking for info on > where gcc is when failing, do I need to re-emerge gcc with any flags for > debug? I'm willing to put some time into this, but I need some guidance. Try shrinking the example first with cvise or creduce: https://wiki.gentoo.org/wiki/Gcc-ICE-reporting-guide#.5Bbonus.5D_minimize_self-contained_source_using_creduce (In reply to Jack from comment #19) > (Minor typo in that first link - the Symptom line has mach instead of march.) > > I'll work through that, but I'm curious why I didn't have any trouble > emerging boost 1.73.0, but I do trip the bug when genkernel tries to compile > the same version for the initramfs. Would tracking down the differences > between those two builds be of any help? It probably comes down to gcc flag difference. At the very least I see '-O2 -march=native' (make.conf) vs -Os -fomit-frame-pointer (genkernel's flags) difference. It should not matter. If just makes gcc behave slightly different on different options. We need to debug whatever crashes.
Created attachment 650230 [details] output of gcc Unfortunately, the crash is now sporadic, and I've gotten genkernel to complete a build of initramfs with no crash. I also rebuild gcc with debugging on, and got the attached output. I'm just starting with creduce, but perhaps the attached can provide some hints. The log includes the command line, and it was run in a directory with only the .ii and .s files produced by -save-temps.
(In reply to Jack from comment #22) > Created attachment 650230 [details] > output of gcc > > Unfortunately, the crash is now sporadic, and I've gotten genkernel to > complete a build of initramfs with no crash. I also rebuild gcc with > debugging on, and got the attached output. I'm just starting with creduce, > but perhaps the attached can provide some hints. > > The log includes the command line, and it was run in a directory with only > the .ii and .s files produced by -save-temps. The backtrace looks similar to backtrace reported by others. Unfortunately it still does not tell where stack corruption happens.
Created attachment 650246 [details] crash with backtrace I'm terrible at reading backtraces, so I don't know if this is useful or not. I have not exited gdb yet, in case I can get any more useful information from it. It has some similarities to https://bugs.gentoo.org/724314#c10 but it's somewhat shorter, and DOES have source names and line numbers.
(In reply to Jack from comment #24) > Created attachment 650246 [details] > crash with backtrace > > I'm terrible at reading backtraces, so I don't know if this is useful or > not. I have not exited gdb yet, in case I can get any more useful > information from it. It has some similarities to > https://bugs.gentoo.org/724314#c10 but it's somewhat shorter, and DOES have > source names and line numbers. Unfortunately this crash happened after stack smashing happened before. You will need to catch stack smash as it happens when canary is overwritten to find the offending line of code. https://wiki.gentoo.org/wiki/Stack-smashing-debugging-guide has a step-by-step how to do it.
Created attachment 650402 [details] gdb output Next attempt. I don't know if it matters, but I couldn't get gcc to build without pie. The only thing I'm really not sure of is if I got the watch location correct. From the example from "mov %rax,-0x8(%rbp)" was to watch "*(long*)($rbp-8)". I used "mov %rax,0x78(%rsp)" to watch "*(long*)($rsp+78)". At least I'm getting used to gdb, so let me know what to change if I need to try again.
(In reply to Jack from comment #26) > Created attachment 650402 [details] > gdb output > > Next attempt. I don't know if it matters, but I couldn't get gcc to build > without pie. The only thing I'm really not sure of is if I got the watch > location correct. From the example from "mov %rax,-0x8(%rbp)" was to > watch "*(long*)($rbp-8)". I used "mov %rax,0x78(%rsp)" to watch > "*(long*)($rsp+78)". At least I'm getting used to gdb, so let me know what > to change if I need to try again. No need to build gcc without pie. It's actually is never a pie binary (one of the exceptions). Looks like you are very close! You did it almost correctly. The small annoyance is that 0x78 is different from 78: > 0x00000000006595de <+46>: mov %rax,0x78(%rsp) > ... > (gdb) watch *(long*)($rsp+78) It has to be 'watch *(long*)($rsp+0x78)'. Otherwise you are off by 42 bytes. """ (gdb) print 78 $1 = 78 (gdb) print 0x78 $2 = 120 """
posted to #gentoo-toolchain: hits the watch at the same location, but with differernt old and new. Then, hitting "c" repeatedly hits the watch in different functions (including gimplify_bind_expr, all with different values. Is there some value change to look for, or just keep going and capture the output? ------- multiple (c)ontinue's don't seem to end up anywhere useful. would a screen-share be of any use here, or can you suggest anything I may have missed?
(In reply to Jack from comment #28) > posted to #gentoo-toolchain: hits the watch at the same location, but with > differernt old and new. Then, hitting "c" repeatedly hits the watch in > different functions (including gimplify_bind_expr, all with different > values. Is there some value change to look for, or just keep going and > capture the output? > ------- > multiple (c)ontinue's don't seem to end up anywhere useful. would a > screen-share be of any use here, or can you suggest anything I may have > missed? If gimplify_bind_expr() was called from that very cp_gimplify_expr() maybe that was it. Hard to say without seeing the output. Ideally you need to add a break and watch on every 'cp_gimplify_expr()' occurence as we can get many nested calls. Do you still run gdb against full preprocessed file? Or did you reduce it with creduce/cvise? Full preprocessed file will likely be very hard to follow as it has a lot (millions?) of *_gimplify_* calls that repeat. Hopefully reduced example will be a lot smaller.
Run on full preprocessed file. I'll try C-reduce again, but don't really trust it will work correctly if test.sh is not guaranteed to fail consistently. More correctly, if the full run doesn't always fail, then test.sh may give wrong answer, and mislead creduce.
(In reply to Jack from comment #30) > Run on full preprocessed file. I'll try C-reduce again, but don't really > trust it will work correctly if test.sh is not guaranteed to fail > consistently. More correctly, if the full run doesn't always fail, then > test.sh may give wrong answer, and mislead creduce. It's OK. It will not reduce too much. It will reduce too little. Once it stabilizes you can make test.sh to run gcc 10 times (or similar) to reduce it further.
I altered test.sh to try up to ten times, and only return 1 if it works all ten times, otherwise return 0. creduce has now been running for almost two hours. The original cpp file was 52 lines, 1939 bytes. The current cpp is 5 lines, under 300 bytes. Before I got that started, I made a modified version of test.sh that would put out the error or else OK, and then sort the results and do uniq -d. On multiple runs, it averaged about 20% OK 40% ./boost/spirit/home/classic/core/composite/composite.hpp:67:31: internal compiler error: Aborted 40% ./boost/wave/grammars/cpp_predef_macros_grammar.hpp:94:31: internal compiler error: Aborted Above is running on the .ii file. Running on the .cpp file (in a fresh directory, out of the boost build dir) those results start in /usr/includ/boost instead of ./boost. Is there anything worth my posting now? Othewise, I'll just wait until I can try working with the reduced file?
(In reply to Jack from comment #32) > I altered test.sh to try up to ten times, and only return 1 if it works all > ten times, otherwise return 0. creduce has now been running for almost two > hours. The original cpp file was 52 lines, 1939 bytes. The current cpp is > 5 lines, under 300 bytes. Woohoo! That one should be easier to step through in gdb! > Before I got that started, I made a modified version of test.sh that would > put out the error or else OK, and then sort the results and do uniq -d. On > multiple runs, it averaged about > 20% OK > 40% ./boost/spirit/home/classic/core/composite/composite.hpp:67:31: internal > compiler error: Aborted > 40% ./boost/wave/grammars/cpp_predef_macros_grammar.hpp:94:31: internal > compiler error: Aborted > > Above is running on the .ii file. Running on the .cpp file (in a fresh > directory, out of the boost build dir) those results start in > /usr/includ/boost instead of ./boost. > > Is there anything worth my posting now? Othewise, I'll just wait until I > can try working with the reduced file? That sounds like great progress! You can attach minimized example for posterity (I will also try to break gcc locally as well). I did not understand if the sample you got is selfcontained or it still uses external includes. If it's selfcontained you can try again to trace gcc through gdb and see if you can catch stack corruption.
I attempted for a self contained example, but when I removed the -I pointing into the build area, I think it found some includes in the installed system, so the example will work for others, if they have boost 1.73.0 installed. If necessary, I suppose I can go back and try to find the necessary includes. creduce took almost 3 hours, and ended up with #include <boost/wave/cpplexer/cpp_lex_iterator.hpp> #include <boost/wave/grammars/cpp_predef_macros_grammar.hpp> typedef boost::wave::cpplexer::lex_token<> a; typedef boost::wave::cpplexer::lex_iterator<a> b; template struct boost::wave::grammars::predefined_macros_grammar_gen<b>; I'll attach my gdb output, but I feel like I'm just going in circles. I feel like I've missed something subtle, or maybe I'm just not really sure what I'm supposed to be looking for in the gdb output.
Created attachment 650644 [details] gdb output using the creduced source shown in previous comment.
(In reply to Jack from comment #34) > I attempted for a self contained example, but when I removed the -I pointing > into the build area, I think it found some includes in the installed system, > so the example will work for others, if they have boost 1.73.0 installed. > If necessary, I suppose I can go back and try to find the necessary includes. > > creduce took almost 3 hours, and ended up with > > #include <boost/wave/cpplexer/cpp_lex_iterator.hpp> > #include <boost/wave/grammars/cpp_predef_macros_grammar.hpp> > typedef boost::wave::cpplexer::lex_token<> a; > typedef boost::wave::cpplexer::lex_iterator<a> b; > template struct boost::wave::grammars::predefined_macros_grammar_gen<b>; Ah, that's not a self-contained example. I'm surprised .ii file contained them. It should not. You can expand includes with 'gcc -E <other-options> file.cpp -o file.pp.cpp' and continue reducing preprocessed file. > I'll attach my gdb output, but I feel like I'm just going in circles. I > feel like I've missed something subtle, or maybe I'm just not really sure > what I'm supposed to be looking for in the gdb output. We can try to do it backwards once/if you'll be able to shrink the example further.
I am getting confused. I ran creduce on the .cpp, not on the .ii. Are those five lines still too big, or is the problem the includes? Or, should I run your suggestion and then run creduce on that? The discussion(s) on tracking this down need to be more explicit about all this. It's not at all obvious to folks who aren't very experienced in gdb and gcc details.
(In reply to Jack from comment #37) > I am getting confused. I ran creduce on the .cpp, not on the .ii. Are > those five lines still too big, or is the problem the includes? Correct. Currently we see the crash in conversion of c++ expression (sort of) into lower-level representation in gcc. Boost includes import many complicated expressions into the final result. reduction of .ii (if possible) where all includes are inlined into a single file will make an example shorter. To make it easily inspectable source file should consists of tens of expressions, not tens of thousands. > Or, should I run your suggestion and then run creduce on that? There should be no fundamental differece between .ii reduction and reduction of 'gcc -E' preprocessed file. Reducing 'gcc -E' preprocessed file can be faster as you already made it so small. > The discussion(s) on tracking this down need to be more explicit about all this. It's not at all obvious to folks who aren't very experienced in gdb and gcc details. It's not specific to gcc or gdb. We need smallest possible input for a faulty program because we are tracing very low-level per-instruction execution of the program to see where it got miscompiled. External includes make input relatively large.
the creduced version of the -E preprocessed file only fails <5% of the time. I have to wonder if my test.sh wasn't really right. I'm going to try from scratch again.
(In reply to Jack from comment #39) > the creduced version of the -E preprocessed file only fails <5% of the time. > I have to wonder if my test.sh wasn't really right. I'm going to try from > scratch again. Having slightly lower failure probability does not sound bad. As long as gcc still crashes it means we have something wrong happening.
Created attachment 650886 [details] output of gcc and of gdb Started again. Preprocessed with gcc -E -P. Creduce ran about 13 hours and reduced to a single line template <long()()->decltype(0) which fails 100%, although the failures seem different from the pre-reduced file. Attachment has output of gcc, showing "instantiate_predef_macros.pp.cpp:1:21: internal compiler error: in splice_late_return_type, at cp/pt.c:29095" In gdb, with breakpoint on that line, the break is: Thread 2.1 "cc1plus" hit Breakpoint 2, 0x000000000072ad8c in splice_late_return_type(tree_node*, tree_node*) () at /var/tmp/portage/sys-devel/gcc-10.1.0-r2/work/gcc-10.1.0/gcc/wide-int-bitmask.h:86 I don't understand how a breakpoint in one .c file seems to trigger in a different .h file. I'm open to suggestions on what else I can do.
(In reply to Jack from comment #41) > Created attachment 650886 [details] > output of gcc and of gdb > > Started again. Preprocessed with gcc -E -P. Creduce ran about 13 hours and > reduced to a single line > > template <long()()->decltype(0) > > which fails 100%, although the failures seem different from the pre-reduced > file. > > Attachment has output of gcc, showing > "instantiate_predef_macros.pp.cpp:1:21: internal compiler error: in > splice_late_return_type, at cp/pt.c:29095" > > In gdb, with breakpoint on that line, the break is: > Thread 2.1 "cc1plus" hit Breakpoint 2, 0x000000000072ad8c in > splice_late_return_type(tree_node*, tree_node*) () > at > /var/tmp/portage/sys-devel/gcc-10.1.0-r2/work/gcc-10.1.0/gcc/wide-int- > bitmask.h:86 > I don't understand how a breakpoint in one .c file seems to trigger in a > different .h file. > > I'm open to suggestions on what else I can do. Well done! I can reproduce the failure on gcc-10.2.0 locally as well: $ cat a.cc template <long()()->decltype(0) $ LANG=C gcc-10.2.0 a.cc a.cc:1:21: internal compiler error: in splice_late_return_type, at cp/pt.c:29152 1 | template <long()()->decltype(0) | ^~~~~~~~~~~ It might happen to be yet another problem. But let me fix a fix for this one first and then we can retry with fixed gcc.
(In reply to Sergei Trofimovich from comment #42) > (In reply to Jack from comment #41) > > Created attachment 650886 [details] > > output of gcc and of gdb > > > > Started again. Preprocessed with gcc -E -P. Creduce ran about 13 hours and > > reduced to a single line > > > > template <long()()->decltype(0) > > > > which fails 100%, although the failures seem different from the pre-reduced > > file. > > > > Attachment has output of gcc, showing > > "instantiate_predef_macros.pp.cpp:1:21: internal compiler error: in > > splice_late_return_type, at cp/pt.c:29095" > > > > In gdb, with breakpoint on that line, the break is: > > Thread 2.1 "cc1plus" hit Breakpoint 2, 0x000000000072ad8c in > > splice_late_return_type(tree_node*, tree_node*) () > > at > > /var/tmp/portage/sys-devel/gcc-10.1.0-r2/work/gcc-10.1.0/gcc/wide-int- > > bitmask.h:86 > > I don't understand how a breakpoint in one .c file seems to trigger in a > > different .h file. > > > > I'm open to suggestions on what else I can do. > > Well done! I can reproduce the failure on gcc-10.2.0 locally as well: > > $ cat a.cc > template <long()()->decltype(0) > $ LANG=C gcc-10.2.0 a.cc > a.cc:1:21: internal compiler error: in splice_late_return_type, at > cp/pt.c:29152 > 1 | template <long()()->decltype(0) > | ^~~~~~~~~~~ > > It might happen to be yet another problem. But let me fix a fix for this one > first and then we can retry with fixed gcc. Probably a https://gcc.gnu.org/PR95820.
Created attachment 650896 [details, diff] gcc-10.2.0-backport-PR95820-ICE-fundecl.patch
gcc-10.2.0-backport-PR95820-ICE-fundecl.patch should fix the "internal compiler error: in splice_late_return_type, at cp/pt.c:29095" failure. To enable it drop the file to /etc/portage/patches/sys-devel/gcc:10/ and rebuild gcc. Once done check if "splice_late_return_type, at cp/pt.c:29095" ICE is gone try to re-reduce preprocessed boost file again.
gcc rebuilt with the patch. Running on the one line file complains "error: 'parameter' function with trailing return type not declared with 'auto' type specifier" and "error: expected '>' at end of input". Has required syntax changed? Running on the previously preprocessed file failed 28 out of 100. Running on the original file (with the #include's) failed 21 out of 100. I re-emerge boost-1.73.0 with no error, but I had no error before. I ran genkernel with no error, and even though I told it not to cleanup, the boost build dir was cleaned, so I can't try on the new version of the original file which failed. I wonder why some of the old files still fail, even sometimes, but perhaps it's best to call it fixed for now, and see if the patch also works for the few others who also found the problem.
(In reply to Jack from comment #46) > gcc rebuilt with the patch. Running on the one line file complains "error: > 'parameter' function with trailing return type not declared with 'auto' type > specifier" and "error: expected '>' at end of input". Has required syntax > changed? The one-liner reduced file had an incorrect c++ syntax. It even lacked ';' after the statement. So the proper compiler error instead of compiler ICE is fine. That's why we needed to start from any syntactically valid state for re-reduction. > Running on the previously preprocessed file failed 28 out of 100. > Running on the original file (with the #include's) failed 21 out of 100. The original bug is still present then. Just harder to hit.
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/proj/gcc-patches.git/commit/?id=fafbb4148cb5c2cf7e1ae02679240cba43e95992 commit fafbb4148cb5c2cf7e1ae02679240cba43e95992 Author: Sergei Trofimovich <slyfox@gentoo.org> AuthorDate: 2020-07-27 06:48:54 +0000 Commit: Sergei Trofimovich <slyfox@gentoo.org> CommitDate: 2020-07-27 06:48:54 +0000 10.2.0: backport ICE on invalid function declarations Reported-by: Jack Ostroff Bug: https://bugs.gentoo.org/730406 Bug: https://gcc.gnu.org/PR95820 Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org> 10.2.0/gentoo/34_all_fundecl-ICE-PR95820.patch | 25 +++++++++++++++++++++++++ 10.2.0/gentoo/README.history | 1 + 2 files changed, 26 insertions(+)
First of all, thank all of you for the hard work and a special thank you to Jack and Sergei. I have tried the patch and am able to rebuild GCC etc. and then reemerge boost. After that genkernel is working without problem (I did not reinstall genkernel). So the patch is working for a 1700x ryzen. I haven't yet have time to try the path Jack has taken to find the bug and honestly, I am not sure what to do but will try on my alternative setup. BTW I am not sure if it means anything but when I have CPU_FLAGS_X86="aes avx avx2 f16c fma3 mmx mmxext pclmul popcnt sha sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3" set in make.conf, boost fails to build. @Sergei, please advice if I should open a new bug for this error or if it is related and should stay here. In any case commenting it out, allows boost to build.
ARRHHHH I wrote to soon. I reemerge genkernel and tried:genkernel --lvm --luks --install initramfs and got the same error again with: *** stack smashing detected ***: terminated In file included from libs/serialization/src/xml_grammar.cpp:64:...... Sorry guys I got excited about the patch. So to be clear, the patch is not resolving the error on an 1700x ryzen.
Note I've never had this error emerging boost, and have now run genkernel several times without seeing the error. I think there is something non-deterministic involved here. I have a looping test set up, and with gcc rebuilt with the patch, I am getting 20-4o% errors and the rest complete OK (with some warnings, but no errors.) Some of the intermediate versions while creduce is running fail with errors other than the stack smashing, but as those or marked uninteresting to creduce, it doesn't matter. It might have to to with the order in which things get done, with multiple threads, although I'm not even certain if that compiler is multithreaded or not. Further, with a single invocation of x86_64-pc-linux-gnu-g++, I'm not sure how to control that, where "-j 1" and related incantations would work with emerge or maybe even genkernel.
Created attachment 651584 [details] Minimize self-contained source using creduce Hi all, I have tried to follow the Gcc-Ice guide and have gotten the following output. But I need help with where to go from here. I am not sure how to apply the https://wiki.gentoo.org/wiki/Stack-smashing-debugging-guide on this code. Please advice. Regards Stig
Created attachment 651586 [details] Reproducer
(In reply to Stig Nielsen from comment #52) > Created attachment 651584 [details] > Minimize self-contained source using creduce Oh, nice! Can you also post the exact compiler command you run against this source and shell script you used for creaduce? > Hi all, > > I have tried to follow the Gcc-Ice guide and have gotten the following > output. But I need help with where to go from here. I am not sure how to > apply the https://wiki.gentoo.org/wiki/Stack-smashing-debugging-guide on > this code. > > Please advice. > > Regards Stig Normally you can run gcc under gdb. For that you need to extract raw command. I usually use '-v' command: $ gcc-10.2.0 -c a.cc -Os -v ... /usr/libexec/gcc/x86_64-pc-linux-gnu/10.2.0/cc1plus -quiet -v -D_GNU_SOURCE a.cc -quiet -dumpbase a.cc -mtune=generic -march=x86-64 -auxbase-strip a.o -Os -version -fdiagnostics-color=always -o /tmp/ccCW1wtS.s ... Now we can run it under gdb: $ gdb --args /usr/libexec/gcc/x86_64-pc-linux-gnu/10.2.0/cc1plus -quiet -v -D_GNU_SOURCE a.cc -quiet -dumpbase a.cc -mtune=generic -march=x86-64 -auxbase-strip a.o -Os -version -fdiagnostics-color=always -o /tmp/ccCW1wtS.s > start Temporary breakpoint 1, main (argc=17, argv=0x7fffffffd748) at /usr/src/debug/sys-devel/gcc-10.2.0/gcc-10.2.0/gcc/main.c:35 It's ready to debug.
Created attachment 651680 [details] Shellscript I used this command: creduce test.sh instantiate_cpp_grammar.ii @Sergei, I hope this is what you were asking for. The debug I will try when I get back home. Thank you for helping me.
I was away for the weekend, and the latest creduce (which took over five days) came up with struct a { template <typename> static char ao; template <class> struct ao<>: which looks like another invalid syntax that triggers an ICE. Unless someone does it first, I should get to debugging within a few days.
(In reply to Stig Nielsen from comment #55) > Created attachment 651680 [details] > Shellscript > > I used this command: creduce test.sh instantiate_cpp_grammar.ii > > @Sergei, I hope this is what you were asking for. > > The debug I will try when I get back home. > > Thank you for helping me. Oh, you will need to tweak the script for this specific failure. > LANG=C++ x86_64-pc-linux-gnu-g++ -fvisibility-inlines-hidden -march=znver1 -O2 -pipe -ggdb -std=c++14 -fPIC -m64 -pthread -finline-functions -Wno-inline -Wall -fvisibility=hidden -DBOOST_ALL_DYN_LINK=1 -DBOOST_ALL_NO_LIB=1 -DNDEBUG -I"." -c instantiate_cpp_grammar.ii -o instantiate_cpp_grammar.o \ > >gcc_out.txt 2>&1 > grep "internal compiler error" gcc_out.txt >/dev/null 2>&1 LANG=C is a human language locale, not programming language flavour. It's used to make gcc produce english errors. Otherwise 'internal compiler error' may be localised. 'grep "internal compiler error" gcc_out.txt >/dev/null 2>&1' is a typical string to search for any problem in compiler. We are specifically interested in smashed stack, thus command should be: """ LANG=C x86_64-pc-linux-gnu-g++ -fvisibility-inlines-hidden -march=znver1 -O2 -pipe -ggdb -std=c++14 -fPIC -m64 -pthread -finline-functions -Wno-inline -Wall -fvisibility=hidden -DBOOST_ALL_DYN_LINK=1 -DBOOST_ALL_NO_LIB=1 -DNDEBUG -I"." -c instantiate_cpp_grammar.ii -o instantiate_cpp_grammar.o \ >gcc_out.txt 2>&1 grep "stack smashing detected" gcc_out.txt >/dev/null 2>&1 """ This script should maintain the property of a compiler broken by stack corruption instead of any other invalid C++ construct.
(In reply to Jack from comment #56) > I was away for the weekend, and the latest creduce (which took over five > days) came up with > > struct a { > template <typename> static char ao; > template <class> struct ao<>: > > which looks like another invalid syntax that triggers an ICE. Unless > someone does it first, I should get to debugging within a few days. I filed https://gcc.gnu.org/PR96425 to fix ICE on invalid code. My guess that similar to https://bugs.gentoo.org/730406#c54 you used 'internal compiler error' as an interestingness test. But to track down stack corruption we will need a 'stack smashing detected' presence. Otherwise we will uncover more unrelated (or maybe related!) crashes.
Created attachment 652808 [details] debug - stack smashing detected Hi all, I have tried to debug and get a stack smashing detected. The error is: *** stack smashing detected ***: terminated I fil inkluderet fra libs/serialization/src/xml_grammar.cpp:64: libs/serialization/src/basic_xml_grammar.ipp: In constructor »boost::archive::basic_xml_grammar<CharType>::basic_xml_grammar() [with CharType = char]«: libs/serialization/src/basic_xml_grammar.ipp:328:9: intern oversætterfejl: Afbrudt (SIGABRT) 325 | str_p(BOOST_ARCHIVE_XML_CLASS_ID()) >> NameTail | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 326 | >> Eq | ~~~~~ 327 | >> L'"' | ~~~~~~~ 328 | >> int_p [xml::assign_object(rv.class_id)] | ^~~~~~~~~~ 0xc7662f crash_signal Does it look right. It is the first time I am trying it and there was no debugging other then running gcc-10.1.0 -c libs/serialization/src/xml_grammar.cpp -Os -v in the Boost workdirectory (/tmp/portage/dev-libs/boost-1.73.0/work/boost_1_73_0-abi_x86_64.amd64)
Created attachment 652810 [details] gcc -v and statistics for creduce I have tried to run creduce as instructed in earlier comments. Here are the gcc -v out put and statistics
Created attachment 652812 [details] instantiate_cpp_grammar.ii The script for creduce (test.sh) is: LANG=C x86_64-pc-linux-gnu-g++ -fvisibility-inlines-hidden -march=znver1 -O2 -pipe -ggdb -std=c++14 -fPIC -m64 -pthread -finline-functions -Wno-inline -Wall -fvisibility=hidden -DBOOST_ALL_DYN_LINK=1 -DBOOST_ALL_NO_LIB=1 -DNDEBUG -I"." -c instantiate_cpp_grammar.ii -o instantiate_cpp_grammar.o \ >gcc_out.txt 2>&1 grep "stack smashing detected" gcc_out.txt >/dev/null 2>&1
(In reply to Stig Nielsen from comment #59) > Does it look right. It is the first time I am trying it and there was no > debugging other then running gcc-10.1.0 -c > libs/serialization/src/xml_grammar.cpp -Os -v in the Boost workdirectory > (/tmp/portage/dev-libs/boost-1.73.0/work/boost_1_73_0-abi_x86_64.amd64) Yeah, this looks correct and looks similar enough to other failures.
(In reply to Stig Nielsen from comment #61) > Created attachment 652812 [details] > instantiate_cpp_grammar.ii > > The script for creduce (test.sh) is: > > LANG=C x86_64-pc-linux-gnu-g++ -fvisibility-inlines-hidden -march=znver1 -O2 > -pipe -ggdb -std=c++14 -fPIC -m64 -pthread -finline-functions -Wno-inline > -Wall -fvisibility=hidden -DBOOST_ALL_DYN_LINK=1 -DBOOST_ALL_NO_LIB=1 > -DNDEBUG -I"." -c instantiate_cpp_grammar.ii -o instantiate_cpp_grammar.o \ > >gcc_out.txt 2>&1 > grep "stack smashing detected" gcc_out.txt >/dev/null 2>&1 Interesting. The reduction looks legitimate. The resulting file is invalid c++ (for me g++-10.2.0 rejects it). That probably means corruption happens very early in c++ frontend in template matches/instantiations. Which might be a good thing as before it crashed in GIMPLE internals (way later in gcc stages). Now you can try to get exact place where stack gets corrupted. https://bugs.gentoo.org/730406#c54
My run of creduce was stopped by an Isaias induced power loss, and the last version of the cpp file it left did not trigger the error in over 100 tries. Based on a message from the creduce-dev list, it had to have triggered the error at least once, or it would not have been left there as a candidate file. I've restarted creduce from an intermediate file I saved earlier. However, part of the list response included "The real answer here is to avoid nondeterminism, for example by disabling ASLR on your machine. I'm not sure that you're going to get a useful result out of C-Reduce here, if you don't do that." So - is it reasonable (or maybe even necessary) to turn off ASLR while debugging this stack smashing debugging, including running creduce?
(In reply to Jack from comment #64) > My run of creduce was stopped by an Isaias induced power loss, and the last > version of the cpp file it left did not trigger the error in over 100 tries. That means the breakage is probably somehow dependent on some volatile state, like hardware memory layout. > Based on a message from the creduce-dev list, it had to have triggered the > error at least once, or it would not have been left there as a candidate > file. I've restarted creduce from an intermediate file I saved earlier. > However, part of the list response included "The real answer here is to > avoid nondeterminism, for example by disabling ASLR on your machine. I'm not > sure that you're going to get a useful result out of C-Reduce here, if you > don't do that." > So - is it reasonable (or maybe even necessary) to turn off ASLR while > debugging this stack smashing debugging, including running creduce? I agree that determinism would help. Unfortunately we have to work with what we have. Disabling ASLR might help a bit. By either making the error more frequent or not happen at all. The knob is https://github.com/torvalds/linux/blob/master/Documentation/admin-guide/sysctl/kernel.rst#randomize-va-space
Created attachment 653616 [details] new reduced test case and gcc output Disabling aslr did not seem to change anything - about 20% fails for this file. New attachment is the latest reduced cpp file and output of gcc. I wonder if this isn't another syntax error, and not the real bug?
Someone on the creduce-dev list also suggested building gcc with --with-build-config=bootstrap-asan. Does that make any sense? I don't see how to do so, so I'd need some pointers or instruction.
(In reply to Jack from comment #66) > Created attachment 653616 [details] > new reduced test case and gcc output > > Disabling aslr did not seem to change anything - about 20% fails for this > file. New attachment is the latest reduced cpp file and output of gcc. I > wonder if this isn't another syntax error, and not the real bug? This attachment contains both c file and gcc errors. I struggle to find where on finishes and another starts. It also does not contain actual gcc command. Can you post them separately?
The gcc command was the same one I've been using all along. The first line of the gcc output is the first line that contains "error:" and the error reflects the first line of the file. I'll repost separately when I'm back at my desktop.
(In reply to Jack from comment #67) > Someone on the creduce-dev list also suggested building gcc with > --with-build-config=bootstrap-asan. Does that make any sense? I don't see > how to do so, so I'd need some pointers or instruction. Might work for you. I tried to build such a gcc and it did not flag anything for me on znver2. Maybe you will be luckier. To build asan'd gcc you need to build it as: # EXTRA_ECONF=--with-build-config=bootstrap-asan GCC_MAKE_TARGET=all FEATURES="-sandbox -usersandbox" emerge -v1 gcc FEATURES="-sandbox -usersandbox" will probably be needed for all emerge invocations that will try to use this gcc as asan is not compatible with sandbox.
(In reply to Jack from comment #69) > The gcc command was the same one I've been using all along. The first line > of the gcc output is the first line that contains "error:" and the error > reflects the first line of the file. I'll repost separately when I'm back > at my desktop. I assume it's a command """ /usr/bin/x86_64-pc-linux-gnu-g++ -fvisibility-inlines-hidden -Os -pipe -fomit-frame-pointer -std=c++14 -fPIC -m64 -pthread -finline-functions -Wno-inline -Wall -fvisibility=hidden -DBOOST_ALL_DYN_LINK=1 -DBOOST_ALL_NO_LIB=1 -DNDEBUG -I. -c -o instantiate_predef_macros.o instantiate_predef_macros.cpp """
(In reply to Sergei Trofimovich from comment #70) [snip] > To build asan'd gcc you need to build it as: > > # EXTRA_ECONF=--with-build-config=bootstrap-asan GCC_MAKE_TARGET=all > FEATURES="-sandbox -usersandbox" emerge -v1 gcc The build failed during install phase with five cases of ==14750==ERROR: LeakSanitizer: detected memory leaks but I think it's a delayed report, as the previous lines are make[2]: Leaving directory '/var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/build/x86_64-pc-linux-gnu/libatomic' make[1]: Leaving directory '/var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/build' I assume this is not the place to discuss that issue, but where is?
(In reply to Jack from comment #72) > (In reply to Sergei Trofimovich from comment #70) > [snip] > > To build asan'd gcc you need to build it as: > > > > # EXTRA_ECONF=--with-build-config=bootstrap-asan GCC_MAKE_TARGET=all > > FEATURES="-sandbox -usersandbox" emerge -v1 gcc > The build failed during install phase with five cases of > > ==14750==ERROR: LeakSanitizer: detected memory leaks > > but I think it's a delayed report, as the previous lines are > > make[2]: Leaving directory > '/var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/build/x86_64-pc-linux-gnu/ > libatomic' > make[1]: Leaving directory > '/var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/build' It means something is broken. ./config/bootstrap-asan.mk already disables leat detection: export ASAN_OPTIONS=detect_leaks=0:use_odr_indicator=1. You can try to also set the variable explicitly in case it leaks out somewhere else. But it should not. It works for me as is. > I assume this is not the place to discuss that issue, but where is? A separate ticket bug on bugs.gentoo.org might be ok.
Thanks. That worked. Some quick tests show lots of leaks detected. I'm going to try to figure out if I can work with the creduced version, or if the leaks on compiling the original preprocessed version need reporting separately.
(In reply to Jack from comment #74) > Thanks. That worked. Some quick tests show lots of leaks detected. I'm > going to try to figure out if I can work with the creduced version, or if > the leaks on compiling the original preprocessed version need reporting > separately. Memory leaks should not be important and possibly a distraction in this case. We are looking for memory corruption here. Note that asan-ed version of gcc might not smash stack anymore. The error could transform into asan report of memory write into out-of-bounds location. To avoid memory leaks getting in the way you will probably need something like: export ASAN_OPTIONS=detect_leaks=0:use_odr_indicator=1 added to you test script for creduce. Otherwise memory leaks will prevent you from discovering possible memory corruption.
So far, using asan enabled gcc, I have not seen one stack smashing. Could that imply that some out-of-bound write (trapped by asan) was causing the stack smash? I'm going to keep trying to debug using the latest creduced version, but that only gives me a stack smash 15-30% of the time.
(In reply to Jack from comment #76) > So far, using asan enabled gcc, I have not seen one stack smashing. Could > that imply that some out-of-bound write (trapped by asan) was causing the > stack smash? It's not clear if asan detects out-of-bound writes for you. Or everything just works? If it does can you post exact gcc output with a write failure?
Created attachment 653939 [details] gcc output You're right - I only see memory leaks, no out-of-bound writes. Attached is the gcc output from #!/bin/bash export ASAN_OPTIONS=detect_leaks=0:use_odr_indicator=1 x86_64-pc-linux-gnu-g++ -fvisibility-inlines-hidden -Os -pipe -fomit-frame-pointer -std=c++14 -fPIC -m64 -pthread -finline-functions -Wno-inline -Wall -fvisibility=hidden -DBOOST_ALL_DYN_LINK=1 -DBOOST_ALL_NO_LIB=1 -DNDEBUG -c -o instantiate_predef_macros.o instantiate_predef_macros.pp.cpp > gcc_out.txt 2>&1 grep "stack smashing detected" gcc_out.txt || echo OK where instantiate_predef_macros.pp.cpp is the first part of my previous attachment. Without setting the ASAN_OPTIONS, I get exactly the same, but without the memory leak detections at the end.
(In reply to Jack from comment #78) > Created attachment 653939 [details] > gcc output > > You're right - I only see memory leaks, no out-of-bound writes. Attached is > the gcc output from Yeah, that means gcc-asan generated code deifferent enough not to exhibit the failure. We would have to continue with standard gcc build.
I haven't given up, but I'm pretty close. I've generated several additional initramfs files with genkernel, and not seen the error again. I can no longer generate the stack smash on the original file. The last output from creduce only fails 2 to 5 times out of 100, and I have not yet gotten a stack smash while in gdb. Since it was never 100% reproducible, I'll guess that some other package update (other than gcc) has changed the environment during compilation to decrease the probability of the failure. There are plenty of syntax errors in the gcc output, but I have not checked if they are consistent or not, and in any case they would be errors in boost, not gcc.
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/proj/gcc-patches.git/commit/?id=9bba1f72a9210743fddf664b716b5cf288132922 commit 9bba1f72a9210743fddf664b716b5cf288132922 Author: Sergei Trofimovich <slyfox@gentoo.org> AuthorDate: 2020-08-23 09:11:31 +0000 Commit: Sergei Trofimovich <slyfox@gentoo.org> CommitDate: 2020-08-23 09:11:31 +0000 10.2.0: cut 2 patchset Four new patches: + 33_all_lto-O0-mix-ICE-ipa-PR96291.patch: fix -O0 crash for ipa/lto + 34_all_fundecl-ICE-PR95820.patch: fix ICE on invalid templates + 35_all_ipa-fix-bit-CP.patch: fix bad code generation in ipa bit constprop + 36_all_ipa-fix-bit-CP-p2.patch: part 2 of previous patch Bug: https://bugs.gentoo.org/733482 Bug: https://gcc.gnu.org/PR96291 Bug: https://bugs.gentoo.org/730406 Bug: https://gcc.gnu.org/PR95820 Bug: https://bugs.gentoo.org/736685 Bug: https://gcc.gnu.org/PR96482 Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org> 10.2.0/gentoo/README.history | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
The bug has been closed via the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=653d6bf6ea15bdf2db033e3099913bac47f5b0e0 commit 653d6bf6ea15bdf2db033e3099913bac47f5b0e0 Author: Sergei Trofimovich <slyfox@gentoo.org> AuthorDate: 2020-08-23 09:17:12 +0000 Commit: Sergei Trofimovich <slyfox@gentoo.org> CommitDate: 2020-08-23 09:17:12 +0000 sys-devel/gcc: cut 2 patchset Four new patches: + 33_all_lto-O0-mix-ICE-ipa-PR96291.patch: fix -O0 crash for ipa/lto + 34_all_fundecl-ICE-PR95820.patch: fix ICE on invalid templates + 35_all_ipa-fix-bit-CP.patch: fix bad code generation in ipa bit constprop + 36_all_ipa-fix-bit-CP-p2.patch: part 2 of previous patch Closes: https://bugs.gentoo.org/733482 Bug: https://gcc.gnu.org/PR96291 Closes: https://bugs.gentoo.org/730406 Bug: https://gcc.gnu.org/PR95820 Closes: https://bugs.gentoo.org/736685 Bug: https://gcc.gnu.org/PR96482 Package-Manager: Portage-3.0.4, Repoman-3.0.1 Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org> sys-devel/gcc/Manifest | 1 + sys-devel/gcc/gcc-10.2.0-r1.ebuild | 13 +++++++++++++ 2 files changed, 14 insertions(+)
I'm still encountering a gcc stack smashing on boost with this new gcc version, but only on AMD systems, and it doesn't happen all the time. Currently working on reducing the source that reproduces the problem.
(In reply to cJ from comment #83) > I'm still encountering a gcc stack smashing on boost with this new gcc > version, but only on AMD systems, and it doesn't happen all the time. Oh, I closed the bug by accident. We only fixed ICE on invalid code here. You are seeing stack smash with gcc-10.2.0-r1, right? > Currently working on reducing the source that reproduces the problem. Thank you!
Created attachment 656910 [details] reduced boost 1.74 xml_grammar file Attaching current reduction code, don't know if it's exploitable yet. It's still running... bear in mind that there is a lot of CPU days behind this.
(In reply to cJ from comment #85) > Created attachment 656910 [details] > reduced boost 1.74 xml_grammar file > > Attaching current reduction code, don't know if it's exploitable yet. To try to to reproduce crash locally I'll need more details: 1. exact reduction script including compiler parameters used 2. 'emerge --info' output
Created attachment 656980 [details] testcase Attaching testcase that I used; it runs compilation with a simple ${CXX} -o -c ${x}.o ${x}.ii But with several attempts pipelined and distributed on remote hosts. Sample output: 20200826T212900 __main__ INFO Attempt 0 20200826T212901 __main__ INFO Attempt 1 20200826T212901 __main__ INFO Attempt 2 20200826T212901 __main__ INFO Attempt 3 20200826T212902 __main__ INFO Attempt 4 20200826T212902 __main__ INFO Attempt 5 20200826T212902 __main__ INFO Attempt 6 20200826T212902 __main__ INFO Attempt 7 20200826T212902 __main__ INFO Attempt 8 20200826T212902 __main__ INFO Attempt 9 20200826T212902 __main__ INFO Interesting: 2020-08-26T21:29:00.678114 on MacGyver
The emerge --info is not very useful with the configuration that I have (per-package options), so I give you what (I think) may be relevant: - Epyc (3000 or 7000) with ECC RAM, no EDAC issues found - Linux-5.8.{0-2} - all systems were @system up-to-date on ~amd64 as of 2 days ago (glibc 2.32 dev-libs/gmp-6.2.0-r1 dev-libs/mpfr-4.1.0 dev-libs/mpc-1.2.0) except that I was using binutils 2.33.1 - CFLAGS = -pipe -march=znver1 -mno-3dnow -mno-lwp -mno-fma4 -mno-xop -mno-tbm -mno-hle -mno-avx512f -mno-avx512er -mno-avx512cd -mno-avx512pf -mtune=znver1 -O2 - gcc USE = abi_x86_64 amd64 cxx elibc_glibc fortran kernel_linux multilib nls nptl openmp pch pie sanitize ssp userland_GNU vtv I also forgot to put LC_LANG=C in the tests (but adding it still reproduces the problem) and I didn't disable ASLR.
If the stack smash is reasonably reproducible try to extract the backtrace where stack smash happens: https://wiki.gentoo.org/wiki/Stack-smashing-debugging-guide
Count me in, I just got > *** stack smashing detected ***: terminated while building boost via genkernel on a > processor : 31 > vendor_id : AuthenticAMD > cpu family : 23 > model : 8 > model name : AMD Ryzen Threadripper 2950X 16-Core Processor > stepping : 2 > microcode : 0x800820d for the first time myself :(
> gcc.compile.c++ bin.v2/libs/serialization/build/gcc-10.2/gentoorelease/link-static/pch-off/threading-multi/visibility-hidden/xml_wgrammar.o > > "x86_64-pc-linux-gnu-g++" -fvisibility-inlines-hidden -Os -pipe -fomit-frame-pointer -I/var/tmp/genkernel/gk_F0bcQHTT/boost/buildroot/usr/include -std=c++14 -m64 -pthread -finline-functions -Wno-inline -Wall -fvisibility=hidden -ftemplate-depth-255 -fvisibility=hidden -fvisibility-inlines-hidden -DBOOST_ALL_NO_LIB=1 -DNDEBUG -I"." -c -o "bin.v2/libs/serialization/build/gcc-10.2/gentoorelease/link-static/pch-off/threading-multi/visibility-hidden/xml_wgrammar.o" "libs/serialization/src/xml_wgrammar.cpp" > > *** stack smashing detected ***: terminated > In file included from libs/serialization/src/xml_wgrammar.cpp:146: > libs/serialization/src/basic_xml_grammar.ipp: In constructor 'boost::archive::basic_xml_grammar<CharType>::basic_xml_grammar() [with CharType = wchar_t]': > libs/serialization/src/basic_xml_grammar.ipp:364:9: internal compiler error: Aborted > 360 | str_p(BOOST_ARCHIVE_XML_CLASS_NAME()) > | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > 361 | >> Eq > | ~~~~~ > 362 | >> L'"' > | ~~~~~~~ > 363 | >> ClassName > | ~~~~~~~~~~~~ > 364 | >> L'"' > | ^~~~~~~ > 0xc77b4f crash_signal > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/toplev.c:328 > 0x7f1ae543837f ??? > /var/tmp/portage/sys-libs/glibc-2.32-r1/work/glibc-2.32/signal/../sysdeps/unix/sysv/linux/x86_64/sigaction.c:0 > 0x7f1ae5438301 __GI_raise > ../sysdeps/unix/sysv/linux/raise.c:50 > 0x7f1ae5421535 __GI_abort > /var/tmp/portage/sys-libs/glibc-2.32-r1/work/glibc-2.32/stdlib/abort.c:79 > 0x7f1ae547b1d6 __libc_message > ../sysdeps/posix/libc_fatal.c:155 > 0x7f1ae550cbf1 __GI___fortify_fail > /var/tmp/portage/sys-libs/glibc-2.32-r1/work/glibc-2.32/debug/fortify_fail.c:26 > 0x7f1ae550cbcf __stack_chk_fail > /var/tmp/portage/sys-libs/glibc-2.32-r1/work/glibc-2.32/debug/stack_chk_fail.c:24 > 0x65a941 cp_gimplify_expr(tree_node**, gimple**, gimple**) > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/cp/cp-gimplify.c:955 > 0x9f5d6b gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*), int) > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/gimplify.c:13465 > 0x9ff060 gimplify_target_expr > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/gimplify.c:6744 > 0x9f70bf gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*), int) > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/gimplify.c:13952 > 0x9fc3c0 gimplify_addr_expr > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/gimplify.c:6171 > 0x9f6cde gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*), int) > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/gimplify.c:13603 > 0x9fc7e4 gimplify_expr > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/gimplify.c:14603 > 0x9fd165 gimplify_call_expr > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/gimplify.c:3497 > 0x9f70e2 gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*), int) > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/gimplify.c:13528 > 0x9fbc20 gimplify_stmt(tree_node**, gimple**) > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/gimplify.c:6825 > 0x9fbc20 gimplify_cleanup_point_expr > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/gimplify.c:6567 > 0x9f7073 gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*), int) > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/gimplify.c:13948 > 0x9f6f05 gimplify_stmt(tree_node**, gimple**) > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/gimplify.c:6825 > Please submit a full bug report, > with preprocessed source if appropriate. > Please include the complete backtrace with any bug report. > See <https://bugs.gentoo.org/> for instructions. > gcc.compile.c++ bin.v2/libs/serialization/build/gcc-10.2/gentoorelease/pch-off/threading-multi/visibility-hidden/xml_grammar.o > > "x86_64-pc-linux-gnu-g++" -fvisibility-inlines-hidden -Os -pipe -fomit-frame-pointer -I/var/tmp/genkernel/gk_F0bcQHTT/boost/buildroot/usr/include -std=c++14 -fPIC -m64 -pthread -finline-functions -Wno-inline -Wall -fvisibility=hidden -ftemplate-depth-255 -fvisibility=hidden -fvisibility-inlines-hidden -DBOOST_ALL_NO_LIB=1 -DBOOST_SERIALIZATION_DYN_LINK=1 -DNDEBUG -I"." -c -o "bin.v2/libs/serialization/build/gcc-10.2/gentoorelease/pch-off/threading-multi/visibility-hidden/xml_grammar.o" "libs/serialization/src/xml_grammar.cpp" > > *** stack smashing detected ***: terminated > In file included from libs/serialization/src/xml_grammar.cpp:64: > libs/serialization/src/basic_xml_grammar.ipp: In constructor 'boost::archive::basic_xml_grammar<CharType>::basic_xml_grammar() [with CharType = char]': > libs/serialization/src/basic_xml_grammar.ipp:328:9: internal compiler error: Aborted > 325 | str_p(BOOST_ARCHIVE_XML_CLASS_ID()) >> NameTail > | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > 326 | >> Eq > | ~~~~~ > 327 | >> L'"' > | ~~~~~~~ > 328 | >> int_p [xml::assign_object(rv.class_id)] > | ^~~~~~~~~~ > 0xc77b4f crash_signal > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/toplev.c:328 > 0x7fa532c2b37f ??? > /var/tmp/portage/sys-libs/glibc-2.32-r1/work/glibc-2.32/signal/../sysdeps/unix/sysv/linux/x86_64/sigaction.c:0 > 0x7fa532c2b301 __GI_raise > ../sysdeps/unix/sysv/linux/raise.c:50 > 0x7fa532c14535 __GI_abort > /var/tmp/portage/sys-libs/glibc-2.32-r1/work/glibc-2.32/stdlib/abort.c:79 > 0x7fa532c6e1d6 __libc_message > ../sysdeps/posix/libc_fatal.c:155 > 0x7fa532cffbf1 __GI___fortify_fail > /var/tmp/portage/sys-libs/glibc-2.32-r1/work/glibc-2.32/debug/fortify_fail.c:26 > 0x7fa532cffbcf __stack_chk_fail > /var/tmp/portage/sys-libs/glibc-2.32-r1/work/glibc-2.32/debug/stack_chk_fail.c:24 > 0x65a941 cp_gimplify_expr(tree_node**, gimple**, gimple**) > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/cp/cp-gimplify.c:955 > 0x9f5d6b gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*), int) > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/gimplify.c:13465 > 0x9ff060 gimplify_target_expr > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/gimplify.c:6744 > 0x9f70bf gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*), int) > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/gimplify.c:13952 > 0x9fc3c0 gimplify_addr_expr > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/gimplify.c:6171 > 0x9f6cde gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*), int) > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/gimplify.c:13603 > 0x9fc7e4 gimplify_expr > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/gimplify.c:14603 > 0x65a720 cp_gimplify_expr(tree_node**, gimple**, gimple**) > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/cp/cp-gimplify.c:891 > 0x9f5d6b gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*), int) > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/gimplify.c:13465 > 0xa02403 gimplify_modify_expr > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/gimplify.c:5766 > 0x9f6bb3 gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*), int) > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/gimplify.c:13556 > 0x9ff060 gimplify_target_expr > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/gimplify.c:6744 > 0x9f70bf gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*), int) > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/gimplify.c:13952 > Please submit a full bug report, > with preprocessed source if appropriate. > Please include the complete backtrace with any bug report. > See <https://bugs.gentoo.org/> for instructions. > ...skipped <pbin.v2/libs/serialization/build/gcc-10.2/gentoorelease/pch-off/threading-multi/visibility-hidden>libboost_serialization.so.1.73.0 for lack of <pbin.v2/libs/serialization/build/gcc-10.2/gentoorelease/pch-off/threading-multi/visibility-hidden>xml_grammar.o... > ...skipped <p/var/tmp/genkernel/gk_F0bcQHTT/boost/boost_1_73_0/stage/lib>libboost_serialization.so.1.73.0 for lack of <pbin.v2/libs/serialization/build/gcc-10.2/gentoorelease/pch-off/threading-multi/visibility-hidden>libboost_serialization.so.1.73.0... > ...skipped <p/var/tmp/genkernel/gk_F0bcQHTT/boost/boost_1_73_0/stage/lib>libboost_serialization.so.1 for lack of <p/var/tmp/genkernel/gk_F0bcQHTT/boost/boost_1_73_0/stage/lib>libboost_serialization.so.1.73.0... > gcc.compile.c++ bin.v2/libs/serialization/build/gcc-10.2/gentoorelease/pch-off/threading-multi/visibility-hidden/xml_wgrammar.o > > "x86_64-pc-linux-gnu-g++" -fvisibility-inlines-hidden -Os -pipe -fomit-frame-pointer -I/var/tmp/genkernel/gk_F0bcQHTT/boost/buildroot/usr/include -std=c++14 -fPIC -m64 -pthread -finline-functions -Wno-inline -Wall -fvisibility=hidden -ftemplate-depth-255 -fvisibility=hidden -fvisibility-inlines-hidden -DBOOST_ALL_NO_LIB=1 -DBOOST_SERIALIZATION_DYN_LINK=1 -DNDEBUG -I"." -c -o "bin.v2/libs/serialization/build/gcc-10.2/gentoorelease/pch-off/threading-multi/visibility-hidden/xml_wgrammar.o" "libs/serialization/src/xml_wgrammar.cpp" > > *** stack smashing detected ***: terminated > In file included from ./boost/spirit/home/classic/core/non_terminal/rule.hpp:33, > from ./boost/spirit/include/classic_rule.hpp:11, > from ./boost/archive/impl/basic_xml_grammar.hpp:53, > from libs/serialization/src/xml_wgrammar.cpp:19: > ./boost/spirit/home/classic/core/non_terminal/impl/rule.ipp: In constructor 'boost::spirit::classic::impl::concrete_parser<ParserT, ScannerT, AttrT>::concrete_parser(const ParserT&) [with ParserT = boost::spirit::classic::sequence<boost::spirit::classic::sequence<boost::spirit::classic::sequence<boost::spirit::classic::sequence<boost::spirit::classic::strlit<const char*>, boost::spirit::classic::rule<boost::spirit::classic::scanner<__gnu_cxx::__normal_iterator<wchar_t*, std::__cxx11::basic_string<wchar_t> >, boost::spirit::classic::scanner_policies<> >, boost::spirit::classic::nil_t, boost::spirit::classic::nil_t> >, boost::spirit::classic::chlit<wchar_t> >, boost::spirit::classic::action<boost::spirit::classic::uint_parser<unsigned int>, boost::archive::xml::assign_impl<unsigned int> > >, boost::spirit::classic::chlit<wchar_t> >; ScannerT = boost::spirit::classic::scanner<__gnu_cxx::__normal_iterator<wchar_t*, std::__cxx11::basic_string<wchar_t> >, boost::spirit::classic::scanner_policies<> >; AttrT = boost::spirit::classic::nil_t]': > ./boost/spirit/home/classic/core/non_terminal/impl/rule.ipp:235:54: internal compiler error: Aborted > 235 | concrete_parser(ParserT const& p_) : p(p_) {} > | ^ > 0xc77b4f crash_signal > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/toplev.c:328 > 0x7f318499137f ??? > /var/tmp/portage/sys-libs/glibc-2.32-r1/work/glibc-2.32/signal/../sysdeps/unix/sysv/linux/x86_64/sigaction.c:0 > 0x7f3184991301 __GI_raise > ../sysdeps/unix/sysv/linux/raise.c:50 > 0x7f318497a535 __GI_abort > /var/tmp/portage/sys-libs/glibc-2.32-r1/work/glibc-2.32/stdlib/abort.c:79 > 0x7f31849d41d6 __libc_message > ../sysdeps/posix/libc_fatal.c:155 > 0x7f3184a65bf1 __GI___fortify_fail > /var/tmp/portage/sys-libs/glibc-2.32-r1/work/glibc-2.32/debug/fortify_fail.c:26 > 0x7f3184a65bcf __stack_chk_fail > /var/tmp/portage/sys-libs/glibc-2.32-r1/work/glibc-2.32/debug/stack_chk_fail.c:24 > 0x65a941 cp_gimplify_expr(tree_node**, gimple**, gimple**) > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/cp/cp-gimplify.c:955 > 0x9f5d6b gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*), int) > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/gimplify.c:13465 > 0x9fbc20 gimplify_stmt(tree_node**, gimple**) > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/gimplify.c:6825 > 0x9fbc20 gimplify_cleanup_point_expr > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/gimplify.c:6567 > 0x9f7073 gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*), int) > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/gimplify.c:13948 > 0x9f6f05 gimplify_stmt(tree_node**, gimple**) > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/gimplify.c:6825 > 0x9f6f05 gimplify_statement_list > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/gimplify.c:1869 > 0x9f6f05 gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*), int) > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/gimplify.c:14000 > 0x9f6ab1 gimplify_stmt(tree_node**, gimple**) > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/gimplify.c:6825 > 0x9f6ab1 gimplify_and_add(tree_node*, gimple**) > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/gimplify.c:486 > 0x9f6ab1 gimplify_expr(tree_node**, gimple**, gimple**, bool (*)(tree_node*), int) > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/gimplify.c:13907 > 0x9f6f05 gimplify_stmt(tree_node**, gimple**) > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/gimplify.c:6825 > 0x9f6f05 gimplify_statement_list > /var/tmp/portage/sys-devel/gcc-10.2.0-r1/work/gcc-10.2.0/gcc/gimplify.c:1869 > Please submit a full bug report, > with preprocessed source if appropriate. > Please include the complete backtrace with any bug report. > See <https://bugs.gentoo.org/> for instructions. >
Now see if you can extract the exact command and source file that triggered the stack smash. See comment 36 for creating a fully standalone example, and then comment 8 (and others) for using creduce to create a minimal example. I'd also be curious how reproducible the error is. I finally gave up when I could only get the stack smash one or two times out of a hundred tries.
It already passed on second attempt.
I now get the stack smash emerging boost - either 1.73.0 or 1.74.0. gcc 10.2.0-r1. In a few days, I should have enough time to make a new attempt to see how consistent it is (I've had three or four failures so far) and try a new creduce attempt.
Sorry guys I'm catching up on other stuff right now.
The emerge of boost failed repeatedly for me last week. After some days away, I extracted the failing command, and created a preprocessed file with gcc -E -P. That version had the stack smash 50-60% of tries. Emerging creduce also pulled in compiler-rt and compiler-rt-sanitizers 9.01. (10.0.0 were already installed). After that, I have not had a single stack smash. Given the compile command uses x86_64-pc-linux-gnu-g++ I have no idea why that would make any difference. When the creduce emerge is completely done, I'll quickpkg the newly re-emerge slotted packages and see if the stack smashing returns. Does this make any sense to anyone?
(In reply to Jack from comment #96) > The emerge of boost failed repeatedly for me last week. After some days > away, I extracted the failing command, and created a preprocessed file with > gcc -E -P. That version had the stack smash 50-60% of tries. Emerging > creduce also pulled in compiler-rt and compiler-rt-sanitizers 9.01. (10.0.0 > were already installed). After that, I have not had a single stack smash. > Given the compile command uses x86_64-pc-linux-gnu-g++ I have no idea why > that would make any difference. When the creduce emerge is completely done, > I'll quickpkg the newly re-emerge slotted packages and see if the stack > smashing returns. > > Does this make any sense to anyone? llvm install should not affect gcc install.
It wasn't llvm, it was just the irreproducibility showing up again. I did another creduce, and ended up with a 24 line file. On runs of 100, I generally get 60-80 stack smashes. However, I've had times when I got 0 (that's ZERO) smashes in 100 runs, three or four times in a row. Is it conceivably possible it depends on which core it runs on? I wouldn't think so, but trying to go further is no fun with the failure that inconsistent.
(In reply to Jack from comment #98) > It wasn't llvm, it was just the irreproducibility showing up again. I did > another creduce, and ended up with a 24 line file. On runs of 100, I > generally get 60-80 stack smashes. However, I've had times when I got 0 > (that's ZERO) smashes in 100 runs, three or four times in a row. Is it > conceivably possible it depends on which core it runs on? I wouldn't think > so, but trying to go further is no fun with the failure that inconsistent. Core might be it. Or a physical memory layout. Cores could be tested by pinning a binary to specific cpu, say with taskset: $ taskset -c 0 ./the-command to run on cpu0.
I'm wondering if this should just be closed as "worksforme" since I've since done several clean emerges of boost 1.73, and now one of 1.74, and also multiple successful runs of genkernel. Using the latest result from creduce, run in a loop of 100 tries, I generally get 0 stack smashes, and I don't think I've had more than about 30% in over a month. I've also not identified any pattern of when I get no failures and when I do get 20-30%.
In bug #724314 i got access to a system where it is somewhat reliably reproducible. Let's close it as a dupe of that bug. *** This bug has been marked as a duplicate of bug 724314 ***
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/proj/gcc-patches.git/commit/?id=7adab39a82fd07085f600603bdc5c440aa1c142a commit 7adab39a82fd07085f600603bdc5c440aa1c142a Author: Sergei Trofimovich <slyfox@gentoo.org> AuthorDate: 2020-12-29 09:51:52 +0000 Commit: Sergei Trofimovich <slyfox@gentoo.org> CommitDate: 2020-12-29 09:51:52 +0000 10.2.0: revert PR95820 backporting The backport breaks parsing as seen in https://gcc.gnu.org/PR98441 Bug: https://gcc.gnu.org/PR95820 Bug: https://bugs.gentoo.org/730406 Reported-by: Daniel Santos Bug: https://gcc.gnu.org/PR98441 Bug: https://bugs.gentoo.org/762382 Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org> 10.2.0/gentoo/34_all_fundecl-ICE-PR95820.patch | 25 ------------------------- 10.2.0/gentoo/README.history | 3 +++ 2 files changed, 3 insertions(+), 25 deletions(-)
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/proj/gcc-patches.git/commit/?id=3779ee05e8c035d435c62198f3b761516e63fdf0 commit 3779ee05e8c035d435c62198f3b761516e63fdf0 Author: Sergei Trofimovich <slyfox@gentoo.org> AuthorDate: 2020-12-29 10:05:50 +0000 Commit: Sergei Trofimovich <slyfox@gentoo.org> CommitDate: 2020-12-29 10:05:50 +0000 10.2.0: cut 6 patchset Single dropped patch: - 34_all_fundecl-ICE-PR95820.patch: revert PR95820 backporting Bug: https://gcc.gnu.org/PR95820 Bug: https://bugs.gentoo.org/730406 Reported-by: Daniel Santos Bug: https://gcc.gnu.org/PR98441 Bug: https://bugs.gentoo.org/762382 Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org> 10.2.0/gentoo/README.history | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=8667621e1a1fb7684b9b8d112255f33f8d5ec978 commit 8667621e1a1fb7684b9b8d112255f33f8d5ec978 Author: Sergei Trofimovich <slyfox@gentoo.org> AuthorDate: 2020-12-29 10:11:31 +0000 Commit: Sergei Trofimovich <slyfox@gentoo.org> CommitDate: 2020-12-29 10:14:44 +0000 sys-devel/gcc: 10.2.0: cut 6 patchset Single dropped patch: - 34_all_fundecl-ICE-PR95820.patch: revert PR95820 backporting Bug: https://gcc.gnu.org/PR95820 Bug: https://bugs.gentoo.org/730406 Reported-by: Daniel Santos Bug: https://gcc.gnu.org/PR98441 Closes: https://bugs.gentoo.org/762382 Package-Manager: Portage-3.0.12, Repoman-3.0.2 Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org> sys-devel/gcc/Manifest | 1 + sys-devel/gcc/gcc-10.2.0-r5.ebuild | 18 ++++++++++++++++++ 2 files changed, 19 insertions(+)