During an emerge -e world, my system starts misbehaving at some point. After a few restarts, I'm quite confident that it is during the merge phase of gcc that things go wrong. Symptoms are: - windowmanager (blackbox) crashes - kde apps (konqueror) segfault - xorg terminates Also, I've seen portage abort the gcc merge with "Segmentation fault" The system runs stable for >2 days (too noisy -> shut down at night). This problem has appeared about 3 weeks ago and has since then been pretty consistent. Reproducible: Always Steps to Reproduce: 1.emerge -e world 2. 3. Expected Results: The emerge should have proceeded Portage 2.0.51.19 (default-linux/x86/2004.3, gcc-3.4.3-20050110, glibc-2.3.4.20050125-r0, 2.6.10 i686) ================================================================= System uname: 2.6.10 i686 AMD Athlon(tm) XP 2600+ Gentoo Base System version 1.6.9 Python: dev-lang/python-2.3.5 [2.3.5 (#1, Mar 9 2005, 16:44:00)] ccache version 2.3 [enabled] dev-lang/python: 2.3.5 sys-devel/autoconf: 2.59-r6, 2.13 sys-devel/automake: 1.7.9-r1, 1.8.5-r3, 1.5, 1.4_p6, 1.6.3, 1.9.5 sys-devel/binutils: 2.15.92.0.2-r5 sys-devel/libtool: 1.5.10-r5 virtual/os-headers: 2.6.8.1-r2 ACCEPT_KEYWORDS="x86 ~x86" AUTOCLEAN="yes" CFLAGS="-O2 -march=athlon-xp -fomit-frame-pointer" CHOST="i686-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3.3/env /usr/kde/3.3/share/config /usr/kde/3.3/shutdown /usr/kde/3/share/config /usr/lib/X11/xkb /usr/lib/mozilla/defaults/pref /usr/share/config /usr/share/texmf/dvipdfm/config/ /usr/share/texmf/dvips/config/ /usr/share/texmf/tex/generic/config/ /usr/share/texmf/tex/platex/config/ /usr/share/texmf/xdvi/ /var/qmail/control" CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d" CXXFLAGS="-O2 -march=athlon-xp -fomit-frame-pointer" DISTDIR="/usr/portage/distfiles" FEATURES="autoaddcvs autoconfig buildpkg ccache distlocks sandbox sfperms" GENTOO_MIRRORS="http://pandemonium.tiscali.de/pub/gentoo/" LDFLAGS="-Wl,-O1" MAKEOPTS="-j2" PKGDIR="/usr/portage/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" SYNC="rsync://rsync.gentoo.org/gentoo-portage" USE="x86 3dnow 3dnowex X a52 aac aalib acpi alsa apache2 apm arts avi bash-completion berkdb bidi bigger-fonts bitmap-fonts bootsplash ccache cdda cddb cdparanoia cdr chroot codecs crypt cups curl dga dv dvd dvdr dvdread ecc emboss encode esd faad fam fbcon ffmpeg fftw font-server foomaticdb fortran gd gd-external gdbm gif glep gmp gpm imagemagick imlib ipv6 jabber jack jikes jit jpeg jpeg2k kde kdeenablefinal ladcca libg++ libwww lm_sensors lzo lzw-tiff mad matroska mhash mikmod mime mjpeg mmx mmx2 mng monkey motif mozilla mozsvg mp3 mpeg mpeg4 ncurses nls nptl nvidia ogg oggvorbis openal opengl oss pam parse-clocks pdf pdflib perl physfs png python qt quicktime readline real recode ruby samba sdl spell sse sse2 ssl svga tcpd threads tiff transcode truetype truetype-fonts type1-fonts utf8 xml2 xv zlib" Unset: ASFLAGS, CBUILD, CTARGET, LANG, LC_ALL, PORTDIR_OVERLAY
you need to provide more information, like the crash messages from GCC, tracebacks from kde/xorg/blackbox. But I'd venture your hardware isn't up to scratch.
It's not the hardware. I've been compiling for ~8h now without any problems. If I understand it correctly the weirdness happens during the gcc merge phase. Providing any tracebacks etc. is quite difficult since most running programs terminate ... I've seen blackbox quit with "signal 11", not much info on the other apps. It's one of those strangebugs that are hard to track down, but it is really annoying and might other users in the future.
compile your glibc and crashing apps with FEATURES="nostrip" CFLAGS="-ggdb3 -O2 -march=athlon-xp", then enable core dumps (ulimit -c ......) Once you get a core dump file, use 'gdb binary corefile' and use the 'bt' command in gdb to generate the trackback.
Quick me,too here, so that Patrick doesn't feel alone. :) I'm doing the same emerge -e, but I have no X running on this host. Symptoms are: 61 packages emerged without problems, then comes gcc-config-1.3.10-r1, big badaboom. Segmentation fault. Same result with gcc-3.4.3-20050110, everything (!) else compiles without any problems whatsoever. Tell me what you need in terms of info from me. Here's an emerge info for starters: Portage 2.0.51.19 (default-linux/x86/2004.0, gcc-3.4.3-20050110, glibc-2.3.4.20050125-r0, 2.6.11-gentoo-r1 i686) ================================================================= System uname: 2.6.11-gentoo-r1 i686 AMD Athlon(tm) Processor Gentoo Base System version 1.6.9 Python: dev-lang/python-2.2.3-r5,dev-lang/python-2.3.5 [2.3.5 (#1, Mar 10 2005, 02:06:20)] distcc 2.18.3 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [enabled] dev-lang/python: 2.2.3-r5, 2.3.5 sys-devel/autoconf: 2.59-r6, 2.13 sys-devel/automake: 1.7.9-r1, 1.8.5-r3, 1.5, 1.4_p6, 1.6.3, 1.9.5 sys-devel/binutils: 2.15.92.0.2-r5 sys-devel/libtool: 1.5.10-r5 virtual/os-headers: 2.4.19-r1, 2.4.22-r1 ACCEPT_KEYWORDS="x86 ~x86" AUTOCLEAN="yes" CFLAGS="-march=athlon -O3 -pipe" CHOST="i686-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3.1/share/config /usr/kde/3.2/share/config /usr/kde/3.3/env /usr/kde/3.3/share/config /usr/kde/3.3/shutdown /usr/kde/3/share/config /usr/lib/X11/xkb /usr/share/config /var/qmail/control" CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d" CXXFLAGS="-march=athlon -O3 -pipe" DISTDIR="/usr/portage/distfiles" FEATURES="autoaddcvs autoconfig ccache distcc sandbox sfperms userpriv" GENTOO_MIRRORS="http://distfiles.gentoo.org http://distro.ibiblio.org/pub/Linux/distributions/gentoo" MAKEOPTS="-j7" PKGDIR="/usr/portage/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" SYNC="rsync://rsync.de.gentoo.org/gentoo-portage" USE="x86 X apm arts avi berkdb bitmap-fonts canna cdr crypt cups curl dvd emboss encode esd fam fbcon flac font-server foomaticdb fortran freewnn gdbm gif gpm gtk gtk2 imlib immqt-bc ipv6 java jpeg kde libg++ libwww mad mikmod motif mozaccess-builtin mozctl mozilla mp3 mpeg ncurses nls nocardbus oggvorbis opengl oss pam pdflib perl png python qt quicktime readline scanner sdl slang spell ssl svga tcltk tcpd tiff truetype truetype-fonts type1-fonts userlocales xml2 xmms xv zlib" Unset: ASFLAGS, CBUILD, CTARGET, LANG, LC_ALL, LDFLAGS, PORTDIR_OVERLAY
Building in a chroot doesn't trigger it. Really interesting ...
quick request, but flip off the sandbox during these tests please (eradicator has reported some crazyness with sandbox and some gcc-config stuff he's hacking on).
FEATURES="-sandbox" doesn't help, I still get the segfault right after ">>> Regenerating /etc/ld.so.cache"
when you get the segfault, there should be a core file somewhere as I noted.
it won't create a core file unless you set a new core file size limit (default is 0 = no core file) with ulimit -c (unit is 1024Bytes, better choose a large value, e.g. 32MB = 32768) the error message will change to "Segmentation fault (core dumped)"
Besides the fact that I've been out of town since my last comment, had to emerge gdb to begin with, and am currently watching glibc compile with the flags Robin suggested, I'm having a bit of a chicken-and-egg problem here: I get the segfaults exclusively from `emerge gcc` and `emerge gcc-config`, so what's the "crashing application" I should debug in those cases? If what Patrick said is true, that's probably gcc, but since the segmentation fault occurs everytime I emerge just that, I'll have difficulties building it with nostrip and -ggdb3... You need those to make sense of the core file, right?
Build gcc and glibc with the required use flags on another machine that works, or get somebody else to (I've got cpu power if you need it).
just noticed this bug. Hey same thing here. host system is amd64 non multilib. + gcc-config x86_64-pc-linux-gnu-3.4.3 * Switching to x86_64-pc-linux-gnu-3.4.3 compiler... [ ok ]+ rm -f //usr/sbin/gcc-config + set +x >>> Regenerating /etc/ld.so.cache... >>> sys-devel/gcc-config-1.3.10-r1 merged. Segmentation fault (core dumped) python compiled with all the good stuff, but still yeilds no human readable backtrace. The core does show the segfault is caused by '/usr/bin/python2.3 -O /usr/bin/emerge gcc-config' and it only happens in the qmerge phase. running from command line is fine. (python@ || portage@) bug is my guess and not toolchain@ I'll continue to debug today to see what I can learn about this as it's a blocker for hardned doing any amd64 stages
It's a toolchain bug. This fixes it. --- gcc-config-1.3.10.orig 2005-03-19 20:27:01.223207288 +0000 +++ gcc-config-1.3.10 2005-03-19 20:27:32.493453488 +0000 @@ -229,7 +229,7 @@ # On many systems (x86/amd64/etc...), this will probably never matter, # but on other systems (arm/mips/etc...), this is quite critical. # http://bugs.gentoo.org/show_bug.cgi?id=60190 - if ! is_cross_compiler ; then + if ! is_cross_compiler && [[ $PORTAGE_CALLER != "emerge" ]] ; then for multilib in $(${ROOT}/${GCC_BIN_PATH}/gcc -print-multi-lib); do multiarg=${multilib#*;} multiarg=${multiarg/@/-}
Confirmed, changing that line in /usr/portage/sys-devel/gcc-config/files/gcc-config-1.3.10 makes the segfault go away. Thanks a lot, solar!
patrick: can you confirm ?
gcc-config seems to work emerge gcc 3.4.3-20050110-r1 fails will apply patch and test.
patch works. Awaiting new gcc-config release ;-)
already been fixed in portage