I'm doing 'emerge -e world' in a fresh stage3 chroot with an updated portage tree. Everything looks great, until gcc-3.4.6 starts compiling. It hangs here: * Updating libf2c/libU77/configure ... * Updating libffi/configure ... * Updating boehm-gc/configure ... but a bit of digging shows that it's stuck in 'touch_files' in contrib/gcc_update. It reliably gets stuck here in a wide range of circumstances. Entering the chroot while it's hung and running 'make -s -f Makefile.XX' completes successfully with no output, which should exit the 'while' loop in 'touch_files'. Suspecting things were hung elsewhere, I tried 'echo all: > Makefile.XX', and that caused things to un-hang and carry on. I see the same behavior on a different version of gcc in http://bugs.gentoo.org/show_bug.cgi?id=117473; the solution there was to upgrade to a newer 2.4 sparc64 kernel version. I'm running 2.6.15-gentoo-r7. Emerge info follows. Portage 2.0.54 (default-linux/amd64/2006.0, gcc-3.4.4, glibc-2.3.5-r2, 2.6.15-gentoo-r7 x86_64) ================================================================= System uname: 2.6.15-gentoo-r7 x86_64 Gentoo Base System version 1.6.14 app-admin/eselect-compiler: [Not Present] dev-lang/python: 2.4.2 dev-python/pycrypto: [Not Present] dev-util/ccache: [Not Present] dev-util/confcache: [Not Present] sys-apps/sandbox: 1.2.12 sys-devel/autoconf: 2.13, 2.59-r6 sys-devel/automake: 1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r1 sys-devel/binutils: 2.16.1-r3 sys-devel/gcc-config: 1.3.13-r3 sys-devel/libtool: 1.5.22 virtual/os-headers: 2.6.11-r2 ACCEPT_KEYWORDS="amd64" AUTOCLEAN="yes" CBUILD="x86_64-pc-linux-gnu" CFLAGS="-march=opteron -O2 -fomit-frame-pointer -pipe" CHOST="x86_64-pc-linux-gnu" CONFIG_PROTECT="/etc /var/bind" CONFIG_PROTECT_MASK="/etc/gconf /etc/revdep-rebuild /etc/terminfo /etc/env.d" CXXFLAGS="-march=opteron -O2 -fomit-frame-pointer -pipe" DISTDIR="/usr/portage/distfiles" FEATURES="autoconfig buildpkg digest distcc distlocks nodoc sandbox sfperms strict" GENTOO_MIRRORS="http://gentoo.cites.uiuc.edu/pub/gentoo/" LC_ALL="en_US.utf8" PKGDIR="/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/local/portage" SYNC="rsync://rsync.us.gentoo.org/gentoo-portage" USE="amd64 alsa avi bash-completion berkdb bitmap-fonts bzip2 cli crypt cups dlloader dri eds encode expat foomaticdb fortran gif gpm imlib isdnlog jpeg lzw lzw-tiff ncurses nls nptl nptlonly opengl pam pcre pdflib perl png pppd python qt3 qt4 readline reflection session spell spl ssl tcpd threads tiff truetype-fonts type1-fonts udev usb xml xml2 xorg xpm xv zlib input_devices_keyboard input_devices_mouse input_devices_evdev userland_GNU kernel_linux elibc_glibc" Unset: CTARGET, INSTALL_MASK, LANG, LDFLAGS, LINGUAS, MAKEOPTS, PORTAGE_RSYNC_EXTRA_OPTS, PORTAGE_RSYNC_OPTS
run `ps aux` and post the output
Created attachment 92216 [details] Edited ps -auxwwf output during hang This is the relevant section of the output.
sync up your tree ... then edit toolchain.eclass and go to the place where ./contrib/gcc_update is actually called ... change it to run like: bash -x ./contrib/gcc_update --touch that should give you some useful debug output
OK, sounds good. This is driven from a script which basically untars stuff into the chroot, then runs chroot $img emerge --sync chroot $img emerge -C pam-login chroot $img emerge -e world and it's within that emerge world that things go awry. Breaking the script and running chroot $img emerge gcc reliably does *not* hang this way. So I'll add the patch you suggest to the script and run it again and report back.
Well, shoot, it looks like your change in the tree caused the patch I included in my script to fail. Anyway, here is the result: * Touching generated files * Touching gcc/cstamp-h.in * Touching gcc/config.in * Touching libjava/aclocal.m4 * Touching libjava/Makefile.in It'd be nice to run the 'make' with -d, but that's not something I can patch quite so easily..
I tried adding MAKEFLAGS: einfo "Touching generated files" MAKEFLAGS=-d ./contrib/gcc_update --touch | \ while read f ; do einfo " ${f%%...}" done but that didn't get the 'make' debugging output. However, on this run: * Touching generated files * Touching gcc/cstamp-h.in * Touching gcc/config.in * Touching libjava/aclocal.m4 * Touching libjava/Makefile.in * Touching libjava/configure the differences from the last run suggest that there's something funny with the timing going on here.
well can you try the change i suggested already ? - ./contrib/gcc_update --touch | \ + bash -x ./contrib/gcc_update --touch | \
result: a whole lot of * + make -s -f Makefile.11805 all * + grep . * + sleep 1 * + make -s -f Makefile.11805 all * + grep . * + sleep 1 it would seem that the 'grep' invocation isn't doing what you want..
Yet this: [chroot] gcc-3.4.6 # while ${MAKE-make} -s -f Makefile.11805 all | grep . > /dev/null; do sleep 1; done completes immediately, even while the 'emerge world' is still hung on the same line; removing that '>/dev/null' does not give any output, either. Putting the above in 'foo.sh' and running 'sandbox ./foo.sh' *also* returns immediately. I'm in over my head here. What else could it be?
find /export/stuff/img/var/tmp/portage/gcc-3.4.6-r1/work/ | xargs touch causes the Makefile to touch configure, but not break out of the loop. Again grasping at straws: the directory containing this image is *exported* via NFS (but not mounted by anything at the moment). Here's the lsof output for the bash invocation of gcc_update: bash 11805 root cwd DIR 253,1 4096 5685252 /export/stuff/img/var/tmp/portage/gcc-3.4.6-r1/work/gcc-3.4.6 bash 11805 root rtd DIR 253,1 4096 4947969 /export/stuff/img bash 11805 root txt REG 253,1 791792 5017885 /export/stuff/img/bin/bash bash 11805 root mem REG 0,0 0 [heap] (stat: No such file or directory) bash 11805 root mem REG 253,1 107498 4948032 /export/stuff/img/lib64/ld-2.3.5.so bash 11805 root mem REG 253,1 373 4983659 /export/stuff/img/usr/lib64/locale/en_US.utf8/LC_IDENTIFICATION bash 11805 root mem REG 253,1 21546 4951876 /export/stuff/img/usr/lib64/gconv/gconv-modules.cache bash 11805 root mem REG 253,1 23 4983652 /export/stuff/img/usr/lib64/locale/en_US.utf8/LC_MEASUREMENT bash 11805 root mem REG 253,1 59 4983657 /export/stuff/img/usr/lib64/locale/en_US.utf8/LC_TELEPHONE bash 11805 root mem REG 253,1 155 4983647 /export/stuff/img/usr/lib64/locale/en_US.utf8/LC_ADDRESS bash 11805 root mem REG 253,1 77 4983655 /export/stuff/img/usr/lib64/locale/en_US.utf8/LC_NAME bash 11805 root mem REG 253,1 34 4983658 /export/stuff/img/usr/lib64/locale/en_US.utf8/LC_PAPER bash 11805 root mem REG 253,1 52 4983649 /export/stuff/img/usr/lib64/locale/en_US.utf8/LC_MESSAGES/SYS_LC_MESSAGES bash 11805 root mem REG 253,1 286 4983651 /export/stuff/img/usr/lib64/locale/en_US.utf8/LC_MONETARY bash 11805 root mem REG 253,1 882134 4983656 /export/stuff/img/usr/lib64/locale/en_US.utf8/LC_COLLATE bash 11805 root mem REG 253,1 2451 4983654 /export/stuff/img/usr/lib64/locale/en_US.utf8/LC_TIME bash 11805 root mem REG 253,1 54 4983650 /export/stuff/img/usr/lib64/locale/en_US.utf8/LC_NUMERIC bash 11805 root mem REG 253,1 26280 4952331 /export/stuff/img/usr/lib64/libsandbox.so.0.0.0 bash 11805 root mem REG 253,1 11280 4948003 /export/stuff/img/lib64/libdl-2.3.5.so bash 11805 root mem REG 253,1 1255872 4948013 /export/stuff/img/lib64/tls/libc-2.3.5.so bash 11805 root mem REG 253,1 208464 4983653 /export/stuff/img/usr/lib64/locale/en_US.utf8/LC_CTYPE bash 11805 root 0u CHR 136,2 4 /dev/pts/2 bash 11805 root 1w FIFO 0,5 567851 pipe bash 11805 root 2w FIFO 0,5 567851 pipe bash 11805 root 10u CHR 136,2 4 /dev/pts/2 bash 11805 root 255r REG 253,1 7222 5752118 /export/stuff/img/var/tmp/portage/gcc-3.4.6-r1/work/gcc-3.4.6/contrib/gcc_update
SpanKY, I think I've found it. A 'strace -p 11805 -f' showed me that the 'make' was, in fact, producing output: [pid 6422] write(1, "make[2]: Leaving directory `/var"..., 74) = 74 but my "by-hand" make invocation didn't do that. I checked the environment for process 11805, and found MAKEFLAGS=w. Searching up the process tree, it turns out this comes from my script at the toplevel, which is run from a Makefile. That poor variable is handed down through /many/ intervening processes! So now I have a solution -- add 'MAKEFLAGS=' and 'MAKELEVEL=' to my script -- but it would probably be a good idea to patch gcc_update to handle this, too. The easiest would probably be to add --no-print-directory to the 'make' invocation. Thoughts? It's an easy change, but I can submit a patch if you'd like.
*** Bug 140921 has been marked as a duplicate of this bug. ***
On second thought, does it make sense to patch emerge to clear those variables? I imagine that there are a small, but apparently non-trivial number of ebuilds that will break in funny ways when those variables are set, and I'm probably not the first or last person to run an emerge from make. # $Id$ import os,sys os.environ["PORTAGE_CALLER"]="emerge" + if os.environ.has_key("MAKEFLAGS"): del os.environ["MAKEFLAGS"] + if os.environ.has_key("MAKELEVEL"): del os.environ["MAKELEVEL"] + if os.environ.has_key("MFLAGS"): del os.environ["MFLAGS"] sys.path = ["/usr/lib/portage/pym"]+sys.path import errno
looks like this is fixed upstream by changing the grep: - while ${MAKE-make} -s -f Makefile.$$ all | grep . > /dev/null; do + while ${MAKE-make} -s -f Makefile.$$ all | grep Touching > /dev/null; do
Created attachment 93097 [details, diff] 08_all_gcc4-gcc-update-tweak.patch try this patch please
Confirmed -- that patch does the job. Thanks!
ok, added to our cvs patchset, thanks for testing/figuring out the bug ;)