I am not really sure how to provide useful information regarding this issue. I encounter it on a 64-thread system that has 128GB of RAM after only a few packages have been compiled. One core stays at full utilization in htop, which is attributed to emerge. This occurred with a fresh install a few days ago, but installing more software and doing updates has not changed anything. Stopping emerge does not provide a useful backtrace and using USRSIG1 to drop into the debugger does not provide useful information either, although maybe someone else would find it more useful than I did: Total: 314 packages (1 new, 313 reinstalls), Size of downloads: 140 kB !!! Ebuilds for the following packages are either all !!! masked or don't exist: >=sys-apps/baselayout-2 Would you like to merge these packages? [Yes/No] >>> Verifying ebuild manifests >>> Running pre-merge checks for sci-libs/mkl-10.3.4.191 * QA: Package does not specify unit for the size check * QA: Assuming mebibytes. * QA: File bug against the package. It should specify the unit. * Checking for at least 1536 mebibytes disk space at "/home/stonybrook/gentoo/var/tmp/portage/sci-libs/mkl-10.3.4.191/temp" ... [ ok ] >>> Starting parallel fetch >>> Emerging (1 of 314) sys-libs/ncurses-5.9-r1 >>> Emerging (2 of 314) app-arch/xz-utils-5.0.3 >>> Emerging (3 of 314) virtual/libintl-0 >>> Emerging (4 of 314) app-arch/bzip2-1.0.6-r3 >>> Emerging (5 of 314) virtual/libiconv-0 >>> Emerging (6 of 314) sys-devel/gnuconfig-20110814 >>> Emerging (7 of 314) sys-devel/binutils-config-3-r1 >>> Emerging (8 of 314) app-misc/mime-types-8 >>> Emerging (9 of 314) app-misc/pax-utils-0.2.1 >>> Emerging (10 of 314) sys-apps/tcp-wrappers-7.6-r8 >>> Emerging (11 of 314) sys-apps/which-2.20 >>> Emerging (12 of 314) app-arch/cpio-2.11 >>> Emerging (13 of 314) sys-process/numactl-2.0.7-r1 >>> Emerging (14 of 314) virtual/os-headers-0 >>> Emerging (15 of 314) sys-devel/autoconf-wrapper-12 >>> Emerging (16 of 314) sys-devel/automake-wrapper-6 >>> Emerging (17 of 314) dev-util/gperf-3.0.4 >>> Emerging (18 of 314) dev-util/byacc-20120115 >>> Emerging (19 of 314) mail-client/mailx-support-20060102-r1 >>> Emerging (20 of 314) net-mail/mailbase-1 >>> Emerging (21 of 314) sys-process/lsof-4.84 >>> Emerging (22 of 314) app-portage/portage-utils-0.9 >>> Installing (14 of 314) virtual/os-headers-0 >>> Installing (16 of 314) sys-devel/automake-wrapper-6 >>> Installing (15 of 314) sys-devel/autoconf-wrapper-12 >>> Installing (20 of 314) net-mail/mailbase-1 >>> Installing (19 of 314) mail-client/mailx-support-20060102-r1 >>> Installing (21 of 314) sys-process/lsof-4.84 >>> Installing (13 of 314) sys-process/numactl-2.0.7-r1 >>> Installing (18 of 314) dev-util/byacc-20120115 >>> Installing (17 of 314) dev-util/gperf-3.0.4 >>> Installing (22 of 314) app-portage/portage-utils-0.9 >>> Installing (12 of 314) app-arch/cpio-2.11 >>> Installing (3 of 314) virtual/libintl-0 >>> Jobs: 12 of 314 complete Load avg: 1.42, 0.78, 0.82--Return-- > /home/stonybrook/gentoo/usr/bin/emerge(28)debug_signal()->None -> pdb.set_trace() (Pdb) bt /home/stonybrook/gentoo/usr/bin/emerge(44)<module>() -> retval = emerge_main() /home/stonybrook/gentoo/usr/lib/portage/pym/_emerge/main.py(2031)emerge_main() -> myopts, myaction, myfiles, spinner) /home/stonybrook/gentoo/usr/lib/portage/pym/_emerge/actions.py(442)action_build() -> retval = mergetask.merge() /home/stonybrook/gentoo/usr/lib/portage/pym/_emerge/Scheduler.py(1009)merge() -> rval = self._merge() /home/stonybrook/gentoo/usr/lib/portage/pym/_emerge/Scheduler.py(1354)_merge() -> self._main_loop() /home/stonybrook/gentoo/usr/lib/portage/pym/_emerge/Scheduler.py(1495)_main_loop() -> while self._keep_scheduling(): > /home/stonybrook/gentoo/usr/bin/emerge(28)debug_signal()->None -> pdb.set_trace() (Pdb) quit * Messages for package sci-libs/mkl-10.3.4.191: * QA: Package does not specify unit for the size check * QA: Assuming mebibytes. * QA: File bug against the package. It should specify the unit. * Messages for package net-mail/mailbase-1: * 'enewgroup()' disabled in Prefixed Portage with non root user * 'enewuser()' disabled in Prefixed Portage with non-root user * 'enewuser()' disabled in Prefixed Portage with non-root user * fowners ignored in Prefix with non-privileged user * Messages for package app-portage/portage-utils-0.9: * ERROR: app-portage/portage-utils-0.9 failed (test phase): * Make check failed. See above for details. * * Call stack: * ebuild.sh, line 85: Called call-ebuildshell 'src_test' * environment, line 490: Called src_test * environment, line 2565: Called _eapi0_src_test * phase-helpers.sh, line 550: Called die * The specific snippet of code: * $emake_cmd -j1 check || \ * die "Make check failed. See above for details." * * If you need support, post the output of 'emerge --info =app-portage/portage-utils-0.9', * the complete build log and the output of 'emerge -pqv =app-portage/portage-utils-0.9'. * The complete build log is located at '/home/stonybrook/gentoo/var/tmp/portage/app-portage/portage-utils-0.9/temp/build.log'. * The ebuild environment file is located at '/home/stonybrook/gentoo/var/tmp/portage/app-portage/portage-utils-0.9/temp/environment'. * S: '/home/stonybrook/gentoo/var/tmp/portage/app-portage/portage-utils-0.9/work/portage-utils-0.9' * /home/stonybrook/gentoo/etc/portage/postsync.d/q-reinitialize has been installed for convenience * If you wish for it to be automatically run at the end of every --sync: * # chmod +x /home/stonybrook/gentoo/etc/portage/postsync.d/q-reinitialize * Normally this should only take a few seconds to run but file systems * such as ext3 can take a lot longer. To disable, simply do: * # chmod -x /home/stonybrook/gentoo/etc/portage/postsync.d/q-reinitialize Traceback (most recent call last): File "/home/stonybrook/gentoo/usr/bin/emerge", line 44, in <module> retval = emerge_main() File "/home/stonybrook/gentoo/usr/lib/portage/pym/_emerge/main.py", line 2031, in emerge_main myopts, myaction, myfiles, spinner) File "/home/stonybrook/gentoo/usr/lib/portage/pym/_emerge/actions.py", line 442, in action_build retval = mergetask.merge() File "/home/stonybrook/gentoo/usr/lib/portage/pym/_emerge/Scheduler.py", line 1009, in merge rval = self._merge() File "/home/stonybrook/gentoo/usr/lib/portage/pym/_emerge/Scheduler.py", line 1354, in _merge self._main_loop() File "/home/stonybrook/gentoo/usr/lib/portage/pym/_emerge/Scheduler.py", line 1495, in _main_loop while self._keep_scheduling(): File "/home/stonybrook/gentoo/usr/bin/emerge", line 28, in debug_signal pdb.set_trace() File "/home/stonybrook/gentoo/usr/lib/python2.7/bdb.py", line 52, in trace_dispatch return self.dispatch_return(frame, arg) File "/home/stonybrook/gentoo/usr/lib/python2.7/bdb.py", line 86, in dispatch_return if self.quitting: raise BdbQuit BdbQuit I briefly spoke with zmedico in #gentoo-portage about this and he suggested that the recent recursion patches from bug #402335 might help.
While this is grasping at straws, this might be related to bug #403287. I imagine that a deadlock would turn into an infinite loop if a thread is polling a thread that deadlocked.
(In reply to comment #1) > While this is grasping at straws, this might be related to bug #403287. I > imagine that a deadlock would turn into an infinite loop if a thread is polling > a thread that deadlocked. That doesn't affect portage, since portage doesn't use threads.
There are logs of EventLoop changes in v2.2.0_alpha87, so it's worth merging those into the prefix branch to see if it helps this.
Please try sys-apps/portage-2.2.01.20239, which is the Prefix equivalent of v2.2.0_alpha87.
I was able to reproduce this issue in a single-core Ubuntu 10.04 VM where I can confirm that the newer ebuild fixes it. I also tested the fix on the original 64-thread system and it appears to have worked, although I think one ebuild's build system deadlocked judging from: >>> Jobs: 216 of 282 complete, 1 running Load avg: 0.06, 0.05, 0.06 I am closing this as fixed.