I ran "sudo emerge -uDNav --jobs @world @system", watched the output scroll past, and suddenly remembered that portage needed an update, so I hit Ctrl+C: >>> Verifying ebuild manifests >>> Starting parallel fetch >>> Emerging (1 of 36) sys-devel/automake-wrapper-7 >>> Emerging (2 of 36) dev-libs/libyaml-0.1.4 >>> Emerging (3 of 36) media-sound/teamspeak-server-bin-3.0.5-r2 >>> Emerging (4 of 36) sys-devel/gcc-config-1.6 >>> Emerging (5 of 36) sys-devel/binutils-config-3-r3 >>> Emerging (6 of 36) app-misc/pax-utils-0.4 >>> Emerging (7 of 36) sys-devel/m4-1.4.16 >>> Emerging (8 of 36) dev-util/ccache-3.1.7 >>> Emerging (9 of 36) sys-process/psmisc-22.16 >>> Emerging (10 of 36) sys-apps/man-pages-3.40 >>> Emerging (11 of 36) sys-kernel/linux-headers-3.4 >>> Emerging (12 of 36) dev-perl/LWP-MediaTypes-6.20.0 >>> Emerging (13 of 36) dev-perl/Encode-Locale-1.30.0 >>> Emerging (14 of 36) perl-core/Encode-2.430.0 >>> Emerging (15 of 36) dev-perl/Net-HTTP-6.30.0 >>> Emerging (16 of 36) dev-perl/File-Listing-6.40.0 >>> Jobs: 0 of 36 complete, 14 running Load avg: 0.50, 0.21, 0.11^C Exiting on signal 2 Exiting on signal 2 (Despite the two "exiting" messages,I only hit Ctrl+C once.) My prompt never returned. This is consistent and repeatable on my server in its current state. If I let it emerge, it may temporarily resolve the issue, but the Ctrl+C hang may happen again in the future. It's happened several times in the past. Fwiw, this also occurs in pre-2.2 portage on another computer, but obviously we can focus on 2.2+. I will attempt to maintain the state portage is in now so I can continue poking it as desired. Portage quits immediately after a SIGHUP. It remains completely unresponsive to Ctrl+C after the first message. Reproducible: Always Portage 2.2.0_alpha116 (default/linux/amd64/10.0/server, gcc-4.5.3, glibc-2.14.1-r3, 2.6.35.4-rscloud x86_64) ================================================================= System uname: Linux-2.6.35.4-rscloud-x86_64-Six-Core_AMD_Opteron-tm-_Processor_2423_HE-with-gentoo-2.1 Timestamp of tree: Mon, 09 Jul 2012 13:00:01 +0000 ccache version 3.1.6 [enabled] app-shells/bash: 4.2_p20 dev-java/java-config: 2.1.11-r3 dev-lang/python: 2.6.8, 2.7.3-r1, 3.1.5, 3.2.3 dev-util/ccache: 3.1.6 dev-util/pkgconfig: 0.26 sys-apps/baselayout: 2.1-r1 sys-apps/openrc: 0.9.8.4 sys-apps/sandbox: 2.5 sys-devel/autoconf: 2.68 sys-devel/automake: 1.10.2, 1.11.1 sys-devel/binutils: 2.21.1-r1 sys-devel/gcc: 4.3.4, 4.5.3-r2 sys-devel/gcc-config: 1.5-r2 sys-devel/libtool: 2.4-r1 sys-devel/make: 3.82-r1 sys-kernel/linux-headers: 3.1 (virtual/os-headers) sys-libs/glibc: 2.14.1-r3 Repositories: gentoo java-overlay local ACCEPT_KEYWORDS="amd64" ACCEPT_LICENSE="* -@EULA" CBUILD="x86_64-pc-linux-gnu" CFLAGS="-O2 -pipe" CHOST="x86_64-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/share/openvpn/easy-rsa" CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/java/ /etc/gconf /etc/gentoo-release /etc/php/apache2-php5.3/ext-active/ /etc/php/cgi-php5.3/ext-active/ /etc/php/cli-php5.3/ext-active/ /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo" CXXFLAGS="-O2 -pipe" DISTDIR="/usr/portage/distfiles" FCFLAGS="-O2 -pipe" FEATURES="assume-digests binpkg-logs ccache collision-protect config-protect-if-modified distlocks ebuild-locks fail-clean fixlafiles news nodoc noinfo noman parallel-fetch parse-eapi-ebuild-head preserve-libs protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch" FFLAGS="-O2 -pipe" GENTOO_MIRRORS="http://distfiles.gentoo.org" LANG="en_US.utf8" LDFLAGS="-Wl,-O1 -Wl,--as-needed" MAKEOPTS="-j4" PKGDIR="/usr/portage/packages" PORTAGE_CONFIGROOT="/" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/var/lib/layman/java-overlay /usr/local/portage" SYNC="rsync://rsync.gentoo.org/gentoo-portage" USE="acl amd64 bazaar berkdb bzip2 cgi cli cracklib crypt ctype curl cxx dri fastcgi fortran ftp gd gdbm git gpm ipv6 json libwww maildir mercurial mmx modules mudflap multilib ncurses nptl nptlonly openmp pam pcre perl php posix pppd python python2 python3 raw readline ruby session snmp sqlite sse sse2 ssh ssl subversion tcpd threads truetype unicode unzip vhosts vim-syntax xml xmlrpc xorg zip zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump" CAMERAS="ptp2" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip tripmate tnt ubx" INPUT_DEVICES="keyboard mouse evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" PHP_TARGETS="php5-3" PYTHON_TARGETS="python3_2 python2_7" RUBY_TARGETS="ruby18 ruby19" USERLAND="GNU" VIDEO_CARDS="fbdev glint intel mach64 mga neomagic nouveau nv r128 radeon savage sis tdfx trident vesa via vmware dummy v4l" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account" Unset: CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL, LINGUAS, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, USE_PYTHON
Created attachment 317738 [details] lsof, pstree, and pdb's backtrace output
*** Bug 433958 has been marked as a duplicate of this bug. ***
Why is this still not confirmed? Should I provide additional information?
When you ^C to emerge, the Scheduler._terminate_tasks method sends SIGTERM to direct child processes, and then it waits for them to terminate. It may help if we also add a timeout, and send a SIGKILL if the SIGTERM cause it to terminate in a reasonable amount of time (10 seconds or so).
Actually, I was about to file a bug when i found this one about ...why not turn it into a feature? My suggestion: Ctrl+C kills emerge and all of its subprocesses. Sending it a signal (SIGUSR1 for example) could instead make it finish the currently running compiles/merges and then *not* proceed with the scheduled merges? E.g. make it finish gracefully. The current behavior (for which this bug was opened) also happens in 2.1.11.62. And I find it quite a nuisance, since it needs an explicit kill -9 to stop.
(In reply to comment #5) > The current behavior (for which this bug was opened) also happens in > 2.1.11.62. And I find it quite a nuisance, since it needs an explicit kill > -9 to stop. Just FYI, a HUP signal to the python process does the trick. It seems python is responsive even if the code its executing is not. If you Ctrl+Z'ed it, you just have to be sure to run "fg" to let it receive and react to the signal.
(In reply to comment #5) > My suggestion: > Ctrl+C kills emerge and all of its subprocesses. If it sends SIGKILL first then we risk losing potentially useful output, such as the traceback captured in bug 463960, comment #26. In order to avoid losing useful information like that, I think it's better to use a timeout. Alternatively, we could send SIGKILL if more than one SIGINT signal is received. > Sending it a signal (SIGUSR1 for example) could instead make it finish the > currently running compiles/merges and then *not* proceed with the scheduled > merges? E.g. make it finish gracefully. I think the graceful exit is a good default, but it will be better if enhanced with a timeout and translation of multiple SIGINT to SIGKILL. > The current behavior (for which this bug was opened) also happens in > 2.1.11.62. And I find it quite a nuisance, since it needs an explicit kill > -9 to stop. Yes, it is certainly annoying.
This suggests several different signals should be sent before a SIGKILL is: http://pthree.org/2012/08/14/appropriate-use-of-kill-9-pid/
I am still suffering this with 2.2.8-r1 (just hit)
According to the article "Proper handling of SIGINT/SIGQUIT", the only proper way to exit due to SIGINT would be essentially as follows: void sigint_handler(int sig) { [do some cleanup] signal(SIGINT, SIG_DFL); kill(getpid(), SIGINT); } Our current code doesn't do that, so that's something to fix. We should certainly avoid calling signal.signal(signal.SIGINT, signal.SIG_IGN) and leaving it in that state (that's what we do now).
I've experienced this bug just now, and use SIGUSR1 + pdb to get some insight into what causes it. The problem is that in Scheduler there are 2 system packages in self._merge_wait_scheduled, and they are also in self._running_tasks (which keeps the main loop running). These tasks have been cancelled by self._terminate_tasks, and cleared from self._task_queues.merge. So, they're just sitting there in a cancelled state, keeping the loop running.
This patch seems to work for me: https://github.com/gentoo/portage/pull/43
Posted for review: https://archives.gentoo.org/gentoo-portage-dev/message/bf102d1a2558b43655bf17251ca9b53d
This is in the master branch: https://gitweb.gentoo.org/proj/portage.git/commit/?id=d54a795615ccb769a25a0f8d6cc15ba930ec428f
Fixed in portage-2.3.3.