Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 425554

Summary: sys-apps/portage-2.2.0_alpha116: Ctrl+C during --jobs simply halts progress w/o quitting
Product: Gentoo Linux Reporter: Jacob Godserv <jacobgodserv>
Component: [OLD] Core systemAssignee: Portage team <dev-portage>
Status: RESOLVED FIXED    
Severity: minor CC: 4glitch, esigra, mrueg, pacho, rdalek1967, SebastianLuther, tomboy64
Priority: Normal Keywords: InVCS
Version: unspecified   
Hardware: All   
OS: Linux   
URL: http://www.cons.org/cracauer/sigint.html
See Also: https://bugs.gentoo.org/show_bug.cgi?id=617550
Whiteboard:
Package list:
Runtime testing required: ---
Bug Depends on:    
Bug Blocks: 184128, 604854    
Attachments: lsof, pstree, and pdb's backtrace output

Description Jacob Godserv 2012-07-10 00:30:48 UTC
I ran "sudo emerge -uDNav --jobs @world @system", watched the output scroll past, and suddenly remembered that portage needed an update, so I hit Ctrl+C:

>>> Verifying ebuild manifests
>>> Starting parallel fetch
>>> Emerging (1 of 36) sys-devel/automake-wrapper-7
>>> Emerging (2 of 36) dev-libs/libyaml-0.1.4
>>> Emerging (3 of 36) media-sound/teamspeak-server-bin-3.0.5-r2
>>> Emerging (4 of 36) sys-devel/gcc-config-1.6
>>> Emerging (5 of 36) sys-devel/binutils-config-3-r3
>>> Emerging (6 of 36) app-misc/pax-utils-0.4
>>> Emerging (7 of 36) sys-devel/m4-1.4.16
>>> Emerging (8 of 36) dev-util/ccache-3.1.7
>>> Emerging (9 of 36) sys-process/psmisc-22.16
>>> Emerging (10 of 36) sys-apps/man-pages-3.40
>>> Emerging (11 of 36) sys-kernel/linux-headers-3.4
>>> Emerging (12 of 36) dev-perl/LWP-MediaTypes-6.20.0
>>> Emerging (13 of 36) dev-perl/Encode-Locale-1.30.0
>>> Emerging (14 of 36) perl-core/Encode-2.430.0
>>> Emerging (15 of 36) dev-perl/Net-HTTP-6.30.0
>>> Emerging (16 of 36) dev-perl/File-Listing-6.40.0
>>> Jobs: 0 of 36 complete, 14 running              Load avg: 0.50, 0.21, 0.11^C
Exiting on signal 2
Exiting on signal 2

(Despite the two "exiting" messages,I only hit Ctrl+C once.) My prompt never returned. This is consistent and repeatable on my server in its current state. If I let it emerge, it may temporarily resolve the issue, but the Ctrl+C hang may happen again in the future. It's happened several times in the past.

Fwiw, this also occurs in pre-2.2 portage on another computer, but obviously we can focus on 2.2+.

I will attempt to maintain the state portage is in now so I can continue poking it as desired.

Portage quits immediately after a SIGHUP. It remains completely unresponsive to Ctrl+C after the first message.

Reproducible: Always




Portage 2.2.0_alpha116 (default/linux/amd64/10.0/server, gcc-4.5.3, glibc-2.14.1-r3, 2.6.35.4-rscloud x86_64)
=================================================================
System uname: Linux-2.6.35.4-rscloud-x86_64-Six-Core_AMD_Opteron-tm-_Processor_2423_HE-with-gentoo-2.1
Timestamp of tree: Mon, 09 Jul 2012 13:00:01 +0000
ccache version 3.1.6 [enabled]
app-shells/bash:          4.2_p20
dev-java/java-config:     2.1.11-r3
dev-lang/python:          2.6.8, 2.7.3-r1, 3.1.5, 3.2.3
dev-util/ccache:          3.1.6
dev-util/pkgconfig:       0.26
sys-apps/baselayout:      2.1-r1
sys-apps/openrc:          0.9.8.4
sys-apps/sandbox:         2.5
sys-devel/autoconf:       2.68
sys-devel/automake:       1.10.2, 1.11.1
sys-devel/binutils:       2.21.1-r1
sys-devel/gcc:            4.3.4, 4.5.3-r2
sys-devel/gcc-config:     1.5-r2
sys-devel/libtool:        2.4-r1
sys-devel/make:           3.82-r1
sys-kernel/linux-headers: 3.1 (virtual/os-headers)
sys-libs/glibc:           2.14.1-r3
Repositories: gentoo java-overlay local
ACCEPT_KEYWORDS="amd64"
ACCEPT_LICENSE="* -@EULA"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-O2 -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/openvpn/easy-rsa"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/java/ /etc/gconf /etc/gentoo-release /etc/php/apache2-php5.3/ext-active/ /etc/php/cgi-php5.3/ext-active/ /etc/php/cli-php5.3/ext-active/ /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo"
CXXFLAGS="-O2 -pipe"
DISTDIR="/usr/portage/distfiles"
FCFLAGS="-O2 -pipe"
FEATURES="assume-digests binpkg-logs ccache collision-protect config-protect-if-modified distlocks ebuild-locks fail-clean fixlafiles news nodoc noinfo noman parallel-fetch parse-eapi-ebuild-head preserve-libs protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch"
FFLAGS="-O2 -pipe"
GENTOO_MIRRORS="http://distfiles.gentoo.org"
LANG="en_US.utf8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
MAKEOPTS="-j4"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/var/lib/layman/java-overlay /usr/local/portage"
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="acl amd64 bazaar berkdb bzip2 cgi cli cracklib crypt ctype curl cxx dri fastcgi fortran ftp gd gdbm git gpm ipv6 json libwww maildir mercurial mmx modules mudflap multilib ncurses nptl nptlonly openmp pam pcre perl php posix pppd python python2 python3 raw readline ruby session snmp sqlite sse sse2 ssh ssl subversion tcpd threads truetype unicode unzip vhosts vim-syntax xml xmlrpc xorg zip zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump" CAMERAS="ptp2" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip tripmate tnt ubx" INPUT_DEVICES="keyboard mouse evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" PHP_TARGETS="php5-3" PYTHON_TARGETS="python3_2 python2_7" RUBY_TARGETS="ruby18 ruby19" USERLAND="GNU" VIDEO_CARDS="fbdev glint intel mach64 mga neomagic nouveau nv r128 radeon savage sis tdfx trident vesa via vmware dummy v4l" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL, LINGUAS, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, USE_PYTHON
Comment 1 Jacob Godserv 2012-07-10 00:32:23 UTC
Created attachment 317738 [details]
lsof, pstree, and pdb's backtrace output
Comment 2 Zac Medico gentoo-dev 2012-09-04 20:17:53 UTC
*** Bug 433958 has been marked as a duplicate of this bug. ***
Comment 3 Jacob Godserv 2013-02-26 15:27:27 UTC
Why is this still not confirmed? Should I provide additional information?
Comment 4 Zac Medico gentoo-dev 2013-02-26 17:00:27 UTC
When you ^C to emerge, the Scheduler._terminate_tasks method sends SIGTERM to direct child processes, and then it waits for them to terminate. It may help if we also add a timeout, and send a SIGKILL if the SIGTERM cause it to terminate in a reasonable amount of time (10 seconds or so).
Comment 5 M. B. 2013-05-07 13:32:36 UTC
Actually, I was about to file a bug when i found this one about ...why not turn it into a feature?

My suggestion:
Ctrl+C kills emerge and all of its subprocesses.
Sending it a signal (SIGUSR1 for example) could instead make it finish the currently running compiles/merges and then *not* proceed with the scheduled merges? E.g. make it finish gracefully.

The current behavior (for which this bug was opened) also happens in 2.1.11.62. And I find it quite a nuisance, since it needs an explicit kill -9 to stop.
Comment 6 Jacob Godserv 2013-05-07 13:50:45 UTC
(In reply to comment #5)
> The current behavior (for which this bug was opened) also happens in
> 2.1.11.62. And I find it quite a nuisance, since it needs an explicit kill
> -9 to stop.

Just FYI, a HUP signal to the python process does the trick. It seems python is responsive even if the code its executing is not. If you Ctrl+Z'ed it, you just have to be sure to run "fg" to let it receive and react to the signal.
Comment 7 Zac Medico gentoo-dev 2013-05-07 14:00:55 UTC
(In reply to comment #5)
> My suggestion:
> Ctrl+C kills emerge and all of its subprocesses.

If it sends SIGKILL first then we risk losing potentially useful output, such as the traceback captured in bug 463960, comment #26.

In order to avoid losing useful information like that, I think it's better to use a timeout. Alternatively, we could send SIGKILL if more than one SIGINT signal is received.

> Sending it a signal (SIGUSR1 for example) could instead make it finish the
> currently running compiles/merges and then *not* proceed with the scheduled
> merges? E.g. make it finish gracefully.

I think the graceful exit is a good default, but it will be better if enhanced with a timeout and translation of multiple SIGINT to SIGKILL.

> The current behavior (for which this bug was opened) also happens in
> 2.1.11.62. And I find it quite a nuisance, since it needs an explicit kill
> -9 to stop.

Yes, it is certainly annoying.
Comment 8 Jacob Godserv 2013-05-07 14:21:09 UTC
This suggests several different signals should be sent before a SIGKILL is:
http://pthree.org/2012/08/14/appropriate-use-of-kill-9-pid/
Comment 9 Pacho Ramos gentoo-dev 2014-03-09 17:36:45 UTC
I am still suffering this with 2.2.8-r1 (just hit)
Comment 10 Zac Medico gentoo-dev 2015-04-22 01:43:17 UTC
According to the article "Proper handling of SIGINT/SIGQUIT", the only proper way to exit due to SIGINT would be essentially as follows:

    void sigint_handler(int sig)
    {
        [do some cleanup]
        signal(SIGINT, SIG_DFL);
        kill(getpid(), SIGINT);
    }

Our current code doesn't do that, so that's something to fix.

We should certainly avoid calling signal.signal(signal.SIGINT, signal.SIG_IGN) and leaving it in that state (that's what we do now).
Comment 11 Zac Medico gentoo-dev 2016-08-20 12:44:33 UTC
I've experienced this bug just now, and use SIGUSR1 + pdb to get some insight into what causes it. The problem is that in Scheduler there are 2 system packages in self._merge_wait_scheduled, and they are also in self._running_tasks (which keeps the main loop running). These tasks have been cancelled by self._terminate_tasks, and cleared from self._task_queues.merge. So, they're just sitting there in a cancelled state, keeping the loop running.
Comment 12 Zac Medico gentoo-dev 2016-08-20 13:39:13 UTC
This patch seems to work for me:

https://github.com/gentoo/portage/pull/43
Comment 15 Zac Medico gentoo-dev 2017-02-10 18:46:04 UTC
Fixed in portage-2.3.3.