Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 720050 - dev-lang/python-3.8.2-r2 : hangs at install phase (related to multiprocessing, maybe related to qemu?)
Summary: dev-lang/python-3.8.2-r2 : hangs at install phase (related to multiprocessing...
Status: CONFIRMED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Python Gentoo Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-04-29 23:11 UTC by Alexander Tsoy
Modified: 2021-07-05 10:42 UTC (History)
10 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
python-3.8.2-r2:20200429-201324.log.gz (python-3.8.2-r2:20200429-201324.log.gz,83.34 KB, application/gzip)
2020-04-29 23:11 UTC, Alexander Tsoy
Details
Patch Makefile.pre.in to avoid ProcessPoolExecutor with compilall.py -j1 (bug_720050_python_compileall_j1.patch,1.70 KB, patch)
2020-07-21 19:25 UTC, Zac Medico
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Alexander Tsoy 2020-04-29 23:11:06 UTC
Created attachment 635252 [details]
python-3.8.2-r2:20200429-201324.log.gz

python-3.8.2-r2 hangs at install phase. Looks like a deadlock related to multiprocessing (again?). This is an armv7 container on a x84_64 host (via qemu-arm). See the result of double Ctrl+C in attached build log.

Also note that Makefile is unconditionally passing -j0 to compileall.py

$ ps auxww | grep python3.8
root      216855  0.1  0.1 4441276 35792 pts/0   SNl+ апр29   0:07 /usr/bin/qemu-arm ./python -E -Wi /var/tmp/portage/dev-lang/python-3.8.2-r2/image/usr/lib/python3.8/compileall.py -j0 -d /usr/lib/python3.8 -f -x bad_coding|badsyntax|site-packages|lib2to3/tests/data /var/tmp/portage/dev-lang/python-3.8.2-r2/image/usr/lib/python3.8
root      216859  0.0  0.0 4309500 32680 pts/0   SNl+ апр29   0:02 /usr/bin/qemu-arm ./python -E -Wi /var/tmp/portage/dev-lang/python-3.8.2-r2/image/usr/lib/python3.8/compileall.py -j0 -d /usr/lib/python3.8 -f -x bad_coding|badsyntax|site-packages|lib2to3/tests/data /var/tmp/portage/dev-lang/python-3.8.2-r2/image/usr/lib/python3.8
root      216861  0.0  0.1 4309500 33024 pts/0   SNl+ апр29   0:02 /usr/bin/qemu-arm ./python -E -Wi /var/tmp/portage/dev-lang/python-3.8.2-r2/image/usr/lib/python3.8/compileall.py -j0 -d /usr/lib/python3.8 -f -x bad_coding|badsyntax|site-packages|lib2to3/tests/data /var/tmp/portage/dev-lang/python-3.8.2-r2/image/usr/lib/python3.8
root      216863  0.0  0.1 4309500 33012 pts/0   SNl+ апр29   0:02 /usr/bin/qemu-arm ./python -E -Wi /var/tmp/portage/dev-lang/python-3.8.2-r2/image/usr/lib/python3.8/compileall.py -j0 -d /usr/lib/python3.8 -f -x bad_coding|badsyntax|site-packages|lib2to3/tests/data /var/tmp/portage/dev-lang/python-3.8.2-r2/image/usr/lib/python3.8
root      216865  0.0  0.0 4309500 32408 pts/0   SNl+ апр29   0:02 /usr/bin/qemu-arm ./python -E -Wi /var/tmp/portage/dev-lang/python-3.8.2-r2/image/usr/lib/python3.8/compileall.py -j0 -d /usr/lib/python3.8 -f -x bad_coding|badsyntax|site-packages|lib2to3/tests/data /var/tmp/portage/dev-lang/python-3.8.2-r2/image/usr/lib/python3.8
root      216867  0.0  0.0 4309368 32344 pts/0   SNl+ апр29   0:02 /usr/bin/qemu-arm ./python -E -Wi /var/tmp/portage/dev-lang/python-3.8.2-r2/image/usr/lib/python3.8/compileall.py -j0 -d /usr/lib/python3.8 -f -x bad_coding|badsyntax|site-packages|lib2to3/tests/data /var/tmp/portage/dev-lang/python-3.8.2-r2/image/usr/lib/python3.8
root      216868  0.0  0.0 4309368 32608 pts/0   SNl+ апр29   0:02 /usr/bin/qemu-arm ./python -E -Wi /var/tmp/portage/dev-lang/python-3.8.2-r2/image/usr/lib/python3.8/compileall.py -j0 -d /usr/lib/python3.8 -f -x bad_coding|badsyntax|site-packages|lib2to3/tests/data /var/tmp/portage/dev-lang/python-3.8.2-r2/image/usr/lib/python3.8
root      216870  0.0  0.1 4309500 33780 pts/0   SNl+ апр29   0:02 /usr/bin/qemu-arm ./python -E -Wi /var/tmp/portage/dev-lang/python-3.8.2-r2/image/usr/lib/python3.8/compileall.py -j0 -d /usr/lib/python3.8 -f -x bad_coding|badsyntax|site-packages|lib2to3/tests/data /var/tmp/portage/dev-lang/python-3.8.2-r2/image/usr/lib/python3.8
root      216872  0.0  0.1 4309500 33552 pts/0   SNl+ апр29   0:02 /usr/bin/qemu-arm ./python -E -Wi /var/tmp/portage/dev-lang/python-3.8.2-r2/image/usr/lib/python3.8/compileall.py -j0 -d /usr/lib/python3.8 -f -x bad_coding|badsyntax|site-packages|lib2to3/tests/data /var/tmp/portage/dev-lang/python-3.8.2-r2/image/usr/lib/python3.8


# emerge --info
Portage 2.3.89 (python 3.6.10-final-0, default/linux/arm/17.0/armv7a, gcc-9.3.0, glibc-2.30-r8, 5.4.35-gentoo armv7l)
=================================================================
System uname: Linux-5.4.35-gentoo-armv7l-AMD_Opteron-tm-_Processor_4386-with-gentoo-2.6
KiB Mem:    32930692 total,    825448 free
KiB Swap:   16777212 total,  16747248 free
Timestamp of repository gentoo: Wed, 29 Apr 2020 12:00:01 +0000
Head commit of repository gentoo: f11bfef6e9a2e6d01a42be507be96b3dbd89e265
Head commit of repository puleglot: 962d3485ccd6efa24b85661ee72ba731fe459a2f

sh dash 0.5.9.1-r3
ld GNU ld (Gentoo 2.33.1 p2) 2.33.1
distcc 3.3.3 armv7a-unknown-linux-gnueabihf [enabled]
app-shells/bash:          5.0_p17::gentoo
dev-lang/perl:            5.30.1::gentoo
dev-lang/python:          2.7.18::gentoo, 3.6.10-r2::gentoo, 3.7.7-r2::gentoo, 3.8.2-r1::gentoo
dev-util/cmake:           3.16.5::gentoo
sys-apps/baselayout:      2.6-r1::gentoo
sys-apps/sandbox:         2.13::gentoo
sys-devel/autoconf:       2.69-r4::gentoo
sys-devel/automake:       1.16.1-r1::gentoo
sys-devel/binutils:       2.33.1-r1::gentoo
sys-devel/gcc:            9.3.0::gentoo
sys-devel/gcc-config:     2.2.1::gentoo
sys-devel/libtool:        2.4.6-r6::gentoo
sys-devel/make:           4.2.1-r4::gentoo
sys-kernel/linux-headers: 5.4::gentoo (virtual/os-headers)
sys-libs/glibc:           2.30-r8::gentoo
Repositories:

gentoo
    location: /var/db/repos/gentoo
    sync-type: rsync
    sync-uri: rsync://puleglot.ru/gentoo-portage
    priority: -1000
    sync-rsync-verify-jobs: 1
    sync-rsync-extra-opts: 
    sync-rsync-verify-max-age: 24
    sync-rsync-verify-metamanifest: yes

puleglot
    location: /var/db/repos/puleglot
    sync-type: git
    sync-uri: https://puleglot.ru/git/gentoo/puleglot-overlay.git
    masters: gentoo
    priority: 900

local
    location: /usr/local/portage
    masters: gentoo
    priority: 1000

ACCEPT_KEYWORDS="arm"
ACCEPT_LICENSE="@FREE"
CBUILD="armv7a-unknown-linux-gnueabihf"
CFLAGS="-O2 -march=armv7-a -mtune=cortex-a8 -mfpu=vfpv3 -mfloat-abi=hard -pipe"
CHOST="armv7a-unknown-linux-gnueabihf"
CONFIG_PROTECT="/etc /usr/share/gnupg/qualified.txt"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo"
CXXFLAGS="-O2 -march=armv7-a -mtune=cortex-a8 -mfpu=vfpv3 -mfloat-abi=hard -pipe"
DISTDIR="/var/cache/distfiles"
EMERGE_DEFAULT_OPTS="--dynamic-deps=n --with-bdeps=y --binpkg-respect-use=y --ask-enter-invalid"
ENV_UNSET="DBUS_SESSION_BUS_ADDRESS DISPLAY GOBIN PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR"
FCFLAGS="-O2 -march=armv7-a -mtune=cortex-a8 -mfpu=vfpv3 -mfloat-abi=hard -pipe"
FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs buildpkg cgroup clean-logs compress-build-logs config-protect-if-modified distcc distlocks ebuild-locks fixlafiles ipc-sandbox merge-sync multilib-strict network-sandbox news parallel-fetch preserve-libs protect-owned qa-unresolved-soname-deps sfperms split-log strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersync xattr"
FFLAGS="-O2 -march=armv7-a -mtune=cortex-a8 -mfpu=vfpv3 -mfloat-abi=hard -pipe"
GENTOO_MIRRORS="http://mirror.yandex.ru/gentoo-distfiles/ http://distfiles.gentoo.org/"
LANG="ru_RU.utf8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
MAKEOPTS="-j6"
PKGDIR="/var/cache/binpkgs"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git"
PORTAGE_TMPDIR="/var/tmp"
USE="acl aio alsa arm armv5te armv6 armv6t2 avahi bash-completion berkdb bzip2 caps cli crypt dbus dri flac fortran gdbm gpm hardened iconv idn ipv6 lz4 lzma mp3 ncurses neon nls nptl ogg openmp pam pcre pie pulseaudio readline sasl seccomp split-usr ssl ssp systemd tls udev unicode urandom vim-syntax vorbis xattr xtpax xz zeroconf zlib zstd" ADA_TARGET="gnat_2018" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="karbon sheets words" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_ARM="neon edsp thumb thumb2 v4 v5 v6 v7 vfp" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock greis isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" INPUT_DEVICES="libinput keyboard mouse" KERNEL="linux" L10N="en ru" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php7-2" POSTGRES_TARGETS="postgres10 postgres11" PYTHON_SINGLE_TARGET="python3_6" PYTHON_TARGETS="python3_6" RUBY_TARGETS="ruby24 ruby25" USERLAND="GNU" VIDEO_CARDS="exynos fbdev omap dummy v4l" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CC, CPPFLAGS, CTARGET, CXX, INSTALL_MASK, LC_ALL, LINGUAS, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
Comment 1 Alexander Tsoy 2020-04-29 23:21:04 UTC
Forgot to add: according to strace, all processes a waiting for mutex.
Comment 2 Matt Whitlock 2020-06-02 22:19:38 UTC
I wonder if this is related to a different problem I've been having with Portage on Python 3.8. I've been seeing hangs upon completion of rsyncing the main Gentoo repo (with 'emaint sync -a' or 'emerge --sync'). The process goes idle, blocked in a FUTEX_WAIT_PRIVATE syscall. It might be related to https://bugs.python.org/issue39360 , whose minimal reproducer in the first comment hangs on my system (with some non-zero probability).
Comment 3 Andreas K. Hüttel archtester gentoo-dev 2020-07-21 11:18:11 UTC
I can confirm this, with qemu-user chroots both for riscv64 and arm. 

It happens about 50% of emerges, and (from all system packages, including python-37) *only* with python-3.8.  

(Which makes building stages with catalyst somewhat painful.)

Given that noone else has reported it yet, it might also be qemu-specific.
Comment 4 Andreas K. Hüttel archtester gentoo-dev 2020-07-21 11:19:57 UTC
(In reply to Matt Whitlock from comment #2)
> I wonder if this is related to a different problem I've been having with
> Portage on Python 3.8. I've been seeing hangs upon completion of rsyncing
> the main Gentoo repo (with 'emaint sync -a' or 'emerge --sync'). The process
> goes idle, blocked in a FUTEX_WAIT_PRIVATE syscall. It might be related to
> https://bugs.python.org/issue39360 , whose minimal reproducer in the first
> comment hangs on my system (with some non-zero probability).

that looks interesting, indeed.
Comment 5 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2020-07-21 11:28:30 UTC
I've seen a somewhat similar problem recently.  I've been building multiple Python versions in parallel, and install phases of all of them suddenly hanged.  It turned out that it was caused by a parallel emerge process that I've stopped via ^z (i.e. a process that shouldn't affect this emerge at all).
Comment 6 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2020-07-21 11:29:02 UTC
Might have to do something with ptys.
Comment 7 Zac Medico gentoo-dev 2020-07-21 16:56:30 UTC
(In reply to Matt Whitlock from comment #2)
> I wonder if this is related to a different problem I've been having with
> Portage on Python 3.8. I've been seeing hangs upon completion of rsyncing
> the main Gentoo repo (with 'emaint sync -a' or 'emerge --sync'). The process
> goes idle, blocked in a FUTEX_WAIT_PRIVATE syscall. It might be related to
> https://bugs.python.org/issue39360 , whose minimal reproducer in the first
> comment hangs on my system (with some non-zero probability).

I hope this patch for bug 730192 solves that:

https://gitweb.gentoo.org/proj/portage.git/commit/?id=bde44b75407dfe0a390033636894a136af4e7533
Comment 8 Zac Medico gentoo-dev 2020-07-21 19:15:15 UTC
The c(In reply to Alexander Tsoy from comment #0)
> Created attachment 635252 [details]
> python-3.8.2-r2:20200429-201324.log.gz
> 
> python-3.8.2-r2 hangs at install phase. Looks like a deadlock related to
> multiprocessing (again?). This is an armv7 container on a x84_64 host (via
> qemu-arm). See the result of double Ctrl+C in attached build log.
> 
> Also note that Makefile is unconditionally passing -j0 to compileall.py
> 
> $ ps auxww | grep python3.8
> root      216855  0.1  0.1 4441276 35792 pts/0   SNl+ апр29   0:07
> /usr/bin/qemu-arm ./python -E -Wi
> /var/tmp/portage/dev-lang/python-3.8.2-r2/image/usr/lib/python3.8/compileall.
> py -j0 -d /usr/lib/python3.8 -f -x
> bad_coding|badsyntax|site-packages|lib2to3/tests/data
> /var/tmp/portage/dev-lang/python-3.8.2-r2/image/usr/lib/python3.8

With -j0, it uses concurrent.futures.ProcessPoolExecutor, and this looks very similar to the gemato deadlock from bug 647964.

If we patch it to use -j1 instead, then it won't use ProcessPoolExecutor.
Comment 9 Zac Medico gentoo-dev 2020-07-21 19:25:57 UTC
Created attachment 650130 [details, diff]
Patch Makefile.pre.in to avoid ProcessPoolExecutor with compilall.py -j1
Comment 10 Matt Whitlock 2020-07-21 19:44:46 UTC
(In reply to Michał Górny from comment #6)
> Might have to do something with ptys.

If the underlying cause is the same as my 'emaint sync -a' hangs, then it's not related to PTYs, as I see the hangs even when running emaint-sync from a cronjob.

By the way, the hang at the end of repo syncing is very reproducible for me. I don't know enough about Python debugging to get a Python stacktrace, but I can get a native stacktrace, which is how I know the process is blocked in a futex syscall.
Comment 11 Zac Medico gentoo-dev 2020-07-21 19:50:32 UTC
(In reply to Matt Whitlock from comment #10)
> (In reply to Michał Górny from comment #6)
> > Might have to do something with ptys.
> 
> If the underlying cause is the same as my 'emaint sync -a' hangs, then it's
> not related to PTYs, as I see the hangs even when running emaint-sync from a
> cronjob.
> 
> By the way, the hang at the end of repo syncing is very reproducible for me.
> I don't know enough about Python debugging to get a Python stacktrace, but I
> can get a native stacktrace, which is how I know the process is blocked in a
> futex syscall.

Please try https://github.com/gentoo/portage/pull/565.patch for emaint sync and or emerge --sync deadlocks.
Comment 12 Matt Whitlock 2020-07-21 19:53:08 UTC
(In reply to Zac Medico from comment #11)
> Please try https://github.com/gentoo/portage/pull/565.patch for emaint sync
> and or emerge --sync deadlocks.

Would I need PR 580 also? I see both PRs linked at https://bugs.gentoo.org/730192, and it looks like it's still a work in progress.
Comment 13 Zac Medico gentoo-dev 2020-07-21 19:56:29 UTC
(In reply to Matt Whitlock from comment #12)
> (In reply to Zac Medico from comment #11)
> > Please try https://github.com/gentoo/portage/pull/565.patch for emaint sync
> > and or emerge --sync deadlocks.
> 
> Would I need PR 580 also? I see both PRs linked at
> https://bugs.gentoo.org/730192, and it looks like it's still a work in
> progress.

For emaint sync and emerge --sync, PR 565 is enough (it's also included in sys-apps/portage-3.0.0-r1). PR 580 applies the same fix to merge / unmerge code.
Comment 14 Zac Medico gentoo-dev 2020-07-21 22:15:04 UTC
Since compileall.py deadlocks in concurrent.futures.ProcessPoolExecutor, I've searched for issues mentioning ProcessPoolExecutor and found at least these:

https://bugs.python.org/issue35809
https://bugs.python.org/issue35866
Comment 15 Andreas K. Hüttel archtester gentoo-dev 2020-07-25 21:42:37 UTC
(In reply to Zac Medico from comment #9)
> Created attachment 650130 [details, diff] [details, diff]
> Patch Makefile.pre.in to avoid ProcessPoolExecutor with compilall.py -j1

Over the last days I used a patch equivalent to this one for stage builds. 
I haven't seen a new hang yet, so looks good.
Comment 16 Andreas K. Hüttel archtester gentoo-dev 2020-07-25 21:43:33 UTC
(In reply to Andreas K. Hüttel from comment #15)
> (In reply to Zac Medico from comment #9)
> > Created attachment 650130 [details, diff] [details, diff] [details, diff]
> > Patch Makefile.pre.in to avoid ProcessPoolExecutor with compilall.py -j1
> 
> Over the last days I used a patch equivalent to this one for stage builds. 
> I haven't seen a new hang yet, so looks good.

https://gitweb.gentoo.org/proj/releng.git/tree/releases/portage/stages-qemu/patches/dev-lang/python:3.8/compileall-singlethreaded.patch

for reference...