Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 828535 - dev-lang/python-3.9.9[pgo]: hangs when running tests for PGO
Summary: dev-lang/python-3.9.9[pgo]: hangs when running tests for PGO
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: AMD64 Linux
: Normal normal
Assignee: Python Gentoo Team
URL:
Whiteboard:
Keywords:
Depends on: 834643 834644
Blocks:
  Show dependency tree
 
Reported: 2021-12-08 00:05 UTC by 2porcupines
Modified: 2023-04-06 00:32 UTC (History)
8 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
ps faux while hanged. Removed all but relavent parts (ps_faux_while_hanged.txt,3.05 KB, text/plain)
2021-12-08 00:06 UTC, 2porcupines
Details
Build log for Python 3.9.9 with PGO enabled (build.log,238.62 KB, text/plain)
2021-12-08 00:07 UTC, 2porcupines
Details
Emerge --info output (emerge_info.txt,5.48 KB, text/plain)
2021-12-08 00:07 UTC, 2porcupines
Details
Build log of the following command: "FEATURES=test emerge --verbose dev-lang/python:3.9" (build_test.log,274.69 KB, text/plain)
2021-12-08 01:28 UTC, 2porcupines
Details
Verbose build logs of Python 3.9.9 (build_test_verbose.log.xz,278.18 KB, text/plain)
2021-12-08 02:37 UTC, 2porcupines
Details

Note You need to log in before you can comment on or make changes to this bug.
Description 2porcupines 2021-12-08 00:05:06 UTC
After enabling PGO and updating to Python-3.9.9, compile will hang after test 423 or 424.

Reproducible: Always

Steps to Reproduce:
1. USE="pgo" emerge dev-lang/python:3.9
2. 
3.
Actual Results:  
Compile hangs during what I believe is the testing phase (step 423 or 424)

Expected Results:  
Compile completes successfully
Comment 1 2porcupines 2021-12-08 00:06:14 UTC
Created attachment 757657 [details]
ps faux while hanged. Removed all but relavent parts
Comment 2 2porcupines 2021-12-08 00:07:17 UTC
Created attachment 757658 [details]
Build log for Python 3.9.9 with PGO enabled
Comment 3 2porcupines 2021-12-08 00:07:46 UTC
Created attachment 757659 [details]
Emerge --info output
Comment 4 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2021-12-08 01:11:21 UTC
Forums post too: https://forums.gentoo.org/viewtopic-p-8683896.html
Comment 5 2porcupines 2021-12-08 01:28:53 UTC
Created attachment 757660 [details]
Build log of the following command: "FEATURES=test emerge --verbose dev-lang/python:3.9"
Comment 6 2porcupines 2021-12-08 02:37:02 UTC
Created attachment 757661 [details]
Verbose build logs of Python 3.9.9

Using a local repository, modified EXTRATESTOPTS to add --verbose

"emake test EXTRATESTOPTS="-u-network -j${jobs} --verbose" \"

Then ran:

"FEATURES=test ebuild python-3.9.9.ebuild clean install"

This is the build log.
Comment 7 Larry the Git Cow gentoo-dev 2021-12-08 02:41:23 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=bf316365f10d66bb16871635d3296819a32a2c61

commit bf316365f10d66bb16871635d3296819a32a2c61
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2021-12-08 02:39:52 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2021-12-08 02:41:15 +0000

    dev-lang/python: skip known-hanging/fragile tests for PGO
    
    test_socket was the one reported originally here but
    others have reported hangs with:
    - test_asyncio
    - test_httpservers
    - test_logging
    - test_multiprocessing_fork
    - test_xmlrpc
    
    This is consistent with some of the odd hangs
    I've seen in src_test sporadically. Let's
    just skip them for PGO for now in lieu of more
    information.
    
    (--verbose for the test suite lives up to
    its name but isn't necessarily enough
    to get us what we need?)
    
    Bug: https://bugs.gentoo.org/828535
    Bug: https://bugs.gentoo.org/788022
    Signed-off-by: Sam James <sam@gentoo.org>

 dev-lang/python/python-3.10.1.ebuild        | 7 ++++++-
 dev-lang/python/python-3.11.0_alpha2.ebuild | 7 ++++++-
 dev-lang/python/python-3.9.9.ebuild         | 7 ++++++-
 3 files changed, 18 insertions(+), 3 deletions(-)
Comment 8 Ed Wildgoose 2021-12-08 14:51:21 UTC
Thanks for this. Not sure it adds much, but on a single machine, I think I can repro this consistently with a hang on the very last PGO test in an arm32 qemu user chroot, and an x86 chroot, both running MUSL. I think it might NOT occur (but possibly does) on an AMD64 chroot

It definitely does not on the bare metal outer machine (amd64 glibc).

So at least for me, musl is a factor. Machine has 16 cores as well (AMD5950x)

I checked strace, but didn't show anything useful to me.
Comment 9 Ed Wildgoose 2021-12-08 17:38:32 UTC
Using the portage from an hour or two before this email, which seems to be a changed ebuild, I still see hangs on at least arm32/musl

I'm not clear if the tests always run in the same order? My hang is now several tests before the end (whereas yesterday the hangs were right on the last test?)

...
0:06:17 load avg: 18.64 [410/418] test_regrtest passed
0:06:23 load avg: 17.39 [411/418] test_venv failed (1 failure)
0:06:31 load avg: 16.56 [412/418] test_subprocess failed (2 failures)
0:06:54 load avg: 13.46 [413/418] test_pickle passed
0:07:26 load avg: 10.95 [414/418] test_weakref passed
0:08:38 load avg: 5.35 [415/418] test_concurrent_futures passed
0:28:28 load avg: 1.01 [416/418] test_peg_generator passed
.. hangs here with zero load ..
Comment 10 Ed Wildgoose 2021-12-08 17:43:32 UTC
OK, running ps faux, I can see that it's hanging while testing "test.test_multiprocessing_forkserver"
Comment 11 Willard Dawson 2022-01-07 02:44:38 UTC
Not sure I have much to add here, except I am also seeing this issue on an x86 box. I could upload emerge info if that's useful?
Comment 12 r7l 2022-01-20 09:16:08 UTC
I have the same issue on amd64 Qemu VMs. Odd thing is that it's building fine with 3.9 but keeps hanging on 3.10 after this line:

....
0:23:16 load avg: 0.76 [425/426] test_zoneinfo passed
Comment 13 r7l 2022-01-20 10:43:09 UTC
I've kept on testing a bit and it is only affecting the current stable version of Python (currently 3.10.0) but is working fine with any of the unstable ones (3.10.1 and above).
Comment 14 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2022-03-06 05:46:58 UTC
(In reply to r7l from comment #13)
> I've kept on testing a bit and it is only affecting the current stable
> version of Python (currently 3.10.0) but is working fine with any of the
> unstable ones (3.10.1 and above).

Thanks.
Comment 15 Larry the Git Cow gentoo-dev 2023-04-06 00:32:05 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=82692290c64f6dd2f11f25625489dab7f7749fab

commit 82692290c64f6dd2f11f25625489dab7f7749fab
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2023-04-05 23:53:32 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2023-04-06 00:31:47 +0000

    dev-lang/python: add timeout for PGO task for >= 3.10
    
    Add a timeout matching the upstream default PROFILE_TASK
    given how often we've seen hangs here.
    
    Bug: https://bugs.gentoo.org/828535
    Bug: https://bugs.gentoo.org/850154
    Bug: https://bugs.gentoo.org/900429
    Bug: https://bugs.gentoo.org/903890
    Thanks-to: Martin Jansa <Martin.Jansa@gmail.com>
    Signed-off-by: Sam James <sam@gentoo.org>

 dev-lang/python/python-3.10.11.ebuild       | 7 ++++++-
 dev-lang/python/python-3.11.3.ebuild        | 7 ++++++-
 dev-lang/python/python-3.12.0_alpha7.ebuild | 7 ++++++-
 3 files changed, 18 insertions(+), 3 deletions(-)

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=33f6f605fb2bb432134103d3de13d8ebe9f5b146

commit 33f6f605fb2bb432134103d3de13d8ebe9f5b146
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2023-04-05 23:45:18 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2023-04-06 00:31:46 +0000

    dev-lang/python: skip test_tools for PGO
    
    Skip the 'test_tools' test when running PGO for now to avoid
    hanging the build (and also kind of fork-bombing the system
    with recursive cpython build attempts).
    
    Interestingly, not seen this when running the actual regular
    testsuite, but I suppose far fewer people actually run that,
    so could just be a frequency thing.
    
    Bug: https://bugs.gentoo.org/828535
    Bug: https://bugs.gentoo.org/850154
    Bug: https://bugs.gentoo.org/903890
    Closes: https://bugs.gentoo.org/900429
    Signed-off-by: Sam James <sam@gentoo.org>

 dev-lang/python/python-3.10.11.ebuild       | 4 ++++
 dev-lang/python/python-3.11.3.ebuild        | 4 ++++
 dev-lang/python/python-3.12.0_alpha7.ebuild | 4 ++++
 dev-lang/python/python-3.9.16_p3.ebuild     | 4 ++++
 4 files changed, 16 insertions(+)