Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 908411 - pypi.eclass: slow performance
Summary: pypi.eclass: slow performance
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Eclasses (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Python Gentoo Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-06-12 15:49 UTC by Sam James
Modified: 2023-06-16 01:16 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-06-12 15:49:45 UTC
The performance of pypi.eclass isn't abysmal, but it's not great either.

Using `pk pkg source '*/*' --repo ~/g/` (from sys-apps/pkgcraft-tools), I see a bunch of surprisingly high sourcing-times for simple ebuilds just using pypi.eclass:

```
$ pk pkg source $(pkg) --repo ~/g/ --bench 5s
dev-python/asgiref-3.7.2::/home/sam/g/: min: 20.445ms, mean: 25.656ms, max: 28.229ms, σ = 1.232ms, N = 195
dev-python/asgiref-3.7.1::/home/sam/g/: min: 20.097ms, mean: 25.333ms, max: 32.116ms, σ = 1.264ms, N = 198
dev-python/asgiref-3.7.0::/home/sam/g/: min: 19.475ms, mean: 25.066ms, max: 30.063ms, σ = 1.257ms, N = 200
dev-python/asgiref-3.6.0::/home/sam/g/: min: 16.634ms, mean: 20.744ms, max: 24.084ms, σ = 1.039ms, N = 242
```

It's not terrible (there's a fair bit of variation on my machine at the moment, sorry, it was worse on arthur's) but it could be better:
```
<@arthurzam> mgorny: sam_: after some experimentation with pkgcraft `pk pkg source` to measure the "speed" of python eclass, I've noted that pypi.eclass is somewhat not fastest. Con
sidering this is an eclass that is inherited a lot, and especially each gentoo system has ebuilds that use it, I think we should try to improve it.
<@arthurzam> For example, this very stupid change http://dpaste.com/AXCJH365W, which breaks behaviour across tree but not for the ebuild I was checking, got us consistent 10% improvement for the ebuild, which is inheriting distutils+pypi
<@arthurzam> Maybe we should perform faster and less "smart" default code for the global scope to optimize the "happy frequent flow"
```

(paste expired, sorry).
Comment 1 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-06-12 15:52:04 UTC
The worst Python offenders are:

$ cat foo | sort -k2 -n | grep python | tail -n 25
dev-python/anyio-3.6.1::/home/sam/g/: 50.802ms
dev-python/pygraphviz-1.10::/home/sam/g/: 51.118ms
dev-python/stevedore-5.0.0::/home/sam/g/: 51.152ms
dev-python/pudb-2022.1.3::/home/sam/g/: 51.432ms
dev-python/trustme-1.0.0::/home/sam/g/: 52.627ms
dev-python/orjson-3.8.11::/home/sam/g/: 52.662ms
dev-python/tasklib-2.5.1::/home/sam/g/: 53.111ms
dev-python/matplotlib-3.7.1::/home/sam/g/: 53.521ms
dev-python/cryptography-40.0.2::/home/sam/g/: 53.62ms
dev-python/tox-4.5.1::/home/sam/g/: 54.124ms
dev-python/orjson-3.9.1::/home/sam/g/: 54.162ms
dev-python/orjson-3.8.14::/home/sam/g/: 54.81ms
dev-python/jellyfish-0.11.2-r1::/home/sam/g/: 55.14ms
dev-python/jsmin-3.0.1::/home/sam/g/: 55.426ms
dev-python/networkx-3.1::/home/sam/g/: 55.907ms
dev-python/bashate-2.1.1::/home/sam/g/: 56.022ms
dev-python/ipython_genutils-0.2.0-r4::/home/sam/g/: 56.216ms
dev-python/keyrings-alt-4.2.0::/home/sam/g/: 56.227ms
dev-python/orjson-3.8.13::/home/sam/g/: 56.303ms
dev-python/cryptography-41.0.1::/home/sam/g/: 57.131ms
dev-python/cryptography-40.0.2-r1::/home/sam/g/: 57.699ms
dev-python/zeroconf-0.64.0::/home/sam/g/: 57.984ms
dev-python/asgiref-3.7.0::/home/sam/g/: 58.566ms
dev-python/reedsolomon-2.1.0_beta1::/home/sam/g/: 59.856ms
dev-python/orjson-3.8.12::/home/sam/g/: 66.529ms
Comment 2 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-06-14 05:41:39 UTC
Pre-fixes:
```
$ time pk pkg source --repo ~/g/ 'dev-python/*'
[...]
real    0m3.198s
user    0m33.096s
sys     0m18.493s

$ time pk pkg source --sort --repo ~/g/ > list
$ grep python list | tail -n15 # python doesn't appear in the overall top 15 bad, so this is the top 15 bad *python* ones
dev-python/zeroconf-0.62.0::/home/sam/git/gentoo: 45.493ms
dev-python/cryptography-40.0.2-r1::/home/sam/git/gentoo: 45.541ms
dev-python/zope-hookable-5.4::/home/sam/git/gentoo: 47.159ms
dev-python/zope-testing-5.0.1::/home/sam/git/gentoo: 48.445ms
dev-python/zope-deprecation-5.0::/home/sam/git/gentoo: 49.298ms
dev-python/y-py-0.6.0::/home/sam/git/gentoo: 50.681ms
dev-python/zope-component-6.0::/home/sam/git/gentoo: 54.138ms
dev-python/ytmusicapi-1.0.2::/home/sam/git/gentoo: 54.945ms
dev-python/zeep-4.2.1::/home/sam/git/gentoo: 56.422ms
dev-python/zope-configuration-5.0::/home/sam/git/gentoo: 58.091ms
dev-python/zeroconf-0.64.1::/home/sam/git/gentoo: 59.816ms
dev-python/zope-exceptions-4.6::/home/sam/git/gentoo: 62.056ms
dev-python/zeroconf-0.63.0::/home/sam/git/gentoo: 65.505ms
dev-python/zeroconf-0.64.0::/home/sam/git/gentoo: 72.05ms
dev-python/zstd-1.5.5.1::/home/sam/git/gentoo: 78.324ms
```

Post-fixes:
```
$ time pk pkg source --repo ~/g/ 'dev-python/*'
[...]
real    0m2.711s
user    0m31.642s
sys     0m12.112s

$ time pk pkg source --sort --repo ~/g/ > listt
$ grep python list | tail -n15 # python doesn't appear in the overall top 15 bad, so this is the top 15 bad *python* ones
dev-python/pydantic-core-0.38.0::/home/sam/git/gentoo: 30.2ms
dev-python/cryptography-40.0.2-r1::/home/sam/git/gentoo: 30.243ms
dev-python/jellyfish-0.11.2-r1::/home/sam/git/gentoo: 30.356ms
dev-python/hypothesis-6.77.0::/home/sam/git/gentoo: 30.516ms
dev-python/orjson-3.8.14::/home/sam/git/gentoo: 30.52ms
dev-python/orjson-3.8.12::/home/sam/git/gentoo: 30.542ms
dev-python/subunit-1.4.2::/home/sam/git/gentoo: 30.683ms
dev-python/setuptools-67.7.2::/home/sam/git/gentoo: 30.75ms
dev-python/setuptools-rust-1.6.0::/home/sam/git/gentoo: 31.299ms
dev-python/orjson-3.8.10::/home/sam/git/gentoo: 31.469ms
dev-python/orjson-3.8.13::/home/sam/git/gentoo: 31.548ms
dev-python/markups-4.0.0::/home/sam/git/gentoo: 31.773ms
dev-python/orjson-3.9.1::/home/sam/git/gentoo: 31.972ms
dev-python/cryptography-40.0.2::/home/sam/git/gentoo: 31.993ms
dev-python/cryptography-41.0.1::/home/sam/git/gentoo: 32.017ms
```
Comment 3 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-06-14 05:42:27 UTC
And for completeness (see original comment), for asgiref with fixes:
```
$ pk pkg source $(pkg) --repo ~/g/ --bench 5s
dev-python/asgiref-3.6.0::/home/sam/git/gentoo: mean: 12.037ms, min: 11.246ms, max: 18.97ms, σ = 623µs, N = 416
dev-python/asgiref-3.7.1::/home/sam/git/gentoo: mean: 16.256ms, min: 14.277ms, max: 22.471ms, σ = 856µs, N = 308
dev-python/asgiref-3.7.2::/home/sam/git/gentoo: mean: 16.312ms, min: 14.419ms, max: 22.379ms, σ = 1.067ms, N = 307
dev-python/asgiref-3.7.0::/home/sam/git/gentoo: mean: 16.114ms, min: 14.084ms, max: 22.217ms, σ = 1.088ms, N = 311
```
Comment 4 Larry the Git Cow gentoo-dev 2023-06-15 12:19:30 UTC
The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=c84d64b357a7f8eca3f309f8fc773e928f57cc52

commit c84d64b357a7f8eca3f309f8fc773e928f57cc52
Author:     Michał Górny <mgorny@gentoo.org>
AuthorDate: 2023-06-13 05:55:34 +0000
Commit:     Michał Górny <mgorny@gentoo.org>
CommitDate: 2023-06-15 12:19:26 +0000

    pypi.eclass: Avoid subshell for extglob setting
    
    Suggested by Eli Schwartz.  This gives roughly 5260 ops / s, over 550%
    speedup.
    
    The complete patch series therefore increases the speed from roughly
    326 ops / s to 5260 ops / s, making the common case 16 times faster.
    
    Closes: https://bugs.gentoo.org/908411
    Closes: https://github.com/gentoo/gentoo/pull/31404
    Signed-off-by: Michał Górny <mgorny@gentoo.org>

 eclass/pypi.eclass | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)
Comment 5 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-06-16 01:16:02 UTC
Full list of commits for completeness:

commit c84d64b357a7f8eca3f309f8fc773e928f57cc52
Author: Michał Górny <mgorny@gentoo.org>
Date:   Tue Jun 13 07:55:34 2023 +0200

    pypi.eclass: Avoid subshell for extglob setting

    Suggested by Eli Schwartz.  This gives roughly 5260 ops / s, over 550%
    speedup.

    The complete patch series therefore increases the speed from roughly
    326 ops / s to 5260 ops / s, making the common case 16 times faster.

    Closes: https://bugs.gentoo.org/908411
    Closes: https://github.com/gentoo/gentoo/pull/31404
    Signed-off-by: Michał Górny <mgorny@gentoo.org>

commit 9640490da48c35124a6f2e27c46931cf1db718f6
Author: Michał Górny <mgorny@gentoo.org>
Date:   Mon Jun 12 21:51:51 2023 +0200

    pypi.eclass: Replace pypi_sdist_url in global scope

    Introduce an internal helper for _pypi_sdist_url that doesn't require
    subshell, and therefore eliminate all subshells from global scope.
    We're nearing 952 ops / s, further 39% speedup.

    Signed-off-by: Michał Górny <mgorny@gentoo.org>


commit 78cb7a0709eea188d764b9ac77d120414e07e7b5
Author: Michał Górny <mgorny@gentoo.org>
Date:   Mon Jun 12 21:41:31 2023 +0200

    pypi.eclass: Translate version without subshell in common case

    Provide an internal helper to translate versions without a subshell,
    and use it in the common case.  Now the benchmark gives 685 ops / s,
    which means it's another 28% speedup.

    Signed-off-by: Michał Górny <mgorny@gentoo.org>

commit acba55d412d48f9d93147b185874f186dbd6cb22
Author: Michał Górny <mgorny@gentoo.org>
Date:   Mon Jun 12 21:31:43 2023 +0200

    pypi.eclass: Normalize names without subshell

    Provide an internal helper to normalize names without a subshell.
    This gives 535 ops / s, so a further 44% speedup.

    Signed-off-by: Michał Górny <mgorny@gentoo.org>

commit ff4c2bbf52663cc7a233e2dbccdd33b193821168
Author: Michał Górny <mgorny@gentoo.org>
Date:   Mon Jun 12 21:26:19 2023 +0200

    pypi.eclass: Translate version once in the default scenario

    Instead of translating version two times, once in pypi_sdist_url
    and then when setting S, do it once and store the result.  This gives
    roughly 371 ops / s, i.e. a 13% speedup.

    Signed-off-by: Michał Górny <mgorny@gentoo.org>

commit 87807b72eb990e9f9dd8b768c1b9ea0d054a0118
Author: Michał Górny <mgorny@gentoo.org>
Date:   Mon Jun 12 21:23:41 2023 +0200

    pypi.eclass: Move setting globals to a function

    Signed-off-by: Michał Górny <mgorny@gentoo.org>

There's also https://github.com/gentoo/gentoo/pull/31465 for another cleanup.
Comment 6 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-06-16 01:16:22 UTC
(.. and https://github.com/gentoo/gentoo/pull/31454).