The performance of pypi.eclass isn't abysmal, but it's not great either. Using `pk pkg source '*/*' --repo ~/g/` (from sys-apps/pkgcraft-tools), I see a bunch of surprisingly high sourcing-times for simple ebuilds just using pypi.eclass: ``` $ pk pkg source $(pkg) --repo ~/g/ --bench 5s dev-python/asgiref-3.7.2::/home/sam/g/: min: 20.445ms, mean: 25.656ms, max: 28.229ms, σ = 1.232ms, N = 195 dev-python/asgiref-3.7.1::/home/sam/g/: min: 20.097ms, mean: 25.333ms, max: 32.116ms, σ = 1.264ms, N = 198 dev-python/asgiref-3.7.0::/home/sam/g/: min: 19.475ms, mean: 25.066ms, max: 30.063ms, σ = 1.257ms, N = 200 dev-python/asgiref-3.6.0::/home/sam/g/: min: 16.634ms, mean: 20.744ms, max: 24.084ms, σ = 1.039ms, N = 242 ``` It's not terrible (there's a fair bit of variation on my machine at the moment, sorry, it was worse on arthur's) but it could be better: ``` <@arthurzam> mgorny: sam_: after some experimentation with pkgcraft `pk pkg source` to measure the "speed" of python eclass, I've noted that pypi.eclass is somewhat not fastest. Con sidering this is an eclass that is inherited a lot, and especially each gentoo system has ebuilds that use it, I think we should try to improve it. <@arthurzam> For example, this very stupid change http://dpaste.com/AXCJH365W, which breaks behaviour across tree but not for the ebuild I was checking, got us consistent 10% improvement for the ebuild, which is inheriting distutils+pypi <@arthurzam> Maybe we should perform faster and less "smart" default code for the global scope to optimize the "happy frequent flow" ``` (paste expired, sorry).
The worst Python offenders are: $ cat foo | sort -k2 -n | grep python | tail -n 25 dev-python/anyio-3.6.1::/home/sam/g/: 50.802ms dev-python/pygraphviz-1.10::/home/sam/g/: 51.118ms dev-python/stevedore-5.0.0::/home/sam/g/: 51.152ms dev-python/pudb-2022.1.3::/home/sam/g/: 51.432ms dev-python/trustme-1.0.0::/home/sam/g/: 52.627ms dev-python/orjson-3.8.11::/home/sam/g/: 52.662ms dev-python/tasklib-2.5.1::/home/sam/g/: 53.111ms dev-python/matplotlib-3.7.1::/home/sam/g/: 53.521ms dev-python/cryptography-40.0.2::/home/sam/g/: 53.62ms dev-python/tox-4.5.1::/home/sam/g/: 54.124ms dev-python/orjson-3.9.1::/home/sam/g/: 54.162ms dev-python/orjson-3.8.14::/home/sam/g/: 54.81ms dev-python/jellyfish-0.11.2-r1::/home/sam/g/: 55.14ms dev-python/jsmin-3.0.1::/home/sam/g/: 55.426ms dev-python/networkx-3.1::/home/sam/g/: 55.907ms dev-python/bashate-2.1.1::/home/sam/g/: 56.022ms dev-python/ipython_genutils-0.2.0-r4::/home/sam/g/: 56.216ms dev-python/keyrings-alt-4.2.0::/home/sam/g/: 56.227ms dev-python/orjson-3.8.13::/home/sam/g/: 56.303ms dev-python/cryptography-41.0.1::/home/sam/g/: 57.131ms dev-python/cryptography-40.0.2-r1::/home/sam/g/: 57.699ms dev-python/zeroconf-0.64.0::/home/sam/g/: 57.984ms dev-python/asgiref-3.7.0::/home/sam/g/: 58.566ms dev-python/reedsolomon-2.1.0_beta1::/home/sam/g/: 59.856ms dev-python/orjson-3.8.12::/home/sam/g/: 66.529ms
Pre-fixes: ``` $ time pk pkg source --repo ~/g/ 'dev-python/*' [...] real 0m3.198s user 0m33.096s sys 0m18.493s $ time pk pkg source --sort --repo ~/g/ > list $ grep python list | tail -n15 # python doesn't appear in the overall top 15 bad, so this is the top 15 bad *python* ones dev-python/zeroconf-0.62.0::/home/sam/git/gentoo: 45.493ms dev-python/cryptography-40.0.2-r1::/home/sam/git/gentoo: 45.541ms dev-python/zope-hookable-5.4::/home/sam/git/gentoo: 47.159ms dev-python/zope-testing-5.0.1::/home/sam/git/gentoo: 48.445ms dev-python/zope-deprecation-5.0::/home/sam/git/gentoo: 49.298ms dev-python/y-py-0.6.0::/home/sam/git/gentoo: 50.681ms dev-python/zope-component-6.0::/home/sam/git/gentoo: 54.138ms dev-python/ytmusicapi-1.0.2::/home/sam/git/gentoo: 54.945ms dev-python/zeep-4.2.1::/home/sam/git/gentoo: 56.422ms dev-python/zope-configuration-5.0::/home/sam/git/gentoo: 58.091ms dev-python/zeroconf-0.64.1::/home/sam/git/gentoo: 59.816ms dev-python/zope-exceptions-4.6::/home/sam/git/gentoo: 62.056ms dev-python/zeroconf-0.63.0::/home/sam/git/gentoo: 65.505ms dev-python/zeroconf-0.64.0::/home/sam/git/gentoo: 72.05ms dev-python/zstd-1.5.5.1::/home/sam/git/gentoo: 78.324ms ``` Post-fixes: ``` $ time pk pkg source --repo ~/g/ 'dev-python/*' [...] real 0m2.711s user 0m31.642s sys 0m12.112s $ time pk pkg source --sort --repo ~/g/ > listt $ grep python list | tail -n15 # python doesn't appear in the overall top 15 bad, so this is the top 15 bad *python* ones dev-python/pydantic-core-0.38.0::/home/sam/git/gentoo: 30.2ms dev-python/cryptography-40.0.2-r1::/home/sam/git/gentoo: 30.243ms dev-python/jellyfish-0.11.2-r1::/home/sam/git/gentoo: 30.356ms dev-python/hypothesis-6.77.0::/home/sam/git/gentoo: 30.516ms dev-python/orjson-3.8.14::/home/sam/git/gentoo: 30.52ms dev-python/orjson-3.8.12::/home/sam/git/gentoo: 30.542ms dev-python/subunit-1.4.2::/home/sam/git/gentoo: 30.683ms dev-python/setuptools-67.7.2::/home/sam/git/gentoo: 30.75ms dev-python/setuptools-rust-1.6.0::/home/sam/git/gentoo: 31.299ms dev-python/orjson-3.8.10::/home/sam/git/gentoo: 31.469ms dev-python/orjson-3.8.13::/home/sam/git/gentoo: 31.548ms dev-python/markups-4.0.0::/home/sam/git/gentoo: 31.773ms dev-python/orjson-3.9.1::/home/sam/git/gentoo: 31.972ms dev-python/cryptography-40.0.2::/home/sam/git/gentoo: 31.993ms dev-python/cryptography-41.0.1::/home/sam/git/gentoo: 32.017ms ```
And for completeness (see original comment), for asgiref with fixes: ``` $ pk pkg source $(pkg) --repo ~/g/ --bench 5s dev-python/asgiref-3.6.0::/home/sam/git/gentoo: mean: 12.037ms, min: 11.246ms, max: 18.97ms, σ = 623µs, N = 416 dev-python/asgiref-3.7.1::/home/sam/git/gentoo: mean: 16.256ms, min: 14.277ms, max: 22.471ms, σ = 856µs, N = 308 dev-python/asgiref-3.7.2::/home/sam/git/gentoo: mean: 16.312ms, min: 14.419ms, max: 22.379ms, σ = 1.067ms, N = 307 dev-python/asgiref-3.7.0::/home/sam/git/gentoo: mean: 16.114ms, min: 14.084ms, max: 22.217ms, σ = 1.088ms, N = 311 ```
The bug has been closed via the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=c84d64b357a7f8eca3f309f8fc773e928f57cc52 commit c84d64b357a7f8eca3f309f8fc773e928f57cc52 Author: Michał Górny <mgorny@gentoo.org> AuthorDate: 2023-06-13 05:55:34 +0000 Commit: Michał Górny <mgorny@gentoo.org> CommitDate: 2023-06-15 12:19:26 +0000 pypi.eclass: Avoid subshell for extglob setting Suggested by Eli Schwartz. This gives roughly 5260 ops / s, over 550% speedup. The complete patch series therefore increases the speed from roughly 326 ops / s to 5260 ops / s, making the common case 16 times faster. Closes: https://bugs.gentoo.org/908411 Closes: https://github.com/gentoo/gentoo/pull/31404 Signed-off-by: Michał Górny <mgorny@gentoo.org> eclass/pypi.eclass | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)
Full list of commits for completeness: commit c84d64b357a7f8eca3f309f8fc773e928f57cc52 Author: Michał Górny <mgorny@gentoo.org> Date: Tue Jun 13 07:55:34 2023 +0200 pypi.eclass: Avoid subshell for extglob setting Suggested by Eli Schwartz. This gives roughly 5260 ops / s, over 550% speedup. The complete patch series therefore increases the speed from roughly 326 ops / s to 5260 ops / s, making the common case 16 times faster. Closes: https://bugs.gentoo.org/908411 Closes: https://github.com/gentoo/gentoo/pull/31404 Signed-off-by: Michał Górny <mgorny@gentoo.org> commit 9640490da48c35124a6f2e27c46931cf1db718f6 Author: Michał Górny <mgorny@gentoo.org> Date: Mon Jun 12 21:51:51 2023 +0200 pypi.eclass: Replace pypi_sdist_url in global scope Introduce an internal helper for _pypi_sdist_url that doesn't require subshell, and therefore eliminate all subshells from global scope. We're nearing 952 ops / s, further 39% speedup. Signed-off-by: Michał Górny <mgorny@gentoo.org> commit 78cb7a0709eea188d764b9ac77d120414e07e7b5 Author: Michał Górny <mgorny@gentoo.org> Date: Mon Jun 12 21:41:31 2023 +0200 pypi.eclass: Translate version without subshell in common case Provide an internal helper to translate versions without a subshell, and use it in the common case. Now the benchmark gives 685 ops / s, which means it's another 28% speedup. Signed-off-by: Michał Górny <mgorny@gentoo.org> commit acba55d412d48f9d93147b185874f186dbd6cb22 Author: Michał Górny <mgorny@gentoo.org> Date: Mon Jun 12 21:31:43 2023 +0200 pypi.eclass: Normalize names without subshell Provide an internal helper to normalize names without a subshell. This gives 535 ops / s, so a further 44% speedup. Signed-off-by: Michał Górny <mgorny@gentoo.org> commit ff4c2bbf52663cc7a233e2dbccdd33b193821168 Author: Michał Górny <mgorny@gentoo.org> Date: Mon Jun 12 21:26:19 2023 +0200 pypi.eclass: Translate version once in the default scenario Instead of translating version two times, once in pypi_sdist_url and then when setting S, do it once and store the result. This gives roughly 371 ops / s, i.e. a 13% speedup. Signed-off-by: Michał Górny <mgorny@gentoo.org> commit 87807b72eb990e9f9dd8b768c1b9ea0d054a0118 Author: Michał Górny <mgorny@gentoo.org> Date: Mon Jun 12 21:23:41 2023 +0200 pypi.eclass: Move setting globals to a function Signed-off-by: Michał Górny <mgorny@gentoo.org> There's also https://github.com/gentoo/gentoo/pull/31465 for another cleanup.
(.. and https://github.com/gentoo/gentoo/pull/31454).