Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 866085 - installing binpkgs in parallel doesn't actually install in parallel
Summary: installing binpkgs in parallel doesn't actually install in parallel
Status: CONFIRMED
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Binary packages support (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Portage team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 835380
  Show dependency tree
 
Reported: 2022-08-23 00:11 UTC by SpanKY
Modified: 2023-08-24 20:38 UTC (History)
4 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Portage Versions Bisection Findings.pdf (Portage Versions Bisection Findings.pdf,66.90 KB, application/pdf)
2023-06-19 18:00 UTC, Matt Turner
Details

Note You need to log in before you can comment on or make changes to this bug.
Description SpanKY gentoo-dev 2022-08-23 00:11:22 UTC
in CrOS, we install a lot of binpkgs.  we'll often install hundreds of them into an empty root when creating a new disk image for a device.

on systems with a lot of cores (e.g. 32+), we find that portage has very low parallelism.  historically it hasn't been great, but it's gotten much worse in recent versions.  our baseline is (largely) v2.3.75.  the full branch you can find here:
https://chromium.googlesource.com/chromiumos/third_party/portage_tool/+/refs/heads/chromeos-2.3.75

we rebased our fork onto 3.0.21 and that's when we detected a huge drop in performance.  you can find that branch here:
https://chromium.googlesource.com/chromiumos/third_party/portage_tool/+/refs/heads/chromeos-3.0.21

it's been hard bisecting things down due to the async migrations breaking things, but we've identified at least one place where there was a large regression.  reverting that in 3.0.21 seems to bring back a lot, but not all, of the performance.
https://gitweb.gentoo.org/proj/portage.git/commit/?id=d66e9ec0b10522528d62e18b83e012c1ec121787

here's one way of reproducing with vanilla Gentoo.  i use -j32 here because that's what our builders tend to have, and my workstation has 36 cores (72 if you count SMT).  but this should be reproducible with even just -j4 or more.

* start with new Gentoo install
https://bouncer.gentoo.org/fetch/root/all/releases/amd64/autobuilds/20220821T170533Z/stage3-amd64-openrc-20220821T170533Z.tar.xz

emerge --version
Portage 3.0.30 (python 3.10.5-final-0, default/linux/amd64/17.1, gcc-11.3.0, glibc-2.35-r8, 5.17.11-1rodete2-amd64 x86_64)

* install a few extra packages
emerge -q --sync
emerge -j32 -1 dev-libs/nss dev-vcs/git

* clone a small CrOS repo for accounts
cd /var/db/repos/
git clone --depth=1 https://chromium.googlesource.com/chromiumos/overlays/eclass-overlay

* workaround a bug in CrOS packages
mkdir -p /usr/share/fonts/foo

* setup a sysroot for the board
SYSROOT=/build/amd64-generic
mkdir -p $SYSROOT/etc/portage/profile
cd $SYSROOT/etc/portage
cat <<EOF >make.conf
SYSROOT=$SYSROOT
CHOST=x86_64-cros-linux-gnu
PKGDIR=$SYSROOT/packages
ACCEPT_LICENSE="*"
MAKEOPTS=-j32
PORTDIR_OVERLAY=/var/db/repos/eclass-overlay
PORTAGE_BINHOST=https://storage.googleapis.com/chromeos-prebuilt/test-vapier/amd64-generic/2022.08.22/packages
FEATURES="parallel-fetch parallel-install"
EOF
ln -sfT /var/db/repos/gentoo/profiles/default/linux/amd64/17.1 make.profile
wget https://storage.googleapis.com/chromeos-prebuilt/test-vapier/amd64-generic/2022.08.22/package.provided -P profile/

* try to emerge things and watch load average
emerge --with-bdeps=n -Gv --binpkg-respect-use=n virtual/target-os -j32 --root $SYSROOT --config-root $SYSROOT |& tee /log

even though there are many many places that say "32 running", the load average barely gets above 2.  that means portage isn't actually installing anything in parallel.

NB: the overall install will probably fail after installing ~ packages, but that's to be expected due to CrOS specific things not working in vanilla Gentoo.  the problem should manifest itself well before that point though so i didn't bother fixing it.
Comment 1 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2022-09-24 01:26:17 UTC
ccing zmedico in case he has any ideas
Comment 2 Matt Turner gentoo-dev 2023-06-17 23:49:58 UTC
Sam asked me to update this bug with what I know. Firstly, the document go/cros-portage-bisection-findings from within Google has a bit more analysis. I'm not sure why all of it didn't find its way here. I also emailed Zac a few months ago.

(In reply to SpanKY from comment #0)
> it's been hard bisecting things down due to the async migrations breaking
> things, but we've identified at least one place where there was a large
> regression.  reverting that in 3.0.21 seems to bring back a lot, but not
> all, of the performance.
> https://gitweb.gentoo.org/proj/portage.git/commit/
> ?id=d66e9ec0b10522528d62e18b83e012c1ec121787

This commit is in portage-2.3.90.

In a private email with Zac, he notes that this commit was reverted in commit https://gitweb.gentoo.org/proj/portage.git/commit/?id=71ae5a58fe72bc32dce030210a73ea5c9eeb4a1c which is in portage-2.3.97.

From looking at the Google doc, I suspect that vapier meant a different commit (this one isn't mentioned in the doc, and it was already reverted before 3.0.21)


The two commits mentioned in the Google doc are

1) https://gitweb.gentoo.org/proj/portage.git/commit/?id=9b755b46f9e88f25fecada0a32095ea614a73b57 (in portage-2.3.99), reported to cause a large increase in time taken to calculate dependencies (+800%). This may have been the issue mitigated by https://gitweb.gentoo.org/proj/portage.git/commit/?id=839ab46be1777e5886da28b98b53a462b992c5bf

2. https://gitweb.gentoo.org/proj/portage.git/commit/?id=50da2e16599202b9ecb3d4494f214a0d30b073d (in portage-2.3.93), reported to keep various processes alive for longer than before, which leads to less parallelization. I believe this is the commit vapier meant to cite in the original report.
Comment 3 Matt Turner gentoo-dev 2023-06-18 00:03:05 UTC
Zac also noted that the rdep_cache in
https://chromium-review.googlesource.com/c/chromiumos/third_party/portage_tool/+/3780786
may help here.
Comment 4 Matt Turner gentoo-dev 2023-06-19 18:00:28 UTC
Created attachment 864225 [details]
Portage Versions Bisection Findings.pdf

Here's the document.