Consider systems that have very slow single-core performance, but lots of coresm such as aarch64 systems. I have such a system, and have portage build binpkgs of all installed packages. If I run `emerge --emptytree @world`, I get around 200 binpkgs queued for installation. This is great coverage of binary packages, but what I observe is that this takes a *very long time*, hours, even though we're only dealing with a few GB worth of extracted files. Watching the list of processes in htop, i see that portage is scheduling the decompression of these binary packages in about the same order that it would do the installation of packages from source. I propose that binpkgs should have the extraction/decompression done without regard to the installation order, and in parallel up to the number of --jobs, and that the decompression/extraction should proceed ahead of the installation of the decompressed/extracted packages. Installation of the extracted/decompressed files can still happen in the normal order. Reproducible: Always
What compression are you using? Even on slower machines, often you can use e.g. zstd and get very low resource usage & fast decompression .... but also, you could just use a format for binpkgs which supports parallel compression and decompression, like xz?
I think it's much more that portage should try to unpack X binpkgs at a time (jobs?) and only serialise the actual "merge to live-fs and VDB-update". I wonder if the existing parallel emerge code would actually achieve this. With the vast amount of cores available, the disk IO should be completely saturated since creating and moving files should be the main bottle-neck here. With the final merge and VDB update requiring a serial execution style you want to burn as much IOs as you can in preparational steps (unpacking).
That's a fair point.
(In reply to Sam James from comment #3) > That's a fair point. (+ qmerge being so much faster for doing this kind of shows the potential here, even though it's not parallel IIRC).
(In reply to Sam James from comment #4) > (In reply to Sam James from comment #3) > > That's a fair point. > > (+ qmerge being so much faster for doing this kind of shows the potential > here, even though it's not parallel IIRC). Correct, qmerge is completely single-threaded at this point. If it is faster than Portage, the overhead likely lives in filesystem interaction and mask calculations.
(In reply to Fabian Groffen from comment #2) > I think it's much more that portage should try to unpack X binpkgs at a time > (jobs?) and only serialise the actual "merge to live-fs and VDB-update". This is what I'm suggesting, yes.
(In reply to Sam James from comment #1) > What compression are you using? Even on slower machines, often you can use > e.g. zstd and get very low resource usage & fast decompression I've used different compression, and settings, you're right that zstd is faster, but it still bottlenecks on the same order of package installation, just faster.
The current one is decompressing the binpkg after pkg_setup, which I don't think it is easy to move to another queue. You can try "parallel-install" feature, if it does do it in parallel. (and the issue in the above link)