Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 425682

Summary: True parallel fetch for each job running
Product: Portage Development Reporter: Marcus Becker <marcus.disi>
Component: Enhancement/Feature RequestsAssignee: Portage team <dev-portage>
Status: UNCONFIRMED ---    
Severity: enhancement CC: pacho
Priority: Normal    
Version: unspecified   
Hardware: All   
OS: Linux   
Whiteboard:
Package list:
Runtime testing required: ---
Bug Depends on:    
Bug Blocks: 377365    

Description Marcus Becker 2012-07-10 15:15:39 UTC
Since Jobs were introduced in portage, it is nice to see up to X jobs running at a time. What I noticed is that there is still only one fetch going on in the background?

Reproducible: Always

Steps to Reproduce:
1. It starts with 4 jobs, downloads the first, then the second etc.
2. lets say the first was a small package of ~500kb and is already done and the third is a larger one etc.
3. at some stage only 1 job is running because other jobs have to wait for it to be downloaded
Actual Results:  
As an example: I can build PHP in ~2min on my machine, but it takes me ~5-6min to download (ok, my connection is not very good), this stalls every following job that could have been done in the meantime.

Expected Results:  
It would be nice, if every job triggers its own fetch parallel?
Comment 1 Jeremy Olexa (darkside) (RETIRED) archtester Gentoo Infrastructure gentoo-dev Security 2012-07-10 15:54:44 UTC
If you have a slow connection, adding more fetch jobs to be processed at once will NOT help anything, it will just stall in a different way.
Comment 2 Marcus Becker 2012-07-10 16:11:56 UTC
But if one package has ~50mb to download and it stalls a 500kb package, I think it would be an inprovement. How many jobs you want to run can be set in the make.conf anyway.
Comment 3 Zac Medico gentoo-dev 2012-07-10 20:28:32 UTC
(In reply to comment #0)
> 1. It starts with 4 jobs, downloads the first, then the second etc.

It will actually download all 4 in parallel. The relevant code is here:

http://git.overlays.gentoo.org/gitweb/?p=proj/portage.git;a=commit;h=ef58bc7573ddce5e3a5466eea50160b81de8edf4

When downloading in parallel, each fetcher's output goes to the corresponding build log (and that part of the build log is discarded if the fetch is successful).

> 2. lets say the first was a small package of ~500kb and is already done and
> the third is a larger one etc.
> 3. at some stage only 1 job is running because other jobs have to wait for
> it to be downloaded

Something like this could happen if all other jobs depend on the one that's currently being fetched/logged in /var/log/emerge-fetch.log. In order to fix this, we'd have to create separate logs for each fetcher.

(In reply to comment #1)
> If you have a slow connection, adding more fetch jobs to be processed at
> once will NOT help anything, it will just stall in a different way.

We can add a --fetch-jobs=N option so that people can tune the number of concurrent fetch jobs for their connection speed.
Comment 4 Marcus Becker 2012-07-14 12:09:21 UTC
One example:
Calculating dependencies... done!
>>> Verifying ebuild manifests
>>> Starting parallel fetch
>>> Emerging (1 of 13) sys-kernel/linux-firmware-20120708
>>> Emerging (2 of 13) media-libs/libpng-1.5.12
>>> Jobs: 0 of 13 complete, 1 running               Load avg: 1.07, 0.86, 0.88

since linux-firmware is 15M to download, it stalls the other jobs?
Comment 5 Zac Medico gentoo-dev 2012-07-14 20:37:13 UTC
(In reply to comment #4)
> One example:
> Calculating dependencies... done!
> >>> Verifying ebuild manifests
> >>> Starting parallel fetch
> >>> Emerging (1 of 13) sys-kernel/linux-firmware-20120708
> >>> Emerging (2 of 13) media-libs/libpng-1.5.12
> >>> Jobs: 0 of 13 complete, 1 running               Load avg: 1.07, 0.86, 0.88
> 
> since linux-firmware is 15M to download, it stalls the other jobs?

Well, you could be looking at a case of bug 403895 there, which is fixed in portage-2.1.11.x.