Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 936287

Summary: Retry failed fetches (or allow a fallback URL?)
Product: Portage Development Reporter: Sam James <sam>
Component: Binary packages supportAssignee: Portage team <dev-portage>
Status: CONFIRMED ---    
Severity: normal CC: binhost, infra-bugs, zmedico
Priority: Normal    
Version: unspecified   
Hardware: All   
OS: Linux   
See Also: https://bugs.gentoo.org/show_bug.cgi?id=463964
https://bugs.gentoo.org/show_bug.cgi?id=936288
https://bugs.gentoo.org/show_bug.cgi?id=425682
https://bugs.gentoo.org/show_bug.cgi?id=579526
https://bugs.gentoo.org/show_bug.cgi?id=932739
https://bugs.gentoo.org/show_bug.cgi?id=930730
Whiteboard:
Package list:
Runtime testing required: ---
Bug Depends on:    
Bug Blocks: 377365    

Description Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-07-19 08:08:28 UTC
This came up in bug 463964 (sort of in passing).

The problem has two parts:
1) On the portage side, we should.. retry? have fallback URLs for the same binhost? when fetching a binpkg fails.

2) On the infra side, there's inconsistency between the host which serves us the binhost Packages index vs the binpkgs themselves.

This manifests often as:
```
>>> Running pre-merge checks for sys-apps/portage-3.0.65-r1
--2024-07-19 08:00:16--  https://mirror.bytemark.co.uk/gentoo/releases/amd64/binpackages/23.0/x86-64/sys-apps/portage/portage-3.0.65-r1-1.gpkg.tar
Resolving mirror.bytemark.co.uk... 2001:41c8:20:5fc::13, 2001:41c8:20:5e6::150, 80.68.83.150, ...
Connecting to mirror.bytemark.co.uk|2001:41c8:20:5fc::13|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2024-07-19 08:00:16 ERROR 404: Not Found.
```

Let's use this bug for 1).
Comment 1 Zac Medico gentoo-dev 2024-07-20 04:25:05 UTC
Surely we'll want the ability retry multiple files simultaneously which makes this related to bug 425682.
Comment 2 Zac Medico gentoo-dev 2024-07-29 04:01:16 UTC
Since we can't merge a package until its files are fetched, we should change the Scheduler's _choose_pkg method to delay packages that haven't been fully fetched by prefetchers, since if _choose_pkg chooses a currently unfetchable package then has effectively allocated a "job" (--jobs count) which would be better allocated to a package which has already completely fetched.

So, retry can mostly be localized to the prefetcher / parallel-fetch area of code.

The pkg_pretend "pre-merge checks" code currently has the ability to cancel prefetch and trigger fetch in the foreground if it detects that prefetch is not active. With prefetch handling the retry, we'll want the "pre-merge checks" area of code to have scheduling that is aware of fetch status, like the _choose_pkg fetch status awareness described above.
Comment 3 Zac Medico gentoo-dev 2024-07-29 04:16:22 UTC
(In reply to Zac Medico from comment #1)
> Surely we'll want the ability retry multiple files simultaneously which
> makes this related to bug 425682.

Since pkg_pretend scheduling needs to be aware of package fetch status as describe in comment #2, bug 579526 is also related.
Comment 4 Zac Medico gentoo-dev 2024-07-29 04:52:02 UTC
(In reply to Sam James from comment #0)
> This came up in bug 463964 (sort of in passing).
> 
> The problem has two parts:
> 1) On the portage side, we should.. retry? have fallback URLs for the same
> binhost? when fetching a binpkg fails.

The default behavior of fetching every Packages index update is harmful, because it makes it more likely that users will fetch an inconsistent copy. Currently the binrepos.conf "frozen" setting (bug 932739) is the only means we have available to modify the default behavior.
Comment 5 Zac Medico gentoo-dev 2024-07-29 16:24:04 UTC
(In reply to Zac Medico from comment #4)
> (In reply to Sam James from comment #0)
> > This came up in bug 463964 (sort of in passing).
> > 
> > The problem has two parts:
> > 1) On the portage side, we should.. retry? have fallback URLs for the same
> > binhost? when fetching a binpkg fails.
> 
> The default behavior of fetching every Packages index update is harmful,
> because it makes it more likely that users will fetch an inconsistent copy.
> Currently the binrepos.conf "frozen" setting (bug 932739) is the only means
> we have available to modify the default behavior.

I've been thinking about ways to automatically enable this "frozen" behavior, in order to avoid a manual binrepos.conf edit. A simple solution would be to add a flag to trigger this "frozen" behavior when REPO_REVISIONS (from bug 924772) is in sync with the source repository.

However this simple solution could lead to missed Packages index updates if some of the package builds were delayed as reported in bug 924772 comment 23. I think it would be ideal to cope with this kind of issue on the server side by treating a solver failure as a fatal QA issue as I suggested in bug 924772 comment 25.
Comment 6 Zac Medico gentoo-dev 2024-07-29 16:39:41 UTC
(In reply to Zac Medico from comment #5)
> (In reply to Zac Medico from comment #4)
> However this simple solution could lead to missed Packages index updates if
> some of the package builds were delayed as reported in bug 924772 comment
> 23. I think it would be ideal to cope with this kind of issue on the server
> side by treating a solver failure as a fatal QA issue as I suggested in bug
> 924772 comment 25.

An alternative coping mechanism would be to implement a conditional Packages index update mechanism that only updates when REPO_REVISIONS is unchanged, as suggested in bug 924772 comment 26.