before commit 328dd4712f88cbb8ef390ae9eb471afa1ef781d7 [1], the order of PORTAGE_BINHOST determined preference when multiple binhosts offered the same binpkg. since that commit, there doesn't seem to be any way to control this. [1] https://gitweb.gentoo.org/proj/portage.git/commit/?id=328dd4712f88cbb8ef390ae9eb471afa1ef781d7 reproduction: # *Good* output. $ git checkout v2.2.18 $ ./test.sh gs://chromeos-prebuilt/board/amd64-generic/chrome-R66-10365.0.0-rc1/packages/chromeos-base/chromeos-chrome-66.0.3336.3_rc-r1.tbz2 gs://chromeos-prebuilt/board/amd64-generic/full-2018.02.02.070328/packages/chromeos-base/chromeos-chrome-66.0.3336.3_rc-r1.tbz2 # *Bad* output. $ git checkout v2.2.19 $ ./test.sh gs://chromeos-prebuilt/board/amd64-generic/full-2018.02.02.070328/packages/chromeos-base/chromeos-chrome-66.0.3336.3_rc-r1.tbz2 gs://chromeos-prebuilt/board/amd64-generic/full-2018.02.02.070328/packages/chromeos-base/chromeos-chrome-66.0.3336.3_rc-r1.tbz2 $ git checkout portage-2.3.24 $ ./test.sh gs://chromeos-prebuilt/board/amd64-generic/full-2018.02.02.070328/packages/chromeos-base/chromeos-chrome-66.0.3336.3_rc-r1.tbz2 gs://chromeos-prebuilt/board/amd64-generic/full-2018.02.02.070328/packages/chromeos-base/chromeos-chrome-66.0.3336.3_rc-r1.tbz2 $ cat test.sh #!/bin/bash run() { ./bin/emerge --root-deps -Gfpqvv --nodeps chromeos-base/chromeos-chrome |& grep ^gs:// } rm -rf foo export ROOT="$PWD/foo" export FEATURES="-binpkg-multi-instance" export CHOST="x86_64-cros-linux-gnu" base="https://storage.googleapis.com/chromeos-prebuilt/board/amd64-generic" export PORTAGE_BINHOST="${base}/paladin-R66-10367.0.0-rc2/packages ${base}/chrome-R66-10365.0.0-rc1/packages" run export PORTAGE_BINHOST="${base}/chrome-R66-10365.0.0-rc1/packages ${base}/paladin-R66-10367.0.0-rc2/packages" run
The difference in behavior is controlled by this code in the fakedbapi class: if multi_instance: self._instance_key = self._instance_key_multi_instance else: self._instance_key = self._instance_key_cpv This patch make it behave like you want with FEATURES="-binpkg-multi-instance": > diff --git a/pym/portage/dbapi/bintree.py b/pym/portage/dbapi/bintree.py > index 201666c41..134c15aa1 100644 > --- a/pym/portage/dbapi/bintree.py > +++ b/pym/portage/dbapi/bintree.py > @@ -79,8 +79,9 @@ class bindbapi(fakedbapi): > # * if binpkg-multi-instance is disabled, it's still possible > # to properly access a PKGDIR which has binpkg-multi-instance > # layout (or mixed layout) > + multi_instance = "binpkg-multi-instance" in kwargs["settings"].features > fakedbapi.__init__(self, exclusive_slots=False, > - multi_instance=True, **kwargs) > + multi_instance=multi_instance, **kwargs) > self.bintree = mybintree > self.move_ent = mybintree.move_ent > # Selectively cache metadata in order to optimize dep matching. This change has 2 effects when binpkg-multi-instance is disabled, which are mentioned in the comment directly above the code: 1) Only one binary package for a given cpv will ever be considered, even though there may be multiple instances available from multiple binhosts. 2) If the local PKGDIR had binpkg-multi-instance enabled at some point, only one binary package for a given cpv will be accessible (this case has not really been tested). We could add a separate FEATURE to toggle the behavior that you want, maybe call it binpkg-cpv-unique or something.
(In reply to Zac Medico from comment #1) > We could add a separate FEATURE to toggle the behavior that you want, maybe > call it binpkg-cpv-unique or something. We could have a behavior that only applies to the binhost logic, while retaining the ability to access multiple packages per cpv in the local PKGDIR. Maybe call it binhost-cpv-unique. This would be implemented in the binarytree._populate_remote method, by discarding any packages that are unwanted. For example: > diff --git a/pym/portage/dbapi/bintree.py b/pym/portage/dbapi/bintree.py > index 201666c41..e32b8c116 100644 > --- a/pym/portage/dbapi/bintree.py > +++ b/pym/portage/dbapi/bintree.py > @@ -811,6 +811,8 @@ class binarytree(object): > > self._remote_has_index = False > self._remotepkgs = {} > + cpv_unique = (set() if 'binhost-cpv-unique' in > + self.settings.features else None) > for base_url in self.settings["PORTAGE_BINHOST"].split(): > parsed_url = urlparse(base_url) > host = parsed_url.netloc > @@ -1027,6 +1029,10 @@ class binarytree(object): > if pkgindex: > remote_base_uri = pkgindex.header.get("URI", base_url) > for d in pkgindex.packages: > + if cpv_unique is not None: > + if d["CPV"] in cpv_unique: > + continue > + cpv_unique.add(d["CPV"]) > cpv = _pkg_str(d["CPV"], metadata=d, > settings=self.settings) > # Local package instances override remote instances
does it really need a dedicated FEATURE ? wouldn't this be the behavior we always want ?
Created attachment 519274 [details, diff] ignore all but 1 package per cpv for multi-binhost (In reply to SpanKY from comment #3) > does it really need a dedicated FEATURE ? wouldn't this be the behavior we > always want ? That would be fine as long as nobody is relying on the current behavior. It may be that chromeos is the only user of multi-binhost support, so reverting the behavior will not matter to anyone.
(In reply to Zac Medico from comment #4) i'm not sure i follow. istm CrOS's use of binhosts is not out of the ordinary: - PORTAGE_BINHOST has supported multiple remotes since at least Jan 2011 which suggests people use it this way - vars in PMS/portage in general exhibit an ordering preference when it permits more than one value (usually last in the list has the higher precedence) given multiple equivalent sources of data, some decision has to be made as to which value to prefer. before the aforementioned commit (from Mar 2015), the ordering was based on the repo (hence the PORTAGE_BINHOST ordering controlled). since that commit, it was changed to (i think) last build time wins regardless of PORTAGE_BINHOST order. CrOS has been relying on the historical behavior -- PORTAGE_BINHOST order dictates when all other settings match. i guess you could add a FEATURE to select between PORTAGE_BINHOST ordering and timestamp ordering. but i think restoring to the "original" behavior makes sense here and see what request comes up next. do we have a good source of user-facing documentation and expected behavior ? the make.conf.5 man page is fairly light (it likes to refer the reader to another section, and that other section tends to just redirect as well). perhaps it'd warrant creating a dedicated page for just this topic that would correlate all the different knobs in one place.
(In reply to SpanKY from comment #5) > (In reply to Zac Medico from comment #4) > > i'm not sure i follow. istm CrOS's use of binhosts is not out of the > ordinary: > - PORTAGE_BINHOST has supported multiple remotes since at least Jan 2011 > which suggests people use it this way > - vars in PMS/portage in general exhibit an ordering preference when it > permits more than one value (usually last in the list has the higher > precedence) Prior behavior was based on the assumption that only one build per cpv would be accessible, but now it's possible to access more that one. All other things being equal, a package with newer BUILD_TIME metadata is preferred. In the case of USE dependencies, a package with different USE flags enabled can be selected to satisfy > given multiple equivalent sources of data, some decision has to be made as > to which value to prefer. before the aforementioned commit (from Mar 2015), > the ordering was based on the repo (hence the PORTAGE_BINHOST ordering > controlled). since that commit, it was changed to (i think) last build time > wins regardless of PORTAGE_BINHOST order. > > CrOS has been relying on the historical behavior -- PORTAGE_BINHOST order > dictates when all other settings match. i guess you could add a FEATURE to > select between PORTAGE_BINHOST ordering and timestamp ordering. but i think > restoring to the "original" behavior makes sense here and see what request > comes up next. > > do we have a good source of user-facing documentation and expected behavior > ? the make.conf.5 man page is fairly light (it likes to refer the reader to > another section, and that other section tends to just redirect as well). > perhaps it'd warrant creating a dedicated page for just this topic that > would correlate all the different knobs in one place.
comment #6 posted prematurely, trying again... (In reply to SpanKY from comment #5) > (In reply to Zac Medico from comment #4) > > i'm not sure i follow. istm CrOS's use of binhosts is not out of the > ordinary: > - PORTAGE_BINHOST has supported multiple remotes since at least Jan 2011 > which suggests people use it this way > - vars in PMS/portage in general exhibit an ordering preference when it > permits more than one value (usually last in the list has the higher > precedence) Prior behavior was based on the assumption that only one build per cpv would be accessible, but now it's possible to access more that one. All other things being equal, a package with newer BUILD_TIME metadata is preferred. In the case of USE dependencies, a package with different USE flags enabled can be selected to satisfy the target USE configuration. > given multiple equivalent sources of data, some decision has to be made as > to which value to prefer. before the aforementioned commit (from Mar 2015), > the ordering was based on the repo (hence the PORTAGE_BINHOST ordering > controlled). since that commit, it was changed to (i think) last build time > wins regardless of PORTAGE_BINHOST order. BUILD_TIME only matters if all other things are equal. That could be overridden by USE configuration, built slot operator dependencies which indicate a link to a specific slot, and also soname dependencies you use --getbinpkgonly together with --ignore-built-slot-operator-deps=n. > CrOS has been relying on the historical behavior -- PORTAGE_BINHOST order > dictates when all other settings match. i guess you could add a FEATURE to > select between PORTAGE_BINHOST ordering and timestamp ordering. but i think > restoring to the "original" behavior makes sense here and see what request > comes up next. As noted above, there's much more than binhost order and build time, there's also USE, built slot operator deps, and soname deps. > do we have a good source of user-facing documentation and expected behavior > ? the make.conf.5 man page is fairly light (it likes to refer the reader to > another section, and that other section tends to just redirect as well). > perhaps it'd warrant creating a dedicated page for just this topic that > would correlate all the different knobs in one place. Yes, additional documentation would be very helpful.
(In reply to Zac Medico from comment #7) > and also soname dependencies you use > --getbinpkgonly together with --ignore-built-slot-operator-deps=n. I mean --getbinpkgonly together with --ignore-soname-deps=n.
sure, i grok that when other metadata matches better (USE/CHOST/etc...), those are preferred regardless of ordering. i think in our case, it's the BUILD_TIME mucking things up. we have bots that produce "better" binpkgs, but at a slower pace than the other builders. we've been relying on the ordering here to make it work. that's the part where we want to level, at least temporarily if we can figure out a better way of generating the binhosts.