Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 647076 - PORTAGE_BINHOST lacks preference ordering now (FEATURES=-binpkg-multi-instance)
Summary: PORTAGE_BINHOST lacks preference ordering now (FEATURES=-binpkg-multi-instance)
Status: CONFIRMED
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Binary packages support (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Portage team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-02-08 23:28 UTC by SpanKY
Modified: 2023-08-24 20:36 UTC (History)
3 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
ignore all but 1 package per cpv for multi-binhost (0001-binarytree-ignore-all-but-1-package-per-cpv-for-mult.patch,2.03 KB, patch)
2018-02-12 20:09 UTC, Zac Medico
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description SpanKY gentoo-dev 2018-02-08 23:28:06 UTC
before commit 328dd4712f88cbb8ef390ae9eb471afa1ef781d7 [1], the order of PORTAGE_BINHOST determined preference when multiple binhosts offered the same binpkg.  since that commit, there doesn't seem to be any way to control this.

[1] https://gitweb.gentoo.org/proj/portage.git/commit/?id=328dd4712f88cbb8ef390ae9eb471afa1ef781d7

reproduction:
# *Good* output.
$ git checkout v2.2.18
$ ./test.sh
gs://chromeos-prebuilt/board/amd64-generic/chrome-R66-10365.0.0-rc1/packages/chromeos-base/chromeos-chrome-66.0.3336.3_rc-r1.tbz2
gs://chromeos-prebuilt/board/amd64-generic/full-2018.02.02.070328/packages/chromeos-base/chromeos-chrome-66.0.3336.3_rc-r1.tbz2

# *Bad* output.
$ git checkout v2.2.19
$ ./test.sh
gs://chromeos-prebuilt/board/amd64-generic/full-2018.02.02.070328/packages/chromeos-base/chromeos-chrome-66.0.3336.3_rc-r1.tbz2
gs://chromeos-prebuilt/board/amd64-generic/full-2018.02.02.070328/packages/chromeos-base/chromeos-chrome-66.0.3336.3_rc-r1.tbz2
$ git checkout portage-2.3.24
$ ./test.sh
gs://chromeos-prebuilt/board/amd64-generic/full-2018.02.02.070328/packages/chromeos-base/chromeos-chrome-66.0.3336.3_rc-r1.tbz2
gs://chromeos-prebuilt/board/amd64-generic/full-2018.02.02.070328/packages/chromeos-base/chromeos-chrome-66.0.3336.3_rc-r1.tbz2

$ cat test.sh
#!/bin/bash
run() {
  ./bin/emerge --root-deps -Gfpqvv --nodeps chromeos-base/chromeos-chrome |& grep ^gs://
}

rm -rf foo
export ROOT="$PWD/foo"
export FEATURES="-binpkg-multi-instance"
export CHOST="x86_64-cros-linux-gnu"
base="https://storage.googleapis.com/chromeos-prebuilt/board/amd64-generic"
export PORTAGE_BINHOST="${base}/paladin-R66-10367.0.0-rc2/packages ${base}/chrome-R66-10365.0.0-rc1/packages"
run
export PORTAGE_BINHOST="${base}/chrome-R66-10365.0.0-rc1/packages ${base}/paladin-R66-10367.0.0-rc2/packages"
run
Comment 1 Zac Medico gentoo-dev 2018-02-09 00:43:36 UTC
The difference in behavior is controlled by this code in the fakedbapi class:

	if multi_instance:
		self._instance_key = self._instance_key_multi_instance
	else:
		self._instance_key = self._instance_key_cpv

This patch make it behave like you want with FEATURES="-binpkg-multi-instance":

> diff --git a/pym/portage/dbapi/bintree.py b/pym/portage/dbapi/bintree.py
> index 201666c41..134c15aa1 100644
> --- a/pym/portage/dbapi/bintree.py
> +++ b/pym/portage/dbapi/bintree.py
> @@ -79,8 +79,9 @@ class bindbapi(fakedbapi):
>  		# * if binpkg-multi-instance is disabled, it's still possible
>  		#   to properly access a PKGDIR which has binpkg-multi-instance
>  		#   layout (or mixed layout)
> +		multi_instance = "binpkg-multi-instance" in kwargs["settings"].features
>  		fakedbapi.__init__(self, exclusive_slots=False,
> -			multi_instance=True, **kwargs)
> +			multi_instance=multi_instance, **kwargs)
>  		self.bintree = mybintree
>  		self.move_ent = mybintree.move_ent
>  		# Selectively cache metadata in order to optimize dep matching.

This change has 2 effects when binpkg-multi-instance is disabled, which are mentioned in the comment directly above the code:

1) Only one binary package for a given cpv will ever be considered, even though there may be multiple instances available from multiple binhosts.

2) If the local PKGDIR had binpkg-multi-instance enabled at some point, only one binary package for a given cpv will be accessible (this case has not really been tested).

We could add a separate FEATURE to toggle the behavior that you want, maybe call it binpkg-cpv-unique or something.
Comment 2 Zac Medico gentoo-dev 2018-02-09 09:58:35 UTC
(In reply to Zac Medico from comment #1)
> We could add a separate FEATURE to toggle the behavior that you want, maybe
> call it binpkg-cpv-unique or something.

We could have a behavior that only applies to the binhost logic, while retaining the ability to access multiple packages per cpv in the local PKGDIR. Maybe call it binhost-cpv-unique. This would be implemented in the binarytree._populate_remote method, by discarding any packages that are unwanted. For example:

> diff --git a/pym/portage/dbapi/bintree.py b/pym/portage/dbapi/bintree.py
> index 201666c41..e32b8c116 100644
> --- a/pym/portage/dbapi/bintree.py
> +++ b/pym/portage/dbapi/bintree.py
> @@ -811,6 +811,8 @@ class binarytree(object):
>  
>  		self._remote_has_index = False
>  		self._remotepkgs = {}
> +		cpv_unique = (set() if 'binhost-cpv-unique' in
> +			self.settings.features else None)
>  		for base_url in self.settings["PORTAGE_BINHOST"].split():
>  			parsed_url = urlparse(base_url)
>  			host = parsed_url.netloc
> @@ -1027,6 +1029,10 @@ class binarytree(object):
>  			if pkgindex:
>  				remote_base_uri = pkgindex.header.get("URI", base_url)
>  				for d in pkgindex.packages:
> +					if cpv_unique is not None:
> +						if d["CPV"] in cpv_unique:
> +							continue
> +						cpv_unique.add(d["CPV"])
>  					cpv = _pkg_str(d["CPV"], metadata=d,
>  						settings=self.settings)
>  					# Local package instances override remote instances
Comment 3 SpanKY gentoo-dev 2018-02-12 19:02:36 UTC
does it really need a dedicated FEATURE ?  wouldn't this be the behavior we always want ?
Comment 4 Zac Medico gentoo-dev 2018-02-12 20:09:58 UTC
Created attachment 519274 [details, diff]
ignore all but 1 package per cpv for multi-binhost

(In reply to SpanKY from comment #3)
> does it really need a dedicated FEATURE ?  wouldn't this be the behavior we
> always want ?

That would be fine as long as nobody is relying on the current behavior. It may be that chromeos is the only user of multi-binhost support, so reverting the behavior will not matter to anyone.
Comment 5 SpanKY gentoo-dev 2018-02-15 08:49:21 UTC
(In reply to Zac Medico from comment #4)

i'm not sure i follow.  istm CrOS's use of binhosts is not out of the ordinary:
- PORTAGE_BINHOST has supported multiple remotes since at least Jan 2011 which suggests people use it this way
- vars in PMS/portage in general exhibit an ordering preference when it permits more than one value (usually last in the list has the higher precedence)

given multiple equivalent sources of data, some decision has to be made as to which value to prefer.  before the aforementioned commit (from Mar 2015), the ordering was based on the repo (hence the PORTAGE_BINHOST ordering controlled).  since that commit, it was changed to (i think) last build time wins regardless of PORTAGE_BINHOST order.

CrOS has been relying on the historical behavior -- PORTAGE_BINHOST order dictates when all other settings match.  i guess you could add a FEATURE to select between PORTAGE_BINHOST ordering and timestamp ordering.  but i think restoring to the "original" behavior makes sense here and see what request comes up next.

do we have a good source of user-facing documentation and expected behavior ?  the make.conf.5 man page is fairly light (it likes to refer the reader to another section, and that other section tends to just redirect as well).  perhaps it'd warrant creating a dedicated page for just this topic that would correlate all the different knobs in one place.
Comment 6 Zac Medico gentoo-dev 2018-02-16 19:35:26 UTC
(In reply to SpanKY from comment #5)
> (In reply to Zac Medico from comment #4)
> 
> i'm not sure i follow.  istm CrOS's use of binhosts is not out of the
> ordinary:
> - PORTAGE_BINHOST has supported multiple remotes since at least Jan 2011
> which suggests people use it this way
> - vars in PMS/portage in general exhibit an ordering preference when it
> permits more than one value (usually last in the list has the higher
> precedence)

Prior behavior was based on the assumption that only one build per cpv would be accessible, but now it's possible to access more that one. All other things being equal, a package with newer BUILD_TIME metadata is preferred. In the case of USE dependencies, a package with different USE flags enabled can be selected to satisfy 

> given multiple equivalent sources of data, some decision has to be made as
> to which value to prefer.  before the aforementioned commit (from Mar 2015),
> the ordering was based on the repo (hence the PORTAGE_BINHOST ordering
> controlled).  since that commit, it was changed to (i think) last build time
> wins regardless of PORTAGE_BINHOST order.
> 
> CrOS has been relying on the historical behavior -- PORTAGE_BINHOST order
> dictates when all other settings match.  i guess you could add a FEATURE to
> select between PORTAGE_BINHOST ordering and timestamp ordering.  but i think
> restoring to the "original" behavior makes sense here and see what request
> comes up next.
> 
> do we have a good source of user-facing documentation and expected behavior
> ?  the make.conf.5 man page is fairly light (it likes to refer the reader to
> another section, and that other section tends to just redirect as well). 
> perhaps it'd warrant creating a dedicated page for just this topic that
> would correlate all the different knobs in one place.
Comment 7 Zac Medico gentoo-dev 2018-02-16 20:05:42 UTC
comment #6 posted prematurely, trying again...

(In reply to SpanKY from comment #5)
> (In reply to Zac Medico from comment #4)
> 
> i'm not sure i follow.  istm CrOS's use of binhosts is not out of the
> ordinary:
> - PORTAGE_BINHOST has supported multiple remotes since at least Jan 2011
> which suggests people use it this way
> - vars in PMS/portage in general exhibit an ordering preference when it
> permits more than one value (usually last in the list has the higher
> precedence)

Prior behavior was based on the assumption that only one build per cpv would be accessible, but now it's possible to access more that one. All other things being equal, a package with newer BUILD_TIME metadata is preferred. In the case of USE dependencies, a package with different USE flags enabled can be selected to satisfy the target USE configuration.

> given multiple equivalent sources of data, some decision has to be made as
> to which value to prefer.  before the aforementioned commit (from Mar 2015),
> the ordering was based on the repo (hence the PORTAGE_BINHOST ordering
> controlled).  since that commit, it was changed to (i think) last build time
> wins regardless of PORTAGE_BINHOST order.

BUILD_TIME only matters if all other things are equal. That could be overridden by USE configuration, built slot operator dependencies which indicate a link to a specific slot, and also soname dependencies you use --getbinpkgonly together with --ignore-built-slot-operator-deps=n.

> CrOS has been relying on the historical behavior -- PORTAGE_BINHOST order
> dictates when all other settings match.  i guess you could add a FEATURE to
> select between PORTAGE_BINHOST ordering and timestamp ordering.  but i think
> restoring to the "original" behavior makes sense here and see what request
> comes up next.

As noted above, there's much more than binhost order and build time, there's also USE, built slot operator deps, and soname deps.

> do we have a good source of user-facing documentation and expected behavior
> ?  the make.conf.5 man page is fairly light (it likes to refer the reader to
> another section, and that other section tends to just redirect as well). 
> perhaps it'd warrant creating a dedicated page for just this topic that
> would correlate all the different knobs in one place.

Yes, additional documentation would be very helpful.
Comment 8 Zac Medico gentoo-dev 2018-02-16 21:55:58 UTC
(In reply to Zac Medico from comment #7)
> and also soname dependencies you use
> --getbinpkgonly together with --ignore-built-slot-operator-deps=n.

I mean --getbinpkgonly together with --ignore-soname-deps=n.
Comment 9 SpanKY gentoo-dev 2018-02-23 17:21:25 UTC
sure, i grok that when other metadata matches better (USE/CHOST/etc...), those are preferred regardless of ordering.

i think in our case, it's the BUILD_TIME mucking things up.  we have bots that produce "better" binpkgs, but at a slower pace than the other builders.  we've been relying on the ordering here to make it work.  that's the part where we want to level, at least temporarily if we can figure out a better way of generating the binhosts.