Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 806592 - [Future EAPI] ECLASS_REVISION and autogenerated ebuild microrevisions
Summary: [Future EAPI] ECLASS_REVISION and autogenerated ebuild microrevisions
Status: CONFIRMED
Alias: None
Product: Gentoo Hosted Projects
Classification: Unclassified
Component: PMS/EAPI (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: PMS/EAPI
URL:
Whiteboard: in-eapi-9
Keywords:
Depends on:
Blocks: future-eapi
  Show dependency tree
 
Reported: 2021-08-05 18:54 UTC by Andreas K. Hüttel
Modified: 2022-08-08 18:06 UTC (History)
5 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Andreas K. Hüttel archtester gentoo-dev 2021-08-05 18:54:14 UTC
As discussed on IRC... proposal for EAPI=9

An eclass can define a variable   
ECLASS_REVISION
assigned to a positive integer. If the variable is not defined, it defaults to 0.

Together with changes to an eclass, this variable should be increased by 1 whenever the outward-facing interface of the eclass, as, e.g., in particular generated dependency strings change. It must never be decreased.

The package manager shall track the revisions of all eclasses used when sourcing an ebuild. From these it calculates a microrevision value and associates it with the ebuild and eventually with the installed package or the binary package. 

A possible calclulation method for the microrevision would be the sum of all eclass revisions.

The microrevision is taken into account in version comparisons, as level below the ebuild revision. In particular, an increased microrevision triggers an upgrade-like rebuild.
 
Since ebuild microrevisions are in general autogenerated and invisible, there is no way to depend on a specific microrevision.
= in dependencies treats different microrevisions as equal.
Comment 1 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2021-08-05 19:04:20 UTC
(In reply to Andreas K. Hüttel from comment #0)
> As discussed on IRC... proposal for EAPI=9
> 
> An eclass can define a variable   
> ECLASS_REVISION
> assigned to a positive integer. If the variable is not defined, it defaults
> to 0.

Nit: the default value doesn't fit in the defined value range.
Comment 2 Andreas K. Hüttel archtester gentoo-dev 2021-08-05 19:12:20 UTC
As for what this is good for... here's an example.

perl-module.eclass adds to each package a dependency on dev-lang/perl
Now imagine that the precise dependency string needs to be changed. Do we really want to revbump 1600 packages without changes then, so the new depstring makes its way into local installed package databases?

In the past, the kde eclasses had similar functions, e.g., adding a minimal Qt version - and here the same problem arises.

And probably many more cases...
Comment 3 Andreas K. Hüttel archtester gentoo-dev 2021-08-05 19:12:49 UTC
(In reply to Michał Górny from comment #1)
> (In reply to Andreas K. Hüttel from comment #0)
> > As discussed on IRC... proposal for EAPI=9
> > 
> > An eclass can define a variable   
> > ECLASS_REVISION
> > assigned to a positive integer. If the variable is not defined, it defaults
> > to 0.
> 
> Nit: the default value doesn't fit in the defined value range.

Well, it is smaller than any assignable value. So assigning a first value always increases!
Comment 4 Ulrich Müller gentoo-dev 2021-08-05 19:36:44 UTC
(In reply to Michał Górny from comment #1)
> (In reply to Andreas K. Hüttel from comment #0)
> > As discussed on IRC... proposal for EAPI=9
> > 
> > An eclass can define a variable   
> > ECLASS_REVISION
> > assigned to a positive integer. If the variable is not defined, it defaults
> > to 0.
> 
> Nit: the default value doesn't fit in the defined value range.

I tend to agree with mgorny here. There's no good reason to disallow an assignment of 0 to the variable. That would also be consistent with ebuild revisions, where an explicit -r0 is allowed.

If we followed the language used elsewhere in the spec, we would say "unsigned integer".
Comment 5 Ulrich Müller gentoo-dev 2022-01-08 12:49:25 UTC
(In reply to Andreas K. Hüttel from comment #0)
> The package manager shall track the revisions of all eclasses used when
> sourcing an ebuild. From these it calculates a microrevision value and
> associates it with the ebuild and eventually with the installed package or
> the binary package. 
> 
> A possible calclulation method for the microrevision would be the sum of all
> eclass revisions.

As discussed in #gentoo-pms, taking the _maximum_ of all eclass revisions makes more sense than the sum. For example, ECLASS_REVISION could be date-based, like ECLASS_REVISION="20220108". Alternatively, it could be a counter; an updated eclass would take the current global maximum and add 1 to it (should be trivial to write a script for this).

About indirect inherits, excluding them would be somewhat unsystematic. It could also lead to problems with inherit guards if an eclass is both indirectly and directly inherited. Maybe we could allow ECLASS_REVISION to be conditional instead (typically under the same variable that disables dependencies)?
Comment 6 Michael Orlitzky gentoo-dev 2022-02-05 13:11:20 UTC
(In reply to Andreas K. Hüttel from comment #0)
> 
> perl-module.eclass adds to each package a dependency on dev-lang/perl
> Now imagine that the precise dependency string needs to be changed. Do we
> really want to revbump 1600 packages without changes then, so the new
> depstring makes its way into local installed package databases?
> 

Couldn't this be automated with (say) pkgdev? If all we need to do is bump every consumer from -rN to -r(N+1), the code shouldn't be very complicated. The git diff would even be relatively readable with diff.renames=True.

A simple rename would commit the new revisions straight to stable, but since that's effectively what's on the table anyway... *shrug*

Name clashes would technically be possible, but exceedingly rare.
Comment 7 Michael Orlitzky gentoo-dev 2022-02-05 13:40:04 UTC
(In reply to Michael Orlitzky from comment #6)
> 
> Couldn't this be automated with (say) pkgdev?

I've forgotten to mention my motivation.

Both approaches require one-time manual intervention by the developer, but creating "real" revisions has two benefits. First, it keeps the PM a little bit simpler. But mainly, it would be faster for end users. Comparing filenames to decide "do I upgrade this?" is relatively easy, while sourcing ebuilds to figure out what eclasses (and at what revisions) they use is comparatively slow.
Comment 8 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2022-02-05 13:56:21 UTC
(In reply to Michael Orlitzky from comment #6)
> Couldn't this be automated with (say) pkgdev? If all we need to do is bump
> every consumer from -rN to -r(N+1), the code shouldn't be very complicated.
> The git diff would even be relatively readable with diff.renames=True.

What about ebuilds outside ::gentoo?

> Name clashes would technically be possible, but exceedingly rare.

How did you determine this?  And even if they're really rare, how do you handle them?
Comment 9 Michael Orlitzky gentoo-dev 2022-02-05 14:25:59 UTC
(In reply to Michał Górny from comment #8)
> (In reply to Michael Orlitzky from comment #6)
> > Couldn't this be automated with (say) pkgdev? If all we need to do is bump
> > every consumer from -rN to -r(N+1), the code shouldn't be very complicated.
> > The git diff would even be relatively readable with diff.renames=True.
> 
> What about ebuilds outside ::gentoo?

I don't have a good (short) answer for that. The bottom line is that I'd be OK with adding this to the list of downsides one adopts when deciding to work outside of the tree. Overlays already forego all tree-wide improvement commits. If the commit message was unique enough, overlay authors could track them and run the revision-bump tool themselves.


> > Name clashes would technically be possible, but exceedingly rare.
> 
> How did you determine this?  And even if they're really rare, how do you
> handle them?

I'm estimating based on what I've seen in the tree. To cause a conflict you'd need,

  * Two successive revisions of the same package;
  * The earlier revision using foo.eclass, the latter not; and
  * A change to foo.eclass that necessitates a revision bump

The developer would have to decide what to do to resolve the problem. You could bump both revisions (wastes a compile), or delete the old one (depends on stable keywords), or ignore that package entirely (doesn't help people using the old revision). Or what I guess is my preferred solution: sort it out with that package's maintainer before changing the eclass.

Regardless, at worst you get a pointless rebuild in those situations.
Comment 10 Michael Orlitzky gentoo-dev 2022-02-05 18:34:12 UTC
(In reply to Michael Orlitzky from comment #9)
> (In reply to Michał Górny from comment #8)
>  
> > What about ebuilds outside ::gentoo?
> 
> I don't have a good (short) answer for that.

I guess I'll mention it: this problem goes away if you revision the eclass as well. If you revision the eclass when you make the change, and then use a tool to update all of its consumers, you can leave the old eclass in the tree for a month to give overlays a chance to update (via the tool).

That's a small amount of extra work, but the main issue with it is that a lot of people seem vehemently opposed to seeing something like -r5 on an eclass.
Comment 11 Michael Orlitzky gentoo-dev 2022-02-22 15:22:00 UTC
> in-eapi-9

Before approving this, please do a pros/cons list comparing this to literally revisioning the eclass and using a tool to auto-update consumers.

Placing the burden on users to recompute this a million times is crazy. All consequences are known at commit-time, and a revbump tool is guaranteed to be simpler to implement and understand.
Comment 12 Ulrich Müller gentoo-dev 2022-02-22 21:27:38 UTC
(In reply to Michael Orlitzky from comment #11)

The whiteboard status follows approval by the Council: https://projects.gentoo.org/council/meeting-logs/20220213-summary.txt
Comment 13 Michael Orlitzky gentoo-dev 2022-02-22 21:31:44 UTC
(In reply to Ulrich Müller from comment #12)
> (In reply to Michael Orlitzky from comment #11)
> 
> The whiteboard status follows approval by the Council:
> https://projects.gentoo.org/council/meeting-logs/20220213-summary.txt

I know, but true approval still awaits a specification and implementation. As does an honest comparison; at the moment, there's not much to compare to.
Comment 14 Ulrich Müller gentoo-dev 2022-02-22 21:48:50 UTC
(In reply to Michael Orlitzky from comment #11)
> Before approving this, please do a pros/cons list comparing this to
> literally revisioning the eclass and using a tool to auto-update consumers.

Not sure if I understand this correctly. What do you mean by "revisioning" the eclass? Changing its name? And by what mechanism would updating these ebuilds trigger a package rebuild?
Comment 15 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2022-02-22 22:11:22 UTC
I have no objection to someone making the tooling to just revbump consumers and it'd be useful for other bits anyway. And it'd be simpler to implement.
Comment 16 Michael Orlitzky gentoo-dev 2022-02-22 22:24:04 UTC
(In reply to Ulrich Müller from comment #14)
> 
> Not sure if I understand this correctly. What do you mean by "revisioning"
> the eclass? Changing its name? And by what mechanism would updating these
> ebuilds trigger a package rebuild?

Yes. To be concrete, I'm talking about creating a new revision -r<N+1> of the eclass and leaving the old one -rN behind for about a month.

What I'm suggesting is that instead of spending the time writing package manager code to support this new feature in EAPI=9, we instead spend it adding (less) code to e.g. pkgdev to perform the straight-to-stable revbumps of all consumers.

This should be much simpler, and will benefit all existing EAPIs. It will almost certainly require less code. It supports overlays by leaving the old eclass revisions in the tree long enough for the overlay maintainers to run the tool themselves.

But most importantly, it performs the "what needs to be rebuilt?" calculation once, on the developer's machine, rather than repeatedly on users' machines during  dependency resolution.
Comment 17 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2022-02-22 22:49:40 UTC
(In reply to Michael Orlitzky from comment #16)
> [...] It supports overlays by leaving the old
> eclass revisions in the tree long enough for the overlay maintainers to run
> the tool themselves.

Let's be realistic and say that the vast majority of third-party repos will not update their ebuilds until the old eclass revision is removed and I file a bug telling them that stuff just blew up.
Comment 18 Michael Orlitzky gentoo-dev 2022-02-22 23:07:15 UTC
(In reply to Michał Górny from comment #17)
> 
> Let's be realistic and say that the vast majority of third-party repos will
> not update their ebuilds until the old eclass revision is removed and I file
> a bug telling them that stuff just blew up.

It's already standard practice to mark the old revision as @DEPRECATED, which should initiate the stuff-gon-blow-up warning automatically.
Comment 19 Ulrich Müller gentoo-dev 2022-02-23 04:36:29 UTC
(In reply to Michael Orlitzky from comment #16)
> (In reply to Ulrich Müller from comment #14)
> > 
> > Not sure if I understand this correctly. What do you mean by "revisioning"
> > the eclass? Changing its name? And by what mechanism would updating these
> > ebuilds trigger a package rebuild?
> 
> Yes. To be concrete, I'm talking about creating a new revision -r<N+1> of
> the eclass and leaving the old one -rN behind for about a month.

But there is no such thing as eclass revisions. Eclasses have only a name which isn't interpreted any further.

> What I'm suggesting is that instead of spending the time writing package
> manager code to support this new feature in EAPI=9, we instead spend it
> adding (less) code to e.g. pkgdev to perform the straight-to-stable revbumps
> of all consumers.

IIUC this proposal is precisely about avoiding that somewhat ugly scenario.

> This should be much simpler, and will benefit all existing EAPIs. It will
> almost certainly require less code. It supports overlays by leaving the old
> eclass revisions in the tree long enough for the overlay maintainers to run
> the tool themselves.
> 
> But most importantly, it performs the "what needs to be rebuilt?"
> calculation once, on the developer's machine, rather than repeatedly on
> users' machines during  dependency resolution.

Metadata will change in both scenarios, and calculation of the microrevision should be trivial (and therefore fast). So why do you expect eclass revisions be significantly slower than ebuild revisions?

We may want to verify this however.
@dilfridge: Could you provide some numbers on dependency resolution time, once we have an implementation in Portage?

OTOH all those bumped ebuild revisions (plus changed manifests with rsync) would have to be downloaded by the user.
Comment 20 Michael Orlitzky gentoo-dev 2022-02-23 13:52:35 UTC
> 
> But there is no such thing as eclass revisions. Eclasses have only a name
> which isn't interpreted any further.

That's not an obstacle. The (true) revisions of the ebuilds themselves are what the package manager will notice. But I'll be more careful with my wording.


> > What I'm suggesting is that instead of spending the time writing package
> > manager code to support this new feature in EAPI=9, we instead spend it
> > adding (less) code to e.g. pkgdev to perform the straight-to-stable revbumps
> > of all consumers.
> 
> IIUC this proposal is precisely about avoiding that somewhat ugly scenario.

Yes, it's a hard problem. And I would rather have any solution than not. All I'm asking is for a fair comparison of the alternatives before we commit to one.


> Metadata will change in both scenarios, and calculation of the microrevision
> should be trivial (and therefore fast). So why do you expect eclass
> revisions be significantly slower than ebuild revisions?

The package manager can easily see that a newer package revision should be pulled in by looking at the filename, especially if the older one has been removed.

With the ECLASS_REVISIONs, the packager manager needs to know what eclasses are used by the installed version of the package, and which eclasses those eclasses use, and whether or not any of their ECLASS_REVISIONs have changed. Should it need to be done, that involves running a lot of bash code and is going to be slower.

Presumably EAPI=9 would allow eclass revision info in the metadata cache, but we shouldn't rely on the optional cache to make a feature usable. (If we can rely on a metadata cache, this feature isn't necessary!) For one, all developers use a git checkout, and many users find it preferable to rsync. This often requires metadata computation and/or generation on users' systems. Overlay users further do not benefit from a pre-generated cache. And in a perfect world, some day we might even have metadata (at least the cached subset of it) that is free of bash code, making a cache obsolete.


> OTOH all those bumped ebuild revisions (plus changed manifests with rsync)
> would have to be downloaded by the user.

This is a good point and something I hadn't thought of.
Comment 21 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2022-02-23 14:00:53 UTC
The PM already needs to get the complete list of inherited eclasses and check them for changes during dependency calculation.  Having to grab eclass revisions doesn't change a thing.
Comment 22 Michael Orlitzky gentoo-dev 2022-02-23 17:06:10 UTC
(In reply to Michał Górny from comment #21)
> The PM already needs to get the complete list of inherited eclasses and
> check them for changes during dependency calculation.  Having to grab eclass
> revisions doesn't change a thing.

"Already" is the wrong comparison. The PM wouldn't need to check for those changes if we disallowed them, instead requiring an -r<N+1> of the eclass and revbumps of consumers.

ECLASS_REVISION codifies the need for the PM to read through those eclasses, while creating a new eclass and using a revbump tool would allow us to avoid it.
Comment 23 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2022-02-23 20:17:46 UTC
No, it wouldn't.  The PM needs to process the eclasses in order to obtain other ebuild metadata.  Adding an additional metadata variable has negligible impact, provided that it is cached the same way as other vars.
Comment 24 Michael Orlitzky gentoo-dev 2022-02-24 02:32:28 UTC
(In reply to Michał Górny from comment #23)
> No, it wouldn't.  The PM needs to process the eclasses in order to obtain
> other ebuild metadata.  Adding an additional metadata variable has
> negligible impact, provided that it is cached the same way as other vars.

I'm either missing something obvious, doing a bad job explaining, or both.

The proposal in Comment 0 is,

> Together with changes to an eclass, this variable should be increased by 1
> whenever the outward-facing interface of the eclass, as, e.g., in particular
> generated dependency strings change. It must never be decreased.

In contrast, I'm suggesting you create a new -r<N+1> eclass whenever you would have increased ECLASS_REVISION. So basically, you "revbump" the eclass whenever you want already-installed consumers to be rebuilt and pick up your eclass changes. (And then you use a tool to actually revbump the consumers, initiating those rebuilds.)

Under that plan, there should be no reason to re-source eclasses for already-installed packages. Any in-place metadata edits that occur in an eclass are "by definition" ones that should not affect installed packages.
Comment 25 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2022-02-24 07:43:56 UTC
I still don't see a difference.

Under ECLASS_REVISION, you get a complete cache regen of all consumers when ECLASS_REVISION changes.  Afterwards, the cache is up-to-date as everything remains unchanged.

Under your proposal, you get a complete cache regen of all consumers because they are revbumped.

There is no real difference.  Except in case 1. we're talking about a big change in metadata cache, and in case 2. we're talking about a huge change to everything.
Comment 26 Michael Orlitzky gentoo-dev 2022-02-24 13:03:09 UTC
(In reply to Michał Górny from comment #25)
> 
> There is no real difference.

With an up-to-date metadata cache generated using somebody else's CPU time, that's about right. But when no cached metadata exists during an update, using the filename to notify the PM of metadata changes has the potential to save a lot of time.
Comment 27 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2022-02-24 14:03:45 UTC
When PM has no local metadata cache, it needs to grab the metadata somewhere anyway.
Comment 28 Michael Orlitzky gentoo-dev 2022-02-24 23:24:56 UTC
(In reply to Michał Górny from comment #27)
> When PM has no local metadata cache, it needs to grab the metadata somewhere
> anyway.

Not always: a simple "is this installed?" query doesn't require metadata.

Even for the operations that do, the PM will have -- or at least can have, if it chooses to be efficient -- its own record of the metadata for the installed package. We already use ebuild revisions to signal important changes within the ebuild; if we also use eclass "revisions" to signal changes within eclasses, then we make it feasible for the PM to trust the metadata that it recorded when a package was installed.
Comment 29 eternalblue34 2022-08-08 18:06:32 UTC
(In reply to Michael Orlitzky from comment #28)
> (In reply to Michał Górny from comment #27)
> > When PM has no local metadata cache, it needs to grab the metadata somewhere
> > anyway.
> 
> Not always: a simple "is this installed?" query doesn't require metadata.

I'm a user, but still I aggree

> Even for the operations that do, the PM will have -- or at least can have,
> if it chooses to be efficient -- its own record of the metadata for the
> installed package. We already use ebuild revisions to signal important
> changes within the ebuild; if we also use eclass "revisions" to signal
> changes within eclasses, then we make it feasible for the PM to trust the
> metadata that it recorded when a package was installed.

That is going to be pretty neat