Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 327809 - Let's make revdep-rebuild obsolete.
Summary: Let's make revdep-rebuild obsolete.
Status: CONFIRMED
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Conceptual/Abstract Ideas (show other bugs)
Hardware: All Linux
: High enhancement with 18 votes (vote)
Assignee: Portage team
URL: http://www.skuggor.se/habari/gentoo-i...
Whiteboard:
Keywords:
Depends on: force-rebuild 234710
Blocks:
  Show dependency tree
 
Reported: 2010-07-11 15:17 UTC by Spider
Modified: 2017-01-29 13:47 UTC (History)
28 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Spider 2010-07-11 15:17:51 UTC
While I'm no longer a Gentoo developer since a few years back, that doesn't mean I've completely stopped using and thinking about some of the problem domains that are around.

One of the more annoying ones I have issues with is the somewhat frequent need for revdep-rebuild, especially on long-living slow-updating maintainance only systems. It would seem that to fix this, there is a need to not just post-breakage scan for things that broke and rebuild, but to preemptively work about it.

Library changes and dependency changes is a fact of life. There is no going on about it, no matter what we do, where we do or where we go, things change. Maintaining a static system isn't anyway interesting for people using Gentoo, there's Debian stable and RedHat/CentOS for that.

However, this doesn't mean that we can't do things to alleviate this situation, _without_ adding more requirements for developer QA.

A first step would of course be to implement and have a working --as-needed build environment. This is mostly a matter of developer buy-in rather than anything else. However, beyond this goes my suggestion:

Taking a hint from how rpm does it shows that after src.rpm build( the install step), the list of files for an rpm is sent through "find-requires" which will scan them for binary elf files and so-files (using file/ldd/objdump) and extract a list of weak symbols for glibc ( "Version References" in objdump -p), .so-file dependencies and binary dependencies.

We can use a similar step just after src_install() in Gentoo, scanning $D for elfs with scanelf/readelf to do a twostep list for our needs.

First, we build a pair list of the form "file dependency_file" for our package, this list gets shipped inside the installation path list. ( Premature optimization: reduce it to only a list of dependencies. Personally, I think we could do more interesting thing with a higher granulation of data ) second, we also generate a backwards-resolved list of "Dependend files => currently installed package owning it" to be stored for QA reasons.

** Extra feature **
If we wish to provide an additional QA tool, we can then build an empty graph (system => current package) and map this graph to the dependencies we just extracted, and if any package listed as dependency is not in the graph, we fail QA tests due to hidden dependency.
** Back to normal **

At this same time we also scan through our installed package, and extract the _provided_ ELF libraries ( and perhaps even function names ).

Now, what use does all this extra work give us? Well for both binary and source installation, we can at a pre_install step check if our package (binary or source) is currently unsupportable by the current system, (concatenate the list of installed libraries on our system with the list of libraries to be installed, match this list with the list of required libraries, if something is missing, we're in an inconsistent state and would break the to-be-installed package. Fall back to automatic resolution, or fail to operator)

That as well as on a pkg_prerm() we scan the to be removed libraries ( We have that cached since installation time) , remove them from the global list of provided libraries, and compare the list with the concatenated list of total dependencies, if we find a mismatch, we would break things by removing this library, and can then fail neatly.

For portage to protect against more subtle breakage, we can expand this to do the checks before we overwrite any libraries in pkg_rpreinst(), by scanning the current systems dependencies against files that we are about to overwrite, then doing either a header comparision in this backwards expanded list to make sure that we do not break any functions in existing programs, or overwrite things with incompatible versions. This would then defend us against binary incompabilities on systems.

So, the costs are during build and installation time, and work more like a global cache of our installed files, leaving us with more tools to protect running systems, both users and developers against breakage, while also giving us opportunity to increase QA tools and availability.

Some references:
rpmdeps.c
http://rpm.org/gitweb?p=rpm.git;a=blob;f=tools/rpmdeps.c;h=5f34c2bfd44156a59515a60ceba7954a458e07ab;hb=HEAD

rpm: find-requires php:
http://rpm.org/gitweb?p=rpm.git;a=blob;f=scripts/find-requires.php;h=4c064fcff8f1554a600889fc37d219c135a4c199;hb=HEAD
( same tree has mono, perl and python scripts)

And this is the rpm.org main elf-scanner for dependency generation/Resolution
http://rpm.org/gitweb?p=rpm.git;a=blob;f=autodeps/linux.req;h=cf60bd9ac5712c07853eb946885687e422a861de;hb=HEAD
Comment 1 Petteri Räty (RETIRED) gentoo-dev 2010-07-11 15:31:02 UTC
I didn't read through the whole post but it doesn't mention @preserved-rebuild from Portage 2.2 that already aims for this goal. Have you looked at that functionality?
Comment 2 Christian Ruppert (idl0r) archtester Gentoo Infrastructure gentoo-dev Security 2010-07-11 15:39:10 UTC
That wouldn't replace revdep-rebuild completely because revdep-rebuild also scans in /usr/local/.. etc. which is IMHO important.. at least for myself.

So in case portage shall completely replace revdep-rebuild we should consider such cases.

Although I'd like to get a rid of revdep-rebuild if portage has an option to check the linking consistence (e.g. via cronjob) + extra scan option to check /usr/local etc.
That could be taken from ld.so.conf and additionally from our env.d files.
Comment 3 Spider 2010-07-11 15:44:04 UTC
(In reply to comment #1)
> I didn't read through the whole post but it doesn't mention @preserved-rebuild
> from Portage 2.2 that already aims for this goal. Have you looked at that
> functionality?
> 

Yes, I have, and while that's a good start, it doesn't go all the way. 
Comment 4 Spider 2010-07-11 15:47:32 UTC
(In reply to comment #2)
> That wouldn't replace revdep-rebuild completely because revdep-rebuild also
> scans in /usr/local/.. etc. which is IMHO important.. at least for myself.

Of course not, revdep-rebuild will still be a possibility,  and would be a nice feature to have, however, we really shouldn't have to hit it as often as we do right now.   Making it obsolete is the goal, not removing it. 

> 
> So in case portage shall completely replace revdep-rebuild we should consider
> such cases.

Let's start with one step at the time, automating smaller parts and solving sub-problems to a tighter and more integral system first. From this point, we could go on and start listing all elf-applications actual functions used and compare to exported, along with doing a lot of other magic on top of it, things that revdep-rebuild cannot do for example.  But,  let's start with the basics.
 
> Although I'd like to get a rid of revdep-rebuild if portage has an option to
> check the linking consistence (e.g. via cronjob) + extra scan option to check
> /usr/local etc.
> That could be taken from ld.so.conf and additionally from our env.d files.

That'd be a complimentary issue/util.  You could also have such tools scan for installed files  (usr/local, usr/lib, whatnot) that aren't in the global lists of libraries, check dependencies and integrity as "extras" and report daily if they break due to some upgrade.    But, all that is extra functionality on top of this. 

Comment 5 solar (RETIRED) gentoo-dev 2010-07-11 16:38:03 UTC
ldd should probably be avoided for a few reasons on ELF systems. Mainly it does not work on $ROOT files when things in a SYSROOT are cross-compiled. Also calling ldd can trigger an ELF's constructor. 

As of today however we are saving some of the info we already need in ELF.NEEDED.2 or NEEDED files in the vdb. So for step 1 it should be easy to do a reverse lookup on providers of the NEEDED data and save those as HINT's in the xpak data. Running these kind checks in a pre_install() phase when EMERGE_FROM != binary might be a little overkill as the linker probably would not of allowed us to build the pkg in the first place.

function/symbol level checking would be nice for a second phase or a more aggressive FEATURE.

I had never seen the RH's main elf-scanner before. Thanks for the link. I'm fond of how they additionally check python/perl/tcl requirements.

Spider thanks for posting this bug.
Comment 6 Spider 2010-07-11 16:50:53 UTC
(In reply to comment #5)
> ldd should probably be avoided for a few reasons on ELF systems.
<snip>
Good points and agreed, I'm mostly throwing out this from base memory and some slight research I've been doing on how things work/used to work/is done elsewhere.   ldd can be replaced with objdump/readelf/scanelf as we feel fit, it's not about how we do things as much as about why we do it. ;)

> 
> As of today however we are saving some of the info we already need in
> ELF.NEEDED.2 or NEEDED files in the vdb. So for step 1 it should be easy to do
> a reverse lookup on providers of the NEEDED data and save those as HINT's in
> the xpak data. 

Good to know, I didn't actually look through what current versions of portage/paludis save and not here, if we can reuse existing infrastructure, all the better.   Also note that textfiles with file=>depend might not be optimal, sqlite database or whatnot may be "better" but not as transparent.  But for the concept, I'm sticking to files as it's easier to think around.

> Running these kind checks in a pre_install() phase when
> EMERGE_FROM != binary might be a little overkill as the linker probably would
> not of allowed us to build the pkg in the first place.

Actually, this would be wrong to assume.  Consider for a moment a .so-file that is built during compile, but erroniously not installed together with the package. Of course, this would be caught in QA by developers who test the application, but it'd also be a win to automate it.
( a setup of things like "package installs binary + library interface" where the library interface isn't complete, but the binary is, may well exist for some cases as well, and would not be found during testing normally )


> I had never seen the RH's main elf-scanner before. Thanks for the link. I'm
> fond of how they additionally check python/perl/tcl requirements.

Aye, they do that, backwards resolving and intra-dependencies,  which are then attached to the binary rpm's,  so a binary rpm first has the pre-defined "Requires" list from the .spec-file, then additionally the scanned data.   They don't do a backwards-resolve step afterwards ( library/include => package ), which might have been nice to have to ensure consistency.  But it's still a bit more than we do, and far from the old days of "install package, binary is broken, try to figure out what libraries are missing" 
 
> Spider thanks for posting this bug.

You're welcome.  

Comment 7 Diego Elio Pettenò (RETIRED) gentoo-dev 2010-07-12 01:01:23 UTC
I'll be blut and I'll say that I don't see this happening anytime sooner than the end of the world. Why? Simply because we're Gentoo.

You say that preserved-rebuild doesn't go all the way; yet we still have trouble _supporting that_ at all.

You note that it needs --as-needed, and I think I'm the one that knows best how difficult it is to herd the Gentoo devs into actually making the effort needed for that to work. Maybe I'm spiteful here but I'd like to point out Tester actually suggesting me to leave Gentoo for Fedora when I insisted that --as-needed is something for us to do.

I don't dislike the idea but

 a) it requires quite a bit of work that has to be done; we're barely finding people to keep Portage alive as far as I can see;
 b) it still requires developers to keep a clue and stop breaking stuff randomly just because they don't care enough.

Mind you, I wouldn't mind if we stopped doing idiotic things like committing a bunch of plugins to the tree just because we're the maintainers of the package using them without even compile-testing them. But that is still happening...

That said, good luck. If anything happened regarding this by next FOSDEM, I'll be offering a beer to all the gentoo devs on CC on this bug (or for those joining me in the non-drinking club, a coke, diet eventually — I'll get one myself of the latter).

/me counts the number of rewrites of revdep-rebuild since he joined, and the fact that it's _still_ in a split package from portage, that is not even in the system set (while a huge bunch of pointless stuff is), and sighs loudly.
Comment 8 Spider 2010-07-12 01:33:53 UTC
(In reply to comment #7)
> I'll be blut and I'll say that I don't see this happening anytime sooner than
> the end of the world. Why? Simply because we're Gentoo.

Heh,  I think I recognise this attitude. ( Hi again, btw. Same old Spider here. Same old sarcastic git. ) 

 
> You note that it needs --as-needed, and I think I'm the one that knows best how
> difficult it is to herd the Gentoo devs into actually making the effort needed
> for that to work. Maybe I'm spiteful here but I'd like to point out Tester
> actually suggesting me to leave Gentoo for Fedora when I insisted that
> --as-needed is something for us to do.


In reality it doesn't _need_ it, however having it would cut down on headache by a minor ton.  Both features are good, this would be improved by --as-needed, but not necessarily strictly depend on it. 


  a) it requires quite a bit of work that has to be done; we're barely finding
> people to keep Portage alive as far as I can see;

Good point, and always the case.

>  b) it still requires developers to keep a clue and stop breaking stuff
> randomly just because they don't care enough.

Yeah, though it may help by preventing them from shooting themselves in the foot. Maybe.

> That said, good luck. If anything happened regarding this by next FOSDEM, I'll
> be offering a beer to all the gentoo devs on CC on this bug (or for those
> joining me in the non-drinking club, a coke, diet eventually — I'll get one
> myself of the latter).

I don't really attend FOSDEM, but if I get this in I may make an exception only for the sake of said beer ;P

> 
> /me counts the number of rewrites of revdep-rebuild since he joined, and the
> fact that it's _still_ in a split package from portage, that is not even in the
> system set (while a huge bunch of pointless stuff is), and sighs loudly.

Thanks,  I know all too well, remember I helped write the first piece of junk?  find + awk +ldd +grep, .  Oh yes, horrible mess. 

But we're veering off topic.

Right now, I checked up the DEPEND.ELF files, and while it has the information we want, it's not present as quite as usable as we'd want it.  ( comma separated lists  are more space efficient, but harder to concatenate and sort on various terms ) still, that's not too much of a hassle.
Comment 9 solar (RETIRED) gentoo-dev 2010-07-12 01:59:21 UTC
(In reply to comment #8)

> Right now, I checked up the DEPEND.ELF files, and while it has the information
> we want, it's not present as quite as usable as we'd want it.  ( comma
> separated lists  are more space efficient, but harder to concatenate and sort
> on various terms ) still, that's not too much of a hassle.

, vs ; takes the same number of chars in the entry.; is used vs , as some fnames will have a , (few shared objects) and ; is not a valid char in fnames.

portage already offers a basic API for grabbing entries out of this file.

Format is as ${arch};${obj};${soname};${rpath};${needed}


As for portage. They are quite accepting of sane code. If it's less sane or CPU/MEM intensive then it can be tucked away behind a FEATURE=
Comment 10 michael@smith-li.com 2010-07-12 05:22:29 UTC
Any attempt to reduce breakage or shorten the amount of time it takes to update safely is good.

I'm responsible for the most recent full rewrite of revdep-rebuild (too many 'r's!), and also attempted a replacement that never really got off the ground (bug 184291 -- guess I need to hype my bug subjects a little more, considering the response that this bug garnered.) So let me know if I can help with anything.

Best of luck!
Comment 11 Spider 2010-07-12 06:30:34 UTC
(In reply to comment #9)
> (In reply to comment #8)
> 
> > Right now, I checked up the DEPEND.ELF files, and while it has the information
> > we want, it's not present as quite as usable as we'd want it.  ( comma
> > separated lists  are more space efficient, but harder to concatenate and sort
> > on various terms ) still, that's not too much of a hassle.
> 
> , vs ; takes the same number of chars in the entry.; is used vs , as some
> fnames will have a , (few shared objects) and ; is not a valid char in fnames.

I was more considering a format of one entry per line. So
/usr/bin/vim,/lib64/libncurses.so.5
/usr/bin/vim,/lib64/libtinfo.so.5 

And so on, mostly because it'd then be a cheap&simple matter of concatenating all lists, sorting by column 2 and it's done. all formats that aren't key:value require slightly more legwork, which may well be fine.
 
> portage already offers a basic API for grabbing entries out of this file.
> 
> Format is as ${arch};${obj};${soname};${rpath};${needed}
I have to look at that more then, seems I'll have to get somewhat started on portage code again?
 
> 
> As for portage. They are quite accepting of sane code. If it's less sane or
> CPU/MEM intensive then it can be tucked away behind a FEATURE=

Too many things are hiding in FEATURE-land I think.  

Comment 12 Sebastian Luther (few) 2010-07-13 11:27:40 UTC
I'd like to discuss what the user's workflow currently looks like and what it looks like with your changes.

Lets assume you have libfoo-1 installed and you want to upgrade to libfoo-2, which has an backwards incompatible ABI change.

a) Current situation with portage 2.1:

You run emerge -uDN world and get exactly one update, libfoo 1 -> 2.
You think fine, let emerge do its job. After half an hour of compiling it's done and all seems fine.
You try to start one of the 100 consumers of libfoo you have installed and see it moan about an missing .so file.
You run revdep-rebuild and if you're lucky you have the time to wait for the 100 rebuilds to finish. During this time the 100 consumers keep being broken. If that's not acceptable you have to downgrade libfoo again. I hope you have a binpkg...


b) Current situation with portage 2.2:

You run emerge -uDN world and get exactly one update, libfoo 1 -> 2.
You think fine, let emerge do its job. After half an hour of compiling it's done and tells you that it has some libraries preserved for you. All consumers of libfoo are still working.
You can now decide to either leave it and come back later or you do the 100 rebuilds right now.
If you're lucky you don't hit one of the issues preserve-libs has (bug 240323) and you're done.


c) Situation with your proposed changes as I understand them:

You run emerge -uDN world and get exactly one update, libfoo 1 -> 2.
You think fine, let emerge do its job. After half an hour of compiling it "stops" right before merging libfoo-2 to $ROOT.
It's not clear to me what you want to happen now. You said in comment 0: "Fall back to automatic resolution, or fail to operator"

The first one sounds like: Merge anyway and rebuild the 100 consumers right away.
Not sure about the second one. Does this mean you want to abort completely and have the user recompile libfoo-2 again (if the user doesn't build binpkgs)?

Please let me know if that's correct and what other ideas you have in mind for this case.


d) I'll now post what I would like the workflow to be.

You run emerge -uDN world and you're told that you have to do 1 update and 100 rebuilds. You either decide to have the time or you leave it for later. Optionally one could do library preservation in between, if someone manages to fix preserve-libs.

Have a look at bug 192319, especially at the end, for some ideas.

Summary)
* I think every one agrees that a) is a really poor way to deal with the situation.

* If someone would manage to fix b) we would win a lot already, even if it only handles link dependencies (not script interpreters for example).

* c) adds to a) the possibility to abort before damage is done and makes emerge do revdep-rebuild's job. I'm not sure if it's that much better that it warrants a lot of work.

* d) Imo that's the way to go, but this still needs someone to think it through and push it.

I hope you'll keep working on this topic, so we can finally kill revdep-rebuild.  
Comment 13 Spider 2010-07-13 18:17:32 UTC
(In reply to comment #12)
> I'd like to discuss what the user's workflow currently looks like and what it
> looks like with your changes.
Thank you, that's a very good initiative that's very welcomed.

> 
> Lets assume you have libfoo-1 installed and you want to upgrade to libfoo-2,
> which has an backwards incompatible ABI change.


> b) Current situation with portage 2.2:
> 
> All consumers of libfoo are still working.
>
> You can now decide to either leave it and come back later or you do the 100
> rebuilds right now.

This is not true, Because at some points something in the deeper stack _may_ be rebuilt against will cause the whole chain to collapse (double symbol issue, see libpng issues, parts of an application stack that links against 1.0 and others against 1.1 )

Admitted corner case, but a real one to be considered.
> 
> c) Situation with your proposed changes as I understand them:
> 
> The first one sounds like: Merge anyway and rebuild the 100 consumers right
> away.
> Not sure about the second one. Does this mean you want to abort completely and
> have the user recompile libfoo-2 again (if the user doesn't build binpkgs)?

Here I would _prefer_ that we simply save state for libfoo-2 ( store a binpkg if need be, or simply don't autoclean it in the next time we get there ).

My base concept for anything is to always "fault" (what if the installation is from a binary package?) a section back to the operator. At this point we can explain that "It has been found that libfoo-2 will break <list> of applications, and we will not merge it until <list> can be rebuilt at the same time" 
( preemptive revdep-rebuild)

The other alternative is as you said "Just fix it for me".   Another approach would be a hybridization, to store a tag and abort, and the next time the user attempts to install libfoo-2 he will find a queue of dependencies to be rebuilt.  
 
> Please let me know if that's correct and what other ideas you have in mind for
> this case.

That would be correct.  Frankly, my main issue is that we _ever_ leave the system of a user in an inconsistent state.  All from removing libacl (doable, portage didn't even use to complain about it if you had the useflag unset) to installing incompatible versions of libpng,  to xul-libraries with same soname and different function descriptors causing all derived applications to break.

> d) I'll now post what I would like the workflow to be.
> 
> You run emerge -uDN world and you're told that you have to do 1 update and 100
> rebuilds. You either decide to have the time or you leave it for later.
> Optionally one could do library preservation in between, if someone manages to
> fix preserve-libs.
> 
> Have a look at bug 192319, especially at the end, for some ideas.
> 
> Summary)
> * I think every one agrees that a) is a really poor way to deal with the
> situation.
> 
> * If someone would manage to fix b) we would win a lot already, even if it only handles link dependencies (not script interpreters for example).
> 
> * c) adds to a) the possibility to abort before damage is done and makes emerge
> do revdep-rebuild's job. I'm not sure if it's that much better that it warrants a lot of work.
> 
> * d) Imo that's the way to go, but this still needs someone to think it through
> and push it.
> 
> I hope you'll keep working on this topic, so we can finally kill
> revdep-rebuild.  
> 

Well, currently we store a list of all ELF objects _that have dependencies_ in the build-info.  We could simply expand this list to contain _all_ ELF objects, wether they depend on something or not.

This would then mean that we can do a preemptive scan of dependencies/depended files during either dependency resolution, or pre-merge. We could then compare the .so-lists as we merge with things in place to note non-incremental things that would affect the tree, and either assume that our developers will add such metadata to the portage tree (information that would _never_ be complete) or that the user would have to fix this at emerge time for his local system.

Even if we just start designing around ELF and ELF-dependencies, we can obviously also change this  (with different scanners) to work on python/perl/php ( probably ruby too) without too many hassles. bourne would be a bit rougher to do though.

Given that unless we work exclusively on binary content, we _cannot_ warn a user about possible breakage in the dependency-tracking phase, this leaves only pre-merge phase as the possibility.  (for larger queued merges, we may well simply "stall" it and move on to the next,  queueing the user interaction up until later).


Also, I find the idea of consciously "leaving behind" parts of packages to be completely abhorrent as it invites to partial dependency chains breaking even more and interesting things later on.

Usabilitywise, I'd say the better way would be to postpone the merge, and when opted to resume, automatically fix the problems for the user. This would also make certain that developers didn't get a chance to push abnormally bad packages into the tree without proper testing.
Comment 14 Zac Medico gentoo-dev 2010-07-14 04:16:49 UTC
(In reply to comment #13)
> Given that unless we work exclusively on binary content, we _cannot_ warn a
> user about possible breakage in the dependency-tracking phase, this leaves only
> pre-merge phase as the possibility.  (for larger queued merges, we may well
> simply "stall" it and move on to the next,  queueing the user interaction up
> until later).

In order to implement bug 192319, all that will be needed is a way to detect ABI changes during upgrades, just before pkg_preinst. That will give us all the data that we need to fold back into ABI_SLOTS and abi-slot-deps. As mentioned in bug 192319, we can also add a phase before pkg_preinst that ebuilds may use to perform specialized ABI change detection.
Comment 15 michael@smith-li.com 2010-07-17 02:23:19 UTC
(In reply to comment #14)
> In order to implement bug 192319, all that will be needed is a way to detect
> ABI changes during upgrades, just before pkg_preinst.

You make it sound so simple. What happens if there's an ABI change that doesn't actually break backwards compatibility? Wouldn't that cause false positives?
Comment 16 Zac Medico gentoo-dev 2010-07-17 05:24:55 UTC
(In reply to comment #15)
> You make it sound so simple. What happens if there's an ABI change that doesn't
> actually break backwards compatibility? Wouldn't that cause false positives?

So, we'll need to be able to analyze the changes and separate the backward compatible ones from the incompatible ones. For backward compatible changes, there's no need to bump ABI_SLOTS in the providing package. The worst case is that the person who does the version bump will have to do some backward compatibility tests before they decide to bump ABI_SLOTS. Those compatibility tests might be integrated into the abi change detection phase, in order to eliminate false positives automatically, making things easier for the person who does the version bump.
Comment 17 Kai Krakow 2010-07-20 08:37:02 UTC
Maybe, a first step into this direction would be to have something similar to RDEPEND, like NEEDS_REBUILD. Some package yield post installation notes about rebuilding specific packages (which do not break on a binary level, but may not work anyway), eg if you use some features (use flags). It would be nice to catch those to and emerging the package should autoselect the NEEDS_REBUILD packages for rebuilding if installed. Maybe a command line switch should be added to enable this feature - and later if it proves working, it could be made default.

This may be something which the original idea of this report could build upon - and it should not be too difficult to implement. I have this idea since a year or two.
Comment 18 Zac Medico gentoo-dev 2010-07-20 13:02:39 UTC
(In reply to comment #17)
> Maybe, a first step into this direction would be to have something similar to
> RDEPEND, like NEEDS_REBUILD.

This NEEDS_REBUILD is essentially the same as what was initially suggested to solve bug 192319, however it has the disadvantage that a ABI providing package has to be aware of all reverse dependencies. The ABI_SLOTS/abi-slot-deps approach solves this problem by requiring the reverse dependencies to specify abi-slot-deps, allowing the package manager to detect when an ABI_SLOTS bump requires them to be rebuilt.
Comment 19 Kai Krakow 2010-07-20 13:15:14 UTC
(In reply to comment #18)
> This NEEDS_REBUILD is essentially the same as what was initially suggested to
> solve bug 192319, [...]

Thanks for the link - very interesting idea. But I suppose care should be taken to not make portage too complex - or it will loose even more developers. Will this idea be infrastructure to build upon? Or does portage provider infrastructure to build upon?

Comment 20 Zac Medico gentoo-dev 2010-07-20 13:22:51 UTC
(In reply to comment #19)
> Thanks for the link - very interesting idea. But I suppose care should be taken
> to not make portage too complex - or it will loose even more developers. Will
> this idea be infrastructure to build upon? Or does portage provider
> infrastructure to build upon?

When all requirements are considered, the ABI_SLOTS/abi-slot-deps approach is really the simplest and most natural solution to the problem. This model is a variation of "slot operator dependencies" which were proposed for inclusion in EAPI 4 (bug 273625). The NEEDS_REBUILD approach simply isn't feasible because it places the knowledge of reverse dependencies in the wrong location.
Comment 21 Kai Krakow 2010-07-23 20:40:53 UTC
(In reply to comment #20)
> When all requirements are considered, the ABI_SLOTS/abi-slot-deps approach is
> really the simplest and most natural solution to the problem. This model is a
> variation of "slot operator dependencies" which were proposed for inclusion in
> EAPI 4 (bug 273625). The NEEDS_REBUILD approach simply isn't feasible because
> it places the knowledge of reverse dependencies in the wrong location.

What my original intention was: Some ebuilds complain when they were built against another version of a lib than they are linked to at runtime. Such examples include proftpd which loudly complains during service startup on the console (but runs) about openssl, and rmagick which just complains about imagemagick and refuses to work - although the linked lib is probably ABI compatible. I suggest to have a trigger to rebuilt such ebuilds.

But as you wrote: NEEDS_REBUILD would be in the wrong direction. So how about moving it to the right place, in this example into proftpd and rmagick ebuilds - and let's call it REBUILD_TRIGGERED_BY or something.

In summary, rmagick would include a REBUILD_TRIGGERED_BY="media-gfx/imagemagick". Question is: Should it trigger a rebuild only on version bumps, or always when imagemagick is rebuilt (eg newuse). Maybe that would need some flags to be incorporated (eg on-version-bumped, on-use-change, ...). A first implementation should just always trigger a rebuild and either add these triggered ebuilds to a set or just pull them into the current build list during dep calculation.
Comment 22 Jacob Godserv 2010-07-23 21:05:57 UTC
(In reply to comment #21)
> In summary, rmagick would include a
> REBUILD_TRIGGERED_BY="media-gfx/imagemagick". Question is: Should it trigger a
> rebuild only on version bumps, or always when imagemagick is rebuilt (eg
> newuse). Maybe that would need some flags to be incorporated (eg
> on-version-bumped, on-use-change, ...). A first implementation should just
> always trigger a rebuild and either add these triggered ebuilds to a set or
> just pull them into the current build list during dep calculation.

You could use the same dependency syntax already defined by portage to determine for which builds this package should get rebuilt.
Comment 23 Zac Medico gentoo-dev 2010-07-23 21:16:54 UTC
(In reply to comment #21)
> But as you wrote: NEEDS_REBUILD would be in the wrong direction. So how about
> moving it to the right place, in this example into proftpd and rmagick ebuilds
> - and let's call it REBUILD_TRIGGERED_BY or something.

Your REBUILD_TRIGGERED_BY idea is essentially equivalent to the ABI_SLOTS/abi-slot-deps approach. Instead of reinventing it, you should probably go read my comments on bug 192319.
Comment 24 michael@smith-li.com 2010-07-24 03:28:35 UTC
(In reply to comment #23)
> Your REBUILD_TRIGGERED_BY idea is essentially equivalent to the
> ABI_SLOTS/abi-slot-deps approach. Instead of reinventing it, you should
> probably go read my comments on bug 192319.

Well, and more to the point, REBUILD_TRIGGERED_BY smacks of hard-coding reverse dependencies into ebuilds, which requires some prescience and is definitely to be avoided. Maybe I didn't understand Zac's comments in bug 192319 before, but I had thought ABI_SLOTS were just a workaround for the (hopefully rare) cases where a soname's version numbering doesn't provide enough/correct compatibility info.
Comment 25 Zac Medico gentoo-dev 2010-07-24 04:31:45 UTC
(In reply to comment #24)
> I had thought ABI_SLOTS were just a workaround for the (hopefully rare) cases
> where a soname's version numbering doesn't provide enough/correct compatibility
> info.

The intention is for it to work with _any_ kind of backward-incompatible ABI change, regardless of the underlying ABI details. Maybe an soname doesn't make sense for some ABIs. It's not really necessary for us to limit our concept of an ABI here. For example, it doesn't have to have anything to do with ELF. Basically, it could be _any_ kind of backward-incompatible change that forces rebuild of reverse dependencies. It's a completely generic solution. This is why we should allow ebuilds/eclasses to define their own phase which runs before pkg_preinst, in order to implement specialized backward-incompatible ABI change detection. This specialized phase could analyze a bytecode interface, or any other kind of binary interface that you can imagine. Of course, the package manager will most likely include built-in backward-incompatible ABI change detection for ELF format, since this is an extremely common case and we don't necessarily want to offload this to an eclass.
Comment 26 Kai Krakow 2010-07-26 20:46:45 UTC
(In reply to comment #23)
> Your REBUILD_TRIGGERED_BY idea is essentially equivalent to the
> ABI_SLOTS/abi-slot-deps approach. Instead of reinventing it, you should
> probably go read my comments on bug 192319.

Well, neither revdep-rebuild nor @preserved-rebuild detect that RMagick needs to be rebuild after bumping the imagemagick version (only sometimes, when ABI really changes). And personally I think - without having looked into the inner workings - that RMagick would not even really need to be rebuilt - it's just picky about being built and run against the exact same version - because ABI didn't change. I conclude this from cases when other software linked against imagemagick still runs and works while RMagick refuses to do so. Would an ABI_SLOT really detect this?

I think this is similar to what I've seen with ProFTPd and OpenSSL. If I understand ABI_SLOT correct, one would need to bump ABI_SLOT to trigger rebuilding. But that would mean it triggers a lot of packages which would not really need rebuilding.

I just wished there was a mechanism which picks up such corner cases by looking into hints within the ebuild - and to reutilize already existing methods like sets or the dependency calculator.
Comment 27 Zac Medico gentoo-dev 2010-07-26 22:30:17 UTC
(In reply to comment #26)
> I conclude this from cases when other software linked against
> imagemagick still runs and works while RMagick refuses to do so. Would an
> ABI_SLOT really detect this?

ABI_SLOTS is form of expression that is decoupled from the detection mechanism, so you're really asking a question about the detection mechanism. In this case, it appears that RMagick relies on an "interface" (or abi slot) that most other reverse dependencies of imagemagick do not rely on. We could express this in ABI_SLOTS of the imagemagick ebuild with something like ABI_SLOTS="rmagick-imagemagick:6.6", and as an abi-slot-dep in the RMagick ebuild with something like RDEPEND="media-gfx/imagemagick{rmagick-imagemagick}". This would indicate that whenever the rmagick-imagemagick abi slot is bumped in the imagemagick ebuild (by a developer), the RMagick ebuild would to be rebuilt. The package manager can detect this because it has recorded RDEPEND="media-gfx/imagemagick{rmagick-imagemagick:6.6}" in the /var/db/pkg entry for RMagick that was generated when RMagick was built. When imagemagick is upgraded to a version that provides a different abi slot, such as rmagick-imagemagick:6.7, the rmagick-imagemagick:6.6 dependency in the RMagick /var/db/pkg entry becomes unsatisfied, which forces the package manager's dependency resolver to trigger a rebuild. Now, I have named the rmagick-imagemagick abi slot "rmagick-imagemagick" just because I assume that this particular abi slot is only relevant to RMagick. If it turns out that other packages are using the same interface that "rmagick-imagemagick" represents (requiring rebuild at the same time), then naturally we'll want to choose a more generic name for this abi slot.

Note that the ABI_SLOTS bump in the imagemagick ebuild is not a fully automated process. The abi slot needs to be done by an ebuild developer. This is what I mean when I say that the ABI_SLOTS expression is decoupled from the detection mechanism. The ebuild developer who does the abi slot bump would determine that it is necessary by either manual testing (test if RMagick still works after imagemagick is upgraded) or by a specialized preinst phase that is designed to detect the symptom(s) of a backward-incompatible change in this particular abi slot. This specialized phase could conceivably include separate tests for any number of different abi slots that are provided by the same package.

> I think this is similar to what I've seen with ProFTPd and OpenSSL. If I
> understand ABI_SLOT correct, one would need to bump ABI_SLOT to trigger
> rebuilding. But that would mean it triggers a lot of packages which would not
> really need rebuilding.

No, as explained above, you'd have a different abi slot identifier for each interface. If you bumped the "rmagick-imagemagick" abi slot, it would only trigger rebuild of reverse dependencies that specificy that particular abi slot with something like RDEPEND="media-gfx/imagemagick{rmagick-imagemagick}"

> I just wished there was a mechanism which picks up such corner cases by looking
> into hints within the ebuild - and to reutilize already existing methods like
> sets or the dependency calculator.

The ABI_SLOTS/abi-slot-deps model accounts for all of these cases.
Comment 28 Kai Krakow 2010-07-26 22:51:41 UTC
(In reply to comment #27)
> ABI_SLOTS is form of expression that is decoupled from the detection mechanism,
[...]
> The ABI_SLOTS/abi-slot-deps model accounts for all of these cases.

Very well explained. Now I really like this approach. Thank you.

Comment 29 michael@smith-li.com 2010-07-27 02:05:58 UTC
(In reply to comment #27)
> We could express this in
> ABI_SLOTS of the imagemagick ebuild with something like
> ABI_SLOTS="rmagick-imagemagick:6.6"...

Maybe I'm beating a dead horse, but wouldn't it be more natural to assume an ABI_SLOT consists of a consecutive range of versions? If media-gfx/imagemagick-6.4, 6.5 and 6.6 fulfill ABI_SLOTS="rmagick-imagemagick:6.4" then it will be very tempting for a neophyte down the road to express this in imagemagick ebuilds as "rmagick-${PN}:${PV}", which would cause unnecessary pain for all. But doesn't it make more sense to express these thing as a range, or at least have an optional syntax for it?

...and as an abi-slot-dep in the RMagick
> ebuild with something like
> RDEPEND="media-gfx/imagemagick{rmagick-imagemagick}"...

I gather RDEPEND="imagemagick? >=media-gfx/imagemagick-6.5{rmagick-imagemagick}" would be valid syntax as well, meaning "if the imagemagick useflag is set, runtime-depend upon imagemagick 6.5 or greater and mark the rmagick-imagemagick ABI_SLOT from that in the vdb for this package."

Is that right?
Comment 30 Zac Medico gentoo-dev 2010-07-27 06:08:24 UTC
(In reply to comment #29)
> (In reply to comment #27)
> > We could express this in
> > ABI_SLOTS of the imagemagick ebuild with something like
> > ABI_SLOTS="rmagick-imagemagick:6.6"...
> 
> Maybe I'm beating a dead horse, but wouldn't it be more natural to assume an
> ABI_SLOT consists of a consecutive range of versions? If
> media-gfx/imagemagick-6.4, 6.5 and 6.6 fulfill
> ABI_SLOTS="rmagick-imagemagick:6.4" then it will be very tempting for a
> neophyte down the road to express this in imagemagick ebuilds as
> "rmagick-${PN}:${PV}", which would cause unnecessary pain for all. But doesn't
> it make more sense to express these thing as a range, or at least have an
> optional syntax for it?

Within context of ABI_SLOTS, a range doesn't make sense to me.  Can give an usage example for clarification?

> ...and as an abi-slot-dep in the RMagick
> > ebuild with something like
> > RDEPEND="media-gfx/imagemagick{rmagick-imagemagick}"...
> 
> I gather RDEPEND="imagemagick?
> >=media-gfx/imagemagick-6.5{rmagick-imagemagick}" would be valid syntax as
> well, meaning "if the imagemagick useflag is set, runtime-depend upon
> imagemagick 6.5 or greater and mark the rmagick-imagemagick ABI_SLOT from that
> in the vdb for this package."
> Is that right?

Right.
Comment 31 michael@smith-li.com 2010-07-28 04:49:40 UTC
(In reply to comment #30)
> Within context of ABI_SLOTS, a range doesn't make sense to me.  Can give an
> usage example for clarification?

I guess not. I talked myself out of the range thing thinking about an example.
Comment 32 Zac Medico gentoo-dev 2011-05-11 17:45:10 UTC
It's worth mentioning that portage-2.2.0_alpha31 has the following new options that can be used as a substitute for revdep-rebuild in some cases. Since ebuilds don't provide any ABI metadata at this time, these options trigger rebuilds of reverse dependencies when dependencies are rebuilt (due to --newuse or whatnot) or upgraded/downgraded (version changes):

    --rebuild-if-unbuilt [ y | n ]
      Rebuild packages when dependencies that are used at both build-time and
      run-time are built.

    --rebuild-if-new-rev [ y | n ]
      Rebuild packages when dependencies that are used at both build-time and
      run-time are built, if the dependency is not already installed with the
      same version and revision.

    --rebuild-if-new-ver [ y | n ]
      Rebuild packages when dependencies that are used at both build-time and
      run-time are built, if the dependency is not already installed with the
      same version. Revision numbers are ignored.

These options will be in included in portage-2.1.10 which will be released in approximately one month.
Comment 33 James Broadhead 2011-05-11 17:58:30 UTC
^ Which of those will be defaults? Will the rebuilds be trigger-able by ebuild writers?
Comment 34 Zac Medico gentoo-dev 2011-05-11 18:03:14 UTC
(In reply to comment #33)
> ^ Which of those will be defaults? Will the rebuilds be trigger-able by ebuild
> writers?

They won't be enabled by default, but you can add them to EMERGE_DEFAULT_OPTS in make.conf. They won't be trigger-able by ebuild writers, since anything like that would require that ebuilds provide support ABI metadata as discussed in bug 192319.
Comment 35 Matthew Thode ( prometheanfire ) archtester Gentoo Infrastructure gentoo-dev Security 2014-02-17 06:18:41 UTC
is this not fixed by the @preserved-rebuild stuff?
Comment 36 Brian Dolbec gentoo-dev 2014-02-17 07:30:45 UTC
No, @preserved-rebuild does not solve all issues with emerge related breakage unless all ebuilds are EAPI 5 and have PROPERLY filled out ABI subslots.

But since the toolchain is EAPI 0 still (I know it's changing) among others... the --rebuild-* options are a way to preemptively fix breakage that the user knows will occur for certain pkgs.  I've changed catalyst's update-seed operation due to dependencies of gcc being updated without gcc being rebuilt. It caused delayed breakage.  While the initial breakage occurred in the stage1 generation, The problem did not show up until stage2 mid-run.