Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 943143 - sys-apps/portage: --keep-going=y loses blockers after failures
Summary: sys-apps/portage: --keep-going=y loses blockers after failures
Status: CONFIRMED
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Core - External Interaction (show other bugs)
Hardware: All Linux
: Normal normal
Assignee: Portage team
URL:
Whiteboard:
Keywords:
: 943196 (view as bug list)
Depends on:
Blocks: 373807 943308
  Show dependency tree
 
Reported: 2024-11-09 14:13 UTC by Michał Górny
Modified: 2024-12-21 06:55 UTC (History)
7 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
console output (output.txt,29.69 KB, text/plain)
2024-11-09 14:16 UTC, Michał Górny
Details
cdindgen.build.log (cbindgen.build.log,250.67 KB, text/x-log)
2024-11-10 13:44 UTC, ditto
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2024-11-09 14:13:01 UTC
Long story short, today's upgrade included dev-lang/rust replacements that involved dev-lang/rust:stable blocker.  Emerge initially included "uninstall" in its depgraph but after a failure, --keep-going=y removed it and proceeded without the uninstall -- effectively installing new dev-lang/rust versions without unmerging the old one, and causing quite a mess.

Will paste the (huge) output in the next comment.  Unfortunately, this is one-off case, so I won't be able to reproduce it or test any fixes.
Comment 1 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2024-11-09 14:16:35 UTC
Created attachment 908305 [details]
console output
Comment 2 ditto 2024-11-10 13:44:50 UTC
Created attachment 908390 [details]
cdindgen.build.log

Hello. I've stumbled upon this bug today myself. Build for cbindgen and a few other fails after a previous failed attempt at merging multiple rust version. In this attempt, rust has merged successfully, but due to previous install being kept, it now fails to resolve deps. c.f. 
 = note: candidate #1: /usr/lib/rust/1.82.0/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-384f118d1e67506a.so
 = note: candidate #2: /usr/lib/rust/1.82.0/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-f96040c24237408e.so
Comment 3 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-11-10 20:33:01 UTC
*** Bug 943196 has been marked as a duplicate of this bug. ***
Comment 4 Larry the Git Cow gentoo-dev 2024-11-12 09:09:34 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=34b74faa06a90bf9d4f62ecfca746b380d60517a

commit 34b74faa06a90bf9d4f62ecfca746b380d60517a
Author:     Matt Jolly <kangie@gentoo.org>
AuthorDate: 2024-11-12 03:07:51 +0000
Commit:     Matt Jolly <kangie@gentoo.org>
CommitDate: 2024-11-12 09:07:42 +0000

    rust.eclass: revert simplified dependency simplification
    
    The simplified dependency specification for cases where no RUST_MAX_SLOT
    is set is the desired end state, however the edge case where portage
    drops blockers with `--keep-going` has an unfortunate interaction where
    both packages are installed simultaneously, e.g. dev-lang/rust-1.82.0:stable
    and dev-lang/rust-1.82.0:1.82.0, and there's no easy way for end users
    to resolve that as the legacy (though masked) ebuilds will meet the simple
    Rust dependency.
    
    Both packages install rlibs with different hashes in them to the same
    path (as shown below) resulting in failures when a package attempts
    to link against an rlib and finds two.
    
    1.82.0:      .../x86_64-unknown-linux-gnu/lib/libunwind-fc4fe814489209c6.rlib
    1.82.0-r100: .../x86_64-unknown-linux-gnu/lib/libunwind-ab65e6747cbe4a5a.rlib
    
    Bug: https://bugs.gentoo.org/943143
    Bug: https://bugs.gentoo.org/943206
    Signed-off-by: Matt Jolly <kangie@gentoo.org>

 eclass/rust.eclass | 27 ++++++++++++---------------
 1 file changed, 12 insertions(+), 15 deletions(-)
Comment 5 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-11-30 01:18:50 UTC
In _save_resume_list, we do:
```
        mtimedb["resume"]["mergelist"] = [
            list(x)
            for x in self._mergelist
            if isinstance(x, Package) and x.operation == "merge"
        ]
```

I think we need to check for x.operation being "uninstall"?
Comment 6 Zac Medico gentoo-dev 2024-11-30 20:28:24 UTC
(In reply to Sam James from comment #5)
> In _save_resume_list, we do:
> ```
>         mtimedb["resume"]["mergelist"] = [
>             list(x)
>             for x in self._mergelist
>             if isinstance(x, Package) and x.operation == "merge"
>         ]
> ```
> 
> I think we need to check for x.operation being "uninstall"?

In theory, the uninstall operations are supposed to be implied and regenerated automatically by the depgraph _loadResumeCommand method when it calls self.altlist().
Comment 7 Zac Medico gentoo-dev 2024-11-30 21:41:07 UTC
(In reply to Michał Górny from comment #1)
> Created attachment 908305 [details]
> console output

The uninstall didn't run in this case because glycin-loaders failed after the dev-lang/rust install but before the uninstall task executed. We need to execute uninstall here despite the glycin-loaders failure, before --keep-going tries to resume.
Comment 8 Zac Medico gentoo-dev 2024-11-30 21:58:02 UTC
We need to keep track of pending uninstalls and make the Scheduler._keep_scheduling method return True until the pending uninstalls are completed, despite package build failures.
Comment 9 Zac Medico gentoo-dev 2024-11-30 22:08:36 UTC
If we add a Scheduler._pending_uninstalls attribute then we can do something like this to keep scheduling:

--- a/lib/_emerge/Scheduler.py
+++ b/lib/_emerge/Scheduler.py
@@ -1789,9 +1789,12 @@ class Scheduler(PollScheduler):
     def _keep_scheduling(self):
         return bool(
             not self._terminated.is_set()
             and self._pkg_queue
-            and not (self._failed_pkgs and not self._build_opts.fetchonly)
+            and not (
+                self._failed_pkgs
+                and not (self._build_opts.fetchonly or self._pending_unstalls)
+            )
         )