Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 728986 - Gentoo Prefix bootstrap fails on Stage 3 due to circular dependencies (virtual/libcrypt-1-r1:0 & dev-lang/python-3.7.7-r2:3.7)
Summary: Gentoo Prefix bootstrap fails on Stage 3 due to circular dependencies (virtu...
Status: RESOLVED FIXED
Alias: None
Product: Gentoo/Alt
Classification: Unclassified
Component: Prefix Support (show other bugs)
Hardware: All Linux
: Normal normal with 1 vote (vote)
Assignee: Gentoo Prefix
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-06-21 10:03 UTC by Sammy Pfeiffer
Modified: 2020-06-30 07:42 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Overlay to break the circular dependency during the install (libcryptbootstrap.tar,20.00 KB, application/x-tar)
2020-06-22 19:53 UTC, devourer
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Sammy Pfeiffer 2020-06-21 10:03:02 UTC
I can't confirm if it happens on x86 too (prevented by https://bugs.gentoo.org/722784) but it does happen in amd64.

Error:

 * Error: circular dependencies:

(virtual/libcrypt-1-r1:0/1::gentoo, ebuild scheduled for merge) depends on
 (sys-libs/glibc-2.31-r5:2.2/2.2::gentoo, ebuild scheduled for merge) (runtime)
  (dev-lang/python-3.7.7-r2:3.7/3.7m::gentoo, ebuild scheduled for merge) (buildtime)
   (virtual/libcrypt-1-r1:0/1::gentoo, ebuild scheduled for merge) (buildtime_slot_op)

 * Note that circular dependencies can often be avoided by temporarily
 * disabling USE flags that trigger optional dependencies.

The following USE changes are necessary to proceed:
 (see "package.use" in the portage(5) man page for more details)
# required by sys-apps/portage-2.3.101-r2::gentoo[python_targets_python3_7,-build]
# required by app-admin/perl-cleaner-2.28::gentoo
# required by dev-lang/perl-5.30.3-r1::gentoo
# required by virtual/perl-Data-Dumper-2.174.0::gentoo
>=dev-lang/python-3.7.7-r2:3.7 ssl



Full log here: https://dev.azure.com/12719821/e566c963-8f77-4f01-b7bc-ae2d91b1334f/_apis/build/builds/2188/logs/41

If you want to find yourself in the docker image with this exact issue:
docker pull awesomebytes/gentoo_prefix_ci_stage3:2188
docker run -it awesomebytes/gentoo_prefix_ci_stage3:2188 /bin/bash

(with EPREFIX being /tmp/gentoo)
Comment 1 Fabian Groffen gentoo-dev 2020-06-21 10:35:01 UTC
It seems that

pkg_setup() {
    # see bug 682570
    [[ -z ${BOOTSTRAP_RAP} ]] && python-any-r1_pkg_setup
}

is no longer enough
Comment 2 Fabian Groffen gentoo-dev 2020-06-21 10:37:56 UTC
This is complicated, python deps are added in BDEPEND, seems we can only force this by merging one of the packages without deps.

@heroxbd: this is a RAP thing, do you see a way to work around this problem?
Comment 3 devourer 2020-06-22 19:53:28 UTC
Created attachment 645764 [details]
Overlay to break the circular dependency during the install

While the devs sort out a clean solution, you can bypass that particular block by changing the requirements on virtual/libcrypt to break the circular dependency during the install.
My solution has been to create a local repository (attached here) and to change the RDEPEND of libcrypt from  elibc_glibc? ( sys-libs/glibc[crypt(+),static-libs(+)?] ) to !bootstrap? ( elibc_glibc? ( sys-libs/glibc[crypt(+),static-libs(+)?] ) ).

Bear in mind that I have no idea if this has other side effects.

I won't pretend to understand much of the bootstrap script, but I also had to edit the flags to actually have ssl enabled, because on my machine, the script seems to be attempting the merging a second time if the first one failed (here, because of the SSL flag missing for python), and that second time was trying to install stuff in {$EPREFIX}/tmp/.
Comment 4 anb 2020-06-25 05:32:46 UTC
Hi,

I got the same issue. By using attachment 645764 [details] provided by devourer@noot-noot.org, I could get things rolling, then it stopped on file collision error:

-----
 * Messages for package sys-apps/coreutils-8.32-r1:

 * Package 'sys-apps/coreutils-8.32-r1' has internal collisions between
 * non-identical files (located in separate directories in the
 * installation image (${D}) corresponding to merged directories in the
 * target filesystem (${ROOT})):
 *
 * 	/home/test/.gentoo/tmp/usr/bin/basename
 * 		/home/test/.gentoo/tmp/bin/basename
 * 		/home/test/.gentoo/tmp/usr/bin/basename
 * 			Differences: type, mode
 *

...

 *
 * 	/home/test/.gentoo/tmp/usr/bin/yes
 * 		/home/test/.gentoo/tmp/bin/yes
 * 		/home/test/.gentoo/tmp/usr/bin/yes
 * 			Differences: type, mode
 *
 * Package 'sys-apps/coreutils-8.32-r1' NOT merged due to internal
 * collisions between non-identical files. If necessary, refer to your
 * elog messages for the whole content of the above message.
-----

while ${EPREFIX}/tmp/bin is a symlink to ${EPREFIX}/tmp/usr/bin.
Comment 5 devourer 2020-06-25 07:18:37 UTC
Hi anb,

This is also the error I had after applying the overlay. In my case, this was caused by Portage's behavior when failing to merge. I was fairly unclear in my previous message. I'll try to explain what happened so that you can see if you have the same issue:

While trying to merge the packages a first time, Portage will complain about a missing USE flag (ssl) for Python, indicating that it is necessary to proceed (see the second error in the message of this bug report).

This does not actually stop Portage: going fairly quickly, Portage will attempt to merge all these packages again, but this time, it will target the system in ${EPREFIX}/tmp. I have no idea why/how it changes target system, but the first attempt was doing the correct thing and targeting the system in ${EPREFIX}.

Thus, if you clear the missing USE flag error, it will proceed in the first attempt and successfully merge everything.

I did try writing to a package.use file to clear that particular error, but it didn't work (I don't remember if I tried doing it in both ${EPREFIX} and ${EPREFIX}/tmp, nor which one I did try it in).

The solution I ended up with is to edit bootstrap_prefix.sh, find the "-ssl" keyword in there (it's among many other disabled flags), and remove the minus sign so that it is actually activated.
Comment 6 anb 2020-06-25 15:06:57 UTC
Hi devourer,

Thanks for the details. I've been able to bootstrap the prefix now, here's several places where I applied workaround:

- coreutils complained file collision as ${EPREFIX}/tmp/bin was a symlink to ${EPREFIX}/tmp/usr/bin. Remove the symlink and copy ${EPREFIX}/tmp/usr/bin over fixed it.

- rsync failed because ${EPREFIX}/tmp/etc/init.d/rsyncd had a shabang of "/sbin/openrc-run" while openrc was not installed. I created a dumb script at "${EPREFIX}/tmp/sbin/openrc-run" with following content:

---
#!/bin/bash
echo "$@"
---

I think these are corner cases only happen when bootstrapping prefix, and the hacks landed in "${EPREFIX}/tmp", which won't affect the real prefix later. It would be good to know the steps in each bootstrap stage, and the purpose of using "${EPREFIX}/tmp" as I often find it confusing while things exist in both location(${EPREFIX} vs ${EPREFIX}/tmp).
Comment 7 Fabian Groffen gentoo-dev 2020-06-25 15:30:03 UTC
(In reply to anb from comment #6)
> I think these are corner cases only happen when bootstrapping prefix, and
> the hacks landed in "${EPREFIX}/tmp", which won't affect the real prefix
> later. It would be good to know the steps in each bootstrap stage, and the
> purpose of using "${EPREFIX}/tmp" as I often find it confusing while things
> exist in both location(${EPREFIX} vs ${EPREFIX}/tmp).

stage1: install bare tools, sufficient to install and run portage, into /tmp
stage2: install more tools without dependencies using portage into /tmp, to build a full system
stage3: build @system in /, using the somewhat proper/sane tools in /tmp

since some of the steps from stage3 are sometimes picking up parts from the host system, and a minimal USE-flag combination is used in stage3 to avoid cycles and extra dependencies, an emerge -e @system is performed to ensure all packages are installed and re-installed proper.

Result, everything in /tmp is considered unusable cruft.  It may be compiled for a different architecture (32-bits iso 64-bits), and it probably depends on host libs (e.g. /usr/lib/curses.so).  Also, stuff installed in /tmp may not have been installed by Portage, e.g. there is no administration of owned files there, hence, it's really a messy place to pull outselves out of the mud.  The bootstrap hence rm -Rf's /tmp as soon as it can and continues solely with the sane(r) tools from /.
Comment 8 devourer 2020-06-25 15:36:08 UTC
My understanding is that the script starts by compiling a few tools for a sane environment (to avoid having to depend on those available on the platform) such as wget, bash, and a few others (stage 1), then it somehow creates a very minimal Gentoo install in ${EPREFIX}/tmp (stage 2), which is then used to create the complete system in ${EPREFIX} (stage 3).

This would make having collisions (and, by implication, installation of software) in ${EPREFIX}/tmp during stage 3 something that should not happen. Are you sure that your system in ${EPREFIX} works fine? If you, for example, move ${EPREFIX}/tmp to ${EPREFIX}/tmp_back without changing anything else, does it still work as intended?
Comment 9 Benda Xu gentoo-dev 2020-06-29 23:48:55 UTC
(In reply to Fabian Groffen from comment #2)
> This is complicated, python deps are added in BDEPEND, seems we can only
> force this by merging one of the packages without deps.
> 
> @heroxbd: this is a RAP thing, do you see a way to work around this problem?

It turned out that the solution is surprisingly simple :)
Comment 10 Larry the Git Cow gentoo-dev 2020-06-29 23:53:03 UTC
The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/repo/proj/prefix.git/commit/?id=e77fd01734f21ec2e9c985c28ba4eb30c1b2bc9d

commit e77fd01734f21ec2e9c985c28ba4eb30c1b2bc9d
Author:     Benda Xu <heroxbd@gentoo.org>
AuthorDate: 2020-06-29 23:34:25 +0000
Commit:     Benda Xu <heroxbd@gentoo.org>
CommitDate: 2020-06-29 23:52:25 +0000

    scripts/bootstrap-prefix.sh: do not skip USE=ssl in stage3.
    
    USE=-ssl has been introduced in d830d32f64280bb 10 years ago to
    simplify bootstrap logic, when cryptography was not crucial.
    
    Now the assumptions do not hold anymore and USE=-ssl causes more
    cursed situations than it cures.
    
    This results in cleaner and more correct code.  As a by-product, it
    fixes Bug 728986.
    
    This has been tested on prefix-standalone, call for more tests.
    
    Reported-By: Sammy Pfeiffer, devourer, anb
    Closes: https://bugs.gentoo.org/728986
    
    Signed-off-by: Benda Xu <heroxbd@gentoo.org>

 scripts/bootstrap-prefix.sh | 4 ----
 1 file changed, 4 deletions(-)
Comment 11 Benda Xu gentoo-dev 2020-06-30 00:26:30 UTC
(In reply to devourer from comment #5)
 
> The solution I ended up with is to edit bootstrap_prefix.sh, find the "-ssl"
> keyword in there (it's among many other disabled flags), and remove the
> minus sign so that it is actually activated.

devourer, you have figured out the solution ahead of me :) Kudos, bro!
Comment 12 Sammy Pfeiffer 2020-06-30 05:31:13 UTC
The CI for amd64 is bootstrapping correctly again.

Thank you for your work!

(x86 still bugged)
Comment 13 Michael Haubenwallner (RETIRED) gentoo-dev 2020-06-30 07:42:31 UTC
(In reply to Larry the Git Cow from comment #10)
>     This has been tested on prefix-standalone, call for more tests.

FWIW, I've retriggered these CI builds:
https://dev.azure.com/ssi-gentoo/prefix-ci/_build?definitionId=8