Created attachment 589924 [details] build.log Something goes wrong with the build of libcrypto: /usr/lib/gcc/x86_64-pc-linux-gnu/8.3.0/../../../../x86_64-pc-linux-gnu/bin/ld: ../../libcrypto.so: file not recognized: file truncated It sounds like it actually installed on some people's systems, which obviously causes a mess.
Yes, it installed an invalid libcrypto on my system. First 40 bytes were null. `file` reported it as "data"... Broke a lot of stuff :(
I see src_compile() already has a -j1 for emake and no progress of fixing it upstream (right?) But -j1 is not present in src_install() and src_install() has serious hacks like bug #665130. I guess it's not supported upstream either. I suggest adding -j1 to src_install() as well.
(In reply to Sergei Trofimovich from comment #2) > I see src_compile() already has a -j1 for emake and no progress of fixing it > upstream (right?) > > But -j1 is not present in src_install() and src_install() has serious hacks > like bug #665130. I guess it's not supported upstream either. I suggest > adding -j1 to src_install() as well. Also https://github.com/openssl/openssl/issues/7679
Looks like the simplest way to reproduce the failure is to use: MAKEOPTS=-j emerge -v1 =dev-libs/openssl-1.0.2t
@ Luke-Jr: You comment seems unrelated: When compilation will fail due to that issue, emerge won't merge package on disk.
Upstream doesn't care (not supported) and I don't have a reliable reproducer. I don't want to restrict to -j1 because it seems to work with multiple jobs most of the time. I'll limit jobs to 6 for now.
(In reply to Thomas Deutschmann from comment #6) > I don't want to restrict to -j1 because it seems to work with multiple jobs > most of the time. I'll limit jobs to 6 for now. Do IO-bound operations even benefit from parallelisation? Seems if it breaks with any -j, then it's broken for all -j2+ and when it "seems to work", you're just lucking out...
(In reply to Thomas Deutschmann from comment #6) > Upstream doesn't care (not supported) Why Gentoo uses it at all then? Is there much benefit for parallelism? > and I don't have a reliable reproducer You may want to try MAKEOPTS=-j and FEATURES=ccache. It will decrease variance of compiler step and will stress build system better. Given it's a race you will unlikely get a reliable reproducer and would have to look at the faulty logs and maybe insert artificial delays to get underlying cause. It looks like soname symlink is created after the actual link is attempted: From Matt's attached log libcrypto symlink is created after (or at the same time) it's attempted to be linked into libgost.so: + x86_64-pc-linux-gnu-gcc -m32 -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -Wall -DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DRC4_ASM -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM -DVPAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -O2 -march=broadwell -pipe -fno-strict-aliasing -Wa,--noexecstack -shared -Wl,-Bsymbolic -Wl,-soname=libgost.so -o libgost.so -Wl,--whole-archive e_gost_err.o gost2001_keyx.o gost2001.o gost89.o gost94_keyx.o gost_ameth.o gost_asn1.o gost_crypt.o gost_ctl.o gost_eng.o gosthash.o gost_keywrap.o gost_md.o gost_params.o gost_pmeth.o gost_sign.o -Wl,--no-whole-archive -L../.. -lcrypto -ldl -lz /usr/lib/gcc/x86_64-pc-linux-gnu/8.3.0/../../../../x86_64-pc-linux-gnu/bin/ld: ../../libcrypto.so: file not recognized: file truncated + rm -f libcrypto.so + ln -s libcrypto.so.1.0.0 libcrypto.so Thus it's probably a missing dependency of libgost.so on libcrypto.so and both race on read/write to libcrypto.so. > I don't want to restrict to -j1 because it seems to work with multiple jobs > most of the time. I'll limit jobs to 6 for now. That sounds dangerous. Build system will have a chance to link against system's library sometimes instead using just-built library. Or use partially written files. Ideally openssl's build system should do atomic renames not to expose partialy written files outside. Say, through 'gcc -o libsoo.so.tmp && mv libsoo.so.tmp libsoo.so'. It will make multiprocess make less confused about when the output is ready. But that is an invasive upstream change. If the upstream is unwilling to take it and nobody steps up to make parallel builds to work reliably at least in Gentoo we are bound to get reports as these time to time.
The problem is not new in openssl-1.0.2 series. Restricting to -j1 for everyone would have dramatic impact on build time (2min vs 10min on my systems). I expect that "-l" is triggering the problem. We are now filtering -l and will limit to -j6 max. If we still get reports, we have to find another solution (and maybe we will end up with -j1). But for now I'll try to keep parallel build which we used for 1-2 years now.
The bug has been closed via the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=2f736482adecae6176bafb64906996c06bade0a3 commit 2f736482adecae6176bafb64906996c06bade0a3 Author: Thomas Deutschmann <whissi@gentoo.org> AuthorDate: 2019-09-15 19:47:46 +0000 Commit: Thomas Deutschmann <whissi@gentoo.org> CommitDate: 2019-09-15 20:28:00 +0000 dev-libs/openssl: limit parallel jobs Closes: https://bugs.gentoo.org/694512 Package-Manager: Portage-2.3.76, Repoman-2.3.17 Signed-off-by: Thomas Deutschmann <whissi@gentoo.org> dev-libs/openssl/openssl-1.0.2t.ebuild | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) Additionally, it has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=3201627815cc92ff5f4396c288354fd3acfcd7c3 commit 3201627815cc92ff5f4396c288354fd3acfcd7c3 Author: Thomas Deutschmann <whissi@gentoo.org> AuthorDate: 2019-09-15 20:27:47 +0000 Commit: Thomas Deutschmann <whissi@gentoo.org> CommitDate: 2019-09-15 20:28:02 +0000 dev-libs/openssl-compat: limit parallel jobs Bug: https://bugs.gentoo.org/694512 Package-Manager: Portage-2.3.76, Repoman-2.3.17 Signed-off-by: Thomas Deutschmann <whissi@gentoo.org> dev-libs/openssl-compat/openssl-compat-1.0.2t.ebuild | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
I don't use -l, only -j64 I had no problems until 1.0.2t, which produces a corrupt libcrypto every time (except when I forced -j1).
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=90be70aa64277dd6fe31c6dea00f7f6c913057ac commit 90be70aa64277dd6fe31c6dea00f7f6c913057ac Author: Thomas Deutschmann <whissi@gentoo.org> AuthorDate: 2019-09-15 20:37:37 +0000 Commit: Thomas Deutschmann <whissi@gentoo.org> CommitDate: 2019-09-15 20:37:37 +0000 dev-libs/openssl: filter load average Bug: https://bugs.gentoo.org/694512 Package-Manager: Portage-2.3.76, Repoman-2.3.17 Signed-off-by: Thomas Deutschmann <whissi@gentoo.org> dev-libs/openssl/openssl-1.0.2t.ebuild | 3 +++ 1 file changed, 3 insertions(+) https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=3558a38befebeadab55ef698eb900b625838408d commit 3558a38befebeadab55ef698eb900b625838408d Author: Thomas Deutschmann <whissi@gentoo.org> AuthorDate: 2019-09-15 20:36:50 +0000 Commit: Thomas Deutschmann <whissi@gentoo.org> CommitDate: 2019-09-15 20:36:50 +0000 dev-libs/openssl-compat: filter load average Bug: https://bugs.gentoo.org/694512 Package-Manager: Portage-2.3.76, Repoman-2.3.17 Signed-off-by: Thomas Deutschmann <whissi@gentoo.org> dev-libs/openssl-compat/openssl-compat-1.0.2t.ebuild | 3 +++ 1 file changed, 3 insertions(+)
(In reply to Luke-Jr from comment #11) > I don't use -l, only -j64 > > I had no problems until 1.0.2t, which produces a corrupt libcrypto every > time (except when I forced -j1). Obviously, ebuild is passing for you. However, the reported error will throw an error and emerge will stop. So it looks you have a different problem. In this case please file an own bug.
(In reply to Luke-Jr from comment #11) > I don't use -l, only -j64 > > I had no problems until 1.0.2t, which produces a corrupt libcrypto every > time (except when I forced -j1). FWIW, while I have different symptoms than Luke, I also had no problems until 1.0.2t.
Created attachment 589954 [details] dev-libs:openssl-1.0.2t:20190915-211511.log.gz I don't see the problem being fixed with -j6 with all above fixes pulled: make -j6 all ... ... ld: ../../libcrypto.so: file not recognized: file truncated collect2: error: ld returned 1 exit status make[3]: *** [../../Makefile.shared:167: link_o.gnu] Error 1
Created attachment 589956 [details] MAKEOPTS-j2-dev-libs:openssl-1.0.2t:20190915-213228.log.gz Here is a MAKEOPTS=-j2 build failure for you.
Limiting the jobs to some arbitrary number (other than 1) makes no sense at all. Please revert, or switch it to -j1.
It wasn't an arbitrary number. But the trigger is FEATURES=ccache (2+ run). Currently bisecting, not happening with 1.0.2r but 1.0.2s-r1.
Found the problem: I killed our patch set when I created 1.0.2s* and synchronized EAPI=7 logic. Incoming revert + fix.
The bug has been closed via the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=77f41cb32418c535b2e948e4bd29d4647b6c99c0 commit 77f41cb32418c535b2e948e4bd29d4647b6c99c0 Author: Thomas Deutschmann <whissi@gentoo.org> AuthorDate: 2019-09-16 00:03:38 +0000 Commit: Thomas Deutschmann <whissi@gentoo.org> CommitDate: 2019-09-16 00:06:18 +0000 dev-libs/openssl-compat: restore Gentoo patch set Patch set for 1.0.2x series were longer applied when ebuilds were bumped to EAPI=7 and unified. Fixes a039f65 ("dev-libs/openssl: bump to EAPI 7") Closes: https://bugs.gentoo.org/694512 Package-Manager: Portage-2.3.76, Repoman-2.3.17 Signed-off-by: Thomas Deutschmann <whissi@gentoo.org> .../{openssl-compat-1.0.2s.ebuild => openssl-compat-1.0.2s-r1.ebuild} | 4 +--- .../{openssl-compat-1.0.2t.ebuild => openssl-compat-1.0.2t-r1.ebuild} | 4 +--- 2 files changed, 2 insertions(+), 6 deletions(-) https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=a21ed49cf57d9a8111876fb49cdd6fc6afb8bd90 commit a21ed49cf57d9a8111876fb49cdd6fc6afb8bd90 Author: Thomas Deutschmann <whissi@gentoo.org> AuthorDate: 2019-09-16 00:01:28 +0000 Commit: Thomas Deutschmann <whissi@gentoo.org> CommitDate: 2019-09-16 00:06:17 +0000 dev-libs/openssl: restore Gentoo patch set Patch set for 1.0.2x series were longer applied when ebuilds were bumped to EAPI=7 and unified. Fixes a039f65 ("dev-libs/openssl: bump to EAPI 7") Closes: https://bugs.gentoo.org/694512 Package-Manager: Portage-2.3.76, Repoman-2.3.17 Signed-off-by: Thomas Deutschmann <whissi@gentoo.org> .../openssl/{openssl-1.0.2s-r1.ebuild => openssl-1.0.2s-r2.ebuild} | 4 +--- dev-libs/openssl/{openssl-1.0.2t.ebuild => openssl-1.0.2t-r1.ebuild} | 4 +--- 2 files changed, 2 insertions(+), 6 deletions(-) Additionally, it has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=4e8a7b0b3b8b702333424151528e2822488c74ad commit 4e8a7b0b3b8b702333424151528e2822488c74ad Author: Thomas Deutschmann <whissi@gentoo.org> AuthorDate: 2019-09-15 23:50:28 +0000 Commit: Thomas Deutschmann <whissi@gentoo.org> CommitDate: 2019-09-16 00:06:13 +0000 Revert "dev-libs/openssl: limit parallel jobs" This reverts commit 2f736482adecae6176bafb64906996c06bade0a3. Bug: https://bugs.gentoo.org/694512 Signed-off-by: Thomas Deutschmann <whissi@gentoo.org> dev-libs/openssl/openssl-1.0.2t.ebuild | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-) https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=5a2ebf9296293621d4a74c1090b5a6087b8a86d4 commit 5a2ebf9296293621d4a74c1090b5a6087b8a86d4 Author: Thomas Deutschmann <whissi@gentoo.org> AuthorDate: 2019-09-15 23:50:08 +0000 Commit: Thomas Deutschmann <whissi@gentoo.org> CommitDate: 2019-09-16 00:06:08 +0000 Revert "dev-libs/openssl-compat: limit parallel jobs" This reverts commit 3201627815cc92ff5f4396c288354fd3acfcd7c3. Bug: https://bugs.gentoo.org/694512 Signed-off-by: Thomas Deutschmann <whissi@gentoo.org> dev-libs/openssl-compat/openssl-compat-1.0.2t.ebuild | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-) https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=73567ce3bf59198b1c2fe19aa59d70fed4c8a13a commit 73567ce3bf59198b1c2fe19aa59d70fed4c8a13a Author: Thomas Deutschmann <whissi@gentoo.org> AuthorDate: 2019-09-15 23:49:52 +0000 Commit: Thomas Deutschmann <whissi@gentoo.org> CommitDate: 2019-09-16 00:06:04 +0000 Revert "dev-libs/openssl-compat: filter load average" This reverts commit 3558a38befebeadab55ef698eb900b625838408d. Bug: https://bugs.gentoo.org/694512 Signed-off-by: Thomas Deutschmann <whissi@gentoo.org> dev-libs/openssl-compat/openssl-compat-1.0.2t.ebuild | 3 --- 1 file changed, 3 deletions(-) https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=6c4711b3b966c2446f2f937d5bfd39d607060b78 commit 6c4711b3b966c2446f2f937d5bfd39d607060b78 Author: Thomas Deutschmann <whissi@gentoo.org> AuthorDate: 2019-09-15 23:49:36 +0000 Commit: Thomas Deutschmann <whissi@gentoo.org> CommitDate: 2019-09-16 00:06:00 +0000 Revert "dev-libs/openssl: filter load average" This reverts commit 90be70aa64277dd6fe31c6dea00f7f6c913057ac. Bug: https://bugs.gentoo.org/694512 Signed-off-by: Thomas Deutschmann <whissi@gentoo.org> dev-libs/openssl/openssl-1.0.2t.ebuild | 3 --- 1 file changed, 3 deletions(-)