Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 694512

Summary: dev-libs/openssl-1.0.2t with FEATURES ccache: parallel build failure: ld: ../../libcrypto.so: file not recognized: file truncated
Product: Gentoo Linux Reporter: Matt Turner <mattst88>
Component: Current packagesAssignee: Gentoo's Team for Core System packages <base-system>
Status: RESOLVED FIXED    
Severity: normal CC: luke-jr+gentoobugs, slyfox
Priority: Normal    
Version: unspecified   
Hardware: All   
OS: Linux   
URL: https://github.com/openssl/openssl/issues/9904
Whiteboard:
Package list:
Runtime testing required: ---
Bug Depends on:    
Bug Blocks: 694162    
Attachments: build.log
dev-libs:openssl-1.0.2t:20190915-211511.log.gz
MAKEOPTS-j2-dev-libs:openssl-1.0.2t:20190915-213228.log.gz

Description Matt Turner gentoo-dev 2019-09-15 16:57:27 UTC
Created attachment 589924 [details]
build.log

Something goes wrong with the build of libcrypto:

/usr/lib/gcc/x86_64-pc-linux-gnu/8.3.0/../../../../x86_64-pc-linux-gnu/bin/ld: ../../libcrypto.so: file not recognized: file truncated


It sounds like it actually installed on some people's systems, which obviously causes a mess.
Comment 1 Luke-Jr 2019-09-15 17:01:27 UTC
Yes, it installed an invalid libcrypto on my system. First 40 bytes were null. `file` reported it as "data"...

Broke a lot of stuff :(
Comment 2 Sergei Trofimovich (RETIRED) gentoo-dev 2019-09-15 17:17:13 UTC
I see src_compile() already has a -j1 for emake and no progress of fixing it upstream (right?)

But -j1 is not present in src_install() and src_install() has serious hacks like bug #665130. I guess it's not supported upstream either. I suggest adding -j1 to src_install() as well.
Comment 3 Sergei Trofimovich (RETIRED) gentoo-dev 2019-09-15 17:27:11 UTC
(In reply to Sergei Trofimovich from comment #2)
> I see src_compile() already has a -j1 for emake and no progress of fixing it
> upstream (right?)
> 
> But -j1 is not present in src_install() and src_install() has serious hacks
> like bug #665130. I guess it's not supported upstream either. I suggest
> adding -j1 to src_install() as well.

Also https://github.com/openssl/openssl/issues/7679
Comment 4 Sergei Trofimovich (RETIRED) gentoo-dev 2019-09-15 17:58:20 UTC
Looks like the simplest way to reproduce the failure is to use:
    MAKEOPTS=-j emerge -v1 =dev-libs/openssl-1.0.2t
Comment 5 Thomas Deutschmann (RETIRED) gentoo-dev 2019-09-15 19:02:41 UTC
@ Luke-Jr: You comment seems unrelated: When compilation will fail due to that issue, emerge won't merge package on disk.
Comment 6 Thomas Deutschmann (RETIRED) gentoo-dev 2019-09-15 19:32:48 UTC
Upstream doesn't care (not supported) and I don't have a reliable reproducer.

I don't want to restrict to -j1 because it seems to work with multiple jobs most of the time. I'll limit jobs to 6 for now.
Comment 7 Luke-Jr 2019-09-15 19:57:40 UTC
(In reply to Thomas Deutschmann from comment #6)
> I don't want to restrict to -j1 because it seems to work with multiple jobs
> most of the time. I'll limit jobs to 6 for now.

Do IO-bound operations even benefit from parallelisation?

Seems if it breaks with any -j, then it's broken for all -j2+ and when it "seems to work", you're just lucking out...
Comment 8 Sergei Trofimovich (RETIRED) gentoo-dev 2019-09-15 20:07:13 UTC
(In reply to Thomas Deutschmann from comment #6)
> Upstream doesn't care (not supported)

Why Gentoo uses it at all then? Is there much benefit for parallelism?

> and I don't have a reliable reproducer

You may want to try MAKEOPTS=-j and FEATURES=ccache. It will decrease variance of compiler step and will stress build system better.

Given it's a race you will unlikely get a reliable reproducer and would have to look at the faulty logs and maybe insert artificial delays to get underlying cause.

It looks like soname symlink is created after the actual link is attempted:

From Matt's attached log libcrypto symlink is created after (or at the same time) it's attempted to be linked into libgost.so:

+ x86_64-pc-linux-gnu-gcc -m32 -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -Wall -DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DRC4_ASM -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM -DVPAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -O2 -march=broadwell -pipe -fno-strict-aliasing -Wa,--noexecstack -shared -Wl,-Bsymbolic -Wl,-soname=libgost.so -o libgost.so -Wl,--whole-archive e_gost_err.o gost2001_keyx.o gost2001.o gost89.o gost94_keyx.o gost_ameth.o gost_asn1.o gost_crypt.o gost_ctl.o gost_eng.o gosthash.o gost_keywrap.o gost_md.o gost_params.o gost_pmeth.o gost_sign.o -Wl,--no-whole-archive -L../.. -lcrypto -ldl -lz
/usr/lib/gcc/x86_64-pc-linux-gnu/8.3.0/../../../../x86_64-pc-linux-gnu/bin/ld: ../../libcrypto.so: file not recognized: file truncated
+ rm -f libcrypto.so
+ ln -s libcrypto.so.1.0.0 libcrypto.so

Thus it's probably a missing dependency of libgost.so on libcrypto.so and both race on read/write to libcrypto.so.

> I don't want to restrict to -j1 because it seems to work with multiple jobs
> most of the time. I'll limit jobs to 6 for now.

That sounds dangerous. Build system will have a chance to link against system's library sometimes instead using just-built library. Or use partially written files.

Ideally openssl's build system should do atomic renames not to expose partialy written files outside. Say, through 'gcc -o libsoo.so.tmp && mv libsoo.so.tmp libsoo.so'. It will make multiprocess make less confused about when the output is ready.

But that is an invasive upstream change. If the upstream is unwilling to take it  and nobody steps up to make parallel builds to work reliably at least in Gentoo we are bound to get reports as these time to time.
Comment 9 Thomas Deutschmann (RETIRED) gentoo-dev 2019-09-15 20:24:28 UTC
The problem is not new in openssl-1.0.2 series. Restricting to -j1 for everyone would have dramatic impact on build time (2min vs 10min on my systems). I expect that "-l" is triggering the problem. We are now filtering -l and will limit to -j6 max.

If we still get reports, we have to find another solution (and maybe we will end up with -j1). But for now I'll try to keep parallel build which we used for 1-2 years now.
Comment 10 Larry the Git Cow gentoo-dev 2019-09-15 20:28:12 UTC
The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=2f736482adecae6176bafb64906996c06bade0a3

commit 2f736482adecae6176bafb64906996c06bade0a3
Author:     Thomas Deutschmann <whissi@gentoo.org>
AuthorDate: 2019-09-15 19:47:46 +0000
Commit:     Thomas Deutschmann <whissi@gentoo.org>
CommitDate: 2019-09-15 20:28:00 +0000

    dev-libs/openssl: limit parallel jobs
    
    Closes: https://bugs.gentoo.org/694512
    Package-Manager: Portage-2.3.76, Repoman-2.3.17
    Signed-off-by: Thomas Deutschmann <whissi@gentoo.org>

 dev-libs/openssl/openssl-1.0.2t.ebuild | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

Additionally, it has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=3201627815cc92ff5f4396c288354fd3acfcd7c3

commit 3201627815cc92ff5f4396c288354fd3acfcd7c3
Author:     Thomas Deutschmann <whissi@gentoo.org>
AuthorDate: 2019-09-15 20:27:47 +0000
Commit:     Thomas Deutschmann <whissi@gentoo.org>
CommitDate: 2019-09-15 20:28:02 +0000

    dev-libs/openssl-compat: limit parallel jobs
    
    Bug: https://bugs.gentoo.org/694512
    Package-Manager: Portage-2.3.76, Repoman-2.3.17
    Signed-off-by: Thomas Deutschmann <whissi@gentoo.org>

 dev-libs/openssl-compat/openssl-compat-1.0.2t.ebuild | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)
Comment 11 Luke-Jr 2019-09-15 20:36:56 UTC
I don't use -l, only -j64

I had no problems until 1.0.2t, which produces a corrupt libcrypto every time (except when I forced -j1).
Comment 12 Larry the Git Cow gentoo-dev 2019-09-15 20:37:53 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=90be70aa64277dd6fe31c6dea00f7f6c913057ac

commit 90be70aa64277dd6fe31c6dea00f7f6c913057ac
Author:     Thomas Deutschmann <whissi@gentoo.org>
AuthorDate: 2019-09-15 20:37:37 +0000
Commit:     Thomas Deutschmann <whissi@gentoo.org>
CommitDate: 2019-09-15 20:37:37 +0000

    dev-libs/openssl: filter load average
    
    Bug: https://bugs.gentoo.org/694512
    Package-Manager: Portage-2.3.76, Repoman-2.3.17
    Signed-off-by: Thomas Deutschmann <whissi@gentoo.org>

 dev-libs/openssl/openssl-1.0.2t.ebuild | 3 +++
 1 file changed, 3 insertions(+)

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=3558a38befebeadab55ef698eb900b625838408d

commit 3558a38befebeadab55ef698eb900b625838408d
Author:     Thomas Deutschmann <whissi@gentoo.org>
AuthorDate: 2019-09-15 20:36:50 +0000
Commit:     Thomas Deutschmann <whissi@gentoo.org>
CommitDate: 2019-09-15 20:36:50 +0000

    dev-libs/openssl-compat: filter load average
    
    Bug: https://bugs.gentoo.org/694512
    Package-Manager: Portage-2.3.76, Repoman-2.3.17
    Signed-off-by: Thomas Deutschmann <whissi@gentoo.org>

 dev-libs/openssl-compat/openssl-compat-1.0.2t.ebuild | 3 +++
 1 file changed, 3 insertions(+)
Comment 13 Thomas Deutschmann (RETIRED) gentoo-dev 2019-09-15 20:39:41 UTC
(In reply to Luke-Jr from comment #11)
> I don't use -l, only -j64
> 
> I had no problems until 1.0.2t, which produces a corrupt libcrypto every
> time (except when I forced -j1).

Obviously, ebuild is passing for you. However, the reported error will throw an error and emerge will stop. So it looks you have a different problem. In this case please file an own bug.
Comment 14 Matt Turner gentoo-dev 2019-09-15 20:40:31 UTC
(In reply to Luke-Jr from comment #11)
> I don't use -l, only -j64
> 
> I had no problems until 1.0.2t, which produces a corrupt libcrypto every
> time (except when I forced -j1).

FWIW, while I have different symptoms than Luke, I also had no problems until 1.0.2t.
Comment 15 Sergei Trofimovich (RETIRED) gentoo-dev 2019-09-15 21:18:54 UTC
Created attachment 589954 [details]
dev-libs:openssl-1.0.2t:20190915-211511.log.gz

I don't see the problem being fixed with -j6 with all above fixes pulled:

make -j6 all
...
... ld: ../../libcrypto.so: file not recognized: file truncated
collect2: error: ld returned 1 exit status
make[3]: *** [../../Makefile.shared:167: link_o.gnu] Error 1
Comment 16 Sergei Trofimovich (RETIRED) gentoo-dev 2019-09-15 21:36:04 UTC
Created attachment 589956 [details]
MAKEOPTS-j2-dev-libs:openssl-1.0.2t:20190915-213228.log.gz

Here is a MAKEOPTS=-j2 build failure for you.
Comment 17 Mike Gilbert gentoo-dev 2019-09-15 23:01:27 UTC
Limiting the jobs to some arbitrary number (other than 1) makes no sense at all. Please revert, or switch it to -j1.
Comment 18 Thomas Deutschmann (RETIRED) gentoo-dev 2019-09-15 23:11:44 UTC
It wasn't an arbitrary number.

But the trigger is FEATURES=ccache (2+ run). Currently bisecting, not happening with 1.0.2r but 1.0.2s-r1.
Comment 19 Thomas Deutschmann (RETIRED) gentoo-dev 2019-09-15 23:48:48 UTC
Found the problem: I killed our patch set when I created 1.0.2s* and synchronized EAPI=7 logic.

Incoming revert + fix.
Comment 20 Larry the Git Cow gentoo-dev 2019-09-16 00:06:30 UTC
The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=77f41cb32418c535b2e948e4bd29d4647b6c99c0

commit 77f41cb32418c535b2e948e4bd29d4647b6c99c0
Author:     Thomas Deutschmann <whissi@gentoo.org>
AuthorDate: 2019-09-16 00:03:38 +0000
Commit:     Thomas Deutschmann <whissi@gentoo.org>
CommitDate: 2019-09-16 00:06:18 +0000

    dev-libs/openssl-compat: restore Gentoo patch set
    
    Patch set for 1.0.2x series were longer applied when ebuilds were
    bumped to EAPI=7 and unified.
    
    Fixes a039f65 ("dev-libs/openssl: bump to EAPI 7")
    Closes: https://bugs.gentoo.org/694512
    Package-Manager: Portage-2.3.76, Repoman-2.3.17
    Signed-off-by: Thomas Deutschmann <whissi@gentoo.org>

 .../{openssl-compat-1.0.2s.ebuild => openssl-compat-1.0.2s-r1.ebuild} | 4 +---
 .../{openssl-compat-1.0.2t.ebuild => openssl-compat-1.0.2t-r1.ebuild} | 4 +---
 2 files changed, 2 insertions(+), 6 deletions(-)

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=a21ed49cf57d9a8111876fb49cdd6fc6afb8bd90

commit a21ed49cf57d9a8111876fb49cdd6fc6afb8bd90
Author:     Thomas Deutschmann <whissi@gentoo.org>
AuthorDate: 2019-09-16 00:01:28 +0000
Commit:     Thomas Deutschmann <whissi@gentoo.org>
CommitDate: 2019-09-16 00:06:17 +0000

    dev-libs/openssl: restore Gentoo patch set
    
    Patch set for 1.0.2x series were longer applied when ebuilds were
    bumped to EAPI=7 and unified.
    
    Fixes a039f65 ("dev-libs/openssl: bump to EAPI 7")
    Closes: https://bugs.gentoo.org/694512
    Package-Manager: Portage-2.3.76, Repoman-2.3.17
    Signed-off-by: Thomas Deutschmann <whissi@gentoo.org>

 .../openssl/{openssl-1.0.2s-r1.ebuild => openssl-1.0.2s-r2.ebuild}    | 4 +---
 dev-libs/openssl/{openssl-1.0.2t.ebuild => openssl-1.0.2t-r1.ebuild}  | 4 +---
 2 files changed, 2 insertions(+), 6 deletions(-)

Additionally, it has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=4e8a7b0b3b8b702333424151528e2822488c74ad

commit 4e8a7b0b3b8b702333424151528e2822488c74ad
Author:     Thomas Deutschmann <whissi@gentoo.org>
AuthorDate: 2019-09-15 23:50:28 +0000
Commit:     Thomas Deutschmann <whissi@gentoo.org>
CommitDate: 2019-09-16 00:06:13 +0000

    Revert "dev-libs/openssl: limit parallel jobs"
    
    This reverts commit 2f736482adecae6176bafb64906996c06bade0a3.
    
    Bug: https://bugs.gentoo.org/694512
    Signed-off-by: Thomas Deutschmann <whissi@gentoo.org>

 dev-libs/openssl/openssl-1.0.2t.ebuild | 9 +--------
 1 file changed, 1 insertion(+), 8 deletions(-)

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=5a2ebf9296293621d4a74c1090b5a6087b8a86d4

commit 5a2ebf9296293621d4a74c1090b5a6087b8a86d4
Author:     Thomas Deutschmann <whissi@gentoo.org>
AuthorDate: 2019-09-15 23:50:08 +0000
Commit:     Thomas Deutschmann <whissi@gentoo.org>
CommitDate: 2019-09-16 00:06:08 +0000

    Revert "dev-libs/openssl-compat: limit parallel jobs"
    
    This reverts commit 3201627815cc92ff5f4396c288354fd3acfcd7c3.
    
    Bug: https://bugs.gentoo.org/694512
    Signed-off-by: Thomas Deutschmann <whissi@gentoo.org>

 dev-libs/openssl-compat/openssl-compat-1.0.2t.ebuild | 9 +--------
 1 file changed, 1 insertion(+), 8 deletions(-)

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=73567ce3bf59198b1c2fe19aa59d70fed4c8a13a

commit 73567ce3bf59198b1c2fe19aa59d70fed4c8a13a
Author:     Thomas Deutschmann <whissi@gentoo.org>
AuthorDate: 2019-09-15 23:49:52 +0000
Commit:     Thomas Deutschmann <whissi@gentoo.org>
CommitDate: 2019-09-16 00:06:04 +0000

    Revert "dev-libs/openssl-compat: filter load average"
    
    This reverts commit 3558a38befebeadab55ef698eb900b625838408d.
    
    Bug: https://bugs.gentoo.org/694512
    Signed-off-by: Thomas Deutschmann <whissi@gentoo.org>

 dev-libs/openssl-compat/openssl-compat-1.0.2t.ebuild | 3 ---
 1 file changed, 3 deletions(-)

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=6c4711b3b966c2446f2f937d5bfd39d607060b78

commit 6c4711b3b966c2446f2f937d5bfd39d607060b78
Author:     Thomas Deutschmann <whissi@gentoo.org>
AuthorDate: 2019-09-15 23:49:36 +0000
Commit:     Thomas Deutschmann <whissi@gentoo.org>
CommitDate: 2019-09-16 00:06:00 +0000

    Revert "dev-libs/openssl: filter load average"
    
    This reverts commit 90be70aa64277dd6fe31c6dea00f7f6c913057ac.
    
    Bug: https://bugs.gentoo.org/694512
    Signed-off-by: Thomas Deutschmann <whissi@gentoo.org>

 dev-libs/openssl/openssl-1.0.2t.ebuild | 3 ---
 1 file changed, 3 deletions(-)