Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 892549 - sys-devel/binutils: After RAP toolchain migration, ld failed to perform some link when building
Summary: sys-devel/binutils: After RAP toolchain migration, ld failed to perform some ...
Status: RESOLVED FIXED
Alias: None
Product: Gentoo/Alt
Classification: Unclassified
Component: Prefix Support (show other bugs)
Hardware: All Linux
: Highest major
Assignee: Gentoo Prefix
URL:
Whiteboard:
Keywords: PullRequest
Depends on: 912676
Blocks:
  Show dependency tree
 
Reported: 2023-01-30 02:54 UTC by Yiyang Wu
Modified: 2024-01-06 11:48 UTC (History)
5 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
emerge --info output (emerge-info.txt,6.33 KB, text/plain)
2023-01-30 03:05 UTC, Yiyang Wu
Details
tgbugs binutils build.log (binutils-prefix-build.log,613.78 KB, text/x-log)
2023-01-31 22:58 UTC, Tom Gillespie
Details
tgbugs gentoo/tmp/bin/emerge --infos (binutils-emerge-info,4.70 KB, text/plain)
2023-01-31 23:01 UTC, Tom Gillespie
Details
bug_MWE (reproduce_ld.tar.gz,13.38 KB, application/gzip)
2023-09-24 09:21 UTC, Yiyang Wu
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Yiyang Wu 2023-01-30 02:54:38 UTC
I migrated my prefix according to https://www.gentoo.org/support/news-items/2023-01-28-rap-prefix-sysroot.html

Actually I just started out rolling system using `emerge --ask --verbose --update --newuse --backtrack=1000 -b --oneshot --keep-going --oneshot -j 8 @world`, and with glibc upgrade things went broken, so I check the news, use the last suggestion to manually prefixify the GNU ld script, and then proceeded the migration without any issue.

However after migration succeeded, some of compilation using bfd linker in user space went wrong. I no longer understood the linker's behavior. Simply compiling a program using cblas functions:

```c
#include "stdio.h"
#include "cblas.h"
int main()
{
        const double x[3] = {1,2,3};
        const double y[3] = {1,-2,3};
        double result = cblas_ddot(3, x, 1, y, 1);
        printf("%lf\n", result);
        return 0;
}
```

run `gcc -fuse-ld=bfd foo.c -lcblas`

I got 
```
/opt/gentoo/usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld.bfd: warning: libgfortran.so.5, needed by /opt/gentoo/usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../lib64/libcblas.so, not found (try using -rpath or -rpath-link)
/opt/gentoo/usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld.bfd: warning: libquadmath.so.0, needed by /opt/gentoo/usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../lib64/libcblas.so, not found (try using -rpath or -rpath-link)
/opt/gentoo/usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld.bfd: /opt/gentoo/usr/lib64/libblas.so.3: undefined reference to `_gfortran_stop_string@GFORTRAN_8'
/opt/gentoo/usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld.bfd: /opt/gentoo/usr/lib64/libblas.so.3: undefined reference to `_gfortran_st_write@GFORTRAN_8'
/opt/gentoo/usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld.bfd: /opt/gentoo/usr/lib64/libblas.so.3: undefined reference to `_gfortran_string_len_trim@GFORTRAN_8'
/opt/gentoo/usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld.bfd: /opt/gentoo/usr/lib64/libblas.so.3: undefined reference to `_gfortran_transfer_character_write@GFORTRAN_8'
/opt/gentoo/usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld.bfd: /opt/gentoo/usr/lib64/libblas.so.3: undefined reference to `_gfortran_transfer_integer_write@GFORTRAN_8'
/opt/gentoo/usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld.bfd: /opt/gentoo/usr/lib64/libblas.so.3: undefined reference to `_gfortran_st_write_done@GFORTRAN_8'
collect2: error: ld returned 1 exit status
```

Using mold, lld, gold there's no issue. If I perform `ln -sn "${EPREFIX}" "${EPREFIX}${EPREFIX}"`, then bfd will also compile successfully.

Reproducible: Always

Steps to Reproduce:
1. Migrate a RAP
2. emerge virtual/blas
3. compile a simple program with `-lcblas`
Comment 1 Yiyang Wu 2023-01-30 03:05:22 UTC
Created attachment 849437 [details]
emerge --info output
Comment 2 James Le Cuirot gentoo-dev 2023-01-30 22:44:10 UTC
I don't know what the cause is yet, but I have reproduced it. I think it affects the libraries installed with gcc, but seemingly not libstdc++.
Comment 3 James Le Cuirot gentoo-dev 2023-01-30 22:48:39 UTC
Aha!

% strace -f -e trace=file gcc -fuse-ld=bfd foo.c -lcblas 2>&1 | fgrep ld.so.conf
[pid  1436] openat(AT_FDCWD, "/mnt/prefix/mnt/prefix/usr/mnt/prefix/etc/ld.so.conf", O_RDONLY) = -1 ENOENT (No such file or directory)
[pid  1436] openat(AT_FDCWD, "/mnt/prefix/mnt/prefix/etc/ld.so.conf", O_RDONLY) = -1 ENOENT (No such file or directory)

This is similar to a cross problem I saw recently. I'll keep looking.
Comment 4 James Le Cuirot gentoo-dev 2023-01-30 23:32:17 UTC
I think the problem is in binutils. I'll look some more tomorrow.
Comment 5 Tom Gillespie 2023-01-31 22:58:20 UTC
I'm seeing a similar issue when bootstrapping a new prefix.
Comment 6 Tom Gillespie 2023-01-31 22:58:59 UTC
Created attachment 849582 [details]
tgbugs binutils build.log
Comment 7 Tom Gillespie 2023-01-31 23:01:32 UTC
Created attachment 849584 [details]
tgbugs gentoo/tmp/bin/emerge --infos
Comment 8 James Le Cuirot gentoo-dev 2023-01-31 23:10:41 UTC
I've made some progress, but it's not straightforward.

The first problem is this hack in the prefix profile. It should be removed.

https://gitweb.gentoo.org/repo/gentoo.git/tree/profiles/features/prefix/standalone/profile.bashrc?id=edf7231cee8509dcc346c3c21891ccb6fbd69602#n18

The second problem is that the paths in files like /mnt/prefix/etc/ld.so.conf.d/05gcc-x86_64-pc-linux-gnu.conf now need to be unprefixed. This fixes things when building, but then ldconfig picks up the wrong libraries when generating the cache.

Before:

> % ldconfig; ldconfig -p | fgrep fortran                                     
>         libgfortran.so.5 (libc6,x86-64) => /mnt/prefix/usr/lib/gcc/x86_64-pc-linux-gnu/12/libgfortran.so.5
>         libgfortran.so (libc6,x86-64) => /mnt/prefix/usr/lib/gcc/x86_64-pc-linux-gnu/12/libgfortran.so

After:

> % ldconfig; ldconfig -p | fgrep fortran                                     
>         libgfortran.so.5 (libc6,x86-64) => /usr/lib/gcc/x86_64-pc-linux-gnu/12/libgfortran.so.5
>         libgfortran.so (libc6,x86-64) => /usr/lib/gcc/x86_64-pc-linux-gnu/12/libgfortran.so
Comment 9 James Le Cuirot gentoo-dev 2023-02-01 23:42:08 UTC
I've been thinking about whether the paths in these ld.so.conf files should be prefixed or not, and hence, which direction should go in fixing this. I believe they should be unprefixed, otherwise the files will never make sense when cross-compiling. That presumably means a change to glibc, but that doesn't have any concept of a sysroot.
Comment 10 Benda Xu gentoo-dev 2023-02-02 05:30:40 UTC
(In reply to James Le Cuirot from comment #9)
> I've been thinking about whether the paths in these ld.so.conf files should
> be prefixed or not, and hence, which direction should go in fixing this. I
> believe they should be unprefixed, otherwise the files will never make sense
> when cross-compiling. That presumably means a change to glibc, but that
> doesn't have any concept of a sysroot.

The fundamental inconsistency comes from the runtime loading (glibc) and build-time linking (binutils), which coincidentally refer to the same ld.so.conf files.  They have to interpret the config file in the same why in order for a system to function.
Comment 11 Benda Xu gentoo-dev 2023-02-02 05:32:32 UTC
(In reply to Benda Xu from comment #10)
> (In reply to James Le Cuirot from comment #9)
> > I've been thinking about whether the paths in these ld.so.conf files should
> > be prefixed or not, and hence, which direction should go in fixing this. I
> > believe they should be unprefixed, otherwise the files will never make sense
> > when cross-compiling. That presumably means a change to glibc, but that
> > doesn't have any concept of a sysroot.
> 
> The fundamental inconsistency comes from the runtime loading (glibc) and
> build-time linking (binutils), which coincidentally refer to the same
> ld.so.conf files.  They have to interpret the config file in the same why in
> order for a system to function.

Note that the above is only true for Prefix, not cross-compiling.   In the latter, runtime loading is not prefixed and but build-time linking is prefixed.
Comment 12 Benda Xu gentoo-dev 2023-02-02 05:59:10 UTC
(In reply to Benda Xu from comment #11)
> (In reply to Benda Xu from comment #10)
> > (In reply to James Le Cuirot from comment #9)
> > > I've been thinking about whether the paths in these ld.so.conf files should
> > > be prefixed or not, and hence, which direction should go in fixing this. I
> > > believe they should be unprefixed, otherwise the files will never make sense
> > > when cross-compiling. That presumably means a change to glibc, but that
> > > doesn't have any concept of a sysroot.
> > 
> > The fundamental inconsistency comes from the runtime loading (glibc) and
> > build-time linking (binutils), which coincidentally refer to the same
> > ld.so.conf files.  They have to interpret the config file in the same why in
> > order for a system to function.
> 
> Note that the above is only true for Prefix, not cross-compiling.   In the
> latter, runtime loading is not prefixed and but build-time linking is
> prefixed.

If we want both cross-compiling and Prefix as this round of patches to achieve, there are 3 levels,

1. Cross-compile a Gentoo vanilla from a Prefix (BROOT)

glibc: vanilla
binutils: search for libraries from ESYSROOT=BROOT/usr/CHOST

2. Cross-compile a Prefix (EPREFIX) from a vanilla

glibc: prefixed with EPREFIX
binutils: search for libraries from ESYSROOT=/usr/CHOST/EPREFIX, because BROOT=/

3. Cross-compile a Prefix_1 (EPREFIX) from a Prefix_0 (BROOT).

glibc: prefixed with EPREFIX
binutils: link from ESYSROOT=BROOT/usr/CHOST/EPREFIX


If ld.so.conf is not prefixed, and glibc is hacked to automatically inject EPREFIX during runtime loading, the solution would be clean.  But we lose the ability to load 3rd party libraries outside the EPREFIX.  One way to fix this is to introduction a special grammar (like binutils) of "=/usr/" in ld.so.conf to mean "Prefix me!".

Another way is to decouple the file ld.so.conf into runtime and build-time (automatically generated from the former) versions for glibc and binutils separately.
Comment 13 Benda Xu gentoo-dev 2023-02-02 06:14:15 UTC
Continuing down the path of decoupling ld.so.conf, it needs the least upstream change: in Gentoo ld.so.conf is completely controlled by env-update and (few) eselect.  The development can happen entirely in Gentoo ourselves.
Comment 14 sinxccc 2023-02-02 14:33:10 UTC
Hi, I would like to see if there is any workaround for this issue to finish a bootstrap of Prefix on linux while it is open? Thanks!
Comment 15 Yiyang Wu 2023-02-02 14:46:43 UTC
(In reply to sinxccc from comment #14)
> Hi, I would like to see if there is any workaround for this issue to finish
> a bootstrap of Prefix on linux while it is open? Thanks!

Try 
```bash
mkdir -p "${EPREFIX}${EPREFIX%/*}"
ln -sn "${EPREFIX}" "${EPREFIX}${EPREFIX}"
```
Comment 16 James Le Cuirot gentoo-dev 2023-02-03 23:26:47 UTC
(In reply to Benda Xu from comment #12)
>
> If ld.so.conf is not prefixed, and glibc is hacked to automatically inject
> EPREFIX during runtime loading, the solution would be clean.  But we lose
> the ability to load 3rd party libraries outside the EPREFIX.  One way to fix
> this is to introduction a special grammar (like binutils) of "=/usr/" in
> ld.so.conf to mean "Prefix me!".

I had this idea too. I know it's a bit more work, but it's my preferred solution. I'll see if I can knock something up over the weekend. I believe such changes would be upstreamable. It's not like using ld.so.conf for both runtime and build-time is a Gentoo-specific thing.

> Another way is to decouple the file ld.so.conf into runtime and build-time
> (automatically generated from the former) versions for glibc and binutils
> separately.

I think that would still need some patching. How else would we tell ld to look in a different place?
Comment 17 Benda Xu gentoo-dev 2023-02-04 03:45:03 UTC
(In reply to James Le Cuirot from comment #16)
> (In reply to Benda Xu from comment #12)
> >
> > If ld.so.conf is not prefixed, and glibc is hacked to automatically inject
> > EPREFIX during runtime loading, the solution would be clean.  But we lose
> > the ability to load 3rd party libraries outside the EPREFIX.  One way to fix
> > this is to introduction a special grammar (like binutils) of "=/usr/" in
> > ld.so.conf to mean "Prefix me!".
> 
> I had this idea too. I know it's a bit more work, but it's my preferred
> solution. I'll see if I can knock something up over the weekend. I believe
> such changes would be upstreamable. It's not like using ld.so.conf for both
> runtime and build-time is a Gentoo-specific thing.

I can imagine that is an "allow glibc to run on CBUILD just like CHOST" feature.  That will be far more overreaching to allow glibc to be relocatable.

> > Another way is to decouple the file ld.so.conf into runtime and build-time
> > (automatically generated from the former) versions for glibc and binutils
> > separately.
> 
> I think that would still need some patching. How else would we tell ld to
> look in a different place?

It (surprisingly) has long been a feature of ld since 2006 [1]. Now it is handled by ld/ldelf.c[2].

Universally, ld first looks for /usr/etc/ld.so.conf and falls back to /etc/ld.so.conf.  So writing a seperate /usr/etc/ (could be symlinked from /etc/ld/) will just work.


1. https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=dfcffada0bf3f6dfd1ba336fb1647694c55d4f22
2. https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=ld/ldelf.c;h=2e27cf48a816dc78bd76d2f0185a601d2edfb392;hb=ef8f08ca13f6c111cc549a3e13be5c5e2d95ca82#l910
Comment 18 James Le Cuirot gentoo-dev 2023-02-04 09:27:47 UTC
(In reply to Benda Xu from comment #17)
> 
> I can imagine that is an "allow glibc to run on CBUILD just like CHOST"
> feature.  That will be far more overreaching to allow glibc to be
> relocatable.

Yes, I think I see your point. Although you can use ld.so.conf for both simultaneously, that's probably not upstream's intention.

> It (surprisingly) has long been a feature of ld since 2006 [1]. Now it is
> handled by ld/ldelf.c[2].
> 
> Universally, ld first looks for /usr/etc/ld.so.conf and falls back to
> /etc/ld.so.conf.  So writing a seperate /usr/etc/ (could be symlinked from
> /etc/ld/) will just work.

I had noticed that, but /usr/etc seems so weird, I thought it was a bug more than a feature. XD I guess we could go this way if you're happy with it. Maybe it would only need to be applied to prefix.
Comment 19 James Le Cuirot gentoo-dev 2023-02-04 18:08:27 UTC
(In reply to Benda Xu from comment #13)
> Continuing down the path of decoupling ld.so.conf, it needs the least
> upstream change: in Gentoo ld.so.conf is completely controlled by env-update
> and (few) eselect.  The development can happen entirely in Gentoo ourselves.

For the record, /etc/ld.so.conf is written by env-update, but /etc/ld.so.conf.d/05gcc-${CHOST}.conf is written by gcc-config. The other files in there typically come from eselect. Unfortunately, I think we'll have to fix this on a case by case basis, although such files are rare. Anyone who manually adds custom entries there is probably playing with fire anyway.
Comment 20 James Le Cuirot gentoo-dev 2023-02-05 23:23:07 UTC
Progress report.

Before rushing in and fixing this, I wanted to check how it would (or wouldn't) affect musl and other linkers. musl doesn't use ld.so.conf at all, but it would still be used by bfd when building. Other linkers, such as lld and mold, do not use ld.so.conf either. I haven't checked gold yet, but I suspect it shares code with bfd.

It's interesting that other linkers do not use ld.so.conf. This means that eselect-blas does not change which blas library these linkers use at build time, only which is used at runtime. The linkers would always use sci-libs/lapack.
Comment 21 Yiyang Wu 2023-02-06 02:37:30 UTC
(In reply to James Le Cuirot from comment #20)
> Progress report.
> 
> Before rushing in and fixing this, I wanted to check how it would (or
> wouldn't) affect musl and other linkers. musl doesn't use ld.so.conf at all,
> but it would still be used by bfd when building. Other linkers, such as lld
> and mold, do not use ld.so.conf either. I haven't checked gold yet, but I
> suspect it shares code with bfd.

(In reply to Yiyang Wu from comment #0)
> Using mold, lld, gold there's no issue.

In my test gold works normally.

> 
> It's interesting that other linkers do not use ld.so.conf. This means that
> eselect-blas does not change which blas library these linkers use at build
> time, only which is used at runtime. The linkers would always use
> sci-libs/lapack.

In my case `gcc -fuse-ld={mold,gold,ldd} -lcblas` links to the correct cblas implementation.
Comment 22 James Le Cuirot gentoo-dev 2023-02-06 23:04:31 UTC
(In reply to Yiyang Wu from comment #21)
>
> In my test gold works normally.

You're right, I can confirm that gold works. Strange that it behaves differently, despite being part of binutils.

> > It's interesting that other linkers do not use ld.so.conf. This means that
> > eselect-blas does not change which blas library these linkers use at build
> > time, only which is used at runtime. The linkers would always use
> > sci-libs/lapack.
> 
> In my case `gcc -fuse-ld={mold,gold,ldd} -lcblas` links to the correct cblas
> implementation.

Are you sure about that? ldd will show the right one, as chosen at runtime, but my own testing shows that it always uses /usr/lib/libcblas.so at build time.

$ strace -f -e trace=file gcc -fuse-ld=bfd foo.c -lcblas -o blas 2>&1 | egrep "libc?blas"
[pid 21099] openat(AT_FDCWD, "/usr/lib/gcc/x86_64-pc-linux-gnu/12/libcblas.so", O_RDONLY) = -1 ENOENT (No such file or directory)
[pid 21099] openat(AT_FDCWD, "/usr/lib/gcc/x86_64-pc-linux-gnu/12/libcblas.a", O_RDONLY) = -1 ENOENT (No such file or directory)
[pid 21099] openat(AT_FDCWD, "/usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../lib64/libcblas.so", O_RDONLY) = 8
[pid 21099] openat(AT_FDCWD, "/usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../lib64/libcblas.so", O_RDONLY) = 9
[pid 21099] openat(AT_FDCWD, "/usr/lib/gcc/x86_64-pc-linux-gnu/12/libblas.so.3", O_RDONLY) = -1 ENOENT (No such file or directory)
[pid 21099] openat(AT_FDCWD, "/usr/lib/gcc/x86_64-pc-linux-gnu/12/32/libblas.so.3", O_RDONLY) = -1 ENOENT (No such file or directory)
[pid 21099] openat(AT_FDCWD, "/usr/lib64/blas/blis/libblas.so.3", O_RDONLY) = 20

$ ldd blas | fgrep blas
        libcblas.so.3 => /usr/lib64/blas/blis/libcblas.so.3 (0x00007ff3523df000)

$ strace -f -e trace=file gcc -fuse-ld=lld foo.c -lcblas -o blas 2>&1 | egrep "libc?blas"
[pid 21114] access("/usr/lib/gcc/x86_64-pc-linux-gnu/12/libcblas.so", F_OK) = -1 ENOENT (No such file or directory)
[pid 21114] access("/usr/lib/gcc/x86_64-pc-linux-gnu/12/libcblas.a", F_OK) = -1 ENOENT (No such file or directory)
[pid 21114] access("/usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../lib64/libcblas.so", F_OK) = 0
[pid 21114] openat(AT_FDCWD, "/usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../lib64/libcblas.so", O_RDONLY|O_CLOEXEC) = 3

$ ldd blas | fgrep blas                                                                  
        libcblas.so.3 => /usr/lib64/blas/blis/libcblas.so.3 (0x00007fdb702f5000)
Comment 23 James Le Cuirot gentoo-dev 2023-02-06 23:11:19 UTC
Just to add to that point, if I uninstall lapack, but keep blis, this happens:

$ gcc -I/usr/include/blis -fuse-ld=lld foo.c -lcblas -o blas
ld.lld: error: unable to find library -lcblas
collect2: error: ld returned 1 exit status

I now realise that eselect-blas is only supposed to affect runtime rather than build time though. That's why sci-libs/lapack appears twice in the dependencies of virtual/blas, the first being unconditional.
Comment 24 Yiyang Wu 2023-02-07 03:11:38 UTC
(In reply to James Le Cuirot from comment #22)
> (In reply to Yiyang Wu from comment #21)
> > In my case `gcc -fuse-ld={mold,gold,ldd} -lcblas` links to the correct cblas
> > implementation.
> 
> Are you sure about that? ldd will show the right one, as chosen at runtime,
> but my own testing shows that it always uses /usr/lib/libcblas.so at build
> time.

You're right, ldd shows the right one, but at build time other linker does not take a look at the desired libcblas.
Comment 25 Larry the Git Cow gentoo-dev 2023-02-11 22:28:12 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=80d8703c52f23ca672a0e690f9daa4aff6520ee1

commit 80d8703c52f23ca672a0e690f9daa4aff6520ee1
Author:     James Le Cuirot <chewi@gentoo.org>
AuthorDate: 2023-02-09 23:04:15 +0000
Commit:     James Le Cuirot <chewi@gentoo.org>
CommitDate: 2023-02-11 22:27:24 +0000

    profiles: Don't prefixify /etc/ld.so.conf path in binutils
    
    Now that the compiler's sysroot is being respected, prefixifying the
    path to /etc/ld.so.conf results in a double prefix.
    
    Bug: https://bugs.gentoo.org/892549
    Signed-off-by: James Le Cuirot <chewi@gentoo.org>

 profiles/features/prefix/standalone/profile.bashrc | 10 ----------
 1 file changed, 10 deletions(-)
Comment 26 James Le Cuirot gentoo-dev 2023-02-11 22:30:18 UTC
Note that the above does not fix the issue. It only deals with part of it, although we may disable the bfd ld.so.conf feature altogether in the end. See https://github.com/gentoo/binutils-gdb/pull/4 for the real fix, which we can hopefully land soon.
Comment 27 Benda Xu gentoo-dev 2023-02-12 01:12:34 UTC
(In reply to James Le Cuirot from comment #23)
> Just to add to that point, if I uninstall lapack, but keep blis, this
> happens:
> 
> $ gcc -I/usr/include/blis -fuse-ld=lld foo.c -lcblas -o blas
> ld.lld: error: unable to find library -lcblas
> collect2: error: ld returned 1 exit status
> 
> I now realise that eselect-blas is only supposed to affect runtime rather
> than build time though. That's why sci-libs/lapack appears twice in the
> dependencies of virtual/blas, the first being unconditional.

Exactly.  The ABIs are interchangeable.
Comment 28 Larry the Git Cow gentoo-dev 2023-02-22 22:26:33 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=07d598347c2a311c91eacd4303e0517cf0a127c3

commit 07d598347c2a311c91eacd4303e0517cf0a127c3
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2023-02-22 22:22:46 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2023-02-22 22:26:22 +0000

    sys-devel/binutils: apply linker search path fixes from Chewi for prefix
    
    Quoting Chewi on the PR for posterity:
    """
    The first of these changes fixes two related issues with prefixed and crossdev environments.
    The prefix issue is detailed in Gentoo bug #892549. The crossdev issue can be reproduced by trying something like:
    
    USE="-python icu" aarch64-unknown-linux-gnu-emerge libxml2
    
    The second of these changes is not essential, but it does make bfd's behaviour
    in this area more consistent with the other linkers, which have not experienced these issues at all.
    
    I'm not sure what upstream will make of these changes, particularly the second one,
    but it is interesting that even gold does not behave the same way as bfd here.
    
    Perhaps we can give them some exposure in Gentoo for a while before seeing what they think.
    The second change would not be submitted upstream as-is because fully removing the ld.so.conf feature is a much bigger diff.
    """
    
    This patch is, for now, only applied for prefix. It should be safe
    on other systems but the issue is more pressing on prefix given a recent
    migration.
    
    Bug: https://bugs.gentoo.org/892549
    Thanks-to: James Le Cuirot <chewi@gentoo.org>
    Signed-off-by: Sam James <sam@gentoo.org>

 sys-devel/binutils/binutils-2.40-r2.ebuild         | 503 +++++++++++++++++++++
 sys-devel/binutils/binutils-9999.ebuild            |  12 +-
 .../files/binutils-2.40-linker-search-path.patch   |  74 +++
 3 files changed, 584 insertions(+), 5 deletions(-)
Comment 29 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-02-22 22:27:26 UTC
If you're hitting this issue, please try emerge --sync in a few hours, then emerge -v1 sys-devel/binutils.
Comment 30 James Le Cuirot gentoo-dev 2023-02-22 22:41:03 UTC
Thanks, Sam, this had been weighing on my mind. If we get good feedback here, I'll update the news item.
Comment 31 Yiyang Wu 2023-02-23 04:10:13 UTC
(In reply to Sam James from comment #29)
> If you're hitting this issue, please try emerge --sync in a few hours, then
> emerge -v1 sys-devel/binutils.

And don't forget to run `eselect binutils set x86_64-pc-linux-gnu-2.40` if the user have default to an older one (like one of my prefix), I got stuck for quite a while :( before realizing I'm still using the 2.39 binutils
Comment 32 Yiyang Wu 2023-02-23 04:18:17 UTC
Now with 07d598347c2a311c91eacd4303e0517cf0a127c3 ld.bfd is able to link my example without the need of double-prefix symlink hack.

But I think this issue also affecting gdb, which is part of binutils-gdb project.

Using sys-devel/gdb-13.1::gentoo, without double-prefix symlink hack, I ran

```
gcc -O0 -ggdb foo.c -o foo -lcblas
gdb foo
GNU gdb (Gentoo 13.1 vanilla) 13.1
.......
(gdb) run
Starting program: /tmp/foo 
warning: Unable to find dynamic linker breakpoint function.
GDB will be unable to debug shared library initializers                                                                                                                                   
and track explicitly loaded dynamic code.
warning: Could not load shared library symbols for 9 libraries, e.g. /opt/gentoo/usr/lib64/blas/openblas/libcblas.so.3.
Use the "info sharedlibrary" command to see the complete listing.
Do you need "set solib-search-path" or "set sysroot"?
```

I need to use either double-prefix symlink hack, or `set sysroot /` in gdb to mitigate this issue.
Comment 33 Larry the Git Cow gentoo-dev 2023-03-01 00:52:10 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/proj/prefix.git/commit/?id=f4bae8f7128a0a7977d4cf765f21301a2275f32e

commit f4bae8f7128a0a7977d4cf765f21301a2275f32e
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2023-03-01 00:51:07 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2023-03-01 00:51:07 +0000

    sys-devel/binutils: add 2.40(-r2)
    
    Bug: https://bugs.gentoo.org/895240
    Bug: https://bugs.gentoo.org/892549
    Signed-off-by: Sam James <sam@gentoo.org>

 sys-devel/binutils/Manifest                        |   2 +
 sys-devel/binutils/binutils-2.40-r2.ebuild         | 509 +++++++++++++++++++++
 .../files/binutils-2.40-linker-search-path.patch   |  74 +++
 3 files changed, 585 insertions(+)
Comment 34 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-03-01 00:54:33 UTC
Let's deal with the gdb issue in bug 896008.
Comment 35 Yiyang Wu 2023-07-01 04:54:17 UTC
This change has brought another issue. I attached a tarball containing files for reproducing. Run make with the default binutils (using bfd linker) will result in:

warning: libfoo.so, needed by libbar/libbar.so, not found (try using -rpath or -rpath-link)
libbar/libbar.so: undefined reference to `funca(int)'

While specifying CXXFLAGS=-fuse-ld=gold does not have the problem.

So, when exe uses symbol from libbar.so but do not depend on libfoo.so, and libbar.so depend on libfoo.so, ld.bfd tries to find libfoo.so when linking exe. But if libfoo.so is not in standard path, nor not explicitly specify -L, then it cannot be found, and throw `libbar.so: undefined reference to <symbol in libfoo used by libbar> when linking exe.

This issue is very rare in the past, because the path of libfoo can be found in ld.conf, or LD_LIBRARY_PATH, or even rpath of libbar, ld.bfd will ultimately find out the location of libfoo. But now we patched out ldelf_check_ld_so_conf, which is one step in ldelf_handle_dt_needed in ld/ldelf.c, then ld.fd loose the ability to locate libraries in ld.conf.

I guess, the fundamental cause is in ldelf_handle_dt_needed, ld.bfd tracks down the dt_needed entries of all explicitly linked shared objects, like the runtime linker [1]. This is not an issue before we change anything, when libfoo.so is in ld.conf search path (because it is needed at runtime); but after the binutils patch [2], problems occurs when linking to libraries not in standard locations but are written in ld.conf. Other linkers does not have the problem, since they did not track down the dt_needed entry of libbar.so at all.

So we have this remaining issue, and blocks some packages. For example some ROCm packages, which links to libamd_comgr, while libamd_comgr links to libLLVM not in /usr/lib/llvm/SLOT/lib64, so when linking to libamd_comgr, ld cannot find libLLVM and omit undefined reference for symbols in libLLVM used by libamd_comgr.

[1] http://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=ld/ldelf.c;h=f9a6819366f1ac634103bedd32844ed1868591be;hb=HEAD#l1026
Comment 36 Benda Xu gentoo-dev 2023-07-01 04:59:49 UTC
Let's reopen this bug before the said regression is understood.
Comment 37 Yiyang Wu 2023-07-01 05:00:02 UTC
(In reply to Yiyang Wu from comment #35)
 but
> after the binutils patch [2],

[2] https://gitweb.gentoo.org/repo/proj/prefix.git/commit/?id=f4bae8f7128a0a7977d4cf765f21301a2275f32e
Comment 38 Yiyang Wu 2023-07-01 09:08:02 UTC
After some studying, my opinion is

The other linkers does not have to behave exactly the same as bfd linker [1]. Currently, Gentoo uses bfd linker to build packages by default, so we need to make sure it is functional.

Previously we decided to patch ld.bfd so it behaves more liker others. it turns out if we want go this way, we have to do more -- ld.bfd rely on ld.so.conf to find out libraries needed in runtime. If this part is patched out, maybe we also need to patch out the entire logic of check dt_needed entries in ld.bfd. Otherwise, https://bugs.gentoo.org/892549#c35 occurs.

As I understand, the RAP migration is turning RAP to cross-compile-for-itself, and then ESYSROOT=SYSROOT/EPREFIX become double prefixed. Consider a general case of cross building, libraries (in DEPEND) should be found at ESYSROOT, and then the output under ESYSROOT should be copied to EPREFIX at target system. So if standalone RAP is really cross-compile-itself, then things should be compiled and installed to /prefix/prefix, and then *copied to /prefix at target system* (which is the same machine actually). From this point of view, the existence of double prefix symlink becomes a bit more reasonable -- the symlink serves as the "copy" process.

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=10238
Comment 39 James Le Cuirot gentoo-dev 2023-07-01 09:16:43 UTC
I haven't had a chance to really look at this yet, but I think the case you describe is just the kind we were trying to fix under prefix. In that case though, it was failing to find libstdc++. In this case, I think it's some other library. This begs the question of how do the other linkers find this library if it's not via /etc/ld.so.conf. I'll have a closer look later.
Comment 40 Yiyang Wu 2023-07-01 09:59:58 UTC
(In reply to James Le Cuirot from comment #39)
> This begs the question of how do the other linkers find this
> library if it's not via /etc/ld.so.conf. I'll have a closer look later.

Other linker simply don't need to find this library, because the executable does not use any symbol from this library (libfoo). It is ld.bfd doing extra work when linking exe against libbar: searching for link dependency not of exe, but of libbar.
Comment 41 Arsen Arsenović gentoo-dev 2023-07-01 18:54:33 UTC
note that the ld.so.conf reading code is correct, and hence, should never have been removed.  I've swapped out the context on what the proper fix for the problem that lead to its removal by now, so I apologize for the lack of rationale.  I seem to remember a --with-sysroot!=/ for something whose sysroot was very much /, though.
Comment 42 James Le Cuirot gentoo-dev 2023-07-01 19:49:23 UTC
(In reply to Yiyang Wu from comment #40)
> Other linker simply don't need to find this library, because the executable
> does not use any symbol from this library (libfoo). It is ld.bfd doing extra
> work when linking exe against libbar: searching for link dependency not of
> exe, but of libbar.

You're right, I had observed but forgotten that the other linkers do not do this extra work.

A symlink would serve as a workaround, but it seemed like a hack at the time. I'll consider it if we can't find a better way forwards.

We could address this as necessary in any packages, but I get the sense you're wanting this to work in general, not just for packages.

I'll see how feasible it would be to stub out the rest of the behaviour. Having bfd behave even more like the other linkers does not seem like a bad goal.
Comment 43 James Le Cuirot gentoo-dev 2023-07-02 21:49:52 UTC
Progress so far. It was fairly easy to stub out the part where it traverses through sub-dependent libraries, but then it complains of unresolved symbols later. Still investigating.
Comment 44 Benda Xu gentoo-dev 2023-09-24 02:29:01 UTC
Hi James,

This bug breaking all the default installations of Prefix.  I expect that we could fix it in a timely manner.

Yours,
Benda
Comment 45 James Le Cuirot gentoo-dev 2023-09-24 08:43:38 UTC
Sorry, it's been hard to put the time in, although I didn't think it was happening frequently. Which package are you seeing it with? I'll give it another look today.
Comment 46 Yiyang Wu 2023-09-24 09:21:15 UTC
(In reply to James Le Cuirot from comment #45)
> Sorry, it's been hard to put the time in, although I didn't think it was
> happening frequently. Which package are you seeing it with? I'll give it
> another look today.

My observation of the issue is at https://bugs.gentoo.org/892549#c35:

> I attached a tarball containing files for reproducing.

Sorry I forgot the attachment. Will attach it right away.

As far as I know in ::gentoo sci-libs/caffe2[cuda] suffers. Also, any packages that links to libamd_comgr provided by dev-libs/rocm-comgr::gentoo suffers. Also, any second-order-reverse-dependency of libs in non standard location like <prefix>/usr/lib64, including llvm, breaks, I suppose.

> It (surprisingly) has long been a feature of ld since 2006. Now it is handled by ld/ldelf.c.

Also, since the https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=07d598347c2a311c91eacd4303e0517cf0a127c3 removes the feature, as I understand, that makes some odd software that depends on this feature broke. Not a big issue in ::gentoo I guess, but on overlays they have such situations, like https://bugs.gentoo.org/553382, or I personally maintained sci-libs/openfoam[1]. I would say those packages should not rely on the feature, but this is the status.
Comment 47 Yiyang Wu 2023-09-24 09:21:59 UTC
Created attachment 871235 [details]
bug_MWE
Comment 48 James Le Cuirot gentoo-dev 2023-09-24 21:10:06 UTC
The tarball really helps, thank you. I don't think there are many real instances of this. There aren't that many libraries that get installed to subdirectories, and those that are are almost always direct dependencies. The only ones I could find on my own system involve LLVM.

I did look at this a little today. Unfortunately, it seems I didn't make any notes about this earlier, or if I did, I lost them. I'll continue investigating this avenue some more, but I think I may end up taking a different route like generating a second ld.so.conf that's only used for linking. I need to go back over my IRC logs to remind myself if there were any downsides to that approach.
Comment 49 James Le Cuirot gentoo-dev 2023-09-27 22:07:08 UTC
I'm now leaning towards the /usr/etc/ld.so.conf solution. If it were only the GCC libraries we needed to worry about then we could keep the first of my two binutils patches and just make this file blank. For LLVM to work too, we need to copy the entries from /etc/ld.so.conf sans prefix. That file is written by env-update (part of Portage) so the fix would be implemented there. Before I wrote my binutils patches, I prepared a similar fix for gcc-config, which writes /etc/ld.so.conf.d/05gcc-*.conf, but that probably isn't needed now.
Comment 50 James Le Cuirot gentoo-dev 2023-10-01 09:53:01 UTC
I've submitted a change to Portage to make it write ${EPREFIX}/usr/etc/ld.so.conf as part of env-update. This seems to fix the issue, but you need to remove the second of the two binutils fixes first. I'd appreciate it if you could try this out. The file is only written when it thinks it's needed, so you may also need to add a dummy line to ${EPREFIX}/etc/ld.so.conf.
Comment 51 James Le Cuirot gentoo-dev 2023-10-01 09:59:16 UTC
Come to think of it, it would have been really easy to get bfd to just not add the sysroot to these paths. That almost seems like a better idea, but not changing the behaviour wins out in my mind.
Comment 52 Larry the Git Cow gentoo-dev 2023-10-01 14:09:05 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=6010348df47c9b5bb8e2f3305b35f82f789aca36

commit 6010348df47c9b5bb8e2f3305b35f82f789aca36
Author:     James Le Cuirot <chewi@gentoo.org>
AuthorDate: 2023-10-01 14:07:46 +0000
Commit:     James Le Cuirot <chewi@gentoo.org>
CommitDate: 2023-10-01 14:08:05 +0000

    sys-devel/binutils: Drop ld.so.conf prefix patch and enable -L patch for cross
    
    The ld.so.conf prefix patch didn't work in all the cases we needed it to. We'll
    fix the issue with /usr/etc/ld.so.conf via env-update instead.
    
    The -L patch was previously only applied to prefixed systems, but it's needed to
    fix crossdev environments too. We should probably just take it into the general
    patchset.
    
    Bug: https://bugs.gentoo.org/892549
    Signed-off-by: James Le Cuirot <chewi@gentoo.org>

 ...tils-2.40-r8.ebuild => binutils-2.40-r9.ebuild} |  4 ++-
 ...tils-2.41-r1.ebuild => binutils-2.41-r2.ebuild} |  4 ++-
 sys-devel/binutils/binutils-9999.ebuild            |  4 ++-
 .../files/binutils-2.40-linker-search-path.patch   | 36 ----------------------
 4 files changed, 9 insertions(+), 39 deletions(-)
Comment 53 Larry the Git Cow gentoo-dev 2023-10-02 21:40:18 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/proj/portage.git/commit/?id=8008e209d900dc988217ce3721292ba895cd0494

commit 8008e209d900dc988217ce3721292ba895cd0494
Author:     James Le Cuirot <chewi@gentoo.org>
AuthorDate: 2023-10-01 09:32:33 +0000
Commit:     James Le Cuirot <chewi@gentoo.org>
CommitDate: 2023-10-02 21:38:18 +0000

    env-update: Write /usr/etc/ld.so.conf to fix bfd in some obscure cases
    
    This is only needed on prefixed systems. bfd currently reads
    ${EPREFIX}/etc/ld.so.conf and adds the prefix to these paths, but these
    paths are already prefixed. We need them to stay prefixed for ldconfig
    and the runtime linker. bfd will use ${EPREFIX}/usr/etc/ld.so.conf
    instead if that is present, so we can write the unprefixed paths there.
    
    Other linkers do not use these files at all. We tried to patch bfd to
    not use them either, as it shouldn't really be necessary, but that
    broke some cases, so we are trying this safer approach instead.
    
    env-update does not write the files under /etc/ld.so.conf.d, but we
    shouldn't need to handle these in any case, as all known instances are
    not affected by this issue.
    
    Bug: https://bugs.gentoo.org/892549
    Closes: https://github.com/gentoo/portage/pull/1105
    Signed-off-by: James Le Cuirot <chewi@gentoo.org>

 NEWS                           |  3 +++
 lib/portage/util/env_update.py | 19 +++++++++++++++++++
 2 files changed, 22 insertions(+)
Comment 54 James Le Cuirot gentoo-dev 2023-10-04 20:06:45 UTC
The Portage change has already gone into 3.0.52, so I think we can consider this fixed now.
Comment 55 Benda Xu gentoo-dev 2023-11-20 04:41:42 UTC
Thank you very much, James :)
Comment 56 Yiyang Wu 2023-12-11 09:49:15 UTC
One question though:

You don't receive that fix unless LDPATH get changed:

The block for generating /usr/etc/ld.so.conf

```
+        if eprefix:
+            # ldconfig needs ld.so.conf paths to be prefixed, but the bfd linker
+            # needs them unprefixed, so write an alternative ld.so.conf file for
+            # the latter. Other linkers do not use these files. See ldelf.c in
+            # binutils for precise bfd behavior, as well as bug #892549.
+            ldsoconf_path = os.path.join(eroot, "usr", "etc", "ld.so.conf")
```

is under this if caluse:

```
if oldld != newld:
```
Comment 57 Yiyang Wu 2023-12-11 09:51:26 UTC
(In reply to Yiyang Wu from comment #56)
> One question though:
> 
> You don't receive that fix unless LDPATH get changed:
> 
> is under this if caluse:
> 
> ```
> if oldld != newld:
> ```

Maybe we should update the NEWS to tell user that they have to manually modify LDPATH and trigger a env-update?
Comment 58 James Le Cuirot gentoo-dev 2023-12-17 21:43:51 UTC
Could have sworn I'd replied to this. I don't recall what I thought at the time, but my thinking now is that the latest issue and fix is too obscure to warrant a news item. You just seem very good at hitting these issues.
Comment 59 Yiyang Wu 2023-12-18 03:29:09 UTC
(In reply to James Le Cuirot from comment #58)
> Could have sworn I'd replied to this. I don't recall what I thought at the
> time, but my thinking now is that the latest issue and fix is too obscure to
> warrant a news item.

Sure. I have to thank you a lot for fixing all these issues! And hopefully people encountering the same issue can find and read through this bug ticket and get the solution.

> You just seem very good at hitting these issues.

No, just because I'm "lucky enough" to encounter these corner cases.
Comment 60 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-12-18 03:37:01 UTC
We could add a minor note in the portage NEWS file (in portage.git).
Comment 61 Yiyang Wu 2023-12-20 09:36:03 UTC
Sorry, I have to report yet another issue, which is discovered previously at https://bugs.gentoo.org/892549#c3 but might be forgotten:

Currently (sys-devel/binutils-2.41-r2) are reading "${EPREFIX}/${EPREFIX}/usr/etc/ld.so.conf", actually. 

I did a GDB trace, it seems to be caused by ld/ldelf.c[1], already mentioned in https://bugs.gentoo.org/892549#c17; while in that line of code, ld_sysroot is "${EPREFIX}" and prefix is "${EPREFIX}/usr", thus giving "${EPREFIX}/${EPREFIX}/usr/etc/ld.so.conf"

Sorry for re-discover that in a late manner. I forgot to remove the double-prefix link hack and this bug did not reveal itself to me until now. 

1. https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=ld/ldelf.c;h=2e27cf48a816dc78bd76d2f0185a601d2edfb392;hb=ef8f08ca13f6c111cc549a3e13be5c5e2d95ca82#l910
Comment 62 James Le Cuirot gentoo-dev 2023-12-23 10:53:56 UTC
(In reply to Yiyang Wu from comment #61)
> Currently (sys-devel/binutils-2.41-r2) are reading
> "${EPREFIX}/${EPREFIX}/usr/etc/ld.so.conf", actually. 

I hate to admit it, but I think you're right. I don't know how I missed this before, maybe I only tested cross-compiling. What a pain. I'll have a look.
Comment 63 James Le Cuirot gentoo-dev 2023-12-23 12:19:59 UTC
I've had a think. Finding $prefix/etc/ld.so.conf is literally the only thing this is used for. Although "prefix" is a variable here, it is effectively hardcoded at build time via configure, genscripts.sh, and elf.em. For a Gentoo build, it will only ever be ${EPREFIX}/usr. Including ${EPREFIX} is unhelpful, not just because of the double prefix, but also because it means you cannot use this linker against some other prefix. Better to rely on the sysroot, which is dynamic, as we have been trying to do. I think it would therefore make sense to hardcode $prefix to /usr in elf.em. I'll give this a try.
Comment 64 James Le Cuirot gentoo-dev 2023-12-23 13:17:09 UTC
I must remember to consider prefix-guest here though. That uses the host's libc and therefore does not set a sysroot. We don't want to use the host's other libraries, so we don't want to load the host's /usr/etc/ld.so.conf, and we should still include ${EPREFIX} in this case.
Comment 65 James Le Cuirot gentoo-dev 2023-12-23 14:56:01 UTC
Please try out https://github.com/gentoo/gentoo/pull/34446.
Comment 66 Yiyang Wu 2023-12-23 16:24:34 UTC
(In reply to James Le Cuirot from comment #65)
> Please try out https://github.com/gentoo/gentoo/pull/34446.

Confirmed this fixes the double-prefix issue for finding <prefix>/usr/etc/ld.so.conf. Thank you very much for the careful analysis and quick response!
Comment 67 Larry the Git Cow gentoo-dev 2024-01-06 11:48:21 UTC
The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=5d9341ed5b240e838abea81a582717aa92381dc6

commit 5d9341ed5b240e838abea81a582717aa92381dc6
Author:     James Le Cuirot <chewi@gentoo.org>
AuthorDate: 2023-12-23 14:50:47 +0000
Commit:     James Le Cuirot <chewi@gentoo.org>
CommitDate: 2024-01-06 11:47:39 +0000

    sys-devel/binutils: Add conditional patch to fix ld.bfd prefix handling
    
    As before, this may make it into our patchset once it's been proven to work. Our
    track record here hasn't been great so far!
    
    Closes: https://bugs.gentoo.org/892549
    Closes: https://github.com/gentoo/gentoo/pull/34446
    Bug: https://github.com/gentoo/binutils-gdb/pull/5
    Signed-off-by: James Le Cuirot <chewi@gentoo.org>

 sys-devel/binutils/binutils-2.41-r4.ebuild         | 534 +++++++++++++++++++++
 sys-devel/binutils/binutils-9999.ebuild            |   9 +-
 .../files/binutils-2.41-linker-prefix.patch        |  56 +++
 3 files changed, 597 insertions(+), 2 deletions(-)