Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 4411 - sys-devel/gcc uses static libs in /usr/lib before it will use a dynamic lib in /lib
Summary: sys-devel/gcc uses static libs in /usr/lib before it will use a dynamic lib i...
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: x86 Linux
: High blocker (vote)
Assignee: Gentoo Toolchain Maintainers
URL:
Whiteboard:
Keywords:
: 9389 9502 9588 9685 (view as bug list)
Depends on:
Blocks:
 
Reported: 2002-07-01 18:39 UTC by Daniel Robbins (RETIRED)
Modified: 2015-12-08 21:06 UTC (History)
20 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
More appropriate patch to the specs file (gcc-3.1.diff,785 bytes, patch)
2002-07-01 19:41 UTC, Martin Schlemmer (RETIRED)
Details | Diff
Sample C file that breaks dlopen() (breakDlopen.c,425 bytes, text/plain)
2009-10-06 10:23 UTC, Ivan
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Daniel Robbins (RETIRED) gentoo-dev 2002-07-01 18:39:10 UTC
Using gcc-3.1-r7, everything links to libncurses.a and libreadline.a (in
/usr/lib) rather than using the shared libraries in /lib.  This means that
/usr/bin/less gets static ncurses, doubling its size.  The quick hack solution
to this problem is to move lib{ncurses|readline}.a to /tmp when compiling things
like less and bc.

The intermediate solution is to edit
/usr/lib/gcc-lib/i686-pc-linux-gnu/3.1/specs and apply the following fix:

under "link:", just before "%{rdynamic:-export-dynamic}", add "-L/lib"

This adds a default -L/lib option to all linking calls, forcing /lib to be
searched first.  This solution could work.  But a better solution may be to find
out why gcc-3.1-r7 (and probably the other gcc3 ebuilds) do this and patch the
gcc sources themselves.
Comment 1 Martin Schlemmer (RETIRED) gentoo-dev 2002-07-01 19:41:06 UTC
Created attachment 1909 [details, diff]
More appropriate patch to the specs file

This one should be more appropriate, as it only tells the linker to have
"-L/lib"
if we are linking dinamically.
Comment 2 Martin Schlemmer (RETIRED) gentoo-dev 2002-07-01 19:45:44 UTC
I also think I have solved the .la problem (bug #4351), but more on that later
on.  Ill test this fix with a bootstrap && emerge system && emerge gnome to
test that it is working fine.  If so, ill do a masked -r8 of gcc-3.1.

Anyhow, off to bed now, as it is late.
Comment 3 Daniel Robbins (RETIRED) gentoo-dev 2002-07-02 16:59:16 UTC
Note that using the -L/lib hack I initially mentioned (the first hack from
Azarah), allows bc-1.06-r3 to build correctly.
Comment 4 Martin Schlemmer (RETIRED) gentoo-dev 2002-07-03 17:18:11 UTC
OK, -r8 is up for grabs (masked).  I am busy with a bootstrap, and will
subsequently test gnome2, and maybe kde (if i can get it downloaded at work).
Comment 5 Brandon Low (RETIRED) gentoo-dev 2002-07-05 00:52:53 UTC
this is azarah's, he has a fix in portage gcc-3.1-r8, test it...
Comment 6 Martin Schlemmer (RETIRED) gentoo-dev 2002-07-06 10:34:55 UTC
Ok, all the previous "fixes" caused ncurses among things not to link properly
if an older version was already installed.  The new sed rule is:

# sed -e "s:%{L\*} %(link_libgcc):%{L\*} -L/lib %(link_libgcc):" specs

and test fine so far.  Anybody who merged -r8, please update CVS and remerge.

Comment 7 Brandon Low (RETIRED) gentoo-dev 2002-07-14 17:54:36 UTC
Azarah, gcc-3.1-r8 kills gnome-vfs, I'm working on hacking it up, but if you
have thoughts I'd love to hear them
Comment 8 Brandon Low (RETIRED) gentoo-dev 2002-07-14 18:48:17 UTC
gnome-vfs and kbear... possibly others appear to have to location
/usr/lib/libstdc++.la hardcoded somehow, still investigating.
Comment 9 Martin Schlemmer (RETIRED) gentoo-dev 2002-07-14 19:23:09 UTC
Version 1.0.5 and version 2.0 of gnome-vfs works just fine here.  Check that
you do not have stray .la files from gcc-3.1 in /usr/lib.

Not sure about kbear though.
Comment 10 Brandon Low (RETIRED) gentoo-dev 2002-07-14 23:25:34 UTC
hrm... I don't have any extra .la files...

this is c00k00
Comment 11 Brandon Low (RETIRED) gentoo-dev 2002-07-23 13:05:04 UTC
Martin, 

  Daniel brought to my attention a problem with the way that gcc-3.1-r8 solves
this problem, there are times in large packages when libtool needs to link a
newly compiled library against another library just compiled in the package, in
this situation, the fix that you have with seding in -L /lib to the libtool
script ends up breaking and grabbing the old library in /lib before it grabs the
new library in the source directory.  We need to actually change the sources of
gcc so that it's default search order is changed probably.  Unfortunately, I'm
going to be flakier than a french pastry for the next month or so, so I can't be
counted on to work on this issue, so I'm re-assigning the bug to you (since you
had already been doing the work on it any way).

thanks...

--Brandon
Comment 12 Martin Schlemmer (RETIRED) gentoo-dev 2002-07-23 14:11:44 UTC
Is there examples ?

Btw .. I did check out the sources .... the lib order seems fine :(

Below is sorda where I reversed it (incorrectly) to try and see if it would
make a difference (just so that anybody who wanna try it before me will not
have to try figuring out where to look).  The prefixes is defined to the top
of ${S}/gcc/gcc.c ....

--------------------snip-----------------------------------------
--- gcc/gcc.c.orig      Wed Jul  3 22:02:54 2002
+++ gcc/gcc.c   Wed Jul  3 22:03:12 2002
@@ -5922,10 +5922,10 @@
                      NULL, PREFIX_PRIORITY_LAST, 0, NULL);
        }
 
-      add_prefix (&startfile_prefixes, standard_startfile_prefix_1,
-                 "BINUTILS", PREFIX_PRIORITY_LAST, 0, NULL);
       add_prefix (&startfile_prefixes, standard_startfile_prefix_2,
                  "BINUTILS", PREFIX_PRIORITY_LAST, 0, NULL);
+      add_prefix (&startfile_prefixes, standard_startfile_prefix_1,
+                 "BINUTILS", PREFIX_PRIORITY_LAST, 0, NULL);
 #if 0 /* Can cause surprises, and one can use -B./ instead.  */
       add_prefix (&startfile_prefixes, "./", NULL,
                  PREFIX_PRIORITY_LAST, 1, NULL);
------------------------------------------------------------------------
Comment 13 Brandon Low (RETIRED) gentoo-dev 2002-07-23 15:20:30 UTC
the example that Daniel and MJC have run accross is with evms-1.1.0-pre4 I
believe... it tries to link against an old library version in /lib before one
that was just created in SOURCEDIR
Comment 14 Martin Schlemmer (RETIRED) gentoo-dev 2002-07-23 15:30:18 UTC
Seems they were still using gcc-3.1-r7, and not -r8 which should be the
fixed version.

BTW: evms compiled fine here twice with gcc-3.1.1 (which is also fixed version,
     for both the libdir bug, and the .la bug that -r7 fixed for most cases,
     and -r8, and gcc-3.1.1 seems to fix for all).


Comment 15 Daniel Robbins (RETIRED) gentoo-dev 2002-08-05 11:12:14 UTC
Fixed in 3.1-r8 and likely 3.1.1, 3.2_pre
Comment 16 Martin Schlemmer (RETIRED) gentoo-dev 2002-10-25 15:33:50 UTC
OK, this needs to be opened agian, as we are getting this again.

Firstly, I will go back into the cause of this.  My initial thoughts back
then was that gcc3 changed handeling for library search paths.  It prob
does so.  I now hower feel that this is not the primary cause of this, as
Grant have a 1.4 system with gcc-2.95.3 with the same issues.  So it
may just be that it is rather the way ld (newer binutils 1.4 and -gcc3
profiles uses) now work that is causing this.

The recent pam borking was a good testing ground that I think brought
out the real issue behind this.

To recap:  We have critical system libs split between /lib and /usr/lib.
The dynamic libs (.so) goes into /lib, as some critical binaries that are
needed for bootup might be linked to them.  Then we have the static libs (.a)
in /usr/lib, as to not clutter /.

What happens with the newer ld (I am only saying what basic tests want to
point to, and have not really check what changed between ld 2.11 and 2.12/13),
is that it seems that the search order is no longer

  "/lib:/usr/lib:<user defined>"

but rather:

  "</lib and /usr/lib with user defined as influenced by -L's>"

or something to that matter.

Thus, when say bc compile small tests to try and detect libreadline, it
finds the .a in /usr/lib *before* the .so in /lib.  As it is not being
linked static, it fails to link, and bc do not detect the presence of
readline.

With the recent pam issues, we have a "fixed" gcc, but libtool changes
library search paths and calls things a bit differently, so once again
the static libpam in /usr/lib is fond before the .so in /lib.  Things
like gdm then cannot load the pam modules (need dynamic code to use libdl
to load them), and auth fails.  The same for many other things that broke
in the recent pam borkage.

To "fix" libtool the same way as gcc was "fixed", is a one liner:

-------------------------------cut------------------------------------------
--- ltmain.sh.no_gcc_lib_searchpath_fix	2002-10-24 23:07:37.000000000 +0200
+++ ltmain.sh	2002-10-24 23:07:56.000000000 +0200
@@ -751,7 +751,7 @@
     finalize_shlibpath=
     convenience=
     old_convenience=
-    deplibs=
     deplibs="-L/lib"
     old_deplibs=
     compiler_flags=
     linker_flags=
------------------------------------------------------------------------------

This, the same with the gcc fix however is not the proper way to do it.

Fix is actually real simple ... It was in front of me all the time, but I
failed to "see" it.  What does glibc do ?  You have libc, libpthread, etc
in /lib, with the static versions (if present) in /usr/lib, but it just
works fine ... why ?

Simple.  It have unversioned symlinks in /usr/lib for all the dynamic libs
in /lib.  For example:

-------------------------------------cut-------------------------------------
nosferatu root # ls -l /lib/libc.*
lrwxrwxrwx    1 root     root           13 Oct 23 11:22 /lib/libc.so.6 ->
libc-2.3.1.so
nosferatu root # ls -l /usr/lib/libc.*
-rw-r--r--    1 root     root      2516390 Oct 22 23:15 /usr/lib/libc.a
-rw-r--r--    1 root     root          178 Oct 22 23:15 /usr/lib/libc.so
nosferatu root # cat /usr/lib/libc.so
/* GNU ld script
   Use the shared library, but some functions are only in
   the static library, so try that secondarily.  */
GROUP ( /lib/libc.so.6 /usr/lib/libc_nonshared.a )
nosferatu root # ls -l /lib/libpthread*
-rwxr-xr-x    1 root     root        86381 Oct 22 23:15 /lib/libpthread-0.10.so
lrwxrwxrwx    1 root     root           18 Oct 23 11:22 /lib/libpthread.so.0 ->
libpthread-0.10.so
nosferatu root # ls -l /usr/lib/libpthread*
-rw-r--r--    1 root     root        94496 Oct 22 23:15 /usr/lib/libpthread.a
lrwxrwxrwx    1 root     root           25 Oct 23 10:40 /usr/lib/libpthread.so
-> ../../lib/libpthread.so.0
nosferatu root # 
-----------------------------------------------------------------------------

So we do not have to move all the static libs to /lib, but just add the
"compat" symlinks for the dynamic versions in /usr/lib.

I will start to get all the affected packages fixed, and then after a bit
we can drop the gcc "fix", as it is possible that it can do some breakage.
Comment 17 Martin Schlemmer (RETIRED) gentoo-dev 2002-10-25 17:06:37 UTC
*** Bug 9389 has been marked as a duplicate of this bug. ***
Comment 18 Martin Schlemmer (RETIRED) gentoo-dev 2002-10-25 17:41:03 UTC
*** Bug 9502 has been marked as a duplicate of this bug. ***
Comment 19 Martin Schlemmer (RETIRED) gentoo-dev 2002-10-25 18:02:59 UTC
*** Bug 9588 has been marked as a duplicate of this bug. ***
Comment 20 Martin Schlemmer (RETIRED) gentoo-dev 2002-10-26 09:50:13 UTC
The solution I decided on, is to rather use linker scripts, as symlinks can
be fragile.  I implemented a function in eutils.eclass:

--------------------------------cut-------------------------------------
gen_usr_ldscript() {

        # Just make sure it exists
        dodir /usr/lib
        
        cat > ${D}/usr/lib/$1 <<"END_LDSCRIPT"
/* GNU ld script
   Because Gentoo have critical dynamic libraries
   in /lib, and the static versions in /usr/lib, we
   need to have a "fake" dynamic lib in /usr/lib,
   otherwise we run into linking problems.
   See bug #4411 on http://bugs.gentoo.org/ for
   more info.  */
GROUP ( /lib/libxxx )
END_LDSCRIPT

        dosed "s:libxxx:$1:" /usr/lib/$1
}
------------------------------------------------------------------------

For say ncurses, which have /lib/libncurses.so*, and /usr/lib/libncurses.a,
you then do in src_install() after the libs have been moved in place:

   gen_usr_ldscript libncurses.so

which will create a linker script /usr/lib/libncurses.so, pointing to the
unversioned symlink /lib/libncurses.so.  I though it will be better to not
have it versioned, as ldconfig 'should' maintain the unversioned symlink.
Comment 21 Martin Schlemmer (RETIRED) gentoo-dev 2002-10-26 09:57:50 UTC
OK, here is what I have updated so far:

  sys-apps/e2fsprogs
  sys-libs/ncurses
  sys-libs/pam
  sys-libs/pwdb
  sys-libs/readline

I need some feedback on more that have dynamic libs in /lib, and static
libs in /usr/lib.
Comment 22 Martin Schlemmer (RETIRED) gentoo-dev 2002-10-29 15:16:31 UTC
*** Bug 9685 has been marked as a duplicate of this bug. ***
Comment 23 Paul Tötterman 2003-07-17 05:24:52 UTC
Does gentoo use gcc 3.1 anywhere at the moment? If not, this could be marked won't fix
Comment 24 Zhen Lin 2003-07-17 05:58:48 UTC
The last Gentoo to use 3.1 is 1.3

All in favour of WONTFIX press commit...
Comment 25 Martin Schlemmer (RETIRED) gentoo-dev 2003-07-24 23:20:38 UTC
This bug is a marker rather.  It is (or should be fixed), but it may be that
some new package, or version of a package can break again.  Thus I leave it
open.
Comment 26 Andrew Cooks (RETIRED) gentoo-dev 2003-11-24 07:54:10 UTC
I suppose it's good to keep this as a note/reminder, but was it really a blocker in the first place? Wouldn't it be better to mark this bug as REMIND?
Comment 27 Alexander Gabert (RETIRED) gentoo-dev 2004-03-05 03:07:46 UTC
hi

do we have still occurrences of this bug in the wild?

if not, can we set it to CANTFIX

this way it can be researched when looking for the bug at the bugzilla system without letting it open and up for grabs by the gcc bug hunters

thanks,

Alex
Comment 28 SpanKY gentoo-dev 2004-06-07 18:43:05 UTC
closing ... markers can be closed :)
Comment 29 Jan Rychter 2006-01-09 08:10:59 UTC
Gentlemen,

Placing ASCII text instead of ELF binaries in .so files in /usr/lib isn't a good solution. I've just ran into problems when my Common Lisp environment was trying to load /usr/lib/libz.so and complained about it not containing an ELF header.

Not everyone programs in C and not everyone loads libraries the same way -- I believe in a sane system programs should be able to expect that .so files in /usr/lib will be shared libraries.

Why not make them symlinks?
Comment 30 Preston Crow 2006-02-01 10:31:23 UTC
Yup, I just got bitten by this with a tcl/tk application that was dynamically loading libz.so.  (Or should I say, trying to load libz.so.)
Comment 31 Moises Silva 2006-02-03 10:52:20 UTC
I have a problem emerging konsole, it complains about libhistory.so not having a proper ELF header. I have made a symlink to the real libhistory, hopefully will work, but that is not a solution tough.

(In reply to comment #30)
> Yup, I just got bitten by this with a tcl/tk application that was dynamically
> loading libz.so.  (Or should I say, trying to load libz.so.)
> 

Comment 32 Simon Stelling (RETIRED) gentoo-dev 2006-03-25 10:25:35 UTC
I'd love to understand the reasoning behind choosing ld scripts over symlinks too... i often have to to find out whether a library is 32bit or 64bit in scripts and file -L only says "/usr/lib/libz.so: ASCII C program text", making it pretty hard to find out what bitness I will end up with when using said file. Symlinks would avoid this problem.
Comment 33 Kevin F. Quinn (RETIRED) gentoo-dev 2007-01-10 10:49:39 UTC
Looking at other distros, SuSE put:

shared lib in /lib
static archive in /usr/lib
symlink /lib to static archive in /usr/lib
symlink /usr/lib to shared lib in /lib

e.g.
$ ls -l /usr/lib/libacl*
-rw-r--r--  1 root root 74142 2004-06-30 19:36 /usr/lib/libacl.a
-rw-r--r--  1 root root   789 2004-06-30 19:36 /usr/lib/libacl.la
lrwxrwxrwx  1 root root    14 2006-05-30 16:48 /usr/lib/libacl.so -> /lib/libacl.so
$ ls -l /lib/libacl*
lrwxrwxrwx  1 root root    17 2006-05-30 16:48 /lib/libacl.a -> /usr/lib/libacl.a
lrwxrwxrwx  1 root root    18 2006-05-30 16:48 /lib/libacl.la -> /usr/lib/libacl.la
lrwxrwxrwx  1 root root    11 2006-05-30 16:48 /lib/libacl.so -> libacl.so.1
lrwxrwxrwx  1 root root    15 2006-05-25 11:07 /lib/libacl.so.1 -> libacl.so.1.1.0
-rw-r--r--  1 root root 43632 2004-06-30 19:36 /lib/libacl.so.1.1.0

Anyone care to list what Debian, RedHat do?
Comment 34 SpanKY gentoo-dev 2007-01-10 13:22:50 UTC
to be honest, i dont think we even care anymore ... our current situation doesnt have any drawbacks as far as i know

shared lib in /lib
static archive in /usr/lib
libtool script in /usr/lib
linker script in /usr/lib that points to /lib

unlink symlink solutions, ours works great with cross-compilers
Comment 35 Svein Ove Aas 2007-06-13 16:36:19 UTC
The linker-script solution is fine for languages and situations where you use ld to link, but it breaks dlopen(3), and therefore both programs trying to use them as plugins and many non-C languages. That includes tcl/tk, common lisp, haskell (ghci), and probably more I haven't used.

The current situation is very frustrating; as it is, I have to ask people to beware of Gentoo because of it.
Comment 36 SpanKY gentoo-dev 2007-06-13 21:03:14 UTC
what are you talking about ?  linker scripts do not break dlopen()
Comment 37 Svein Ove Aas 2007-06-13 23:19:16 UTC
No?
They certainly break ghci. I thought I'd tracked it to the dlopen call, but I guess I'll have to keep looking.
Comment 38 SpanKY gentoo-dev 2007-06-13 23:53:47 UTC
the default search path for the loader is /lib then /usr/lib

so if your application is changing the search order (via broken LD_* env vars or silly RUNPATH ELF DT tags), then yes, a linker script in /usr/lib would throw an error

however, this is hardly a problem specific to Gentoo as you'll find other distributions utilizing linker scripts in /usr/lib as well
Comment 39 Ivan 2009-10-06 10:21:54 UTC
(In reply to comment #36)
> what are you talking about ?  linker scripts do not break dlopen()
> 

Yes they do: if you try and dlopen("libpcre.so", _) then it will fail.  The problem is that whilst /lib is searched before /usr/lib, there is no libpcre.so in /lib, just libpcre.so.0{,.0.1} (which people don't know to look for when writing their code).  So the first "valid" file the dynamic linker finds when you dlopen() is the linker script in /usr/lib, and since that isn't a valid ELF file dlopen() fails.  After this comment I'll attach a sample C file that uses dlopen() based on the example in the man page for dlopen; note that it works if explicitly asked for "libpcre.so.0".

If we're going to keep with these linker scripts in /usr/lib, can we at least consider having symlinks in /lib for libpcre, etc. like there are for most other libraries?  Otherwise we'll keep having problems with interpreted languages that need to open the libs at runtime rather than using the static linker (note that Template Haskell in GHC uses ghci; thus even building packages that use Template Haskell and require linking to libpcre - for example dev-haskell/highlighting-kate in the Haskell overlay - fail to work).
Comment 40 Ivan 2009-10-06 10:23:40 UTC
Created attachment 206233 [details]
Sample C file that breaks dlopen()

Compile this using "$ gcc -rdynamic -o breakDlopen breakDlopen.c -ldl" and then try to run it.

Based off the example in the dlopen man page.
Comment 41 Mikael Magnusson 2010-10-13 11:11:11 UTC
This bug (or rather, the fix for it) also affects fontforge when trying to extract a font from a pdf file. Replacing the libz.so script with a symlink to /lib/libz.so.1 makes it work. A quick grep on the fontforge source shows a couple of other places dlopening libz as well.
Comment 42 SpanKY gentoo-dev 2010-10-13 14:15:20 UTC
new issues -> new bugs.  any package that does dlopen("libfoo.so") without the version info like ".so.X" is broken.
Comment 43 Laurent Parenteau 2010-11-19 19:01:10 UTC
I've opened a new bug for the dlopen() issue : https://bugs.gentoo.org/show_bug.cgi?id=346095
Comment 44 Raffaello D. Di Napoli 2011-03-12 21:48:15 UTC
The ld scripts don’t work if cross-compiling using crossdev, to a custom ROOT (with its own make.conf, and so on). These are the custom ROOT’s LDFLAGS:

  -Wl,--verbose -L${ROOT}lib -L${ROOT}usr/lib


This is what the correct search order should be, for custom-ROOT=/crossbuild, cross-CHOST=/usr/arch-mach-sys, parent-CHOST=/ :

(from -L options)
1. /crossbuild/lib/lib*.so
2. /crossbuild/lib/lib*.a
3. /crossbuild/usr/lib/lib*.so
4. /crossbuild/usr/lib/lib*.a
(from --with-sysroot configure option)
5. /usr/arch-mach-sys/lib/lib*.so
6. /usr/arch-mach-sys/lib/lib*.a
7. /usr/arch-mach-sys/usr/lib/lib*.so
8. /usr/arch-mach-sys/usr/lib/lib*.a


This is how it succeeds for a non-script, then fails for a script; notice the search order is disrupted by the ld script:

attempt to open /crossbuild/lib/libICE.so failed
attempt to open /crossbuild/lib/libICE.a failed
attempt to open /crossbuild/usr/lib/libICE.so succeeded
-lICE (/crossbuild/usr/lib/libICE.so)
attempt to open /crossbuild/lib/libuuid.so failed
attempt to open /crossbuild/lib/libuuid.a failed
attempt to open /crossbuild/usr/lib/libuuid.so succeeded
opened script file /crossbuild/usr/lib/libuuid.so
opened script file /crossbuild/usr/lib/libuuid.so
attempt to open /usr/arch-mach-sys/lib/libuuid.so.1 failed
attempt to open /lib/libuuid.so.1 succeeded
/usr/libexec/gcc/arch-mach-sys/ld: skipping incompatible /lib/libuuid.so.1 when searching for /lib/libuuid.so.1
/usr/libexec/gcc/arch-mach-sys/ld: cannot find /lib/libuuid.so.1

Since the script specifies an absolute /lib/libuuid.so.1, ld ignores my -L flags, and jumps to its sysroot, except the package is not there, and so it skips to the host’s root, which of course contains an incompatible binary.

I don’t understand why the script file /crossbuild/usr/lib/libuuid.so is opened twice, but I wouldn’t care if it was working properly.

Relative symlinks, on the other hand, work okay. That’s what I’m using now; I’m just not sure if they work in the general case, or just this tri-level weirdness I have to work with.
Comment 45 SpanKY gentoo-dev 2011-03-13 20:10:33 UTC
you're just doing it wrong
Comment 46 SpanKY gentoo-dev 2015-12-08 21:06:31 UTC
see bug 290974 for an explanation/discussion why dlopening libfoo.so is broken