Bug 90744 - Principle of fix_libtool_files.sh is flawed, alternative libstdc++ methodolgy suggested
|
Bug#:
90744
|
Product: Gentoo Linux
|
Version: unspecified
|
Platform: All
|
|
OS/Version: Linux
|
Status: RESOLVED
|
Severity: normal
|
Priority: P2
|
|
Resolution: FIXED
|
Assigned To: toolchain@gentoo.org
|
Reported By: jamie@shareable.org
|
|
Component: Development
|
|
|
URL:
|
|
Summary: Principle of fix_libtool_files.sh is flawed, alternative libstdc++ methodolgy suggested
|
|
Keywords:
|
|
Status Whiteboard:
|
|
Opened: 2005-04-28 11:04 0000
|
The problems that people have seen with fix_libtool_files only really affect
programs linked with libstdc++.
I want to explain why there's a bug in what fix_libtool_files.sh does, and
therefore why it should be deprecated and replaced.
The reason why the .la files encode the libstdc++ version (or more precisely,
why libstdc++.la is put in a version-specific directory) is because the binary
compatibility can change between some versions of libstdc++.
fix_libtool_files.sh is not safe for that reason: just updating the version
referenced in .la files is not binary-safe. Often it works but there's no
guarantee of it working, and runtime failures (including subtle runtime errors)
are possible.
Here's an example of what can go wrong: Suppose libgtk+.so is built with
libstdc++-3.3.4, and then gcc is updated to 3.4.3, removing the old version.
People get link time errors when linking with libgtk+: can't find
...../3.3.4/libstdc++.so. So fix_libtool_files.sh is run, and it changes the
path references in libgtk+.la to refer to ...../3.4.3/libstdc++.so.
That usually works. But if the installed binary of libgtk+.so was built
against libstdc++-3.3.4, then it may result in run time errors to build a
program with gcc-3.4.3, linked with libstdc++-3.4.3 and that version of
libgtk+.so that was built against an older version.
Now, I'm not saying there's ABI differences between those specific versions
3.3.4 and 3.4.3. But that in general, there can be, and it can result in
subtle bugs that may or may not show up at link time.
That's why just changing the version numbers in .la files, as
fix_libtool_files.sh does, isn't safe even when it _appears_ to work.
If changing those version numbers was safe, then there would be no reason for
the libstdc++.la files to exist in a gcc-version-specific directory in the
first place.
So the _real_ bug here is that portage's dependency tracking doesn't maintain
compatible library versions when updating gcc, when there are libraries
installed which depend on old libstdc++ versions. And that portage doesn't
know it has to recompile all the libraries (such as gtk+ and pspell) which
depend on libstdc++, in the correct order, when switching to a new GCC version
with a new libstdc++.
The problem is exactly the same as it is for other shared libraries which
depend on each other's ABI versions, and which have to be updated in the right
order to avoid run-time or link-time bugs.
Which begs the question: why is libstdc++ not treated the same as other
libraries which have regular ABI changes, by putting it into /usr/lib with a
varying major SONAME, and letting normal package dependencies ensure the
multiple versions stay around as long as any other installed binary is
dependent on them?
Reproducible: Always
Steps to Reproduce:
1.
2.
3.
>Now, I'm not saying there's ABI differences between those specific versions 3.3.4 and 3.4.3.
The C++ ABI of gcc 3.3.x and 3.4.x does differ. You have to rebuild when
switching.
I've just done "emerge -auv world --deep". (The "--deep" is just to confirm
that the maximum dependency checking still doesn't rebuild everything needed).
It tries to update libgtk+, on a system which has recently moved from gcc-3.3.4
to gcc-3.4.3, and it fails to compile because libgtk+ references libtiff.la
which references the 3.3.4/libstdc+.la.
What portage _should_ logically be doing is rebuilding libtiff first, on the
grounds that the currently installed libtiff development libraries aren't
necessarily binary compatible with the libstdc++ that the compiler will now
link with.
Portage doesn't rebuild libtiff first, even with --deep. If it did, then gtk+
would build fine too. This is confirmed by rebuilding libtiff explicitly, and
then gtk+ builds fine after it.
Portage would be able to rebuild everything without the libtool/libstdc++
errors if it tracked library dependencies at this level of detail when
rebuilding.
So, if Portage tracked the library dependencies more closely, and rebuilt
prerequisite libraries even when their package version hasn't changed but their
binary prerequisite library SONAMEs _have_ changed to an incompatible value,
this would simultaneously fix the problems people see with libstdc++ versions,
do the right thing with respect to the shared-lib ABI versions, and remove any
need for fix_libtool_files.sh.
I think. (There's probably something I haven't thought of).
There's still another rather subtle problem:
When libtiff.so.3.7.1 is rebuilt, it produces a new shared library which is ABI
compatible with the current compiler's libstdc++. But there may be programs
installed which are linked to libtiff.so.3.7.1 and also with the _old_ version
of libstdc++. This is difficult to solve with SONAMEs alone, because it really
needs both differently-compiled versions of libtiff.so.3.7.1 installed at the
same time, even though those have the same SONAME, to be guaranteed that all
programs are binary compatible with each other. Such a puzzle; I don't have
any good ideas for handling that.
>I've just done "emerge -auv world --deep".
`revdep-rebuild --soname libstdc++.so.5` is the better choice.
fix_libtool_files.sh is indeed buggy ( Bug 71265 ) and that's for sure one of the reasons gcc-3.4 is not stable on x86, yet, but I don't see why
- emerge libtool
- fix_libtool_files.sh 3.3.4
- revdep-rebuild --soname libstdc++.so.5
shouldn't solve your problems. Forwarding you to the toolchain guys...
I agree with Jamie; just changing the references in .la files is risky, and
pretty much guaranteed to fail every now and then for the reasons Jamie states.
The way I see it, things boil down to two approaches
A) Do not remove the /usr/lib/gcc[-lib] directories
B) Rebuild any packages depending on libraries that have been deleted.
(A) I think is the most reliable, as it means packages continue to work even
though the compiler that built them has been removed. However it'd need some
kind of REMOVE_PROTECT functionality in portage, and perhaps some kind of mop-up
script to check for usage and remove if unused.
(B) I'll attach a script I use to locate broken-ness in .la files
Created an attachment (id=69951) [details]
Scans library directory for libtool archives that depend on non-existent stuff
Puts the list of stuff it finds to stdout and progress to stderr, e.g.:
$ ./find-broken-la.sh > broken-las.txt
Scans all the directories in /etc/ld.so.conf, picks out 'dependency_libs=' and
checks it for each .la file.
Although this is not directly connected to this, I want to simply ask is there
any plans removing the .la files from gentoo, which were already purged from
most other mainstream distributions.
I don't know if this is the case with libc, but isn't it enough to have simply
different soname-tags in the library, which will trigger the rebuild?
This issue always comes up with compiler upgrades. Are there any
drawbacks/benefits from removing/keeping the la files around? If there is no
real downside, I say we eliminate them, otherwise, this bug should be closed as
WONTFIX.
(In reply to comment #7)
> This issue always comes up with compiler upgrades. Are there any
> drawbacks/benefits from removing/keeping the la files around? If there is no
> real downside, I say we eliminate them, otherwise, this bug should be closed as
> WONTFIX.
well, better later than never, I have nuked all the la-s on my system.
Some information - this is x86 system, that's why /usr/lib - obviously it will
be /usr/lib64 (for amd64) and so on. Moreover I don't use kde - xfce4-svn,
gnome, e17 is what I have, although I have 2-3 apps using qt and kdelibs.
Firstly i've made sth like
find /usr/lib -name '*.la' -delete
After that I have made a very hackish patch for the portage (of course this
should be refined, as it removes all of the la-s unconditionally)
--- bin/prepall.orig 2006-04-04 01:00:26.000000000 +0200
+++ bin/prepall 2006-04-07 22:22:21.000000000 +0200
@@ -23,13 +23,14 @@
chmod +x "${j}"
done
- for j in "${i}"/*.a "${i}"/*.la ; do
+ for j in "${i}"/*.a ; do
[[ ! -e ${j} ]] && continue
[[ -L ${j} ]] && continue
[[ ! -x ${j} ]] && continue
echo "removing executable bit: /${j/${D}/}"
chmod -x "${j}"
done
+ find ${i} -name '*.la' -print0 | xargs -0 /bin/rm -f
done
# When installing static libraries into /usr/lib and shared libraries into
from the code before this:
for i in "${D}"opt/*/lib{,32,64} \
"${D}"lib{,32,64} \
"${D}"usr/lib{,32,64} \
"${D}"usr/X11R6/lib{,32,64} ; do
one could see that this will not remove the la-s in qt and kde-dirs, but some
should verify if this will work, as I have read somewhere that kde uses these
files for plugins and libs loading, although I am not sure.
Now from around 1500 pkg on my system - some 5-6 failed the re-emerge -e world,
because of the missing la-s
subversion (I have made a patch - it was a problem in the .configure-script),
openjade (it is again a problem in the way the dependancies are supplied -
should be easy to patch), and some of the gstreamer-plugins fail, because of
missing la-s (I haven't looked at it for now).
So it seems that this operation should be fairly safe...
nuking all .la files isnt going to happen
we're going to be nuking just libstdc++.la
well are there any reasons behind this, they create more problems than their
usefulness is, I think that leaving only these, which are really necessary is a
better sollution.
because you'll break things by removing the .la files
go ahead and try to compile a big gui package statically using libtool and
watch it blow up
well, i didn't know about this, does gentoo support the static useflag for such
packages?
*** Bug 130423 has been marked as a duplicate of this bug. ***
*** Bug 130423 has been marked as a duplicate of this bug. ***
For gcc-4.1.1 we have removed libstdc++.la, so eventually we'll be able to get
rid of fix_libtool_files.sh. It will just take some time.
(In reply to comment #17)
> For gcc-4.1.1 we have removed libstdc++.la, so eventually we'll be able to get
> rid of fix_libtool_files.sh. It will just take some time.
>
Shouldn't this be more of a resolved later then? It would be nice to have the
ebuild die with collision-protect early in the process to be able to disable
it. I don't think many of our users realize that you can continue the build
with FEATURES="-collision-protect" ebuild <pkg> merge.
Nuking libstdc++.la seems to introduce its own problems. For example, in order
to build kdelibs, I had to keep libstdc++.la and manually change the directory
referenced in the file to point to 4.1.1, because fix_libtool_files.sh didn't
work. kdelibs finally built after this change, but I still can't get
inkscape-0.44 to compile, and I suspect that this nuking choice has caused some
others of my packages to break.
This is a pretty major switch from earlier versions of gcc, and I think should
be made abundantly clear to the user prior to upgrading. A wiki entry on this
subject would also be nice, and I can finally rest a little more easily knowing
that I must wait for updated packages to shed their dependency on libstdc++.la
I've included links to the errors that I've gotten for various packages since,
or at least the ones I suspect have something to do with libstdc++.la
in this case the problem is solely in fix_libtool_files.sh
the resolution is to just remove the references to libstdc++.la from the
la-files on your system, it has nothing to do with inkscape or kdelibs.
*** Bug 142381 has been marked as a duplicate of this bug. ***
*** Bug 156793 has been marked as a duplicate of this bug. ***