Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 926120 - Figure out proper handling for LTO static libraries
Summary: Figure out proper handling for LTO static libraries
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal
Assignee: Gentoo Toolchain Maintainers
URL:
Whiteboard:
Keywords: PullRequest
: 940538 940540 944810 (view as bug list)
Depends on:
Blocks: 955567 876430 883419 900519 924183 924360 927994 938858
  Show dependency tree
 
Reported: 2024-03-04 03:42 UTC by Sam James
Modified: 2025-05-08 05:16 UTC (History)
11 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-03-04 03:42:10 UTC
I kept thinking we had a proper bug for this, but we didn't AFAIK. I shorehorned it into bug 866422 for a while.

With static libraries and -flto, bitcode is included in the static archive (.a) and then when an application links using them later on, they're glued together and treated as the whole program. It allows more optimisation, but it's very brittle.

The archives aren't compatible across GCC versions, even minor versions (depends on if a commit which changes certain interals gets backported). The interface version isn't always updated for such commits which causes a segfault/ICE instead of a nicer error for mismatched versions.

We don't generally bother with static libraries and we can take the loss on optimising them, especially given it also often blows up memory usage with LTO too.
Comment 1 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-03-04 03:43:48 UTC
Fedora has two approaches:
1) For GCC, it checks that -ffat-lto-objects was used (they warn if not - https://fedoraproject.org/wiki/Changes/LTOBuildImprovements), and strips the bitcode from the static libraries using https://src.fedoraproject.org/rpms/redhat-rpm-config/blob/rawhide/f/brp-strip-lto

2) For LLVM, they try to convert the bitcode to ELF (https://src.fedoraproject.org/rpms/redhat-rpm-config/blob/rawhide/f/brp-llvm-compile-lto-elf).
Comment 2 Eli Schwartz gentoo-dev 2024-03-04 03:46:56 UTC
So naively, a paired set of functions?

In src_configure:

lto-guarantee-fat() {
	if tc-is-lto; then
		append-cflags $(test-flags-CC -ffat-lto-objects)
		append-cxxflags $(test-flags-CXX -ffat-lto-objects)
	fi
}


in src_install, manually strip with -R .gnu.lto_* -R .gnu.debuglto_* -N __gnu_lto_v1

Or hook this up to portage's global stripping routine. See bug 920745.
Comment 4 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-03-04 03:49:48 UTC
(In reply to Eli Schwartz from comment #2)

I suppose this isn't that bad... especially given not _that_ many packages even install static libraries.

We need to see if we can make the same approach work for LLVM too, ideally. I wouldn't want to have to do two approaches...
Comment 5 Arsen Arsenović gentoo-dev 2024-03-04 18:57:48 UTC
I was and potentially am still partial to post-processing static libraries by compiling them (or their contents) with '-x lto'.  I haven't experimented with this at all, however, so it is entirely possible that it's not possible for some reason.

The reason for me holding on to it is because it'd be less error-prone (maintainers wouldn't have to know ahead of time whether static libraries are being installed) and maybe faster.

If someone plays around with this, please consider it.
Comment 6 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-03-05 08:56:33 UTC
(In reply to Arsen Arsenović from comment #5)
> I was and potentially am still partial to post-processing static libraries
> by compiling them (or their contents) with '-x lto'.
> [...]

> The reason for me holding on to it is because it'd be less error-prone
> (maintainers wouldn't have to know ahead of time whether static libraries
> are being installed) and maybe faster.

I don't think it's less error prone because if we were to do it, it'd have to be in Portage (so PM-specific) and also figure out a way for ebuilds to control it...

> 
> If someone plays around with this, please consider it.

But I haven't given up on the idea and I will keep your preference in mind, of course
Comment 7 Arsen Arsenović gentoo-dev 2024-03-05 10:36:33 UTC
(In reply to Sam James from comment #6)
> (In reply to Arsen Arsenović from comment #5)
> > I was and potentially am still partial to post-processing static libraries
> > by compiling them (or their contents) with '-x lto'.
> > [...]
> 
> > The reason for me holding on to it is because it'd be less error-prone
> > (maintainers wouldn't have to know ahead of time whether static libraries
> > are being installed) and maybe faster.
> 
> I don't think it's less error prone because if we were to do it, it'd have
> to be in Portage (so PM-specific) and also figure out a way for ebuilds to
> control it...

But the default is, IMO, far better.  Packages that fail to build in that way are likely exceptions rather than the rule.

But yes, the rest stands..

> > 
> > If someone plays around with this, please consider it.
> 
> But I haven't given up on the idea and I will keep your preference in mind,
> of course

Thanks :-)
Comment 8 Eli Schwartz gentoo-dev 2024-03-27 21:36:27 UTC
This came up during the 23.0 profile migration as well.

The global USE default now enables zstd, which therefore toggles the gcc IUSE and enables zstd-compressed LTO bytecode.

GCC promptly emits an ICE when linking against existing static archives that were built with a GCC that was configured to use zlib instead of zstd.

This is a general problem if you rebuild the compiler with changed USE...
Comment 9 Eli Schwartz gentoo-dev 2024-03-27 21:42:52 UTC
find /usr/lib* -name '*.a' | xargs equery belongs | sort -u

38 packages on my system. It is not a pleasant idea when doing a profile migration and --emptytree if you have to build all of those with LTO disabled before migrating, then rebuild them again with LTO to taste.
Comment 10 Arsen Arsenović gentoo-dev 2024-03-28 14:44:26 UTC
jakub has also suggested the fat-lto-object approach.  maybe we should make a 'dot-a.eclass' for adding + processing installed lto objects and object archives, plus a QA check that detects packages forgetting to use them.  this seems PM-independent but still solid enough
Comment 11 Arniiiii 2024-07-31 09:18:40 UTC Comment hidden (offtopic)
Comment 12 Arniiiii 2024-07-31 09:52:10 UTC Comment hidden (offtopic)
Comment 13 Arniiiii 2024-07-31 09:57:36 UTC Comment hidden (offtopic)
Comment 14 Arniiiii 2024-07-31 10:00:13 UTC Comment hidden (offtopic)
Comment 15 Arsen Arsenović gentoo-dev 2024-07-31 10:16:47 UTC
(In reply to Arniii from comment #11)
> (In reply to Arsen Arsenović from comment #10)
> > jakub has also suggested the fat-lto-object approach.  maybe we should make
> > a 'dot-a.eclass' for adding + processing installed lto objects and object
> > archives, plus a QA check that detects packages forgetting to use them. 
> > this seems PM-independent but still solid enough
> 
> Using fat-lto-objects just make it that if there's code that fails to
> understand static libs built with lto, it will try to use the part of the
> static lib that is normal. 
> 
> Is it bad? Yes, because it's saying "we don't care about performance gain
> from compiling libs with lto" . 
No, we don't care about compiling _static_ libraries with LTO.  We very much care about performance gain from LTO in various other bits of code.  I suspect most library interfaces wouldn't get much benefit from IPA over them anyway, though (but I have no data to back that up). 

> AFAIK shared libraries don't have lto at all, so static libs are the only
> approach for lto gain AFAIU.
They do, internally.

> (In reply to Eli Schwartz from comment #8)
> > This came up during the 23.0 profile migration as well.
> > 
> > The global USE default now enables zstd, which therefore toggles the gcc
> > IUSE and enables zstd-compressed LTO bytecode.
> > 
> > GCC promptly emits an ICE when linking against existing static archives that
> > were built with a GCC that was configured to use zlib instead of zstd.
> > 
> > This is a general problem if you rebuild the compiler with changed USE...
> 
> I believe it's a problem of an end user. If they want, they use or zlib, or
> zstd, or they set `-flto-compression-level=0` . 
I disagree with that mentality.  There are two valid solutions, and neither involve ICEs when compiling stuff that uses static libraries:

1) Strip LTO code from static libraries
2) Teach portage to track this dependency

(1) seems far more favorable because there are so few static libraries, but (2) is useful for other things also.

> (In reply to Sam James from comment #0)
> > I kept thinking we had a proper bug for this, but we didn't AFAIK. I
> > shorehorned it into bug 866422 for a while.
> 
> > The archives aren't compatible across GCC versions, even minor versions
> > (depends on if a commit which changes certain interals gets backported). The
> > interface version isn't always updated for such commits which causes a
> > segfault/ICE instead of a nicer error for mismatched versions.
> > 
> 
> 
> Maybe create a set of packages that creates static libs ( or add those to
> @preserved_rebuild ), or get any mechanism to rebuild all static libs if gcc
> version is updated or changed the gcc's ---zlib--- zstd use flag
We can't express that dependency today, I think.

> > We don't generally bother with static libraries and we can take the loss on
> > optimising them, especially given it also often blows up memory usage with
> > LTO too.
> 
> I would like not to take the loss, if it's possible. 
> About memory usage: there's a lot of notes about sometimes massive RAM usage
> for  when -flto , it's the problem of an end user, they understand what
> drawbacks are and agree to them. 
> Actually, the only big problem with massive RAM usage for me was QEMU (
> https://bugs.gentoo.org/883419 )( a guy at discord (Zen) with 512 GB of RAM
> said he could compile it with static-user and support for aarch64 ) and
> chromium packages ( require at least 128GB of RAM if lto + -ggdb3 ) .
> Nowadays, for all other packages the problem with big RAM usage is
> insignificant.

It is possible that failed because the build system (which is ninja) ran too many processes because it lacks job servers.  I don't recall the details of when I ran into this also.

> The actual problems are:
> 1. Recently I've seen a guy at gentoo's discord which was trying to use
> static libs and was on llvm profile. He compiled libcxx with static libs. He
> got both static libs and shared libs of libcxx. Somehow when compiling clang
> after this he got a lot of complains from compiler about adding `-fPIC` . I
> guess it should be investigated because it's maybe problem not the packages
> that create LTO static libs, but the packages that use the LTO static libs.
We don't have a Discord.  I've heard there's a Gentoo-themed community on Discord though.

Usage of PIC is unrelated to LTO.  Now, I don't know what the error was, but it was likely indicative of not using PIC but trying to build a shared library, which fails for obvious reasons.

> 2. GCC's static libs built with LTO are not understandable by Clang and vice
> versa: Clang's static libs build with LTO or LTO-thin are not understandable
> by GCC. I remember when I tried llvm profile one year ago there were enough
> of such problems. I recall at least next bug: https://bugs.gentoo.org/913040
Indeed, more reasons not to LTO static libs.

(In reply to Arniii from comment #12)
> 3. Figure out how to account what packages creates static libs and what
> packages depend on them, and if triggered ( gcc is updated or enabled zstd
> USE flag for gcc ), make mechanism of how to rebuild static libs and then
> rebuild packages that used them.

That'd work, but someone has to implement it.

(In reply to Arniii from comment #13)
> (In reply to Arniii from comment #12)
> I believe this can be solved by adding a metadata entry like if there was
> USE flag `static-libs` , there's going to be "STATIC_LIBS_LTO_COMPILER: gcc
> -llvm" or something like. Then we could figure out what packages should be
> rebuild if there was lto static lib built with gcc or llvm and make
> high-level entry, if it uses gcc and static libs, expect all packages that
> expected to have static libs to be compiled like "STATIC_LIBS_LTO_COMPILER:
> -gcc -llvm" or "STATIC_LIBS_LTO_COMPILER: gcc -llvm"
It'd be better not to special-case this instance, we have other examples (GHC, perl maybe?, emacs native-comp) where we'd want/need to propagate "computed" compatibility information
Comment 16 Arsen Arsenović gentoo-dev 2024-07-31 10:17:27 UTC
(In reply to Arniii from comment #14)
> I believe we need such approach because only in a distant future there's
> going to be gcc-llvm lto compatibility or interoperability, so we need such
> approach.

LTO bitcode is compiler internals, so that's not happening.
Comment 17 Arniiiii 2024-07-31 10:53:30 UTC Comment hidden (offtopic)
Comment 18 Eli Schwartz gentoo-dev 2024-07-31 13:04:16 UTC
(In reply to Arniii from comment #11)
> Is it bad? Yes, because it's saying "we don't care about performance gain
> from compiling libs with lto" . 
> 
> AFAIK shared libraries don't have lto at all, so static libs are the only
> approach for lto gain AFAIU.


They have intra-artifact LTO, but not inter-artifact LTO.


(In reply to Arniii from comment #11)
> I would like not to take the loss, if it's possible. 


We are not declaring that fundamental problems are just "the user did something wrong", based purely on the fact that a *single* user would rather keep the bug in order to gain highly dubious LTO benefits.

As far as I am concerned there are exactly two valid options here:

- get future versions of portage to handle static libraries by stripping LTO for safety and QA reasons.

- offer a portage option to disable this safety feature, so that the future version of portage behaves exactly like the current version of portage in the event that people prefer the status quo and don't want to "take the loss".
Comment 19 Arniiiii 2024-07-31 13:12:02 UTC Comment hidden (offtopic)
Comment 20 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-07-31 13:15:10 UTC Comment hidden (offtopic)
Comment 21 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-07-31 13:16:13 UTC Comment hidden (offtopic)
Comment 22 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-07-31 13:18:51 UTC Comment hidden (offtopic)
Comment 23 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-10-01 00:41:07 UTC
*** Bug 940540 has been marked as a duplicate of this bug. ***
Comment 24 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-10-01 00:41:20 UTC
*** Bug 940538 has been marked as a duplicate of this bug. ***
Comment 25 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-11-27 06:52:50 UTC
*** Bug 944810 has been marked as a duplicate of this bug. ***
Comment 26 Larry the Git Cow gentoo-dev 2025-05-04 10:15:23 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/proj/toolchain/binutils-patches.git/commit/?id=dbbd30a6cf0ba45b316f11a01f25428230fe2768

commit dbbd30a6cf0ba45b316f11a01f25428230fe2768
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2025-05-04 10:14:41 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2025-05-04 10:14:41 +0000

    9999: add H.J.'s patch for strip --plugin
    
    Bug: https://sourceware.org/PR21479
    Bug: https://bugs.gentoo.org/866422
    Bug: https://bugs.gentoo.org/926120
    Signed-off-by: Sam James <sam@gentoo.org>

 9999/0006-strip-lto-plugin.patch | 751 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 751 insertions(+)
Comment 27 Larry the Git Cow gentoo-dev 2025-05-06 08:46:02 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=f4e964aa2024f09da289f71768cc9b09c1ab09cf

commit f4e964aa2024f09da289f71768cc9b09c1ab09cf
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2025-05-01 18:30:56 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2025-05-06 08:45:06 +0000

    dot-a.eclass: new eclass for handling LTO in static archives
    
    Introduce a new eclass with utility functions for handling LTO bytecode
    (or internal representation, IR) inside static archives (.a files).
    
    Static libraries when built with LTO will contain LTO bytecode which is
    not portable across compiler versions or compiler vendors. To avoid pessimising
    the library and always filtering LTO, we can build it with -ffat-lto-objects
    instead, which builds some components twice. The installed part will then
    have the LTO contents stripped out, leaving the regular objects in the
    static archive.
    
    It's not feasible to make these work otherwise, as we'd need tracking
    for whether a library was built by a specific compiler and its version,
    and that compatibility can vary based on other factors (e.g. with gcc,
    sys-devel/gcc[zstd] controls if it supports zstd compression for LTO). We
    also discourage static libraries anyway.
    
    Provide two functions:
    
    * lto-guarantee-fat
    
      If LTO is currently enabled (as determined by `tc-is-lto`, added in
      2aea6c3ff2181ad96187e456a3307609fd288d4c), add `-ffat-lto-objects`
      to CFLAGS and CXXFLAGS if supported.
    
      This guarantees that produced archives are "fat" (contain both IR
      and regular object files) for later pruning.
    
    * strip-lto-bytecode
    
      Process a given static archive (.a file) and remove its IR component,
      leaving a regular object.
    
    This approach is also taken by Fedora, openSUSE, and Debian/Ubuntu. An
    honourable mention to `lto-rebuild` which fulfilled the same task for many
    in the LTO overlay too.
    
    We did consider an alternative approach where we'd relink objects using
    the driver in src_install (or some hook afterwards), but this would be
    more brittle, as we'd need to extract the right arguments to use (see
    e.g. the recent Wireshark issues in fad8ff8a45afc83559f8df695cf96dfec51d3e8a
    for how this can be subtle) and not PM-agnostic given we don't have portable
    hooks right now (and even if we did, suspect they wouldn't work in a way
    that facilitated this). It's also not clear if such an approach would've
    worked for Clang.
    
    The tests have been quite helpful in debugging this and making sure things
    work as expected. They  both make sure the eclass does what it ought to,
    but also try to capture the expected interaction with the toolchain (which is
    why we have the skips depending on tooling & versions) to allow us to test
    workarounds and make sure we understand the interactions fully: it made
    it easy to test e.g. a patch to Binutils to make strip have plugin/LTO
    integration (PR21479).
    
    All of this wasn't worth pursuing until H. J. Lu's patches for Binutils
    landed, which they have now in binutils-2.44 [0], which made bfd's handling
    of mixed objects much more robust.
    
    [0] https://inbox.sourceware.org/binutils/20250112220244.597636-1-hjl.tools@gmail.com/
    
    Bug: https://bugs.gentoo.org/926120
    Thanks-to: Arsen Arsenović <arsen@gentoo.org>
    Co-authored-by: Eli Schwartz <eschwartz@gentoo.org>
    Signed-off-by: Sam James <sam@gentoo.org>

 eclass/dot-a.eclass   | 117 ++++++++++++
 eclass/tests/dot-a.sh | 511 ++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 628 insertions(+)
Comment 28 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2025-05-06 08:48:05 UTC
Done. Some more packages will need to be ported to use the eclass.
Comment 29 zyxhere 2025-05-06 10:21:47 UTC
Does this mean that the default flags can have -flto now?
Or are there any concerns still with LTO other then inceasing build times?
Comment 30 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2025-05-06 11:57:26 UTC
(In reply to zyxhere from comment #29)
> Does this mean that the default flags can have -flto now?
> Or are there any concerns still with LTO other then inceasing build times?

First, need to port more packages to use this eclass and test it more (per https://bugs.gentoo.org/926120#c28).

Then need to test more, and ideally have keen users test with LTO and also make sure to run test suites.

Then hopefully can look at making the binhost start to use LTO.

We can go from there.
Comment 31 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2025-05-07 14:19:48 UTC
Filed bug 955567 for tracking default plans.
Comment 32 Larry the Git Cow gentoo-dev 2025-05-08 05:16:54 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/proj/binhost.git/commit/?id=057c5eccd298d01c781cf2bd1a0a7550113de537

commit 057c5eccd298d01c781cf2bd1a0a7550113de537
Author:     Eli Schwartz <eschwartz@gentoo.org>
AuthorDate: 2025-05-08 05:14:02 +0000
Commit:     Eli Schwartz <eschwartz@gentoo.org>
CommitDate: 2025-05-08 05:14:16 +0000

    lto: remove nolto masking for static archive exception
    
    We now have dot-a.eclass to handle this. Removing the exception means we
    will now start building flex with LTO, as it gets caught by our opt-in
    for sys-devel/*
    
    Bug: https://bugs.gentoo.org/926120
    Signed-off-by: Eli Schwartz <eschwartz@gentoo.org>

 builders/dola/gnome-23/portage/package.env/lto      | 5 -----
 builders/dola/kde-23/portage/package.env/lto        | 5 -----
 builders/dola/server-23/portage/package.env/lto     | 5 -----
 builders/milou/gnome-23/portage/package.env/lto     | 5 -----
 builders/milou/gnome-v3-23/portage/package.env/lto  | 5 -----
 builders/milou/kde-23/portage/package.env/lto       | 5 -----
 builders/milou/kde-v3-23/portage/package.env/lto    | 5 -----
 builders/milou/openrc-23/portage/package.env/lto    | 5 -----
 builders/milou/openrc-v3-23/portage/package.env/lto | 5 -----
 builders/milou/server-23/portage/package.env/lto    | 5 -----
 builders/milou/server-v3-23/portage/package.env/lto | 5 -----
 11 files changed, 55 deletions(-)