Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 821955 - LLVM eclass needs a way to query library dependencies for linked llvm versions
Summary: LLVM eclass needs a way to query library dependencies for linked llvm versions
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Eclasses (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: LLVM support project
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 797298
  Show dependency tree
 
Reported: 2021-11-05 15:20 UTC by Sebastian Parborg
Modified: 2024-02-10 15:00 UTC (History)
6 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Sebastian Parborg 2021-11-05 15:20:51 UTC
Currently there is no way to query or make sure that ebuild library dependencies uses the same shared llvm library version as some of its other dependencies.

This leads to issues in for example media-gfx/blender where if media-lib/osl is built with an other version of llvm than media-libs/mesa, blender will crash on startup.

To remedy this we need to either statically link llvm to media-lib/osl (doesn't seem like this is an option as we don't want static libraries), or we need to implement a way for ebuilds to enforce that all dependencies uses the same llvm version.

There is a third option of course that we currently use where we simply just pray that uses will not get their llvm version mixed in this manner. But so far this is failing regularly as not all libraries supports the latest llvm versions that are available in portage.
Comment 1 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2021-11-06 14:50:03 UTC
I don't think this is technically possible.  The best solution i can think of is binding specific packages to a single LLVM version (and optionally slotting them to allow other versions), and then making sure all your deps are bound to the same LLVM version.
Comment 2 Sebastian Parborg 2021-11-07 09:52:26 UTC
You mean it is not technically possible right now or that it would be impossible to add to portage?

To me I would think that at least in this case it should be possible as OSL is a direct dependency of Blender.

Wouldn't it be possible to record/store which llvm slot a package is using and be able to query it? (Kinda like the python version use flags I guess)

Restricting the llvm version would work, so I could do that. However I don't feel that is a good long term solution. Lets take the current state of portage for example.

LLVM 13 is marked as unstable, however I might get OSL patched to work with it.
Then if I force OSL to only use LLVM 13, all users that are on the stable version will not be able to use blender with the OSL useflag. Of course the reverse is true as well if we enforce that only the stable LLVM version is usable.

Just to reiterate: In this case we only need to make sure that mesa (if the user is running opensource GPU drivers) and OSL are using the same LLVM slot.
I can already do this heavy handedly in the blender ebuild by using `ldd` in the ebuild and calling `die` if the llvm versions don't match.

Because it is already possible to determine this mismatch this way, I would think that is should be possible to do this special dependency check before starting to emerge the package.

It think having a solution like that would bring far fewer problems down the line than if we simply don't support multiple llvm versions. Especially since we don't allow static linking.
Comment 3 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2021-11-07 10:08:46 UTC
Write a PMS spec for it, get it approved, wait for EAPI 9, then we can talk.
Comment 4 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2021-11-07 10:21:11 UTC
I'm sorry if my last comment sounded harsh.  I've only meant to point out how hard it is to actually get it done.
Comment 5 Sebastian Parborg 2021-11-07 10:56:20 UTC
I don't really know what a "PMS spec" is. Google fails me (I'm getting paint color related search results).

If it would mean that we get a better way to equery packages for things like this, I wouldn't be against putting in some work into it.

From your point of view, what is the hard part?

- Writing the code to make it work


- Getting it approved


- Both


At least to me, IN THEORY, it shouldn't be impossible. But you have much more experience than me in these matters so I would like to talk with you a bit so I can get a clear picture of what the pain points or impossible parts of this would be.
Comment 6 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2021-11-07 11:03:24 UTC
The hardest part is actually figuring out a design that is generic (i.e. doesn't hardcode LLVM-specific stuff) and works in all scenarios, including cross-ROOT installs, building and installing binary packages and so on.  To be honest, I don't even know where to start.

In the end, I'll probably end up unslotting LLVM.  Today it seems to cause more issues than it actually solves.  The only real advantage it has is providing old clang versions for testing but that's not strictly a problem for Gentoo to solve.
Comment 7 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2021-11-07 11:07:20 UTC
PMS is https://wiki.gentoo.org/wiki/Project:Package_Manager_Specification

Even if you figure out a good and working design, it may be rejected as "too hard to describe".  Then there's a matter of implementing it for Portage (which will probably fall on you), and maybe PkgCore (unless you figure out a design that is "optional").

Oh, and let's not forget that you actually need to be concerned by two scenarios: a. when the package is already installed (+ possibility of rebuilding it), and b. when it's being installed as part of the current batch.
Comment 8 Sebastian Parborg 2021-11-07 11:23:07 UTC
Ah, if you are going/considering to unslot llvm, then I guess it is not really worth spending time on this.

If it would stay slotted, I guess the easy way out if we were going to still keep llvm sloted is to essentially copy the python targets use flags. So we would have `llvm_targets_13` or something.

However I do agree with you that allowing different llvm slots does seems to give us more issues than it solves. Especially when you need to trigger rebuilds as you pointed out.

Perhaps unslotting llvm and having a hard requirement of creating a nice system to handle issues like these before it can be slotted again would be a good thing to do?
Comment 9 Sebastian Parborg 2021-11-18 16:54:08 UTC
Any ETA on when unslotting of llvm will happen?

I'm asking as if it will take long, then perhaps we should just mask the osl useflag on the blender ebuild.

I'll update the blender ebuild for the upcoming 3.0 release so I could do it then.
Comment 10 Emily Rowlands 2021-11-18 16:59:17 UTC
This was discussed on the mailing list and Michał decided not to go ahead with the proposal.

https://archives.gentoo.org/gentoo-dev/message/fe9fed2791ddd4a609d37beeacf19406
Comment 11 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2021-11-18 17:03:40 UTC
Exactly.  There was a significant opposition to losing this feature, and in the end it's something that's to Gentoo's advantage over other distros.

Nevertheless, we need to figure out some way of addressing this.  I think it's a tad late for that now but maybe I should keep new LLVM releases masked for some time while revdeps are being updated.  I'd use some help figuring out which packages are actually broken like this, to keep an eye on them.
Comment 12 Sebastian Parborg 2021-11-18 18:07:36 UTC
Hmm, in the case of Blender it is actually only the OSL and Mesa combo that currently has this issue.

However I don't really know if simply masking will help in this case.
I have known for a long time and this is why I even patched OSL to support llvm 11 and 12.

But even with patches, the end user can still end up with mixed llvm versions as we currently can't tell if a user needs to rebuild packages if for example the Mesa version they use was recompiled with an other llvm version.

So even if I make sure all the deps for Blender can be built with the llvm versions that are in portage, there is no guarantee that the end user will end up with the correct combination.

Anyways, thanks for the quick responses both of you!

I'm here if you need me to test things out or if you want my input on anything :)
Comment 13 Matt Turner gentoo-dev 2021-11-18 18:37:02 UTC
Short of adding an LLVM-equivalent of PYTHON_SINGLE_TARGET to packages that depend on LLVM, I don't have any good ideas about solving this.

Mesa's dependence on LLVM provides a particularly common place for a second LLVM version to appear in a process. We *could* statically link LLVM into Mesa to avoid this, but it wouldn't solve the problem in general, and I have concerns about doing that anyway.

This is, unfortunately, a problem with using LLVM.
Comment 14 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2021-11-18 19:18:43 UTC
The problem with using flags is that it only works when you depend on both packages directly, or add meaningful flags to interim dependencies.
Comment 15 Matt Turner gentoo-dev 2021-11-18 19:44:57 UTC
(In reply to Michał Górny from comment #14)
> The problem with using flags is that it only works when you depend on both
> packages directly, or add meaningful flags to interim dependencies.

Ah, you are right.

Yeah, seems kind of intractable with currently available infrastructure.
Comment 16 Sebastian Parborg 2022-11-10 12:43:37 UTC
Is there any light at the end of the tunnel for this?

As llvm 15 was marked stable, I'm now again getting crashing bug reports because of llvm version missmatches:
https://bugs.gentoo.org/880671

If this doesn't get a proper solution, I don't see any other way than static linking to solve this.
Comment 17 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-02-05 23:46:47 UTC
See https://marc.info/?l=gentoo-dev&m=170715274130875&w=2.
Comment 18 Larry the Git Cow gentoo-dev 2024-02-10 10:47:29 UTC
The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=5618de75aec49009489efb560a89e014fd060524

commit 5618de75aec49009489efb560a89e014fd060524
Author:     Michał Górny <mgorny@gentoo.org>
AuthorDate: 2024-02-05 19:29:36 +0000
Commit:     Michał Górny <mgorny@gentoo.org>
CommitDate: 2024-02-10 10:47:23 +0000

    llvm-r1.eclass: Initial version
    
    Bug: https://bugs.gentoo.org/923228
    Bug: https://bugs.gentoo.org/880671
    Closes: https://bugs.gentoo.org/821955
    Closes: https://bugs.gentoo.org/919150
    Signed-off-by: Michał Górny <mgorny@gentoo.org>

 eclass/llvm-r1.eclass   | 250 ++++++++++++++++++++++++++++++++++++++++++++++++
 eclass/tests/llvm-r1.sh | 151 +++++++++++++++++++++++++++++
 2 files changed, 401 insertions(+)
Comment 19 Sebastian Parborg 2024-02-10 15:00:19 UTC
I just wanted to say thanks for implementing this! Should make it easier to prevent crashes. :)