Created attachment 471384 [details] emerge --info llvm I am using media-libs/mesa-17.0.5, sys-devel/llvm-3.9.1-r1 and media-libs/vulkan-loader-1.0.42.0. When I am trying to execute any vulkan demo (like vulkaninfo) I am getting the following error: =========== VULKAN INFO =========== Vulkan API Version: 1.0.42 : CommandLine Error: Option 'asan-instrument-assembly' registered more than once! LLVM ERROR: inconsistency in registered CommandLine options This has both been mentioned in our forums and upstream: - https://forums.gentoo.org/viewtopic-p-8006010.html#8006010 - https://lists.freedesktop.org/archives/mesa-dev/2016-October/130765.html
Created attachment 471386 [details] emerge --info mesa
The following quote is from upstream support: https://patchwork.freedesktop.org/patch/125361/ > This was reported/mentioned by Dave back in Feb [1] .The conclusion is > - do not use BUILD_SHARED_LIBS [2]. > > [1] https://lists.freedesktop.org/archives/mesa-dev/2016-February/107242.html > [2] https://lists.freedesktop.org/archives/mesa-dev/2016-February/107306.html
Actually, it's not a bug with media-libs/vulkan-loader but with sys-devel/llvm.
(In reply to EoD from comment #3) > Actually, it's not a bug with media-libs/vulkan-loader but with > sys-devel/llvm. Does anyone know why llvm and clang are (gentoo default) compiled with BUILD_SHARED_LIBS rather than with LLVM_BUILD_LLVM_DYLIB? From what I've seen it is not generally recommended https://releases.llvm.org/5.0.0/docs/CMake.html And leads to problems like 61754 When configured with LLVM_BUILD_LLVM_DYLIB, media-libs/mesa-17.3.0_rc2 installs successfully (provided it is compiled with --disable-llvm-shared-libs), then the runtime error mentioned in bug 61754 no longer exists. I'm guessing that building with BUILD_SHARED_LIBS might be required by other packages? Tried a few like dev-libs/libclc, but this builds cleanly as well. Any ideas? Seems all of this could be sorted out if there were appropriate USE flags.
Linking the dylib requires humongous amounts of memory. Bugs should be fixed, not worked around.
Also, I won't be able to fix this unless someone provides me with a simple way to reproduce it. I should point out that I don't have any Vulkan-compatible hardware, so the mesa+vulkan-loader route is impossible to me.
Created attachment 505594 [details] Simple test case Ok, I've been able to produce an easy test case for this.
Long story short, LLVM, Mesa and glibc are all doing very stupid things. LLVM command-line parser is based on global static variables that add themselves to the global command-line parser instance via constructor. As it is impossible to destroy global objects, those options are never removed from the parser, even if the library providing them is unloaded. They just become invalid pointers... Mesa attempts to load and *unload* random plugins which is generally a very bad idea. Once a library is loaded, it should stay loaded. Unloading can provoke a lot of undefined behavior which you are seeing right now. Finally, glibc is stupid enough to actually attempt unloading of libraries. However, it has some cheap hacks to try to get things less broken which result in some of the LLVM libraries being unloaded, and some not. Which makes things even more broken. For comparison, in musl dlclose() is a no-op because unloading a library is a very bad idea. As I suspected, the 'single dylib' is not a solution but a cheap workaround for the problem. It just shoves all the symbols into one library, forcing glibc to either keep them all or unload them along with the parser instance and hope that loading it again won't result in instance being still alive with invalid contents. In our case, the former happens and that's why things don't explode horribly. Of course, the same problem will still happen if another library using LLVM command-line library is loaded, say, libclang. Or some random LLVM plugin. The correct solution would be to rewrite the LLVM option parser to not rely on completely undefined behavior. However, I seriously doubt LLVM upstream will want to replace their 'bright' solution that magically loads options into the parser (except when it explodes). In any case, I can't think of a good replacement that's likely to fly. So I think our solution space is limited to hacking the libraries providing command-line options not to be unloaded, ever. Supposedly STB_GNU_UNIQUE linkage does that. Which means we probably need to inject it somehow into cl::opt.
Created attachment 505862 [details, diff] Attempted-unsuccessful patch Good news is that after spending most of the day on this, I was able to make it inject a proper STB_GNU_UNIQUE symbol to the libraries. Bad news is that it doesn't prevent ld.so from unloading it as documentation claims it would. I'd appreciate some more help here.
Created attachment 506574 [details, diff] Sketchy turn off linker unloading patch (In reply to Michał Górny from comment #9) > Created attachment 505862 [details, diff] [details, diff] > Attempted-unsuccessful patch > > Good news is that after spending most of the day on this, I was able to make > it inject a proper STB_GNU_UNIQUE symbol to the libraries. Bad news is that > it doesn't prevent ld.so from unloading it as documentation claims it would. > > I'd appreciate some more help here. Thanks for all your efforts Michał. Agreed, throwing memory at a problem is a pretty poor solution. I had the same linker problem with your patch and couldn't get around it. I recompiled llvm with the cmake variable CMAKE_SHARED_LINKER_FLAGS that included "-Wl, -z, nodelete" (without the patch). This got rid of the runtime error, but I'm not sure if there are significant ramifications in doing this (or even if this is the best place for the variable to be set).
Thanks for your patch, Ross. It didn't occur to me that there could be a straightforward linker flag to do that. I've modified your patch to pass it only on Linux and pushed that for upstream review. Let's see how that goes.
This is now merged upstream and -9999 should no longer suffer from the issue. I'm going to wait some time in case it causes any more issues, and then request backporting to 5.0 branch. Then backport to 4.0.1. I'm sorry but 3.9 is no longer supported in Gentoo, so please don't expect it there.
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=3f7b543082fa986823d08a1df44eb8bf634f10df commit 3f7b543082fa986823d08a1df44eb8bf634f10df Author: Michał Górny <mgorny@gentoo.org> AuthorDate: 2017-12-01 16:23:42 +0000 Commit: Michał Górny <mgorny@gentoo.org> CommitDate: 2017-12-01 23:40:43 +0000 sys-devel/llvm: Backport unloading prevention fix Bug: https://bugs.gentoo.org/617154 ...Wl-z-nodelete-on-Linux-to-prevent-unloadi.patch | 56 +++++++++++++++++ ...Wl-z-nodelete-on-Linux-to-prevent-unloadi.patch | 71 ++++++++++++++++++++++ .../{llvm-4.0.1.ebuild => llvm-4.0.1-r1.ebuild} | 4 ++ sys-devel/llvm/llvm-5.0.1_rc2.ebuild | 4 ++ sys-devel/llvm/llvm-5.0.9999.ebuild | 4 ++ 5 files changed, 139 insertions(+)}
The bug has been closed via the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=849e14c28cd1743b443c7d8b41b61c1b4a56f2fa commit 849e14c28cd1743b443c7d8b41b61c1b4a56f2fa Author: Michał Górny <mgorny@gentoo.org> AuthorDate: 2017-12-20 19:16:28 +0000 Commit: Michał Górny <mgorny@gentoo.org> CommitDate: 2017-12-20 20:58:40 +0000 sys-devel/llvm: Bump to 5.0.1 (final) Closes: https://bugs.gentoo.org/617154 Closes: https://bugs.gentoo.org/636840 sys-devel/llvm/Manifest | 3 ++- .../llvm/{llvm-5.0.1_rc2.ebuild => llvm-5.0.1.ebuild} | 16 ++++++++-------- 2 files changed, 10 insertions(+), 9 deletions(-)