Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 681806 - sys-libs/libomp must be built with clang/clang++ for use cuda to be fully installed
Summary: sys-libs/libomp must be built with clang/clang++ for use cuda to be fully ins...
Status: UNCONFIRMED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Bernard Cafarelli
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-03-26 17:51 UTC by Robert
Modified: 2020-02-15 16:09 UTC (History)
4 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Patch for libomp-9.0.0 that correctly handles cuda libraries (libomp-9.0.0.patch,1.16 KB, patch)
2019-12-03 14:51 UTC, Robert
Details | Diff
Updated Patch for libomp-9.0.0 that correctly handles cuda libraries (libomp-9.0.0.patch,1.16 KB, patch)
2019-12-03 14:55 UTC, Robert
Details | Diff
sys-libs/libomp-10.0.0_rc1-r1.ebuild (libomp-10.0.0_rc1-r1.ebuild,3.18 KB, text/plain)
2020-02-15 16:09 UTC, justXi
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Robert 2019-03-26 17:51:13 UTC
If one compiles sys-libs/libomp with the cuda use flag, you must also compile with the matching clang compiler in order to compile all the generally wanted files (specifically /usr/lib64/libomptarget-nvptx-sm_35.bc which provides the bit code for the builtin instructions on the gpu).  The ebuild should either emit a warning that instructs the user how to compile the package with clang using portage environments.

Additionally, the libomptarget-nvptx-sm_35.bc gets installed by default to a location not on what clang thinks is gentoo's library path.  Perhaps a warning could be installed for this too, or it could be put in on of the directories that it searches.

See also: https://www.hahnjo.de/blog/2018/10/08/clang-7.0-openmp-offloading-nvidia.html
Comment 1 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2019-03-26 20:22:05 UTC
I don't have a problem forcing clang as that's what we do already for compiler-rt.  Do we only need with USE=cuda?  Could you expand a bit on the paths, maybe provide a patch?
Comment 2 Robert 2019-03-26 20:54:27 UTC
(In reply to Michał Górny from comment #1)
> I don't have a problem forcing clang as that's what we do already for
> compiler-rt.  Do we only need with USE=cuda?  Could you expand a bit on the
> paths, maybe provide a patch?

There may be other gpu devices that need equivalents of bc files, but the cuda one is the one I know about, and the one I need, and the only one I can test.

I could toss together a patch in a few weeks.  I'd need to do some digging to figure out where clang is searching for the device bitcode library.  How far back would you want the patches applied?  I think the feature was added in llvm7.x
Comment 3 Robert 2019-12-02 22:21:06 UTC
I finally figured out the path clang is searching for libomptarget.  It is looking in $(llvm-config --prefix)/lib64, so CMAKE_INSTALL_PREFIX needs to be set to $(llvm-config --prefix) for libomp.

I now have working a ebuild for cuda offloading with libomp.  I can make a patch in the next few days.
Comment 4 Robert 2019-12-03 14:51:03 UTC
Created attachment 598324 [details, diff]
Patch for libomp-9.0.0 that correctly handles cuda libraries

I've only made a patch for the version of libomp that I use.  Others should follow similarly.  Please let me know if you need help testing.
Comment 5 Robert 2019-12-03 14:55:32 UTC
Created attachment 598326 [details, diff]
Updated Patch for libomp-9.0.0 that correctly handles cuda libraries

Sorry the previous patch was inverted.
Comment 6 justXi 2019-12-21 14:05:12 UTC
Thanks for the patch. 
I had the same problem and the patch solved it.

Currently I use libomp-9.0.0 and nvidia-cuda-toolkit-10.1.243-r1. This CUDA version need GCC 8.x. Furthermore as far as I understand, CUDA 10.1.x is the latest supported version for llvm/clang 9.
Comment 7 justXi 2019-12-30 13:37:14 UTC
I think a currently full list of NVPTX_COMPUTE_CAPABILITIES (https://developer.nvidia.com/cuda-gpus) should include "35,50,52,60,61,70,75", if it is enabled by the CUDA use flag. Or what about a new USE_EXPAND like "NVPTX_TARGETS" with SM35, ..., SM75?

libomp-10.* seems to get support for AMDGCN, but currently this is experimental and would build only if the right tools are installed. 

https://github.com/llvm/llvm-project/blob/master/openmp/libomptarget/deviceRTLs/amdgcn/CMakeLists.txt

There is LIBOMPTARGET_AMDGCN_GFXLIST initialized with "gfx700 gfx701 gfx801 gfx803 gfx900".
Comment 8 Robert 2019-12-30 18:16:17 UTC
justXi,  I think a new USE_EXPAND option is the right idea here.  However if we are going to do that, we should make some corresponding changes to sys-devel/clang to set the default ptx targets, specifically we should add a ebuild variable to to set `CLANG_OPENMP_NVPTX_DEFAULT_ARCH` in the cmake configuration.
Comment 9 justXi 2020-01-02 13:52:12 UTC
So we would get:

NVPTX_DEFAULT_TARGET=^^ ( SM35 ... SM75 )
affects "CLANG_OPENMP_NVPTX_DEFAULT_ARCH"

NVPTX_TARGETS=|| ( SM35 ... SM75) 
affects "LIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES"
Comment 10 justXi 2020-02-15 16:09:58 UTC
Created attachment 613952 [details]
sys-libs/libomp-10.0.0_rc1-r1.ebuild