Hi there, I have the nvptx target in my make.conf and also nvidia-cuda-toolkit installed. I use the current clang clang-runtime-19.1.4/sys-libs/libomp-19.1.4::gentoo yet whenever i want to make a program with clang++ -fopenmp -fopenmp-targets=nvptx64 I get the following reply clang++: error: no library 'libomptarget-nvptx-sm_75.bc' found in the default clang lib directory or in LIBRARY_PATH; use '--libomptarget-nvptx-bc-path' to specify nvptx bitcode library Process terminated with status 1 (0 minute(s), 0 second(s)) Reproducible: Always sys-devel/clang-19.1.4:19/19.1::gentoo USE="extra (pie) static-analyzer xml -debug -doc (-ieee-long-double) -test -verify-sig" ABI_X86="(64) -32 (-x32)" LLVM_TARGETS="(AArch64) (AMDGPU) (ARM) (AVR) (BPF) (Hexagon) (Lanai) (LoongArch) (MSP430) (Mips) (NVPTX) (PowerPC) (RISCV) (Sparc) (SystemZ) (VE) (WebAssembly) (X86) (XCore) -ARC -CSKY -DirectX -M68k -SPIRV -Xtensa" PYTHON_SINGLE_TARGET="python3_12 -python3_10 -python3_11 -python3_13" 0 KiB Total: 1 package (1 reinstall), Size of downloads: 0 KiB * IMPORTANT: 28 news items need reading for repository 'gentoo'. * Use eselect news read to view new items. localhost /home/benni/projects/MdspanOpenmptest2 # emerge --info Portage 3.0.66.1 (python 3.12.7-final-0, default/linux/amd64/23.0/desktop/plasma, gcc-14, glibc-2.40-r5, 6.11.6-gentoo-x86_64 x86_64) ================================================================= System uname: Linux-6.11.6-gentoo-x86_64-x86_64-AMD_Ryzen_9_3900X_12-Core_Processor-with-glibc2.40 KiB Mem: 32792468 total, 12461760 free KiB Swap: 31249404 total, 31249148 free Timestamp of repository gentoo: Thu, 28 Nov 2024 03:00:00 +0000 Head commit of repository gentoo: 52a163bf41cc8777712637518d9fded9511df07d Timestamp of repository escpr2: Thu, 31 Oct 2024 18:34:26 +0000 Head commit of repository escpr2: f2c923b8c651f1d14744975f879c2c78f6e5f8f9 sh bash 5.2_p37 ld GNU ld (Gentoo 2.43 p3) 2.43.1 app-misc/pax-utils: 1.3.8::gentoo app-shells/bash: 5.2_p37::gentoo dev-build/autoconf: 2.13-r8::gentoo, 2.72-r1::gentoo dev-build/automake: 1.17-r1::gentoo dev-build/cmake: 3.31.0::gentoo dev-build/libtool: 2.5.4::gentoo dev-build/make: 4.4.1-r100::gentoo dev-build/meson: 1.6.0::gentoo dev-java/java-config: 2.3.4::gentoo dev-lang/perl: 5.40.0::gentoo dev-lang/python: 3.12.7_p1::gentoo, 3.13.0::gentoo dev-lang/rust-bin: 1.81.0-r100::gentoo, 1.82.0-r100::gentoo sys-apps/baselayout: 2.17::gentoo sys-apps/openrc: 0.55.1::gentoo sys-apps/sandbox: 2.40::gentoo sys-devel/binutils: 2.43-r2::gentoo sys-devel/binutils-config: 5.5.2::gentoo sys-devel/clang: 18.1.8-r6::gentoo, 19.1.4::gentoo sys-devel/gcc: 13.3.1_p20241115::gentoo, 14.2.1_p20241116::gentoo sys-devel/gcc-config: 2.11::gentoo sys-devel/lld: 19.1.4::gentoo sys-devel/llvm: 18.1.8-r6::gentoo, 19.1.4::gentoo sys-kernel/linux-headers: 6.11::gentoo (virtual/os-headers) sys-libs/glibc: 2.40-r5::gentoo Repositories: gentoo location: /var/db/repos/gentoo sync-type: rsync sync-uri: rsync://rsync.gentoo.org/gentoo-portage priority: -1000 volatile: False sync-rsync-verify-jobs: 1 sync-rsync-verify-metamanifest: yes sync-rsync-extra-opts: sync-rsync-verify-max-age: 3 escpr2 location: /var/db/repos/escpr2 sync-type: git sync-uri: https://github.com/gentoo-mirror/escpr2.git masters: gentoo volatile: False Binary Repositories: gentoobinhost priority: 9999 sync-uri: https://distfiles.gentoo.org/releases/amd64/binpackages/23.0/x86-64 ACCEPT_KEYWORDS="amd64 ~amd64" ACCEPT_LICENSE="@FREE @BINARY-REDISTRIBUTABLE" CBUILD="x86_64-pc-linux-gnu" CFLAGS="-march=native -O3 -pipe" CHOST="x86_64-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/lib64/libreoffice/program/sofficerc /usr/share/config /usr/share/gnupg/qualified.txt" CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c" CXXFLAGS="-march=native -O3 -pipe" DISTDIR="/var/cache/distfiles" ENV_UNSET="CARGO_HOME DBUS_SESSION_BUS_ADDRESS DISPLAY GDK_PIXBUF_MODULE_FILE GOBIN GOPATH PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR XDG_STATE_HOME" FCFLAGS="-march=native -O3 -pipe" FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs binpkg-multi-instance binpkg-request-signature buildpkg-live config-protect-if-modified distlocks ebuild-locks fixlafiles getbinpkg ipc-sandbox merge-sync merge-wait multilib-strict network-sandbox news parallel-fetch pid-sandbox pkgdir-index-trusted preserve-libs protect-owned qa-unresolved-soname-deps sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr" FFLAGS="-march=native -O3 -pipe" GENTOO_MIRRORS="http://distfiles.gentoo.org" LANG="de_DE.utf8" LDFLAGS="-Wl,-O1 -Wl,--as-needed -Wl,-z,pack-relative-relocs" LEX="flex" PKGDIR="/var/cache/binpkgs" PORTAGE_CONFIGROOT="/" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git" PORTAGE_TMPDIR="/var/tmp" SHELL="/bin/bash" USE="X a52 aac acl acpi activities alsa amd64 bluetooth branding bzip2 cairo cdda cdr cet clang compiler-rt contrib crypt cuda cudnn cups dbus declarative dri dts dvd dvdr dvi elogind encode eps exif fits flac fortran gdbm gdbui gif go gphoto2 gpm graphite gtk gui iconv icu ipv6 jit jpeg kde kf6compat kwallet lcms libcxx libnotify libtirpc llvm lm-sensors lto mad mng modules-sign mp3 mp4 mpeg multilib ncurses networkmanager nls nvenc objc objc++ ogg ompt opencv opengl openmp pam pango pcre pdf pipewire plasma png policykit postscript ppds pulseaudio qml qt5 qt6 raw readline rust screencast sdl seccomp semantic-desktop sound spell ssl startup-notification svg test-rust tiff tpm truetype udev udisks uefi unicode upower usb vorbis vulkan wayland wcs widgets wxwidgets x264 xattr xcb xft xml xv xvid zlib" ABI_X86="64" ADA_TARGET="gcc_12" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_anon authn_dbm authn_file authz_dbm authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir env expires ext_filter file_cache filter headers include info log_config logio mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="karbon sheets words" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="mmx mmxext sse sse2 aes avx avx2 f16c fma3 pclmul popcnt rdrand sha sse3 sse4_1 sse4_2 sse4a ssse3" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock greis isync itrax navcom oceanserver oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 tsip tripmate tnt ublox" GUILE_SINGLE_TARGET="3-0" GUILE_TARGETS="3-0" INPUT_DEVICES="libinput" KERNEL="linux" L10N="de" LCD_DEVICES="bayrad cfontz glk hd44780 lb216 lcdm001 mtxorb text" LUA_SINGLE_TARGET="lua5-1" LUA_TARGETS="lua5-1" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php8-2" POSTGRES_TARGETS="postgres16" PYTHON_SINGLE_TARGET="python3_12" PYTHON_TARGETS="python3_12" RUBY_TARGETS="ruby32" VIDEO_CARDS="nvidia" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipp2p iface geoip fuzzy condition tarpit sysrq proto logmark ipmark dhcpmac delude chaos account" Unset: ADDR2LINE, AR, ARFLAGS, AS, ASFLAGS, CC, CCLD, CONFIG_SHELL, CPP, CPPFLAGS, CTARGET, CXX, CXXFILT, ELFEDIT, EMERGE_DEFAULT_OPTS, EXTRA_ECONF, F77FLAGS, FC, GCOV, GPROF, INSTALL_MASK, LC_ALL, LD, LFLAGS, LIBTOOL, LINGUAS, MAKE, MAKEFLAGS, MAKEOPTS, NM, OBJCOPY, OBJDUMP, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, PYTHONPATH, RANLIB, READELF, RUSTFLAGS, SIZE, STRINGS, STRIP, YACC, YFLAGS sys-libs/libomp-19.1.4:0/19.1::gentoo USE="ompt -debug -gdb-plugin -hwloc -test -verify-sig" ABI_X86="32 (64) (-x32)" PYTHON_SINGLE_TARGET="python3_12 -python3_10 -python3_11 -python3_13" 0 KiB sys-devel/clang-runtime-19.1.4:19::gentoo USE="compiler-rt libcxx openmp sanitize" ABI_X86="32 (64) (-x32)" 0 KiB
*** Bug 945266 has been marked as a duplicate of this bug. ***
I want to note that libomp-18 had an offload flag for this. Installing libomp-18 would, however, require from me these changes I do. howewer, not want to switch back to clang/openmp 18. and i also do not want these 32bit abi.... Unfortunately, libomp-19 does not check for the upload flag anymore and does not seem to build or install the necessary files for upload so that clang could upload an own program via openmp to nvidia gpu . I have searched for the files, but could not find any to where i could point clang to... [ebuild UD ] sys-libs/libomp-18.1.8:0/18.1::gentoo [19.1.4:0/19.1::gentoo] USE="offload%* ompt -debug -gdb-plugin -hwloc -test -verify-sig" ABI_X86="32 (64) (-x32)" LLVM_TARGETS="-AMDGPU% -NVPTX%" PYTHON_SINGLE_TARGET="python3_12 -python3_10 -python3_11 -python3_13" 0 KiB [nomerge ] mail-client/claws-mail-4.3.0-r2::gentoo USE="dbus gnutls imap libcanberra libnotify networkmanager nls notification oauth pdf pgp spell startup-notification svg -archive -bogofilter -calendar -clamav -debug -doc -ldap -litehtml -nntp -perl -python -rss -session -sieve -smime -spam-report -spamassassin -valgrind -webkit -xface" PYTHON_SINGLE_TARGET="python3_12 -python3_10 -python3_11" [nomerge ] net-misc/networkmanager-1.48.10-r1::gentoo USE="bluetooth concheck elogind gtk-doc introspection modemmanager nss (policykit) ppp tools wext wifi -audit -connection-sharing -debug -dhclient -dhcpcd -gnutls -iptables -iwd -libedit -nftables -ofono -ovs -psl -resolvconf (-selinux) -syslog -systemd -teamd -test -vala" ABI_X86="(64) -32 (-x32)" [nomerge ] dev-libs/newt-0.52.24::gentoo USE="gpm nls -tcl" PYTHON_TARGETS="python3_12 -python3_10 -python3_11 -python3_13" [binary R ] sys-libs/gpm-1.20.7-r6-7::gentoo USE="(-selinux)" ABI_X86="32* (64) (-x32)" 200 KiB [ebuild R ] sys-devel/llvm-18.1.8-r6:18/18.1::gentoo USE="binutils-plugin libffi ncurses xml zstd -debug -debuginfod -doc -exegesis -libedit -test -verify-sig -z3" ABI_X86="32* (64) (-x32)" LLVM_TARGETS="(AArch64) (AMDGPU) (ARM) (AVR) (BPF) (Hexagon) (Lanai) (LoongArch) (MSP430) (Mips) (NVPTX) (PowerPC) (RISCV) (Sparc) (SystemZ) (VE) (WebAssembly) (X86) (XCore) -ARC -CSKY -DirectX -M68k -SPIRV -Xtensa" 0 KiB [ebuild R ] sys-libs/ncurses-6.5_p20241109:0/6::gentoo USE="gpm stack-realign (tinfo) -ada (-cxx) -debug -doc -minimal -profile (-split-usr) -static-libs -test -trace -verify-sig" ABI_X86="32* (64) (-x32)" 0 KiB Total: 4 packages (1 downgrade, 3 reinstalls, 1 binary), Size of downloads: 200 KiB * Error: circular dependencies: (sys-libs/gpm-1.20.7-r6-7:0/0::gentoo, binary scheduled for merge) depends on (sys-libs/ncurses-6.5_p20241109:0/6::gentoo, ebuild scheduled for merge) (runtime_slot_op) (sys-libs/gpm-1.20.7-r6-7:0/0::gentoo, binary scheduled for merge) (buildtime) It might be possible to break this cycle by applying the following change: - sys-libs/ncurses-6.5_p20241109 (Change USE: -gpm) Note that this change can be reverted, once the package has been installed. !!! Multiple package instances within a single package slot have been pulled !!! into the dependency graph, resulting in a slot conflict: sys-libs/libomp:0 (sys-libs/libomp-18.1.8:0/18.1::gentoo, ebuild scheduled for merge) USE="offload ompt -debug -gdb-plugin -hwloc -test -verify-sig" ABI_X86="32 (64) (-x32)" LLVM_TARGETS="-AMDGPU -NVPTX" PYTHON_SINGLE_TARGET="python3_12 -python3_10 -python3_11 -python3_13" pulled in by =sys-libs/libomp-18.1.8 (Argument) (sys-libs/libomp-19.1.4:0/19.1::gentoo, installed) USE="ompt -debug -gdb-plugin -hwloc -test -verify-sig" ABI_X86="32 (64) (-x32)" PYTHON_SINGLE_TARGET="python3_12 -python3_10 -python3_11 -python3_13" pulled in by >=sys-libs/libomp-19.1.4[abi_x86_32(-)?,abi_x86_64(-)?,abi_x86_x32(-)?,abi_mips_n32(-)?,abi_mips_n64(-)?,abi_mips_o32(-)?,abi_s390_32(-)?,abi_s390_64(-)?] required by (sys-devel/clang-runtime-19.1.4:19/19::gentoo, installed) USE="compiler-rt libcxx openmp sanitize" ABI_X86="32 (64) (-x32)" ^^ ^^^^^^ It may be possible to solve this problem by using package.mask to prevent one of those packages from being selected. However, it is also possible that conflicting dependencies exist such that they are impossible to satisfy simultaneously. If such a conflict exists in the dependencies of two different packages, then those packages can not be installed simultaneously. For more information, see MASKED PACKAGES section in the emerge man page or refer to the Gentoo Handbook. The following USE changes are necessary to proceed: (see "package.use" in the portage(5) man page for more details) # required by sys-devel/llvm-18.1.8-r6::gentoo[libffi] # required by sys-libs/libomp-18.1.8::gentoo[offload] # required by @selected # required by @world (argument) >=dev-libs/libffi-3.4.6-r2 abi_x86_32 # required by sys-libs/libomp-18.1.8::gentoo[offload] # required by @selected # required by @world (argument) >=sys-devel/llvm-18.1.8-r6:18 abi_x86_32 # required by sys-devel/llvm-18.1.8-r6::gentoo[ncurses] # required by sys-libs/libomp-18.1.8::gentoo[offload] # required by @selected # required by @world (argument) >=sys-libs/ncurses-6.5_p20241109 abi_x86_32 # required by sys-devel/llvm-18.1.8-r6::gentoo[xml] # required by sys-libs/libomp-18.1.8::gentoo[offload] # required by @selected # required by @world (argument) >=dev-libs/libxml2-2.13.5 abi_x86_32 # required by dev-libs/libxml2-2.13.5::gentoo[icu] # required by sys-devel/llvm-18.1.8-r6::gentoo[xml] # required by sys-libs/libomp-18.1.8::gentoo[offload] # required by @selected # required by @world (argument) >=dev-libs/icu-76.1-r1 abi_x86_32
I guess in this related bug, somebody has provided an updated ebuild that tries to handle cuda libraries... but i guess this is for an old version, so paths and values for cuda capabilities and so on would have to be updated, if that solution is still correct. After all, it may be that one has to update the path variable or place these files somewhere where clang can find them and so on... https://681806.bugs.gentoo.org/attachment.cgi?id=598326
USE=offload was removed from libomp-19 back in April 2024 in [1], I don't know if the situation improved since then that would make it easier. Either way mostly just need someone to figure out a way to make it work. Sounds like another case of would-be-easier if LLVM was a single big package rather than split into clang,libomp,etc... (except maybe some specific runtime components), albeit that's rather involved and not everyone wants this to happen, pushing for *that* solution may be difficult. [1] https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=81b1f0d8cf5e Author: Michał Górny <mgorny@gentoo.org> Date: Sat Apr 27 06:42:21 2024 sys-libs/libomp: Remove offloading support Upstream split offload into a separate component, that does not work when built standalone, and building via runtimes is entirely broken.
(In reply to Ionen Wolkens from comment #4) > (except maybe some specific runtime components) well, I said that thinking of what we were sometime thinking w/ that, but libomp is technically one so that wouldn't work
thanks for the information. Perhaps one could couple the offload flag with clang in the ebuild? (e.g. offload? (clang) or so)... I am now trying to build clang from source. Unfortunately, they are not very user friendly when it comes to the correct parameters for offloading... I hope that something like this: cmake -G "Unix Makefiles" -DLLVM_TARGETS_TO_BUILD="X86;NVPTX" -DLIBOMPTARGET_DEVICE_ARCHITECTURES="sm_70;sm_75;sm_80" -DCMAKE_BUILD_TYPE="Release" -DLIBOMP_ARCH="x86_64" -DCMAKE_INSTALL_PREFIX=${INSTALL_PREFIX} -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra;cross-project-tests;libc;lld;lldb;polly;pstl" -DLLVM_ENABLE_RUNTIMES="compiler-rt;libc;libcxx;libcxxabi;libunwind;openmp;offload" ../llvm-project/llvm will work. The problem is: gcc also seems, on gentoo, to have difficulties with offloading. I suspect for offloading in gcc, i would have to install crossdev and then install nvptx-tools... Unfortunately, on my system. the emerge of that, also fails. And, well the gentoo package system says: https://packages.gentoo.org/packages/sys-devel/nvptx-tools Version 20240326 is available upstream. Please consider updating! It seems that version 20240326 is available upstream, while the latest version in the Gentoo tree is 0_pre20230122. This package is masked and could be removed soon! The mask comment indicates that this package is scheduled for removal from our package repository. Please review the mask information below for more details. On my system, the emerge of that package (after unmasking) fails. So this is a bit unconvenient if you want to program your gpu...
so that was it, i wrote crossdev -stable -t nvptx. Apparently, crossdev -t nvptx seems to work. I hope that then i have clang and gcc which can upload. But i still find this a bit inconvenient for a system like gentoo, to go for Cmake and install a compiler suite by oneself. Or go for masked packages... gpu offloading is not that new in the testing branch....
no, crossdev fails: home/benni/projects/MdspanOpenmptest2 # crossdev -t nvptx -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- * crossdev version: 20240921 * Host Portage ARCH: amd64 * Host Portage System: x86_64-pc-linux-gnu (i686-pc-linux-gnu x86_64-pc-linux-gnu) * Target Portage ARCH: * * Target System: nvptx * Stage: 3 (C compiler & libc) * USE=multilib: no * Target ABIs: default * binutils: nvptx-tools-[latest] * gcc: gcc-[latest] * headers: linux-headers-[latest] * libc: newlib-[latest] * CROSSDEV_OVERLAY: /var/db/repos/escpr2 * PORT_LOGDIR: /var/log/portage * PORTAGE_CONFIGROOT: / * Portage flags: _ - ~ - _ - ~ - _ - ~ - _ - ~ - _ - ~ - _ - ~ - _ - ~ - _ - ~ - _ - ~ - _ - ~ - _ - ~ - _ - ~ - _ - ~ - _ - ~ - _ - ~ - _ - ~ - _ - ~ - _ - ~ - _ - ~ - _ - * leaving sys-kernel/linux-headers in /var/db/repos/escpr2 * leaving sys-libs/newlib in /var/db/repos/escpr2 * leaving sys-devel/nvptx-tools in /var/db/repos/escpr2 * leaving sys-devel/gcc in /var/db/repos/escpr2 * leaving dev-debug/gdb in /var/db/repos/escpr2 * leaving metadata/layout.conf alone in /var/db/repos/escpr2 _ - ~ - _ - ~ - _ - ~ - _ - ~ - _ - ~ - _ - ~ - _ - ~ - _ - ~ - _ - ~ - _ - ~ - _ - ~ - _ - ~ - _ - ~ - _ - ~ - _ - ~ - _ - ~ - _ - ~ - _ - ~ - _ - ~ - _ - * Log: /var/log/portage/cross-nvptx-nvptx-tools.log * Emerging cross-nvptx-tools ... [ ok ] * Log: /var/log/portage/cross-nvptx-gcc-stage1.log * Emerging cross-gcc-stage1 ... * error: gcc failed :( * * If you file a bug, please attach the following logfiles: * /var/log/portage/cross-nvptx-info.log * /var/log/portage/cross-nvptx-gcc-stage1.log.xz * /var/tmp/portage/cross-nvptx/gcc*/temp/gcc-config.logs.tar.xz So I guess i am stuck with clang and hope it compiles for offloading....
that is from the log of crossdev and gcc: I suspect that is why the package was hard masked and in for removal. So currently, with offloading in libomp removed, gentoo has no real gpu offloading support then. ake[3]: Leaving directory '/var/tmp/portage/cross-nvptx/gcc-14.2.1_p20241116/work/build/libcc1' make[1]: Entering directory '/var/tmp/portage/cross-nvptx/gcc-14.2.1_p20241116/work/build' Checking multilib configuration for libgcc... mkdir -p -- nvptx/libgcc Configuring in nvptx/libgcc configure: creating cache ./config.cache checking build system type... x86_64-pc-linux-gnu checking host system type... nvptx-unknown-none checking for --enable-version-specific-runtime-libs... no checking for a BSD-compatible install... /usr/lib/portage/python3.12/ebuild-helpers/xattr/install -c checking for gawk... gawk checking for nvptx-ar... nvptx-ar checking for nvptx-lipo... nvptx-lipo checking for nvptx-nm... /var/tmp/portage/cross-nvptx/gcc-14.2.1_p20241116/work/build/./gcc/nm checking for nvptx-ranlib... nvptx-ranlib checking for nvptx-strip... nvptx-strip checking whether ln -s works... yes checking for nvptx-gcc... /var/tmp/portage/cross-nvptx/gcc-14.2.1_p20241116/work/build/./gcc/xgcc -B/var/tmp/portage/cross-nvptx/gcc-14.2.1_p20241116/work/build/./gcc/ -B/usr/nvptx/bin/ -B/usr/nvptx/lib/ -isystem /usr/nvptx/include -is> checking for suffix of object files... configure: error: in `/var/tmp/portage/cross-nvptx/gcc-14.2.1_p20241116/work/build/nvptx/libgcc': configure: error: cannot compute suffix of object files: cannot compile See `config.log' for more details make[1]: *** [Makefile:12426: configure-target-libgcc] Error 1 make[1]: Leaving directory '/var/tmp/portage/cross-nvptx/gcc-14.2.1_p20241116/work/build' make[1]: *** Waiting for unfinished jobs.... make[2]: Entering directory '/var/tmp/portage/cross-nvptx/gcc-14.2.1_p20241116/work/build/c++tools' x86_64-pc-linux-gnu-g++ -O2 -pipe -fPIE -fno-exceptions -fno-rtti -I/var/tmp/portage/cross-nvptx/gcc-14.2.1_p20241116/work/gcc-14-20241116/c++tools/../libcody -I/var/tmp/portage/cross-nvptx/gcc-14.2.1_p20241116/work/gcc-14-20241116/c++> -MMD -MP -MF resolver.d -c -o resolver.o /var/tmp/portage/cross-nvptx/gcc-14.2.1_p20241116/work/gcc-14-20241116/c++tools/resolver.cc make[2]: Leaving directory '/var/tmp/portage/cross-nvptx/gcc-14.2.1_p20241116/work/build/c++tools' make[2]: Entering directory '/var/tmp/portage/cross-nvptx/gcc-14.2.1_p20241116/work/build/c++tools' x86_64-pc-linux-gnu-g++ -O2 -pipe -fPIE -fno-exceptions -fno-rtti -I/var/tmp/portage/cross-nvptx/gcc-14.2.1_p20241116/work/gcc-14-20241116/c++tools/../libcody -I/var/tmp/portage/cross-nvptx/gcc-14.2.1_p20241116/work/gcc-14-20241116/c++> -MMD -MP -MF server.d -c -o server.o /var/tmp/
As for the decision to remove ofloading from libomp i do really not understand it. Those who need to set that flag on are people who write their own applications. They do not really care whether they have additionally to build clang, as most likely, it is already on their system. So one can set it as useflag and then couple "offload" to a dependence on the clang suite. When you write gpu applications, you are not in need to save space on your harddrive that much. Also, with offloading removed, you have to go to source and install clang anyway....
(In reply to Benjamin Schulz from comment #10) > As for the decision to remove ofloading from libomp i do really not > understand it. Did you read what I said and the commit message I linked? The decision was because it was broken with how things are setup right now, and there's a need for someone to figure out how to make it work with our ebuilds again (which may or may not be difficult depending on if things improved or not since then). It wasn't removed because we think it shouldn't be there.
Hi, i now tried to compile clang with omp for uploading from source. That is what I got from make. For x86_64 builtins preferring x86_64/floatundixf.S to floatundixf.c -- Looking for __GLIBC__ -- Looking for __GLIBC__ - found -- Performing Test HAS_THREAD_LOCAL -- Performing Test HAS_THREAD_LOCAL - Success -- Builtin supported architectures: i386;x86_64 -- Generated Sanitizer SUPPORTED_TOOLS list on "Linux" is "asan;lsan;hwasan;msan;tsan;ubsan" -- sanitizer_common tests on "Linux" will run against "asan;lsan;hwasan;msan;tsan;ubsan" -- check-shadowcallstack does nothing. -- Performing Test OPENMP_HAVE_ONEAPI_COMPILER -- Performing Test OPENMP_HAVE_ONEAPI_COMPILER - Failed -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success -- Found Threads: TRUE CMake Error at /home/benni/projects/clang/llvm-project/openmp/runtime/cmake/LibompUtils.cmake:26 (message): LIBOMP: 128-bit quad precision functionality requested but not available Call Stack (most recent call first): /home/benni/projects/clang/llvm-project/openmp/runtime/CMakeLists.txt:286 (libomp_error_say) -- Configuring incomplete, errors occurred! make[2]: *** [runtimes/CMakeFiles/runtimes-configure.dir/build.make:76: runtimes/runtimes-stamps/runtimes-configure] Fehler 1 make[1]: *** [CMakeFiles/Makefile2:238731: runtimes/CMakeFiles/runtimes-configure.dir/all] Fehler 2 make: *** [Makefile:156: all] Fehler 2 Since gcc offloading support also breaks, i currently have no real options to test code that offloads to gpu, which is a bit... sad....
> I suspect that is why the package was hard masked and in for removal. sys-devel/nvptx-tools isn't masked for removal (it just shouldn't be emerged unless via crossdev). Can you file a new bug for the crossdev failure you had? Thanks.
Unsurprisingly, its build system is still broken. Actually, it's even more broken than it was originally -- looks like someone's been trying to copy parts of standalone build logic from openmp and then maintain it without actually testing anything. I'm going to try some time to look into fixing it over the weekend.
Hi thanks. Regarding the build system: I made two attempts. One was with the wong parameters. It did not include clang-tools-extra;cross-project-tests;libc;lld;lldb;polly;pstl" -DLLVM_ENABLE_RUNTIMES="compiler-rt;libc;libcxx;libcxxabi;libunwind;openmp;offload" So at that time clang got installed, manually, but without offloading support. the second attempt with this runtimes support for the offloading failed... After that, i unmerged clang and tried to install clang 18 with offloading support, which also broke.... These 3 attempts apparently did something with the system. Or was it an emerge --sync that updated something in the ebuild? Anyway, after this, I now get this error when I want to install clang -- Performing Test C_WCOMMENT_ALLOWS_LINE_WRAP - Failed -- Performing Test C_SUPPORTS_CTAD_MAYBE_UNSPPORTED_FLAG -- Performing Test C_SUPPORTS_CTAD_MAYBE_UNSPPORTED_FLAG - Failed -- Performing Test CXX_SUPPORTS_CTAD_MAYBE_UNSPPORTED_FLAG -- Performing Test CXX_SUPPORTS_CTAD_MAYBE_UNSPPORTED_FLAG - Success -- Performing Test LINKER_SUPPORTS_COLOR_DIAGNOSTICS -- Performing Test LINKER_SUPPORTS_COLOR_DIAGNOSTICS - Failed -- Looking for os_signpost_interval_begin -- Looking for os_signpost_interval_begin - not found -- Performing Test HAVE_CXX_ATOMICS_WITHOUT_LIB -- Performing Test HAVE_CXX_ATOMICS_WITHOUT_LIB - Success -- Performing Test HAVE_CXX_ATOMICS64_WITHOUT_LIB -- Performing Test HAVE_CXX_ATOMICS64_WITHOUT_LIB - Success -- Performing Test LLVM_HAS_ATOMICS -- Performing Test LLVM_HAS_ATOMICS - Success -- Found Python3: /usr/bin/python3.12 (found version "3.12.7") found components: Interpreter CMake Error at CMakeLists.txt:126 (message): llvm-gtest not found. Please install llvm-gtest or disable tests with -DLLVM_INCLUDE_TESTS=OFF -- Configuring incomplete, errors occurred! * ERROR: sys-devel/clang-19.1.4::gentoo failed (configure phase): * cmake failed * * Call stack: * ebuild.sh, line 136: Called src_configure * environment, line 4076: Called multili I have of course Use= "-test", so it ignores that flag. And it does not matter, if I switch that flag on, the ebuild would still abort with this error. I filed a separate bug on this: https://bugs.gentoo.org/945316 Others were having this too sometimes, so it occurs sometimes, I guess: https://www.reddit.com/r/Gentoo/comments/1cpm32d/emerging_clang_fails_cannot_find_llvmgtest/ Now it turns out that I had to remove that manually installed version first by xargs rm < install_manifest.txt in the build directory. to remove the installed version from source. Then, I could re-install clang again. While it is good to be reminded to remove a program that I compiled from source before overwriting it with another version from portage, It is quite strange that a manual install via cmake of clang would basically "turn a portage useflag on" for that program and let the makefile ignore its given parameters... I see the makefiles are pretty large... so that is taking a ton of time to fix it, I guess. Thank you for trying to help... Especially since gcc offload is also broken on my system currently...
I am currently rebuilding my gcc with the patches offload from sam. If somebody has working patches for clang/llvm that I can test, let me know. I see there is some ongoing work: https://github.com/llvm/llvm-project/pull/118173 However, i suspect i would need some short instruction what to do exactly. I am not too versed in gentoo's system internals. Once an offload compiler is there, I can test it myself with a short c++ file. For the curious: here are enough examples for offloading. https://enccs.github.io/openmp-gpu/target/
USE="-go" crossdev -t nvptx-none finished without error. This is a bit of a, well, annoying problem, since the only way this then would work regularly, is if one would put the useflag -go in a config file for the offload compiler, but then the gentoo dev teams would have to periodically check whether this is still valid. I will now test some code. Lets see....
omp offloading for gcc works now for me. Thanks to sam for doing this work. If I want to test offloading with clang now, do I simply need to copy the makefiles from? https://github.com/llvm/llvm-project/pull/118173 into the clang source directory and then compile it from source with these parameter? cmake -G "Unix Makefiles" -DLLVM_TARGETS_TO_BUILD="X86;NVPTX" -DLIBOMPTARGET_DEVICE_ARCHITECTURES="sm_70;sm_75;sm_80" -DCMAKE_BUILD_TYPE="Release" -DLIBOMP_ARCH="x86_64" -DCMAKE_INSTALL_PREFIX=${INSTALL_PREFIX} -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra;cross-project-tests;libc;lld;lldb;polly;pstl" -DLLVM_ENABLE_RUNTIMES="compiler-rt;libc;libcxx;libcxxabi;libunwind;openmp;offload" ../llvm-project/llvm
Apparently not. The changed makefiles create the following error 100%] Built target omptarget.rtl.cuda [100%] Building CXX object offload/plugins-nextgen/host/CMakeFiles/omptarget.rtl.host.dir/src/rtl.cpp.o /home/benni/projects/clang/llvm-project/offload/plugins-nextgen/host/src/rtl.cpp:15:10: fatal error: 'ffi.h' file not found 15 | #include <ffi.h> | ^~~~~~~ 1 error generated. make[5]: *** [offload/plugins-nextgen/host/CMakeFiles/omptarget.rtl.host.dir/build.make:79: offload/plugins-nextgen/host/CMakeFiles/omptarget.rtl.host.dir/src/rtl.cpp.o] Fehler 1 make[4]: *** [CMakeFiles/Makefile2:345458: offload/plugins-nextgen/host/CMakeFiles/omptarget.rtl.host.dir/all] Fehler 2 make[3]: *** [Makefile:136: all] Fehler 2 make[2]: *** [runtimes/CMakeFiles/runtimes-build.dir/build.make:76: runtimes/runtimes-stamps/runtimes-build] Fehler 2 make[1]: *** [CMakeFiles/Makefile2:238827: runtimes/CMakeFiles/runtimes-build.dir/all] Fehler 2 with these parameters: cmake -G "Unix Makefiles" -DLLVM_TARGETS_TO_BUILD="X86;NVPTX" -DLIBOMPTARGET_DEVICE_ARCHITECTURES="sm_70;sm_75;sm_80" -DCMAKE_BUILD_TYPE="Release" -DLIBOMP_ARCH="x86_64" -DCMAKE_INSTALL_PREFIX=${INSTALL_PREFIX} -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra;cross-project-tests;libc;lld;lldb;polly;pstl" -DLLVM_ENABLE_RUNTIMES="compiler-rt;libc;libcxx;libcxxabi;libunwind;openmp;offload" ../llvm-project/llvm
ffi.h is in offload/plugins-nextgen/host/dynamic_ffi/ffi.h The make files, however, just have lines like target_include_directories(${target_name} PUBLIC ${common_dir}/include) or target_include_directories(omptarget.rtl.host PRIVATE dynamic_ffi)
For whats worth, if i copy ffi.h from offload/pugins-nextgen/host/dynamic_ffi also into the folder offload/pugins-nextgen/host and change its include from <> to "", then it compiles, but then it links wrong of course... Also that is certainly not a solution to the cmake problem. I do not know why the cmakelists in host folder does not find the header when its told PRIVATE dynamic_ffi in the includedirectories directive...
if I add the line target_include_directories(omptarget.rtl.host PRIVATE dynamic_ffi) outside the if clause in pugins-nextgen/host/cmakelists.txt, then it compiles without a source change, but it still has this linker error; 100%] Linking CXX shared library /home/benni/projects/clang/build/lib/libomptarget.so /usr/lib/gcc/x86_64-pc-linux-gnu/14/../../../../x86_64-pc-linux-gnu/bin/ld: /home/benni/projects/clang/build/lib/libomptarget.rtl.host.a(rtl.cpp.o): in function `llvm::omp::target::plugin::GenELF64PluginTy::initImpl()': rtl.cpp:(.text._ZN4llvm3omp6target6plugin16GenELF64PluginTy8initImplEv[_ZN4llvm3omp6target6plugin16GenELF64PluginTy8initImplEv]+0x15): undefined reference to `ffi_init()' So i guess another path is wrong and if that is corrected then it would build...
just for your information: When i make a clean git clone with git clone https://github.com/llvm/llvm-project.git and create a build directory inside its /llvm-project folder,and then press cmake -G "Unix Makefiles" -DLLVM_TARGETS_TO_BUILD="X86;NVPTX" -DLIBOMPTARGET_DEVICE_ARCHITECTURES="sm_75" -DCMAKE_BUILD_TYPE="Release" -DLIBOMP_ARCH="x86_64" -DCMAKE_INSTALL_PREFIX=${INSTALL_PREFIX} -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra;libc;lld;lldb;polly;pstl;openmp;flang;libclc;compiler-rt;bolt" -DLLVM_ENABLE_RUNTIMES="compiler-rt;libc;libcxx;libcxxabi;libunwind;openmp;offload" ../llvm-project/llvm I will get the following error: -- Setting LIBC_NAMESPACE namespace to '__llvm_libc_20_0_0_git' c++: error: unknown command line option »--print-resource-dir«; did you mean »--print-search-dirs«? c++: severe failure: no input files Compilation endet. -- Set COMPILER_RESOURCE_DIR to /usr/lib/gcc/x86_64-pc-linux-gnu/14/ using --print-search-dirs CMake Error at /home/benni/projects/clang/llvm-project/libc/cmake/modules/LLVMLibCArchitectures.cmake:92 (message): libc build: could not read compiler target info from: Has the command to create the build now changed? I sometimes wonder now whether if this is an assessment center project from Apple or AMD... "Here, lets have them a few CmakeLists.txt's and *.cpp files and then see who can fix that in the shortest amount of time..."
hm, these settings: cmake -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra;compiler-rt;libclc;lld;openmp" -DLLVM_ENABLE_RUNTIMES="libc;libunwind;libcxxabi;openmp;offload;compiler-rt" -DLLVM_TARGETS_TO_BUILD="X86;NVPTX" -DLIBOMPTARGET_DEVICE_ARCHITECTURES="sm_75" ../llvm would give me output like this: behavior and not rely on setting a policy to OLD. Call Stack (most recent call first): /home/benni/projects/clang/llvm-project/clang/CMakeLists.txt:7 (include) -- Clang version: 20.0.0git -- Found Python3: /usr/bin/python3.13 (found version "3.13.1") found components: Interpreter -- libclc target 'amdgcn--' is enabled -- device: tahiti ( pitcairn;verde;oland;hainan;bonaire;kabini;kaveri;hawaii;mullins;tonga;tongapro;iceland;carrizo;fiji;stoney;polaris10;polaris11;gfx602;gfx705;gfx805;gfx900;gfx902;gfx904;gfx906;gfx908;gfx909;gfx90a;gfx90c;gfx940;gfx941;gfx942;gfx1010;gfx1011;gfx1012;gfx1013;gfx1030;gfx1031;gfx1032;gfx1033;gfx1034;gfx1035;gfx1036;gfx1100;gfx1101;gfx1102;gfx1103;gfx1150;gfx1151;gfx1152;gfx1153;gfx1200;gfx1201 ) -- libclc target 'amdgcn--amdhsa' is enabled -- device: none ( ) -- libclc target 'amdgcn-mesa-mesa3d' is enabled -- device: tahiti ( pitcairn;verde;oland;hainan;bonaire;kabini;kaveri;hawaii;mullins;tonga;tongapro;iceland;carrizo;fiji;stoney;polaris10;polaris11;gfx602;gfx705;gfx805;gfx900;gfx902;gfx904;gfx906;gfx908;gfx909;gfx90a;gfx90c;gfx940;gfx941;gfx942;gfx1010;gfx1011;gfx1012;gfx1013;gfx1030;gfx1031;gfx1032;gfx1033;gfx1034;gfx1035;gfx1036;gfx1100;gfx1101;gfx1102;gfx1103;gfx1150;gfx1151;gfx1152;gfx1153;gfx1200;gfx1201 ) -- libclc target 'clspv--' is enabled -- device: none ( ) -- libclc target 'clspv64--' is enabled -- device: none ( ) -- libclc target 'nvptx--' is enabled -- device: none ( ) -- libclc target 'nvptx--nvidiacl' is enabled -- device: none ( ) -- libclc target 'nvptx64--' is enabled -- device: none ( ) -- libclc target 'nvptx64--nvidiacl' is enabled -- device: none ( ) -- libclc target 'r600--' is enabled -- device: cedar ( palm;sumo;sumo2;redwood;juniper ) -- device: cypress ( hemlock ) -- device: barts ( turks;caicos ) -- device: cayman ( aruba ) CMake Error at /usr/share/cmake/Modules/ExternalProject.cmake:2959 (add_custom_target): add_custom_target cannot create target "builtins" because another target with the same name already exists. The existing target is a custom target created in source directory "/home/benni/projects/clang/llvm-project/compiler-rt/lib/builtins". See documentation for policy CMP0002 for more details. Call Stack (most recent call first): cmake/modules/LLVMExternalProjectUtils.cmake:363 (ExternalProject_Add) runtimes/CMakeLists.txt:90 (llvm_ExternalProject_Add) runtimes/CMakeLists.txt:166 (builtin_default_target) CMake Error at cmake/modules/LLVMExternalProjectUtils.cmake:453 (add_custom_target): add_custom_target cannot create target "compiler-rt" because another target with the same name already exists. The existing target is a custom target created in source directory "/home/benni/projects/clang/llvm-project/compiler-rt". See documentation for policy CMP0002 for more details. Call Stack (most recent call first): runtimes/CMakeLists.txt:261 (llvm_ExternalProject_Add) runtimes/CMakeLists.txt:554 (runtime_default_target) CMake Error at cmake/modules/LLVMExternalProjectUtils.cmake:453 (add_custom_target): add_custom_target cannot create target "install-compiler-rt" because another target with the same name already exists. The existing target is a custom target created in source directory "/home/benni/projects/clang/llvm-project/compiler-rt". See documentation for policy CMP0002 for more details. Call Stack (most recent call first): runtimes/CMakeLists.txt:261 (llvm_ExternalProject_Add) runtimes/CMakeLists.txt:554 (runtime_default_target) CMake Error at cmake/modules/LLVMExternalProjectUtils.cmake:453 (add_custom_target): add_custom_target cannot create target "install-compiler-rt-stripped" because another target with the same name already exists. The existing target is a custom target created in source directory "/home/benni/projects/clang/llvm-project/compiler-rt". See documentation for policy CMP0002 for more details. Call Stack (most recent call first): runtimes/CMakeLists.txt:261 (llvm_ExternalProject_Add) runtimes/CMakeLists.txt:554 (runtime_default_target) CMake Error at cmake/modules/LLVMExternalProjectUtils.cmake:453 (add_custom_target): add_custom_target cannot create target "check-openmp" because another target with the same name already exists. The existing target is a custom target created in source directory "/home/benni/projects/clang/llvm-project/openmp". See documentation for policy CMP0002 for more details. Call Stack (most recent call first): runtimes/CMakeLists.txt:261 (llvm_ExternalProject_Add) runtimes/CMakeLists.txt:554 (runtime_default_target) CMake Error at cmake/modules/LLVMExternalProjectUtils.cmake:453 (add_custom_target): add_custom_target cannot create target "check-compiler-rt" because another target with the same name already exists. The existing target is a custom target created in source directory "/home/benni/projects/clang/llvm-project/compiler-rt/test". See documentation for policy CMP0002 for more details. Call Stack (most recent call first): runtimes/CMakeLists.txt:261 (llvm_ExternalProject_Add) runtimes/CMakeLists.txt:554 (runtime_default_target) -- Registering ExampleIRTransforms as a pass plugin (static build: OFF) -- Registering Bye as a pass plugin (static build: OFF) -- LLVM FileCheck Found: /usr/lib/llvm/19/bin/FileCheck -- Google Benchmark version: v0.0.0, normalized to 0.0.0 -- Performing Test HAVE_GNU_POSIX_REGEX -- failed to compile -- Performing Test HAVE_POSIX_REGEX -- success -- Performing Test HAVE_STEADY_CLOCK -- success -- Performing Test HAVE_PTHREAD_AFFINITY -- success -- Configuring incomplete, errors occurred!
this here: cmake -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra;libclc;lld" -DLLVM_ENABLE_RUNTIMES="libc;libunwind;libcxxabi;openmp;offload;compiler-rt" -DLLVM_TARGETS_TO_BUILD="X86;NVPTX" -DLIBOMPTARGET_DEVICE_ARCHITECTURES="sm_75" ../llvm fould finish configure without errors. But does this build an openmp such that I can offload when I have removed it from "Projects"? When I add openmp into the projects, I get this: - device: barts ( turks;caicos ) -- device: cayman ( aruba ) CMake Error at cmake/modules/LLVMExternalProjectUtils.cmake:453 (add_custom_target): add_custom_target cannot create target "check-openmp" because another target with the same name already exists. The existing target is a custom target created in source directory "/home/benni/projects/clang/llvm-project/openmp". See documentation for policy CMP0002 for more details. Call Stack (most recent call first): runtimes/CMakeLists.txt:261 (llvm_ExternalProject_Add) runtimes/CMakeLists.txt:554 (runtime_default_target)
configure works, but the build fails with the errors from before: -- Performing Test COMPILER_RT_TARGET_HAS_UNAME - Success -- Performing Test HAS_THREAD_LOCAL -- Performing Test HAS_THREAD_LOCAL - Success -- Generated Sanitizer SUPPORTED_TOOLS list on "Linux" is "asan;lsan;hwasan;msan;tsan;ubsan" -- sanitizer_common tests on "Linux" will run against "asan;lsan;hwasan;msan;tsan;ubsan" -- check-shadowcallstack does nothing. -- Performing Test OPENMP_HAVE_ONEAPI_COMPILER -- Performing Test OPENMP_HAVE_ONEAPI_COMPILER - Failed [ 98%] Built target builtins.opt.tahiti-amdgcn-mesa-mesa3d [ 98%] Generating tahiti-amdgcn-mesa-mesa3d.bc -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success -- Found Threads: TRUE CMake Error at /home/benni/projects/clang/llvm-project/openmp/runtime/cmake/LibompUtils.cmake:26 (message): LIBOMP: 128-bit quad precision functionality requested but not available Call Stack (most recent call first): /home/benni/projects/clang/llvm-project/openmp/runtime/CMakeLists.txt:286 (libomp_error_say) Am I doing or configuring something wrong?
this one here seems to build: cmake -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra;libclc;lld;compiler-rt;openmp" -DLLVM_ENABLE_RUNTIMES="libc;libunwind;libcxxabi;offload" -DLLVM_TARGETS_TO_BUILD="X86;NVPTX" -DLIBOMPTARGET_DEVICE_ARCHITECTURES="sm_75" -DLIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES=75 -DCMAKE_INSTALL_PREFIX=/CustomClang ../llvm But can this offload?
no it does not build. it ends with the error: Built target runtimes-clobber [100%] Built target runtimes-configure [100%] Performing build step for 'runtimes' [ 0%] Built target merge_runtime_commands [ 0%] Built target unwind_shared_objects [ 0%] Built target unwind_shared [ 7%] Built target unwind_static_objects [ 7%] Built target unwind_static [ 7%] Built target generate-cxxabi-headers [ 7%] Building CXX object libcxxabi/src/CMakeFiles/cxxabi_shared_objects.dir/cxa_aux_runtime.cpp.o /home/benni/projects/clang/llvm-project/libcxxabi/src/cxa_aux_runtime.cpp:13:10: fatal error: 'exception' file not found 13 | #include <exception> | ^~~~~~~~~~~ 1 error generated. make[5]: *** [libcxxabi/src/CMakeFiles/cxxabi_shared_objects.dir/build.make:79: libcxxabi/src/CMakeFiles/cxxabi_shared_objects.dir/cxa_aux_runtime.cpp.o] Fehler 1 make[4]: *** [CMakeFiles/Makefile2:323095: libcxxabi/src/CMakeFiles/cxxabi_shared_objects.dir/all] Fehler 2 make[3]: *** [Makefile:136: all] Fehler 2 make[2]: *** [runtim and if i remove the target libcxx, then it ends with the error that I had before: [ 52%] Built target omptarget.rtl.amdgpu [ 52%] Building CXX object offload/plugins-nextgen/cuda/CMakeFiles/omptarget.rtl.cuda.dir/src/rtl.cpp.o [ 52%] Building CXX object offload/plugins-nextgen/cuda/CMakeFiles/omptarget.rtl.cuda.dir/dynamic_cuda/cuda.cpp.o [ 52%] Linking CXX static library /home/benni/projects/clang/llvm-project/build/lib/libomptarget.rtl.cuda.a [ 52%] Built target omptarget.rtl.cuda [ 52%] Building CXX object offload/plugins-nextgen/host/CMakeFiles/omptarget.rtl.host.dir/src/rtl.cpp.o /home/benni/projects/clang/llvm-project/offload/plugins-nextgen/host/src/rtl.cpp:15:10: fatal error: 'ffi.h' file not found 15 | #include <ffi.h> | ^~~~~~~ 1 error generated. make[5]: *** [offload/plugins-nextgen/host/CMakeFiles/omptarget.rtl.host.dir/build.make:79: offload/plugins-nextgen/host/CMakeFiles/omptarget.rtl.host.dir/src/rtl.cpp.o] Fehler 1 make[4]: *** [CMakeFiles/Makefile2:322917: offload/plugins-nextgen/host/CMakeFiles/omptarget.rtl.host.dir/all] Fehler 2 make[3]: *** [Makefile:136: all] Fehler 2 make[2]: *** [runtimes/CMakeFiles/runtimes-build.dir/build.make:76: runtimes/runtimes-stamps/runtimes-build] Fehler 2 make[1]: *** [CMakeFiles/Makefile2:195563: runtime
mgorny's already submitted other fixes for stuff like the ffi issue.
Thanks Sam, I tried a clean github pull, since I thought it was submitted. This here now has built: cmake -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="bolt;clang;clang-tools-extra;compiler-rt;libclc;lld;lldb;openmp;polly;pstl" -DLLVM_TARGETS_TO_BUILD="X86;NVPTX" -DCMAKE_INSTALL_PREFIX=/CustomClang ../llvm (without runtimes, it appears, that DLLVM_TARGETS_TO_BUILD="X86;NVPTX" suffices to build that NVPTX target. The folder CustomClang/share/clc/ now has these bc files that clang always asked for. I will test it later today. Apparently the documentation also says that a target that can be a runtime or a project should be either in Projects or in Runtimes in the cmake command. But now I can put some packages only in runtimes. For example libcxx... And it does not work if I build the offload runtime. So I still do not know if this really can offload. but I know have files like nvptx64.bc. And that looks similar to a file it wanted...
It appears that I have that fix from Mgorny. And unfortunately it seems I need that offload runtime. For example, this here builds: # cmake -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="bolt;clang;clang-tools-extra;compiler-rt;libclc;lld;lldb;openmp;polly;pstl" -DLLVM_TARGETS_TO_BUILD="X86;NVPTX" -DLLVM_ENABLE_RUNTIMES="libc;libunwind;libcxxabi;libcxx" -DCMAKE_INSTALL_PREFIX=/CustomClang ../llvm But it does not give me the correct file. Clang wants: [100%] Packaging LLVM offloading binary libomptarget-nvptx-sm_75.bc.out And i get this file apparently, by not only write DLLVM_TARGETS_TO_BUILD="X86;NVPTX" But I need to set -DLLVM_ENABLE_RUNTIMES="offload" Yetm the entire command, after building that file, breaks somewhere later with error. # cmake -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="bolt;clang;clang-tools-extra;compiler-rt;libclc;lld;lldb;openmp;polly;pstl" -DLLVM_TARGETS_TO_BUILD="X86;NVPTX" -DLLVM_ENABLE_RUNTIMES="libc;libunwind;libcxxabi;libcxx;offload" -DCMAKE_INSTALL_PREFIX=/CustomClang ../llvm leads to... 96%] Building CXX object offload/plugins-nextgen/cuda/CMakeFiles/omptarget.rtl.cuda.dir/src/rtl.cpp.o [ 96%] Building CXX object offload/plugins-nextgen/host/CMakeFiles/omptarget.rtl.host.dir/src/rtl.cpp.o /home/benni/projects/clang/llvm-project/offload/plugins-nextgen/host/src/rtl.cpp:15:10: fatal error: 'ffi.h' file not found 15 | #include <ffi.h> | ^~~~~~~ [ 96%] Building CXX object offload/plugins-nextgen/amdgpu/CMakeFiles/omptarget.rtl.amdgpu.dir/dynamic_hsa/hsa.cpp.o [ 96%] Building CXX object offload/plugins-nextgen/cuda/CMakeFiles/omptarget.rtl.cuda.dir/dynamic_cuda/cuda.cpp.o [ 96%] Packaging LLVM offloading binary libomptarget-amdgpu-gfx1012.bc.out [100%] Embedding LLVM offloading binary in devicertl-amdgpu-gfx942.o [100%] Packaging LLVM offloading binary libomptarget-nvptx-sm_37.bc.out [100%] Embedding LLVM offloading binary in devicertl-nvptx-sm_35.o [100%] Packaging LLVM offloading binary libomptarget-nvptx-sm_50.bc.out [100%] Embedding LLVM offloading binary in devicertl-nvptx-sm_52.o [100%] Packaging LLVM offloading binary libomptarget-nvptx-sm_53.bc.out [100%] Packaging LLVM offloading binary libomptarget-nvptx-sm_60.bc.out [....] [100%] Embedding LLVM offloading binary in devicertl-nvptx-sm_70.o [100%] Packaging LLVM offloading binary libomptarget-nvptx-sm_62.bc.out [100%] Packaging LLVM offloading binary libomptarget-nvptx-sm_72.bc.out [100%] Embedding LLVM offloading binary in devicertl-nvptx-sm_62.o [100%] Embedding LLVM offloading binary in devicertl-nvptx-sm_72.o [100%] Packaging LLVM offloading binary libomptarget-nvptx-sm_80.bc.out [100%] Embedding LLVM offloading binary in devicertl-nvptx-sm_80.o [100%] Packaging LLVM offloading binary libomptarget-nvptx-sm_86.bc.out [100%] Embedding LLVM offloadi [100%] Embedding LLVM offloading binary in devicertl-nvptx-sm_86.o [100%] Packaging LLVM offloading binary libomptarget-nvptx-sm_90.bc.out [100%] Embedding LLVM offloading binary in devicertl-nvptx-sm_90.o 1 error generated. make[5]: *** [offload/plugins-nextgen/host/CMakeFiles/omptarget.rtl.host.dir/build.make:79: offload/plugins-nextgen/host/CMakeFiles/omptarget.rtl.host.dir/src/rtl.cpp.o] Fehler 1 make[4]: *** [CMakeFiles/Makefile2:324787: offload/plugins-nextgen/host/CMakeFiles/omptarget.rtl.host.dir/all] Fehler 2 make[4]: *** Es wird auf noch nicht beendete Prozesse gewartet … [100%] Packaging LLVM offloading binary libomptarget-nvptx-sm_89.bc.out [100%] Embedding LLVM offloading binary in devicertl-nvptx-sm_89.o [100%] Packaging LLVM offloading binary libomptarget-nvptx-sm_87.bc.out [100%] Embedding LLVM offloading binary in devicertl-nvptx-sm_87.o [100%] Packaging LLVM offloading binary libomptarget-nvptx-sm_75.bc.out [100%] Embedding LLVM offloading binary in devicertl-nvptx-sm_75.o [100%] Linking CXX static library /home/benni/projects/clang/llvm-project/build/lib/libomptarget.devicertl.a [100%] Built target omptarget.devicertl [100%] Linking CXX At least it is not the file for my gpu where it breaks down.... it generates an out file for my gpu. it breaks somewhere else...
so that was it, I missed the last fix from Mgorny. Now it build with cmake -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="bolt;clang;clang-tools-extra;compiler-rt;libclc;lld;lldb;openmp;polly;pstl" -DLLVM_TARGETS_TO_BUILD="X86;NVPTX" -DLLVM_ENABLE_RUNTIMES="libc;libunwind;libcxxabi;libcxx" -DCMAKE_INSTALL_PREFIX=/CustomClang ../llvm ; make I will test it later during the day...
oh no, i forgot again to set the offload runtime.. without that, I have now re-checked that I have indeed all the recent content from mgorny.... As soon as I add the offload runtime to CMake, the build will fail with nextgen/host/src/rtl.cpp:15:10: fatal error: 'ffi.h' file not found 15 | #include <ffi.h> | ^~~~~~~ [ 96%] Built target libc.src.stdio.scanf_core.scanf_main
hm, if i try to forcefully compile a cpp file with the clang generated by cmake -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="bolt;clang;clang-tools-extra;compiler-rt;libclc;lld;lldb;openmp;polly;pstl" -DLLVM_TARGETS_TO_BUILD="X86;NVPTX" -DLLVM_ENABLE_RUNTIMES="libc;libunwind;libcxxabi;libcxx" -DCMAKE_INSTALL_PREFIX=/CustomClang ../llvm ; and use an offload target file CustomClang/share/clc/nvptx64--.bc then apparently, this compiles at least, but I get a link time problem: linking module '/CustomClang/share/clc/nvptx64--.bc': Linking two modules of different target triples: '/CustomClang/share/clc/nvptx64--.bc' is 'nvptx64-unknown-unknown' whereas '/home/benni/projects/openmptestnew/openmpoffloatest/main.cpp' is 'nvptx64-nvidia-cuda' the other nvptx files also do not work. So I guess I need that offload runtime in the build command to get that file, which is broken, because of libffi by now. But there are many amd-something.bc files in that folder. It looks almost as if the nvptx were packed into these nvptx64--.bc file but could not be linked or somehting...
if i add a parameter: -DFFI_INCLUDE_DIR=/usr/lib64/libffi/include then it builds. the entire parameters are cmake -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="bolt;clang;clang-tools-extra;compiler-rt;libclc;lld;lldb;openmp;polly;pstl" -DLLVM_TARGETS_TO_BUILD="X86;NVPTX" -DLLVM_ENABLE_RUNTIMES="libc;libunwind;libcxxabi;libcxx;offload" -DFFI_INCLUDE_DIR=/usr/lib64/libffi/include ../llvm Testing it later...
a test example compiles with offloading, but if I try to run it, it says: ./a.out: error while loading shared libraries: libomptarget.so.20.0git: cannot open shared object file: No such file or directory In the above command openmp was build as a project, not as a runtime. I guess that is why libomptarget.so.20.0git is not in the install manifest. If I remove openmp from projects and build openmp as a runtime instead, then i get this: -- Found system-installed LLVM 20.0.0git with headers in /home/benni/projects/clang/llvm-project/llvm/include;/home/benni/projects/clang/llvm-project/build/include -- Clang-tidy tests are enabled. -- Performing Test OPENMP_HAVE_ONEAPI_COMPILER -- Performing Test OPENMP_HAVE_ONEAPI_COMPILER - Failed CMake Error at /home/benni/projects/clang/llvm-project/openmp/runtime/cmake/LibompUtils.cmake:26 (message): LIBOMP: 128-bit quad precision functionality requested but not available Call Stack (most recent call first): /home/benni/projects/clang/llvm-project/openmp/runtime/CMakeLists.txt:286 (libomp_error_say)
the code with which i run my example, was this, by the way: clang++ -O3 -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda ./main3.cpp ./main.cpp -lm -lstdc++ Here is at least some documentation: https://openmp.llvm.org/SupportAndFAQ.html It indeed says that offload AND openmp should be both build as runtime, which I can't because of the LIBOMP: 128-bit quad precision functionality requested but not available Call Stack (most recent call first): error that appears then in cmake configure stage...
this may be related to the quad precision support problem for the omp target https://reviews.llvm.org/D64289
Today's LLVM 20.x reintroduces sys-libs/llvm-offload for this. Will do 19.x later.
See https://wiki.gentoo.org/wiki/Project:LLVM/Testing_new_LLVM_versions.
Hi, i found time to test your ebuilds. I just wanted to give the following feedback: this clang 20.0.0.0.9999 ebuild compiles but apparently, it does not install the necessary symlinks. so the new version 20 clang can not be found by typing clang++, but must be reached by /usr/lib/llvm/20/bin/clang. If I emerge llvm-offload the following code: #include <stdio.h> #ifdef _OPENMP #include <omp.h> #endif int main() { int num_devices = omp_get_num_devices(); printf("Number of available devices %d\n", num_devices); #pragma omp target { if (omp_is_initial_device()) { printf("Running on host\n"); } else { printf("This code is running on the taget device.\n"); int nteams= omp_get_num_teams(); int nthreads= omp_get_num_threads(); printf("Running on device with %d teams in total and %d threads in each team\n",nteams,nthreads); } } } Compiles on my system with /usr/lib/llvm/20/bin/clang -O3 -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda ./main3.cpp ./main.cpp -lm -lstdc++ and running it, the program returns that it is running on the device. So that is sucessful. Thank you for making this possible, Interestingly, in contrast to Sam's gcc crossdev solution, clang has access not only to numerical functions from libc, but also to printf. gcc was a bit limited in this respect. This is amazing, since the gpu has only a small shared memory access and printf needs to be able to modify stdout. But with clang, this has limits, too. So it is not very surprising that if we modify the above code and replace printf by an include of iostream and std::cout<< "Text" then, we get the following errors: ./main.cpp:23:7: warning: type 'ostream' (aka 'basic_ostream<char>') is not trivially copyable and not guaranteed to be mapped correctly [-Wopenmp-mapping] nvlink error : Undefined reference to '_ZSt16__ostream_insertIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_PKS3_l' in '/tmp/main-ac8a7a-nvptx64-nvidia-cuda-sm_75-25a839-a15769.cubin' nvlink error : Undefined reference to 'strlen' in '/tmp/main-ac8a7a-nvptx64-nvidia-cuda-sm_75-25a839-a15769.cubin' /usr/lib/llvm/20/bin/clang-nvlink-wrapper: error: 'nvlink' failed With mathematical functions, there appears no problem, fortunately. But from that, we can see that t really works on the target gpu. For optimized loops, it would be just nice if one would be able to use the optimizer polly with it. Currently, gentoo's clang has no polly support. And, well, the ebuild has this line: cmake_src_configure if [[ -z ${gpus} ]]; then # clang requires libomptarget.devicertl.a, but it can be empty > "${BUILD_DIR}"/libomptarget.devicertl.a || die fi while this may work nicely on gentoo, I generally want to write code that can run on most devices and operating systems possible. (Therefore, using a standard like openmp is a good choice, since it does not depend on a specific acellerator.) Now imagine, I am on another platform. The git manual for clang does nowhere write that one has to create an empty file called libomptarget.devicertl.a to satisfy the compiler. I guess, thing like that should be in the build-system, like in a cmakelists.txt And, well it is still a bit irritating that if i build openmp as a runtime with clang, that I get then this here: CMake Error at /home/benni/projects/clang/llvm-project/openmp/runtime/cmake/LibompUtils.cmake:26 (message): LIBOMP: 128-bit quad precision functionality requested but not available Call Stack (most recent call first): which One apparently does not get when building it as a project. How is this solved in gentoo's libopenmp? Since it has apparently something to do with hardware bounds, where the processor supports quad precision and the consumer cpu just plain double... But, otherwise, well I get code now that works on target with clang. So sucess. Thanks for the good work. I can now continue on my mathematical code. Thanks. With best regards, Benjamin
(In reply to Benjamin Schulz from comment #41) > Hi, i found time to test your ebuilds. > > I just wanted to give the following feedback: > > > this clang 20.0.0.0.9999 ebuild compiles but apparently, it does not install > the necessary symlinks. > > so the new version 20 clang can not be found by typing clang++, but must be > reached by /usr/lib/llvm/20/bin/clang. You have to run . /etc/profile afterwards as it adds a new dir to PATH. > > If I emerge llvm-offload > > [...] > and running it, the program returns that it is running on the device. > > So that is sucessful. Thank you for making this possible, > \o/ > Interestingly, in contrast to Sam's gcc crossdev solution, clang has access > not only to numerical functions from libc, but also to printf. > > gcc was a bit limited in this respect. This is amazing, since the gpu has > only a small shared memory access and printf needs to be able to modify > stdout. But with clang, this has limits, too. So it is not very surprising > that if we modify the above code and replace printf by an include of > iostream and std::cout<< > > "Text" then, we get the following errors: > > ./main.cpp:23:7: warning: type 'ostream' (aka 'basic_ostream<char>') is not > trivially copyable and not guaranteed to be mapped correctly > [-Wopenmp-mapping] > nvlink error : Undefined reference to > '_ZSt16__ostream_insertIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_PKS3_ > l' in '/tmp/main-ac8a7a-nvptx64-nvidia-cuda-sm_75-25a839-a15769.cubin' > nvlink error : Undefined reference to 'strlen' in > '/tmp/main-ac8a7a-nvptx64-nvidia-cuda-sm_75-25a839-a15769.cubin' > /usr/lib/llvm/20/bin/clang-nvlink-wrapper: error: 'nvlink' failed > Interesting indeed! > With mathematical functions, there appears no problem, fortunately. But from > that, we can see that t really works on the target gpu. > > For optimized loops, it would be just nice if one would be able to use the > optimizer polly with it. Currently, gentoo's clang has no polly support. > > And, well, > > the ebuild has this line: > > > cmake_src_configure > > if [[ -z ${gpus} ]]; then > # clang requires libomptarget.devicertl.a, but it can be empty > > "${BUILD_DIR}"/libomptarget.devicertl.a || die > fi > > while this may work nicely on gentoo, I generally want to write code that > can run on most devices and operating systems possible. > > (Therefore, using a standard like openmp is a good choice, since it does not > depend on a specific acellerator.) Yeah, one of the interesting things about the GCC offloading at least is it can support both at once and choose at runtime which to use. I am not sure if the LLVM one does or not (not saying it doesn't, just that I don't know). > > Now imagine, I am on another platform. The git manual for clang does nowhere > write that one has to create an empty file called > > libomptarget.devicertl.a > > to satisfy the compiler. > > > I guess, thing like that should be in the build-system, like in a > cmakelists.txt > AFAIK upstream are planning on ditching this entirely (see https://github.com/llvm/llvm-project/pull/119091). > > And, well it is still a bit irritating that if i build openmp as a runtime > with clang, that I get then this here: > > CMake Error at > /home/benni/projects/clang/llvm-project/openmp/runtime/cmake/LibompUtils. > cmake:26 (message): > LIBOMP: 128-bit quad precision functionality requested but not available > Call Stack (most recent call first): > > which One apparently does not get when building it as a project. > How is this solved in gentoo's libopenmp? Since it has apparently something > to do with hardware bounds, where the processor supports quad precision and > the consumer cpu just plain double... LLVM's CMake is a mess. You'd have to look at the build log and copy the arguments and go from there. At a glance, -DLIBOMP_USE_QUAD_PRECISION=OFF may work. But looking at https://github.com/llvm/llvm-project/blob/f0297ae552e1e5aacafc1ed43968041994dc8a6e/openmp/runtime/cmake/config-ix.cmake#L241, it depends on if building with GCC or Clang, maybe?
>Yeah, one of the interesting things about the GCC offloading at least is it can >support both at once and choose at runtime which to use. I am not sure if the >LLVM one does or not (not saying it doesn't, just that I don't know). Well, openmp code generally runs on the host. Only code within an #pragma omp target {region} device(number) uploads instructions inside {region} to the target device with a given number at runtime. What would be interesting is if one can link to two different offload targets. Say, two different gpu from separate manufacturers, e.g. one onboard chip with much shared memory between gpu and cüu, and one separate gpu on pci without that. So that one could populate both gpu's, or, if a gpu fails in a computer center, replace it with a newer and possibly different gpu, and then, at runtime, let the application use the newly installed graphics card. Unfortunately, with only one gpu, I can not test if that is possible. > You have to run . /etc/profile afterwards as it adds a new dir to PATH. Thanks. Yes that works. I just had read the manual https://wiki.gentoo.org/wiki/Clang which is unfortunately silent about this. >LLVM's CMake is a mess. You'd have to look at the build log and copy the >arguments and go from there. At a glance, -DLIBOMP_USE_QUAD_PRECISION=OFF may >work. Ah thanks. That may be it... Yes, clang's build system is really a mess. But also Clang's documentation should be updated... Best regards, Benjamin
(In reply to Benjamin Schulz from comment #43) > > >Yeah, one of the interesting things about the GCC offloading at least is it can >support both at once and choose at runtime which to use. I am not sure if the >LLVM one does or not (not saying it doesn't, just that I don't know). > > > Well, openmp code generally runs on the host. Only code within an > > #pragma omp target {region} device(number) > > uploads instructions inside {region} to the target device with a given > number at runtime. > > What would be interesting is if one can link to two different offload > targets. > > Say, two different gpu from separate manufacturers, e.g. one onboard chip > with much shared memory between gpu and cüu, and one separate gpu on pci > without that. So that one could populate both gpu's, or, if a gpu fails in a > computer center, replace it with a newer and possibly different gpu, and > then, at runtime, let the application use the newly installed graphics card. > > Unfortunately, with only one gpu, I can not test if that is possible. > I think GCC supports that at least: "No hardware-vendor libraries (like CUDA or ROCm) are required for compilation. And when run: if the hardware library is not available and/or no suitable offload device is available, host fallback is done. Compiling with codegeneration both Nvidia (nvptx) and AMD GPUs in the same binary is supported; whether that program then run on the host, on Nvidia GPUs, or AMD GPUs – or on Nvidia and AMD GPUs is decided at run time. " from https://gcc.gnu.org/wiki/Offloading.
(In reply to Benjamin Schulz from comment #43) > Thanks. Yes that works. I just had read the manual > > https://wiki.gentoo.org/wiki/Clang > > which is unfortunately silent about this. > Please edit it ;)
>I think GCC supports that at least: That is interesting. Not only for computer centers where gpu's often fail due to usage and their high numbers, but also for consumers who have onboard gpu's, which could then be used for something useful while the main gpu is doing something else.
commit d9d53c78c0f6fa892498d313724d7dbfc7043401 Author: Michał Górny <mgorny@gentoo.org> Date: Tue Dec 17 22:34:38 2024 +0100 llvm-runtimes/offload: Add 19.1.6 Signed-off-by: Michał Górny <mgorny@gentoo.org>