Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 945265 - llvm-runtimes/openmp-19.1.4 does not seem to build omptargets for nvidia
Summary: llvm-runtimes/openmp-19.1.4 does not seem to build omptargets for nvidia
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal
Assignee: LLVM support project
URL:
Whiteboard:
Keywords:
: 945266 (view as bug list)
Depends on:
Blocks:
 
Reported: 2024-11-28 18:55 UTC by Benjamin Schulz
Modified: 2024-12-17 22:30 UTC (History)
5 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Benjamin Schulz 2024-11-28 18:55:22 UTC
Hi there, I have the nvptx target in my make.conf and also nvidia-cuda-toolkit installed. I use the current clang clang-runtime-19.1.4/sys-libs/libomp-19.1.4::gentoo yet whenever i want to make a program with  

clang++ -fopenmp -fopenmp-targets=nvptx64 I get the following reply 

clang++: error: no library 'libomptarget-nvptx-sm_75.bc' found in the default clang lib directory or in LIBRARY_PATH; use '--libomptarget-nvptx-bc-path' to specify nvptx bitcode library
Process terminated with status 1 (0 minute(s), 0 second(s))

Reproducible: Always




sys-devel/clang-19.1.4:19/19.1::gentoo  USE="extra (pie) static-analyzer xml -debug -doc (-ieee-long-double) -test -verify-sig" ABI_X86="(64) -32 (-x32)" LLVM_TARGETS="(AArch64) (AMDGPU) (ARM) (AVR) (BPF) (Hexagon) (Lanai) (LoongArch) (MSP430) (Mips) (NVPTX) (PowerPC) (RISCV) (Sparc) (SystemZ) (VE) (WebAssembly) (X86) (XCore) -ARC -CSKY -DirectX -M68k -SPIRV -Xtensa" PYTHON_SINGLE_TARGET="python3_12 -python3_10 -python3_11 -python3_13" 0 KiB

Total: 1 package (1 reinstall), Size of downloads: 0 KiB

 * IMPORTANT: 28 news items need reading for repository 'gentoo'.
 * Use eselect news read to view new items.

localhost /home/benni/projects/MdspanOpenmptest2 # emerge --info
Portage 3.0.66.1 (python 3.12.7-final-0, default/linux/amd64/23.0/desktop/plasma, gcc-14, glibc-2.40-r5, 6.11.6-gentoo-x86_64 x86_64)
=================================================================
System uname: Linux-6.11.6-gentoo-x86_64-x86_64-AMD_Ryzen_9_3900X_12-Core_Processor-with-glibc2.40
KiB Mem:    32792468 total,  12461760 free
KiB Swap:   31249404 total,  31249148 free
Timestamp of repository gentoo: Thu, 28 Nov 2024 03:00:00 +0000
Head commit of repository gentoo: 52a163bf41cc8777712637518d9fded9511df07d
Timestamp of repository escpr2: Thu, 31 Oct 2024 18:34:26 +0000
Head commit of repository escpr2: f2c923b8c651f1d14744975f879c2c78f6e5f8f9

sh bash 5.2_p37
ld GNU ld (Gentoo 2.43 p3) 2.43.1
app-misc/pax-utils:        1.3.8::gentoo
app-shells/bash:           5.2_p37::gentoo
dev-build/autoconf:        2.13-r8::gentoo, 2.72-r1::gentoo
dev-build/automake:        1.17-r1::gentoo
dev-build/cmake:           3.31.0::gentoo
dev-build/libtool:         2.5.4::gentoo
dev-build/make:            4.4.1-r100::gentoo
dev-build/meson:           1.6.0::gentoo
dev-java/java-config:      2.3.4::gentoo
dev-lang/perl:             5.40.0::gentoo
dev-lang/python:           3.12.7_p1::gentoo, 3.13.0::gentoo
dev-lang/rust-bin:         1.81.0-r100::gentoo, 1.82.0-r100::gentoo
sys-apps/baselayout:       2.17::gentoo
sys-apps/openrc:           0.55.1::gentoo
sys-apps/sandbox:          2.40::gentoo
sys-devel/binutils:        2.43-r2::gentoo
sys-devel/binutils-config: 5.5.2::gentoo
sys-devel/clang:           18.1.8-r6::gentoo, 19.1.4::gentoo
sys-devel/gcc:             13.3.1_p20241115::gentoo, 14.2.1_p20241116::gentoo
sys-devel/gcc-config:      2.11::gentoo
sys-devel/lld:             19.1.4::gentoo
sys-devel/llvm:            18.1.8-r6::gentoo, 19.1.4::gentoo
sys-kernel/linux-headers:  6.11::gentoo (virtual/os-headers)
sys-libs/glibc:            2.40-r5::gentoo
Repositories:

gentoo
    location: /var/db/repos/gentoo
    sync-type: rsync
    sync-uri: rsync://rsync.gentoo.org/gentoo-portage
    priority: -1000
    volatile: False
    sync-rsync-verify-jobs: 1
    sync-rsync-verify-metamanifest: yes
    sync-rsync-extra-opts: 
    sync-rsync-verify-max-age: 3

escpr2
    location: /var/db/repos/escpr2
    sync-type: git
    sync-uri: https://github.com/gentoo-mirror/escpr2.git
    masters: gentoo
    volatile: False

Binary Repositories:

gentoobinhost
    priority: 9999
    sync-uri: https://distfiles.gentoo.org/releases/amd64/binpackages/23.0/x86-64

ACCEPT_KEYWORDS="amd64 ~amd64"
ACCEPT_LICENSE="@FREE @BINARY-REDISTRIBUTABLE"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=native -O3 -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/lib64/libreoffice/program/sofficerc /usr/share/config /usr/share/gnupg/qualified.txt"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c"
CXXFLAGS="-march=native -O3 -pipe"
DISTDIR="/var/cache/distfiles"
ENV_UNSET="CARGO_HOME DBUS_SESSION_BUS_ADDRESS DISPLAY GDK_PIXBUF_MODULE_FILE GOBIN GOPATH PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR XDG_STATE_HOME"
FCFLAGS="-march=native -O3 -pipe"
FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs binpkg-multi-instance binpkg-request-signature buildpkg-live config-protect-if-modified distlocks ebuild-locks fixlafiles getbinpkg ipc-sandbox merge-sync merge-wait multilib-strict network-sandbox news parallel-fetch pid-sandbox pkgdir-index-trusted preserve-libs protect-owned qa-unresolved-soname-deps sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
FFLAGS="-march=native -O3 -pipe"
GENTOO_MIRRORS="http://distfiles.gentoo.org"
LANG="de_DE.utf8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed -Wl,-z,pack-relative-relocs"
LEX="flex"
PKGDIR="/var/cache/binpkgs"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git"
PORTAGE_TMPDIR="/var/tmp"
SHELL="/bin/bash"
USE="X a52 aac acl acpi activities alsa amd64 bluetooth branding bzip2 cairo cdda cdr cet clang compiler-rt contrib crypt cuda cudnn cups dbus declarative dri dts dvd dvdr dvi elogind encode eps exif fits flac fortran gdbm gdbui gif go gphoto2 gpm graphite gtk gui iconv icu ipv6 jit jpeg kde kf6compat kwallet lcms libcxx libnotify libtirpc llvm lm-sensors lto mad mng modules-sign mp3 mp4 mpeg multilib ncurses networkmanager nls nvenc objc objc++ ogg ompt opencv opengl openmp pam pango pcre pdf pipewire plasma png policykit postscript ppds pulseaudio qml qt5 qt6 raw readline rust screencast sdl seccomp semantic-desktop sound spell ssl startup-notification svg test-rust tiff tpm truetype udev udisks uefi unicode upower usb vorbis vulkan wayland wcs widgets wxwidgets x264 xattr xcb xft xml xv xvid zlib" ABI_X86="64" ADA_TARGET="gcc_12" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_anon authn_dbm authn_file authz_dbm authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir env expires ext_filter file_cache filter headers include info log_config logio mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="karbon sheets words" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="mmx mmxext sse sse2 aes avx avx2 f16c fma3 pclmul popcnt rdrand sha sse3 sse4_1 sse4_2 sse4a ssse3" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock greis isync itrax navcom oceanserver oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 tsip tripmate tnt ublox" GUILE_SINGLE_TARGET="3-0" GUILE_TARGETS="3-0" INPUT_DEVICES="libinput" KERNEL="linux" L10N="de" LCD_DEVICES="bayrad cfontz glk hd44780 lb216 lcdm001 mtxorb text" LUA_SINGLE_TARGET="lua5-1" LUA_TARGETS="lua5-1" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php8-2" POSTGRES_TARGETS="postgres16" PYTHON_SINGLE_TARGET="python3_12" PYTHON_TARGETS="python3_12" RUBY_TARGETS="ruby32" VIDEO_CARDS="nvidia" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipp2p iface geoip fuzzy condition tarpit sysrq proto logmark ipmark dhcpmac delude chaos account"
Unset:  ADDR2LINE, AR, ARFLAGS, AS, ASFLAGS, CC, CCLD, CONFIG_SHELL, CPP, CPPFLAGS, CTARGET, CXX, CXXFILT, ELFEDIT, EMERGE_DEFAULT_OPTS, EXTRA_ECONF, F77FLAGS, FC, GCOV, GPROF, INSTALL_MASK, LC_ALL, LD, LFLAGS, LIBTOOL, LINGUAS, MAKE, MAKEFLAGS, MAKEOPTS, NM, OBJCOPY, OBJDUMP, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, PYTHONPATH, RANLIB, READELF, RUSTFLAGS, SIZE, STRINGS, STRIP, YACC, YFLAGS


 sys-libs/libomp-19.1.4:0/19.1::gentoo  USE="ompt -debug -gdb-plugin -hwloc -test -verify-sig" ABI_X86="32 (64) (-x32)" PYTHON_SINGLE_TARGET="python3_12 -python3_10 -python3_11 -python3_13" 0 KiB

sys-devel/clang-runtime-19.1.4:19::gentoo  USE="compiler-rt libcxx openmp sanitize" ABI_X86="32 (64) (-x32)" 0 KiB
Comment 1 Mike Gilbert gentoo-dev 2024-11-28 20:42:08 UTC
*** Bug 945266 has been marked as a duplicate of this bug. ***
Comment 2 Benjamin Schulz 2024-11-28 22:18:24 UTC
I want to note that libomp-18 had an offload flag for this. Installing libomp-18 would, however, require from me these changes 

I do. howewer, not want to switch back to clang/openmp 18. and i also do not want these 32bit abi....

Unfortunately, libomp-19 does not check for the upload flag anymore and does not seem to build or install the necessary files for upload so that clang could upload an own program via openmp to nvidia gpu . I have searched for the files, but could not find any to where i could point clang to...



[ebuild     UD ] sys-libs/libomp-18.1.8:0/18.1::gentoo [19.1.4:0/19.1::gentoo] USE="offload%* ompt -debug -gdb-plugin -hwloc -test -verify-sig" ABI_X86="32 (64) (-x32)" LLVM_TARGETS="-AMDGPU% -NVPTX%" PYTHON_SINGLE_TARGET="python3_12 -python3_10 -python3_11 -python3_13" 0 KiB
[nomerge       ] mail-client/claws-mail-4.3.0-r2::gentoo  USE="dbus gnutls imap libcanberra libnotify networkmanager nls notification oauth pdf pgp spell startup-notification svg -archive -bogofilter -calendar -clamav -debug -doc -ldap -litehtml -nntp -perl -python -rss -session -sieve -smime -spam-report -spamassassin -valgrind -webkit -xface" PYTHON_SINGLE_TARGET="python3_12 -python3_10 -python3_11" 
[nomerge       ]  net-misc/networkmanager-1.48.10-r1::gentoo  USE="bluetooth concheck elogind gtk-doc introspection modemmanager nss (policykit) ppp tools wext wifi -audit -connection-sharing -debug -dhclient -dhcpcd -gnutls -iptables -iwd -libedit -nftables -ofono -ovs -psl -resolvconf (-selinux) -syslog -systemd -teamd -test -vala" ABI_X86="(64) -32 (-x32)" 
[nomerge       ]   dev-libs/newt-0.52.24::gentoo  USE="gpm nls -tcl" PYTHON_TARGETS="python3_12 -python3_10 -python3_11 -python3_13" 
[binary   R    ]    sys-libs/gpm-1.20.7-r6-7::gentoo  USE="(-selinux)" ABI_X86="32* (64) (-x32)" 200 KiB
[ebuild   R    ] sys-devel/llvm-18.1.8-r6:18/18.1::gentoo  USE="binutils-plugin libffi ncurses xml zstd -debug -debuginfod -doc -exegesis -libedit -test -verify-sig -z3" ABI_X86="32* (64) (-x32)" LLVM_TARGETS="(AArch64) (AMDGPU) (ARM) (AVR) (BPF) (Hexagon) (Lanai) (LoongArch) (MSP430) (Mips) (NVPTX) (PowerPC) (RISCV) (Sparc) (SystemZ) (VE) (WebAssembly) (X86) (XCore) -ARC -CSKY -DirectX -M68k -SPIRV -Xtensa" 0 KiB
[ebuild   R    ]  sys-libs/ncurses-6.5_p20241109:0/6::gentoo  USE="gpm stack-realign (tinfo) -ada (-cxx) -debug -doc -minimal -profile (-split-usr) -static-libs -test -trace -verify-sig" ABI_X86="32* (64) (-x32)" 0 KiB

Total: 4 packages (1 downgrade, 3 reinstalls, 1 binary), Size of downloads: 200 KiB

 * Error: circular dependencies:

(sys-libs/gpm-1.20.7-r6-7:0/0::gentoo, binary scheduled for merge) depends on
 (sys-libs/ncurses-6.5_p20241109:0/6::gentoo, ebuild scheduled for merge) (runtime_slot_op)
  (sys-libs/gpm-1.20.7-r6-7:0/0::gentoo, binary scheduled for merge) (buildtime)

It might be possible to break this cycle
by applying the following change:
- sys-libs/ncurses-6.5_p20241109 (Change USE: -gpm)

Note that this change can be reverted, once the package has been installed.

!!! Multiple package instances within a single package slot have been pulled
!!! into the dependency graph, resulting in a slot conflict:

sys-libs/libomp:0

  (sys-libs/libomp-18.1.8:0/18.1::gentoo, ebuild scheduled for merge) USE="offload ompt -debug -gdb-plugin -hwloc -test -verify-sig" ABI_X86="32 (64) (-x32)" LLVM_TARGETS="-AMDGPU -NVPTX" PYTHON_SINGLE_TARGET="python3_12 -python3_10 -python3_11 -python3_13" pulled in by
    =sys-libs/libomp-18.1.8 (Argument)

  (sys-libs/libomp-19.1.4:0/19.1::gentoo, installed) USE="ompt -debug -gdb-plugin -hwloc -test -verify-sig" ABI_X86="32 (64) (-x32)" PYTHON_SINGLE_TARGET="python3_12 -python3_10 -python3_11 -python3_13" pulled in by
    >=sys-libs/libomp-19.1.4[abi_x86_32(-)?,abi_x86_64(-)?,abi_x86_x32(-)?,abi_mips_n32(-)?,abi_mips_n64(-)?,abi_mips_o32(-)?,abi_s390_32(-)?,abi_s390_64(-)?] required by (sys-devel/clang-runtime-19.1.4:19/19::gentoo, installed) USE="compiler-rt libcxx openmp sanitize" ABI_X86="32 (64) (-x32)"
    ^^                ^^^^^^                                                                                                                                                                                                                                                                                                                                                                                                                                                                      


It may be possible to solve this problem by using package.mask to
prevent one of those packages from being selected. However, it is also
possible that conflicting dependencies exist such that they are
impossible to satisfy simultaneously.  If such a conflict exists in
the dependencies of two different packages, then those packages can
not be installed simultaneously.

For more information, see MASKED PACKAGES section in the emerge man
page or refer to the Gentoo Handbook.


The following USE changes are necessary to proceed:
 (see "package.use" in the portage(5) man page for more details)
# required by sys-devel/llvm-18.1.8-r6::gentoo[libffi]
# required by sys-libs/libomp-18.1.8::gentoo[offload]
# required by @selected
# required by @world (argument)
>=dev-libs/libffi-3.4.6-r2 abi_x86_32
# required by sys-libs/libomp-18.1.8::gentoo[offload]
# required by @selected
# required by @world (argument)
>=sys-devel/llvm-18.1.8-r6:18 abi_x86_32
# required by sys-devel/llvm-18.1.8-r6::gentoo[ncurses]
# required by sys-libs/libomp-18.1.8::gentoo[offload]
# required by @selected
# required by @world (argument)
>=sys-libs/ncurses-6.5_p20241109 abi_x86_32
# required by sys-devel/llvm-18.1.8-r6::gentoo[xml]
# required by sys-libs/libomp-18.1.8::gentoo[offload]
# required by @selected
# required by @world (argument)
>=dev-libs/libxml2-2.13.5 abi_x86_32
# required by dev-libs/libxml2-2.13.5::gentoo[icu]
# required by sys-devel/llvm-18.1.8-r6::gentoo[xml]
# required by sys-libs/libomp-18.1.8::gentoo[offload]
# required by @selected
# required by @world (argument)
>=dev-libs/icu-76.1-r1 abi_x86_32
Comment 3 Benjamin Schulz 2024-11-28 22:26:08 UTC
I guess in this related bug, somebody has provided an updated ebuild that tries to handle cuda libraries... but i guess this is for an old version, so paths and values for cuda capabilities and so on would have to be updated, if that solution is still correct.

After all, it may be that one has to update the path variable or place these files somewhere where clang can find them and so on...

https://681806.bugs.gentoo.org/attachment.cgi?id=598326
Comment 4 Ionen Wolkens gentoo-dev 2024-11-28 23:04:03 UTC
USE=offload was removed from libomp-19 back in April 2024 in [1], I don't know if the situation improved since then that would make it easier. Either way mostly just need someone to figure out a way to make it work.

Sounds like another case of would-be-easier if LLVM was a single big package rather than split into clang,libomp,etc... (except maybe some specific runtime components), albeit that's rather involved and not everyone wants this to happen, pushing for *that* solution may be difficult.

[1] https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=81b1f0d8cf5e
Author: Michał Górny <mgorny@gentoo.org>
Date:   Sat Apr 27 06:42:21 2024

    sys-libs/libomp: Remove offloading support
    
    Upstream split offload into a separate component, that does not work
    when built standalone, and building via runtimes is entirely broken.
Comment 5 Ionen Wolkens gentoo-dev 2024-11-28 23:11:10 UTC
(In reply to Ionen Wolkens from comment #4)
> (except maybe some specific runtime components)
well, I said that thinking of what we were sometime thinking w/ that, but libomp is technically one so that wouldn't work
Comment 6 Benjamin Schulz 2024-11-29 00:13:36 UTC
thanks for the information. Perhaps one could couple the offload flag with clang in the ebuild? (e.g. offload? (clang) or so)...

I am now trying to build clang from source. Unfortunately, they are not very user friendly when it comes to the correct parameters for offloading... I hope that something like this:

cmake -G "Unix Makefiles" -DLLVM_TARGETS_TO_BUILD="X86;NVPTX" -DLIBOMPTARGET_DEVICE_ARCHITECTURES="sm_70;sm_75;sm_80" -DCMAKE_BUILD_TYPE="Release" -DLIBOMP_ARCH="x86_64" -DCMAKE_INSTALL_PREFIX=${INSTALL_PREFIX} -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra;cross-project-tests;libc;lld;lldb;polly;pstl" -DLLVM_ENABLE_RUNTIMES="compiler-rt;libc;libcxx;libcxxabi;libunwind;openmp;offload" ../llvm-project/llvm

 will work.

The problem is: gcc also seems, on gentoo, to have difficulties with offloading.

I suspect for offloading in gcc, i would have to install crossdev and then install nvptx-tools...

Unfortunately, on my system. the emerge of that, also fails. And, well the gentoo package system says:
https://packages.gentoo.org/packages/sys-devel/nvptx-tools

 Version 20240326 is available upstream. Please consider updating!
It seems that version 20240326 is available upstream, while the latest version in the Gentoo tree is 0_pre20230122.

This package is masked and could be removed soon!
The mask comment indicates that this package is scheduled for removal from our package repository.
Please review the mask information below for more details.

On my system, the emerge of that package (after unmasking) fails.

So this is a bit unconvenient if you want to program your gpu...
Comment 7 Benjamin Schulz 2024-11-29 00:18:40 UTC
so that was it, i wrote crossdev -stable  -t nvptx. Apparently, 

crossdev  -t nvptx

seems to work. I hope that then i have clang and gcc which can upload. But i still find this a bit inconvenient for a system like gentoo, to go for Cmake and install a compiler suite by oneself. Or go for masked packages...

gpu offloading is not that new in the testing branch....
Comment 8 Benjamin Schulz 2024-11-29 00:19:46 UTC
no, crossdev fails:

home/benni/projects/MdspanOpenmptest2 # crossdev  -t nvptx
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 * crossdev version:      20240921
 * Host Portage ARCH:     amd64
 * Host Portage System:   x86_64-pc-linux-gnu (i686-pc-linux-gnu x86_64-pc-linux-gnu)
 * Target Portage ARCH:   *
 * Target System:         nvptx
 * Stage:                 3 (C compiler & libc)
 * USE=multilib:          no
 * Target ABIs:           default

 * binutils:              nvptx-tools-[latest]
 * gcc:                   gcc-[latest]
 * headers:               linux-headers-[latest]
 * libc:                  newlib-[latest]

 * CROSSDEV_OVERLAY:      /var/db/repos/escpr2
 * PORT_LOGDIR:           /var/log/portage
 * PORTAGE_CONFIGROOT:    /
 * Portage flags:         
  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  
 * leaving sys-kernel/linux-headers in /var/db/repos/escpr2
 * leaving sys-libs/newlib in /var/db/repos/escpr2
 * leaving sys-devel/nvptx-tools in /var/db/repos/escpr2
 * leaving sys-devel/gcc in /var/db/repos/escpr2
 * leaving dev-debug/gdb in /var/db/repos/escpr2
 * leaving metadata/layout.conf alone in /var/db/repos/escpr2
  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  ~  -  _  -  
 * Log: /var/log/portage/cross-nvptx-nvptx-tools.log
 * Emerging cross-nvptx-tools ...                                                                                                                                                                                                     [ ok ]
 * Log: /var/log/portage/cross-nvptx-gcc-stage1.log
 * Emerging cross-gcc-stage1 ...

 * error: gcc failed :(
 * 
 * If you file a bug, please attach the following logfiles:
 * /var/log/portage/cross-nvptx-info.log
 * /var/log/portage/cross-nvptx-gcc-stage1.log.xz
 * /var/tmp/portage/cross-nvptx/gcc*/temp/gcc-config.logs.tar.xz


So I guess i am stuck with clang and hope it compiles for offloading....
Comment 9 Benjamin Schulz 2024-11-29 00:23:01 UTC
that is from the log of crossdev and gcc:

I suspect that is why the package was hard masked and in for removal. 

So currently, with offloading in libomp removed, gentoo has no real gpu offloading support then.

 ake[3]: Leaving directory '/var/tmp/portage/cross-nvptx/gcc-14.2.1_p20241116/work/build/libcc1'
make[1]: Entering directory '/var/tmp/portage/cross-nvptx/gcc-14.2.1_p20241116/work/build'
Checking multilib configuration for libgcc...
mkdir -p -- nvptx/libgcc
Configuring in nvptx/libgcc
configure: creating cache ./config.cache
checking build system type... x86_64-pc-linux-gnu
checking host system type... nvptx-unknown-none
checking for --enable-version-specific-runtime-libs... no
checking for a BSD-compatible install... /usr/lib/portage/python3.12/ebuild-helpers/xattr/install -c
checking for gawk... gawk
checking for nvptx-ar... nvptx-ar
checking for nvptx-lipo... nvptx-lipo
checking for nvptx-nm... /var/tmp/portage/cross-nvptx/gcc-14.2.1_p20241116/work/build/./gcc/nm
checking for nvptx-ranlib... nvptx-ranlib
checking for nvptx-strip... nvptx-strip
checking whether ln -s works... yes
checking for nvptx-gcc... /var/tmp/portage/cross-nvptx/gcc-14.2.1_p20241116/work/build/./gcc/xgcc -B/var/tmp/portage/cross-nvptx/gcc-14.2.1_p20241116/work/build/./gcc/ -B/usr/nvptx/bin/ -B/usr/nvptx/lib/ -isystem /usr/nvptx/include -is>
checking for suffix of object files... configure: error: in `/var/tmp/portage/cross-nvptx/gcc-14.2.1_p20241116/work/build/nvptx/libgcc':
configure: error: cannot compute suffix of object files: cannot compile
See `config.log' for more details
make[1]: *** [Makefile:12426: configure-target-libgcc] Error 1
make[1]: Leaving directory '/var/tmp/portage/cross-nvptx/gcc-14.2.1_p20241116/work/build'
make[1]: *** Waiting for unfinished jobs....
make[2]: Entering directory '/var/tmp/portage/cross-nvptx/gcc-14.2.1_p20241116/work/build/c++tools'
x86_64-pc-linux-gnu-g++ -O2 -pipe -fPIE -fno-exceptions -fno-rtti -I/var/tmp/portage/cross-nvptx/gcc-14.2.1_p20241116/work/gcc-14-20241116/c++tools/../libcody -I/var/tmp/portage/cross-nvptx/gcc-14.2.1_p20241116/work/gcc-14-20241116/c++>
  -MMD -MP -MF resolver.d -c -o resolver.o /var/tmp/portage/cross-nvptx/gcc-14.2.1_p20241116/work/gcc-14-20241116/c++tools/resolver.cc
make[2]: Leaving directory '/var/tmp/portage/cross-nvptx/gcc-14.2.1_p20241116/work/build/c++tools'
make[2]: Entering directory '/var/tmp/portage/cross-nvptx/gcc-14.2.1_p20241116/work/build/c++tools'
x86_64-pc-linux-gnu-g++ -O2 -pipe -fPIE -fno-exceptions -fno-rtti -I/var/tmp/portage/cross-nvptx/gcc-14.2.1_p20241116/work/gcc-14-20241116/c++tools/../libcody -I/var/tmp/portage/cross-nvptx/gcc-14.2.1_p20241116/work/gcc-14-20241116/c++>
  -MMD -MP -MF server.d -c -o server.o /var/tmp/
Comment 10 Benjamin Schulz 2024-11-29 00:33:41 UTC
As for the decision to remove ofloading from libomp i do really not understand it. 

Those who need to set that flag on are people who write their own applications. 

They do not really care whether they have additionally to build clang, as most likely, it is already on their system.

So one can set it as useflag and then couple "offload" to a dependence on the clang suite. 

When you write gpu applications, you are not in need to save space on your harddrive that much. 

Also, with offloading removed, you have to go to source and install clang anyway....
Comment 11 Ionen Wolkens gentoo-dev 2024-11-29 01:56:35 UTC
(In reply to Benjamin Schulz from comment #10)
> As for the decision to remove ofloading from libomp i do really not
> understand it. 
Did you read what I said and the commit message I linked? The decision was because it was broken with how things are setup right now, and there's a need for someone to figure out how to make it work with our ebuilds again (which may or may not be difficult depending on if things improved or not since then).

It wasn't removed because we think it shouldn't be there.
Comment 12 Benjamin Schulz 2024-11-29 06:42:09 UTC
Hi, i now tried to compile clang with omp for uploading from source. That is what I got from make.

For x86_64 builtins preferring x86_64/floatundixf.S to floatundixf.c
-- Looking for __GLIBC__
-- Looking for __GLIBC__ - found
-- Performing Test HAS_THREAD_LOCAL
-- Performing Test HAS_THREAD_LOCAL - Success
-- Builtin supported architectures: i386;x86_64
-- Generated Sanitizer SUPPORTED_TOOLS list on "Linux" is "asan;lsan;hwasan;msan;tsan;ubsan"
-- sanitizer_common tests on "Linux" will run against "asan;lsan;hwasan;msan;tsan;ubsan"
-- check-shadowcallstack does nothing.
-- Performing Test OPENMP_HAVE_ONEAPI_COMPILER
-- Performing Test OPENMP_HAVE_ONEAPI_COMPILER - Failed
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
CMake Error at /home/benni/projects/clang/llvm-project/openmp/runtime/cmake/LibompUtils.cmake:26 (message):
  LIBOMP: 128-bit quad precision functionality requested but not available
Call Stack (most recent call first):
  /home/benni/projects/clang/llvm-project/openmp/runtime/CMakeLists.txt:286 (libomp_error_say)


-- Configuring incomplete, errors occurred!
make[2]: *** [runtimes/CMakeFiles/runtimes-configure.dir/build.make:76: runtimes/runtimes-stamps/runtimes-configure] Fehler 1
make[1]: *** [CMakeFiles/Makefile2:238731: runtimes/CMakeFiles/runtimes-configure.dir/all] Fehler 2
make: *** [Makefile:156: all] Fehler 2


Since gcc offloading support also breaks, i currently have no real options to test code that offloads to gpu, which is a bit... sad....
Comment 13 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-11-29 07:04:01 UTC
> I suspect that is why the package was hard masked and in for removal. 

sys-devel/nvptx-tools isn't masked for removal (it just shouldn't be emerged unless via crossdev).

Can you file a new bug for the crossdev failure you had? Thanks.
Comment 14 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2024-11-29 15:54:41 UTC
Unsurprisingly, its build system is still broken.  Actually, it's even more broken than it was originally -- looks like someone's been trying to copy parts of standalone build logic from openmp and then maintain it without actually testing anything.  I'm going to try some time to look into fixing it over the weekend.
Comment 15 Benjamin Schulz 2024-11-29 18:05:22 UTC
Hi thanks.

Regarding the build system:


I made two attempts. One was with the wong parameters. It did not include 
clang-tools-extra;cross-project-tests;libc;lld;lldb;polly;pstl" -DLLVM_ENABLE_RUNTIMES="compiler-rt;libc;libcxx;libcxxabi;libunwind;openmp;offload"

So at that time clang got installed, manually, but without offloading support. the second attempt with this runtimes support for the offloading failed...

After that, i unmerged clang and tried to install clang 18 with offloading support, which also broke....

These 3 attempts apparently did something with the system. Or was it an emerge --sync that updated something in the ebuild?

Anyway, after this, I now get this error when I want to install clang  

-- Performing Test C_WCOMMENT_ALLOWS_LINE_WRAP - Failed
-- Performing Test C_SUPPORTS_CTAD_MAYBE_UNSPPORTED_FLAG
-- Performing Test C_SUPPORTS_CTAD_MAYBE_UNSPPORTED_FLAG - Failed
-- Performing Test CXX_SUPPORTS_CTAD_MAYBE_UNSPPORTED_FLAG
-- Performing Test CXX_SUPPORTS_CTAD_MAYBE_UNSPPORTED_FLAG - Success
-- Performing Test LINKER_SUPPORTS_COLOR_DIAGNOSTICS
-- Performing Test LINKER_SUPPORTS_COLOR_DIAGNOSTICS - Failed
-- Looking for os_signpost_interval_begin
-- Looking for os_signpost_interval_begin - not found
-- Performing Test HAVE_CXX_ATOMICS_WITHOUT_LIB
-- Performing Test HAVE_CXX_ATOMICS_WITHOUT_LIB - Success
-- Performing Test HAVE_CXX_ATOMICS64_WITHOUT_LIB
-- Performing Test HAVE_CXX_ATOMICS64_WITHOUT_LIB - Success
-- Performing Test LLVM_HAS_ATOMICS
-- Performing Test LLVM_HAS_ATOMICS - Success
-- Found Python3: /usr/bin/python3.12 (found version "3.12.7") found components: Interpreter
CMake Error at CMakeLists.txt:126 (message):
  llvm-gtest not found.  Please install llvm-gtest or disable tests with
  -DLLVM_INCLUDE_TESTS=OFF


-- Configuring incomplete, errors occurred!
 * ERROR: sys-devel/clang-19.1.4::gentoo failed (configure phase):
 *   cmake failed
 * 
 * Call stack:
 *     ebuild.sh, line  136:  Called src_configure
 *   environment, line 4076:  Called multili

I have of course Use= "-test", so it ignores that flag. 

And it does not matter, if I switch that flag on, the ebuild would still abort with this error.

I filed a separate bug on this: https://bugs.gentoo.org/945316

Others were  having this too sometimes, so it occurs sometimes, I guess:
https://www.reddit.com/r/Gentoo/comments/1cpm32d/emerging_clang_fails_cannot_find_llvmgtest/

Now it turns out that I had to remove that manually installed version first by

xargs rm < install_manifest.txt

in the build directory.

to remove the installed version from source. 

Then, I could re-install clang again.

While it is good to be reminded to remove a program that I compiled from source before overwriting it with another version from portage, It is quite strange that a manual install via cmake of clang would basically "turn a portage useflag on" for that program and let the makefile ignore its given parameters... 


I see the makefiles are pretty large... so that is taking a ton of time to fix it, I guess.

Thank you for trying to help...

Especially since gcc offload is also broken on my system currently...
Comment 16 Benjamin Schulz 2024-12-01 18:33:23 UTC
I am currently rebuilding my gcc with the patches offload from sam.


If somebody has working patches for clang/llvm that I can test, let me know.

I see there is some ongoing work:

https://github.com/llvm/llvm-project/pull/118173

However, i suspect i would need some short instruction what to do exactly. 

I am not too versed in gentoo's system internals. Once an offload compiler is there, I can test it myself with a short c++ file. 

For the curious: here are enough examples for offloading.
https://enccs.github.io/openmp-gpu/target/
[+] Comment 17 Benjamin Schulz 2024-12-01 21:41:34 UTC Comment hidden (obsolete)
Comment 18 Benjamin Schulz 2024-12-02 23:27:02 UTC
omp offloading for gcc works now for me.

Thanks to sam for doing this work.

If I want to test offloading with clang now, do I simply need to copy the makefiles from?

https://github.com/llvm/llvm-project/pull/118173 into the clang source directory and then compile it from source with these parameter?

cmake -G "Unix Makefiles" -DLLVM_TARGETS_TO_BUILD="X86;NVPTX" -DLIBOMPTARGET_DEVICE_ARCHITECTURES="sm_70;sm_75;sm_80" -DCMAKE_BUILD_TYPE="Release" -DLIBOMP_ARCH="x86_64" -DCMAKE_INSTALL_PREFIX=${INSTALL_PREFIX} -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra;cross-project-tests;libc;lld;lldb;polly;pstl" -DLLVM_ENABLE_RUNTIMES="compiler-rt;libc;libcxx;libcxxabi;libunwind;openmp;offload" ../llvm-project/llvm
Comment 19 Benjamin Schulz 2024-12-03 15:53:15 UTC
Apparently not. The changed makefiles create the following error 

100%] Built target omptarget.rtl.cuda
[100%] Building CXX object offload/plugins-nextgen/host/CMakeFiles/omptarget.rtl.host.dir/src/rtl.cpp.o
/home/benni/projects/clang/llvm-project/offload/plugins-nextgen/host/src/rtl.cpp:15:10: fatal error: 'ffi.h' file not found
   15 | #include <ffi.h>
      |          ^~~~~~~
1 error generated.
make[5]: *** [offload/plugins-nextgen/host/CMakeFiles/omptarget.rtl.host.dir/build.make:79: offload/plugins-nextgen/host/CMakeFiles/omptarget.rtl.host.dir/src/rtl.cpp.o] Fehler 1
make[4]: *** [CMakeFiles/Makefile2:345458: offload/plugins-nextgen/host/CMakeFiles/omptarget.rtl.host.dir/all] Fehler 2
make[3]: *** [Makefile:136: all] Fehler 2
make[2]: *** [runtimes/CMakeFiles/runtimes-build.dir/build.make:76: runtimes/runtimes-stamps/runtimes-build] Fehler 2
make[1]: *** [CMakeFiles/Makefile2:238827: runtimes/CMakeFiles/runtimes-build.dir/all] Fehler 2


with these parameters:

cmake -G "Unix Makefiles" -DLLVM_TARGETS_TO_BUILD="X86;NVPTX" -DLIBOMPTARGET_DEVICE_ARCHITECTURES="sm_70;sm_75;sm_80" -DCMAKE_BUILD_TYPE="Release" -DLIBOMP_ARCH="x86_64" -DCMAKE_INSTALL_PREFIX=${INSTALL_PREFIX} -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra;cross-project-tests;libc;lld;lldb;polly;pstl" -DLLVM_ENABLE_RUNTIMES="compiler-rt;libc;libcxx;libcxxabi;libunwind;openmp;offload" ../llvm-project/llvm
Comment 20 Benjamin Schulz 2024-12-03 16:08:58 UTC
ffi.h is in offload/plugins-nextgen/host/dynamic_ffi/ffi.h

The make files, however, just have lines like

 target_include_directories(${target_name} PUBLIC ${common_dir}/include)
or
  target_include_directories(omptarget.rtl.host PRIVATE dynamic_ffi)
Comment 21 Benjamin Schulz 2024-12-03 18:20:33 UTC
For whats worth, if i copy ffi.h from offload/pugins-nextgen/host/dynamic_ffi also into the folder offload/pugins-nextgen/host and change its include from <> to "", then it compiles, but then it links wrong of course...


Also that is certainly not a solution to the cmake problem. 

I do not know why the cmakelists in host folder does not find the header when its told  PRIVATE dynamic_ffi in the includedirectories directive...
Comment 22 Benjamin Schulz 2024-12-03 19:35:18 UTC
if I add the line

target_include_directories(omptarget.rtl.host PRIVATE dynamic_ffi)

outside the if clause 

in pugins-nextgen/host/cmakelists.txt, then it compiles without a source change, but it still has this linker error;

100%] Linking CXX shared library /home/benni/projects/clang/build/lib/libomptarget.so
/usr/lib/gcc/x86_64-pc-linux-gnu/14/../../../../x86_64-pc-linux-gnu/bin/ld: /home/benni/projects/clang/build/lib/libomptarget.rtl.host.a(rtl.cpp.o): in function `llvm::omp::target::plugin::GenELF64PluginTy::initImpl()':
rtl.cpp:(.text._ZN4llvm3omp6target6plugin16GenELF64PluginTy8initImplEv[_ZN4llvm3omp6target6plugin16GenELF64PluginTy8initImplEv]+0x15): undefined reference to `ffi_init()'


So i guess another path is wrong and if that is corrected then it would build...
Comment 23 Benjamin Schulz 2024-12-04 20:31:09 UTC
just for your information:
When i make a clean git clone with

git clone https://github.com/llvm/llvm-project.git

and create a build directory inside its /llvm-project folder,and then press

cmake -G "Unix Makefiles" -DLLVM_TARGETS_TO_BUILD="X86;NVPTX" -DLIBOMPTARGET_DEVICE_ARCHITECTURES="sm_75" -DCMAKE_BUILD_TYPE="Release" -DLIBOMP_ARCH="x86_64" -DCMAKE_INSTALL_PREFIX=${INSTALL_PREFIX} -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra;libc;lld;lldb;polly;pstl;openmp;flang;libclc;compiler-rt;bolt" -DLLVM_ENABLE_RUNTIMES="compiler-rt;libc;libcxx;libcxxabi;libunwind;openmp;offload" ../llvm-project/llvm



I will get the following error:

-- Setting LIBC_NAMESPACE namespace to '__llvm_libc_20_0_0_git'
c++: error: unknown command line option »--print-resource-dir«; did you mean »--print-search-dirs«?
c++: severe failure: no input files
Compilation endet.
-- Set COMPILER_RESOURCE_DIR to /usr/lib/gcc/x86_64-pc-linux-gnu/14/ using --print-search-dirs
CMake Error at /home/benni/projects/clang/llvm-project/libc/cmake/modules/LLVMLibCArchitectures.cmake:92 (message):
  libc build: could not read compiler target info from:

Has the command to create the build now changed?

I sometimes wonder now whether if this is an assessment center project from Apple or AMD... "Here, lets have them a few CmakeLists.txt's and *.cpp files and then see who can fix that in the shortest amount of time..."
Comment 24 Benjamin Schulz 2024-12-06 23:27:48 UTC
hm, these settings:

cmake -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra;compiler-rt;libclc;lld;openmp" 
-DLLVM_ENABLE_RUNTIMES="libc;libunwind;libcxxabi;openmp;offload;compiler-rt" 
-DLLVM_TARGETS_TO_BUILD="X86;NVPTX" 
-DLIBOMPTARGET_DEVICE_ARCHITECTURES="sm_75" ../llvm


would give me output like this:

 behavior and not rely on setting a policy to OLD.
Call Stack (most recent call first):
  /home/benni/projects/clang/llvm-project/clang/CMakeLists.txt:7 (include)


-- Clang version: 20.0.0git
-- Found Python3: /usr/bin/python3.13 (found version "3.13.1") found components: Interpreter
-- libclc target 'amdgcn--' is enabled
--   device: tahiti ( pitcairn;verde;oland;hainan;bonaire;kabini;kaveri;hawaii;mullins;tonga;tongapro;iceland;carrizo;fiji;stoney;polaris10;polaris11;gfx602;gfx705;gfx805;gfx900;gfx902;gfx904;gfx906;gfx908;gfx909;gfx90a;gfx90c;gfx940;gfx941;gfx942;gfx1010;gfx1011;gfx1012;gfx1013;gfx1030;gfx1031;gfx1032;gfx1033;gfx1034;gfx1035;gfx1036;gfx1100;gfx1101;gfx1102;gfx1103;gfx1150;gfx1151;gfx1152;gfx1153;gfx1200;gfx1201 )
-- libclc target 'amdgcn--amdhsa' is enabled
--   device: none (  )
-- libclc target 'amdgcn-mesa-mesa3d' is enabled
--   device: tahiti ( pitcairn;verde;oland;hainan;bonaire;kabini;kaveri;hawaii;mullins;tonga;tongapro;iceland;carrizo;fiji;stoney;polaris10;polaris11;gfx602;gfx705;gfx805;gfx900;gfx902;gfx904;gfx906;gfx908;gfx909;gfx90a;gfx90c;gfx940;gfx941;gfx942;gfx1010;gfx1011;gfx1012;gfx1013;gfx1030;gfx1031;gfx1032;gfx1033;gfx1034;gfx1035;gfx1036;gfx1100;gfx1101;gfx1102;gfx1103;gfx1150;gfx1151;gfx1152;gfx1153;gfx1200;gfx1201 )
-- libclc target 'clspv--' is enabled
--   device: none (  )
-- libclc target 'clspv64--' is enabled
--   device: none (  )
-- libclc target 'nvptx--' is enabled
--   device: none (  )
-- libclc target 'nvptx--nvidiacl' is enabled
--   device: none (  )
-- libclc target 'nvptx64--' is enabled
--   device: none (  )
-- libclc target 'nvptx64--nvidiacl' is enabled
--   device: none (  )
-- libclc target 'r600--' is enabled
--   device: cedar ( palm;sumo;sumo2;redwood;juniper )
--   device: cypress ( hemlock )
--   device: barts ( turks;caicos )
--   device: cayman ( aruba )
CMake Error at /usr/share/cmake/Modules/ExternalProject.cmake:2959 (add_custom_target):
  add_custom_target cannot create target "builtins" because another target
  with the same name already exists.  The existing target is a custom target
  created in source directory
  "/home/benni/projects/clang/llvm-project/compiler-rt/lib/builtins".  See
  documentation for policy CMP0002 for more details.
Call Stack (most recent call first):
  cmake/modules/LLVMExternalProjectUtils.cmake:363 (ExternalProject_Add)
  runtimes/CMakeLists.txt:90 (llvm_ExternalProject_Add)
  runtimes/CMakeLists.txt:166 (builtin_default_target)


CMake Error at cmake/modules/LLVMExternalProjectUtils.cmake:453 (add_custom_target):
  add_custom_target cannot create target "compiler-rt" because another target
  with the same name already exists.  The existing target is a custom target
  created in source directory
  "/home/benni/projects/clang/llvm-project/compiler-rt".  See documentation
  for policy CMP0002 for more details.
Call Stack (most recent call first):
  runtimes/CMakeLists.txt:261 (llvm_ExternalProject_Add)
  runtimes/CMakeLists.txt:554 (runtime_default_target)


CMake Error at cmake/modules/LLVMExternalProjectUtils.cmake:453 (add_custom_target):
  add_custom_target cannot create target "install-compiler-rt" because
  another target with the same name already exists.  The existing target is a
  custom target created in source directory
  "/home/benni/projects/clang/llvm-project/compiler-rt".  See documentation
  for policy CMP0002 for more details.
Call Stack (most recent call first):
  runtimes/CMakeLists.txt:261 (llvm_ExternalProject_Add)
  runtimes/CMakeLists.txt:554 (runtime_default_target)


CMake Error at cmake/modules/LLVMExternalProjectUtils.cmake:453 (add_custom_target):
  add_custom_target cannot create target "install-compiler-rt-stripped"
  because another target with the same name already exists.  The existing
  target is a custom target created in source directory
  "/home/benni/projects/clang/llvm-project/compiler-rt".  See documentation
  for policy CMP0002 for more details.
Call Stack (most recent call first):
  runtimes/CMakeLists.txt:261 (llvm_ExternalProject_Add)
  runtimes/CMakeLists.txt:554 (runtime_default_target)


CMake Error at cmake/modules/LLVMExternalProjectUtils.cmake:453 (add_custom_target):
  add_custom_target cannot create target "check-openmp" because another
  target with the same name already exists.  The existing target is a custom
  target created in source directory
  "/home/benni/projects/clang/llvm-project/openmp".  See documentation for
  policy CMP0002 for more details.
Call Stack (most recent call first):
  runtimes/CMakeLists.txt:261 (llvm_ExternalProject_Add)
  runtimes/CMakeLists.txt:554 (runtime_default_target)


CMake Error at cmake/modules/LLVMExternalProjectUtils.cmake:453 (add_custom_target):
  add_custom_target cannot create target "check-compiler-rt" because another
  target with the same name already exists.  The existing target is a custom
  target created in source directory
  "/home/benni/projects/clang/llvm-project/compiler-rt/test".  See
  documentation for policy CMP0002 for more details.
Call Stack (most recent call first):
  runtimes/CMakeLists.txt:261 (llvm_ExternalProject_Add)
  runtimes/CMakeLists.txt:554 (runtime_default_target)


-- Registering ExampleIRTransforms as a pass plugin (static build: OFF)
-- Registering Bye as a pass plugin (static build: OFF)
-- LLVM FileCheck Found: /usr/lib/llvm/19/bin/FileCheck
-- Google Benchmark version: v0.0.0, normalized to 0.0.0
-- Performing Test HAVE_GNU_POSIX_REGEX -- failed to compile
-- Performing Test HAVE_POSIX_REGEX -- success
-- Performing Test HAVE_STEADY_CLOCK -- success
-- Performing Test HAVE_PTHREAD_AFFINITY -- success
-- Configuring incomplete, errors occurred!
Comment 25 Benjamin Schulz 2024-12-06 23:33:26 UTC
this here:

cmake -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra;libclc;lld" -DLLVM_ENABLE_RUNTIMES="libc;libunwind;libcxxabi;openmp;offload;compiler-rt" -DLLVM_TARGETS_TO_BUILD="X86;NVPTX" -DLIBOMPTARGET_DEVICE_ARCHITECTURES="sm_75" ../llvm

fould finish configure without errors. But does this build an openmp such that I can offload when I have removed it from "Projects"?

When I add openmp into the projects, I get this:

-   device: barts ( turks;caicos )
--   device: cayman ( aruba )
CMake Error at cmake/modules/LLVMExternalProjectUtils.cmake:453 (add_custom_target):
  add_custom_target cannot create target "check-openmp" because another
  target with the same name already exists.  The existing target is a custom
  target created in source directory
  "/home/benni/projects/clang/llvm-project/openmp".  See documentation for
  policy CMP0002 for more details.
Call Stack (most recent call first):
  runtimes/CMakeLists.txt:261 (llvm_ExternalProject_Add)
  runtimes/CMakeLists.txt:554 (runtime_default_target)
Comment 26 Benjamin Schulz 2024-12-07 00:52:13 UTC
configure works, but the build fails with the errors from before:

-- Performing Test COMPILER_RT_TARGET_HAS_UNAME - Success
-- Performing Test HAS_THREAD_LOCAL
-- Performing Test HAS_THREAD_LOCAL - Success
-- Generated Sanitizer SUPPORTED_TOOLS list on "Linux" is "asan;lsan;hwasan;msan;tsan;ubsan"
-- sanitizer_common tests on "Linux" will run against "asan;lsan;hwasan;msan;tsan;ubsan"
-- check-shadowcallstack does nothing.
-- Performing Test OPENMP_HAVE_ONEAPI_COMPILER
-- Performing Test OPENMP_HAVE_ONEAPI_COMPILER - Failed
[ 98%] Built target builtins.opt.tahiti-amdgcn-mesa-mesa3d
[ 98%] Generating tahiti-amdgcn-mesa-mesa3d.bc
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
CMake Error at /home/benni/projects/clang/llvm-project/openmp/runtime/cmake/LibompUtils.cmake:26 (message):
  LIBOMP: 128-bit quad precision functionality requested but not available
Call Stack (most recent call first):
  /home/benni/projects/clang/llvm-project/openmp/runtime/CMakeLists.txt:286 (libomp_error_say)

Am I doing or configuring something wrong?
Comment 27 Benjamin Schulz 2024-12-07 00:57:40 UTC
this one here seems to build:

cmake -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra;libclc;lld;compiler-rt;openmp" -DLLVM_ENABLE_RUNTIMES="libc;libunwind;libcxxabi;offload" -DLLVM_TARGETS_TO_BUILD="X86;NVPTX" -DLIBOMPTARGET_DEVICE_ARCHITECTURES="sm_75" -DLIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES=75 -DCMAKE_INSTALL_PREFIX=/CustomClang ../llvm 

But can this offload?
Comment 28 Benjamin Schulz 2024-12-07 02:20:19 UTC
no it does not build. it ends with the error:

Built target runtimes-clobber
[100%] Built target runtimes-configure
[100%] Performing build step for 'runtimes'
[  0%] Built target merge_runtime_commands
[  0%] Built target unwind_shared_objects
[  0%] Built target unwind_shared
[  7%] Built target unwind_static_objects
[  7%] Built target unwind_static
[  7%] Built target generate-cxxabi-headers
[  7%] Building CXX object libcxxabi/src/CMakeFiles/cxxabi_shared_objects.dir/cxa_aux_runtime.cpp.o
/home/benni/projects/clang/llvm-project/libcxxabi/src/cxa_aux_runtime.cpp:13:10: fatal error: 'exception' file not found
   13 | #include <exception>
      |          ^~~~~~~~~~~
1 error generated.
make[5]: *** [libcxxabi/src/CMakeFiles/cxxabi_shared_objects.dir/build.make:79: libcxxabi/src/CMakeFiles/cxxabi_shared_objects.dir/cxa_aux_runtime.cpp.o] Fehler 1
make[4]: *** [CMakeFiles/Makefile2:323095: libcxxabi/src/CMakeFiles/cxxabi_shared_objects.dir/all] Fehler 2
make[3]: *** [Makefile:136: all] Fehler 2
make[2]: *** [runtim


and if i remove the target libcxx, then it ends with the error that I had before:

[ 52%] Built target omptarget.rtl.amdgpu
[ 52%] Building CXX object offload/plugins-nextgen/cuda/CMakeFiles/omptarget.rtl.cuda.dir/src/rtl.cpp.o
[ 52%] Building CXX object offload/plugins-nextgen/cuda/CMakeFiles/omptarget.rtl.cuda.dir/dynamic_cuda/cuda.cpp.o
[ 52%] Linking CXX static library /home/benni/projects/clang/llvm-project/build/lib/libomptarget.rtl.cuda.a
[ 52%] Built target omptarget.rtl.cuda
[ 52%] Building CXX object offload/plugins-nextgen/host/CMakeFiles/omptarget.rtl.host.dir/src/rtl.cpp.o
/home/benni/projects/clang/llvm-project/offload/plugins-nextgen/host/src/rtl.cpp:15:10: fatal error: 'ffi.h' file not found
   15 | #include <ffi.h>
      |          ^~~~~~~
1 error generated.
make[5]: *** [offload/plugins-nextgen/host/CMakeFiles/omptarget.rtl.host.dir/build.make:79: offload/plugins-nextgen/host/CMakeFiles/omptarget.rtl.host.dir/src/rtl.cpp.o] Fehler 1
make[4]: *** [CMakeFiles/Makefile2:322917: offload/plugins-nextgen/host/CMakeFiles/omptarget.rtl.host.dir/all] Fehler 2
make[3]: *** [Makefile:136: all] Fehler 2
make[2]: *** [runtimes/CMakeFiles/runtimes-build.dir/build.make:76: runtimes/runtimes-stamps/runtimes-build] Fehler 2
make[1]: *** [CMakeFiles/Makefile2:195563: runtime
Comment 29 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-12-07 02:22:35 UTC
mgorny's already submitted other fixes for stuff like the ffi issue.
Comment 30 Benjamin Schulz 2024-12-07 04:14:27 UTC
Thanks Sam,

I tried a clean github pull, since I thought it was submitted.

This here now has built:


cmake -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="bolt;clang;clang-tools-extra;compiler-rt;libclc;lld;lldb;openmp;polly;pstl"  -DLLVM_TARGETS_TO_BUILD="X86;NVPTX"  -DCMAKE_INSTALL_PREFIX=/CustomClang ../llvm 

(without runtimes, it appears, that 

DLLVM_TARGETS_TO_BUILD="X86;NVPTX" 

suffices to build that NVPTX target.

The folder 

CustomClang/share/clc/

now has these bc files that clang always asked for.


I will test it later today.

Apparently the documentation also says that a target that can be a runtime or a project should be either in Projects or in Runtimes in the cmake command.

But now I can put some packages only in runtimes. For example libcxx... And it does not work if I build the offload runtime. 

So I still do not know if this really can offload. but I know have files like nvptx64.bc. And that looks similar to a file it wanted...
Comment 31 Benjamin Schulz 2024-12-07 04:45:44 UTC
It appears that I have that fix from Mgorny. 

And unfortunately  it seems I need that offload runtime.

For example, this here builds:

# cmake -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="bolt;clang;clang-tools-extra;compiler-rt;libclc;lld;lldb;openmp;polly;pstl"  -DLLVM_TARGETS_TO_BUILD="X86;NVPTX" 
-DLLVM_ENABLE_RUNTIMES="libc;libunwind;libcxxabi;libcxx"  -DCMAKE_INSTALL_PREFIX=/CustomClang ../llvm 

But it does not give me the correct file.

 Clang wants:
[100%] Packaging LLVM offloading binary libomptarget-nvptx-sm_75.bc.out



And i get this file apparently, by not only write 
DLLVM_TARGETS_TO_BUILD="X86;NVPTX"

But I need to set
-DLLVM_ENABLE_RUNTIMES="offload"  


Yetm the entire command, after building that file, breaks somewhere later with error.


# cmake -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="bolt;clang;clang-tools-extra;compiler-rt;libclc;lld;lldb;openmp;polly;pstl"  -DLLVM_TARGETS_TO_BUILD="X86;NVPTX" -DLLVM_ENABLE_RUNTIMES="libc;libunwind;libcxxabi;libcxx;offload"  -DCMAKE_INSTALL_PREFIX=/CustomClang ../llvm 


leads to...


 96%] Building CXX object offload/plugins-nextgen/cuda/CMakeFiles/omptarget.rtl.cuda.dir/src/rtl.cpp.o
[ 96%] Building CXX object offload/plugins-nextgen/host/CMakeFiles/omptarget.rtl.host.dir/src/rtl.cpp.o
/home/benni/projects/clang/llvm-project/offload/plugins-nextgen/host/src/rtl.cpp:15:10: fatal error: 'ffi.h' file not found
   15 | #include <ffi.h>
      |          ^~~~~~~
[ 96%] Building CXX object offload/plugins-nextgen/amdgpu/CMakeFiles/omptarget.rtl.amdgpu.dir/dynamic_hsa/hsa.cpp.o
[ 96%] Building CXX object offload/plugins-nextgen/cuda/CMakeFiles/omptarget.rtl.cuda.dir/dynamic_cuda/cuda.cpp.o
[ 96%] Packaging LLVM offloading binary libomptarget-amdgpu-gfx1012.bc.out

[100%] Embedding LLVM offloading binary in devicertl-amdgpu-gfx942.o
[100%] Packaging LLVM offloading binary libomptarget-nvptx-sm_37.bc.out
[100%] Embedding LLVM offloading binary in devicertl-nvptx-sm_35.o
[100%] Packaging LLVM offloading binary libomptarget-nvptx-sm_50.bc.out
[100%] Embedding LLVM offloading binary in devicertl-nvptx-sm_52.o
[100%] Packaging LLVM offloading binary libomptarget-nvptx-sm_53.bc.out
[100%] Packaging LLVM offloading binary libomptarget-nvptx-sm_60.bc.out
[....]
[100%] Embedding LLVM offloading binary in devicertl-nvptx-sm_70.o
[100%] Packaging LLVM offloading binary libomptarget-nvptx-sm_62.bc.out
[100%] Packaging LLVM offloading binary libomptarget-nvptx-sm_72.bc.out
[100%] Embedding LLVM offloading binary in devicertl-nvptx-sm_62.o
[100%] Embedding LLVM offloading binary in devicertl-nvptx-sm_72.o
[100%] Packaging LLVM offloading binary libomptarget-nvptx-sm_80.bc.out
[100%] Embedding LLVM offloading binary in devicertl-nvptx-sm_80.o
[100%] Packaging LLVM offloading binary libomptarget-nvptx-sm_86.bc.out
[100%] Embedding LLVM offloadi
[100%] Embedding LLVM offloading binary in devicertl-nvptx-sm_86.o
[100%] Packaging LLVM offloading binary libomptarget-nvptx-sm_90.bc.out
[100%] Embedding LLVM offloading binary in devicertl-nvptx-sm_90.o
1 error generated.
make[5]: *** [offload/plugins-nextgen/host/CMakeFiles/omptarget.rtl.host.dir/build.make:79: offload/plugins-nextgen/host/CMakeFiles/omptarget.rtl.host.dir/src/rtl.cpp.o] Fehler 1
make[4]: *** [CMakeFiles/Makefile2:324787: offload/plugins-nextgen/host/CMakeFiles/omptarget.rtl.host.dir/all] Fehler 2
make[4]: *** Es wird auf noch nicht beendete Prozesse gewartet …
[100%] Packaging LLVM offloading binary libomptarget-nvptx-sm_89.bc.out
[100%] Embedding LLVM offloading binary in devicertl-nvptx-sm_89.o
[100%] Packaging LLVM offloading binary libomptarget-nvptx-sm_87.bc.out
[100%] Embedding LLVM offloading binary in devicertl-nvptx-sm_87.o
[100%] Packaging LLVM offloading binary libomptarget-nvptx-sm_75.bc.out
[100%] Embedding LLVM offloading binary in devicertl-nvptx-sm_75.o
[100%] Linking CXX static library /home/benni/projects/clang/llvm-project/build/lib/libomptarget.devicertl.a
[100%] Built target omptarget.devicertl
[100%] Linking CXX 



At least it is not the file for my gpu where it breaks down.... it generates an out file for my gpu. it breaks somewhere else...
Comment 32 Benjamin Schulz 2024-12-07 04:54:09 UTC
so that was it, I missed the last fix from Mgorny. Now it build with 

 cmake -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="bolt;clang;clang-tools-extra;compiler-rt;libclc;lld;lldb;openmp;polly;pstl"  -DLLVM_TARGETS_TO_BUILD="X86;NVPTX" -DLLVM_ENABLE_RUNTIMES="libc;libunwind;libcxxabi;libcxx"  -DCMAKE_INSTALL_PREFIX=/CustomClang ../llvm ;

make

I will test it later during the day...
Comment 33 Benjamin Schulz 2024-12-07 05:07:50 UTC
oh no, i forgot again to set the offload runtime.. 

without that, 

I have now re-checked that I have indeed all the recent content from mgorny....

As soon as I add the offload runtime to CMake, the build will fail with


nextgen/host/src/rtl.cpp:15:10: fatal error: 'ffi.h' file not found
   15 | #include <ffi.h>
      |          ^~~~~~~
[ 96%] Built target libc.src.stdio.scanf_core.scanf_main
Comment 34 Benjamin Schulz 2024-12-07 07:21:26 UTC
hm, if i try to forcefully compile a cpp file with the clang generated by

 cmake -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="bolt;clang;clang-tools-extra;compiler-rt;libclc;lld;lldb;openmp;polly;pstl"  -DLLVM_TARGETS_TO_BUILD="X86;NVPTX" -DLLVM_ENABLE_RUNTIMES="libc;libunwind;libcxxabi;libcxx"  -DCMAKE_INSTALL_PREFIX=/CustomClang ../llvm ;

and use an offload target file CustomClang/share/clc/nvptx64--.bc

then apparently, this compiles at least, but I get a link time problem:

linking module '/CustomClang/share/clc/nvptx64--.bc': 

Linking two modules of different target triples: '/CustomClang/share/clc/nvptx64--.bc' is 'nvptx64-unknown-unknown' whereas 
'/home/benni/projects/openmptestnew/openmpoffloatest/main.cpp' is 'nvptx64-nvidia-cuda'

the other nvptx files also do not work.

So I guess I need that offload runtime in the build command to get that file, which is broken, because of libffi by now.

But there are many amd-something.bc files in that folder. It looks almost as if the nvptx were packed into these nvptx64--.bc file but could not be linked or somehting...
Comment 35 Benjamin Schulz 2024-12-07 09:34:15 UTC
if i add a parameter:

-DFFI_INCLUDE_DIR=/usr/lib64/libffi/include

then it builds.

the entire parameters are 

cmake -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="bolt;clang;clang-tools-extra;compiler-rt;libclc;lld;lldb;openmp;polly;pstl"  -DLLVM_TARGETS_TO_BUILD="X86;NVPTX" -DLLVM_ENABLE_RUNTIMES="libc;libunwind;libcxxabi;libcxx;offload" -DFFI_INCLUDE_DIR=/usr/lib64/libffi/include ../llvm 

Testing it later...
Comment 36 Benjamin Schulz 2024-12-07 10:09:05 UTC
a test example compiles with offloading, but if I try to run it, it says:

./a.out: error while loading shared libraries: libomptarget.so.20.0git: cannot open shared object file: No such file or directory

In the above command openmp was build as a project, not as a runtime. I guess that is why libomptarget.so.20.0git is not in the install manifest.


If I remove openmp from projects and build openmp as a runtime instead, then i get this:


-- Found system-installed LLVM 20.0.0git with headers in /home/benni/projects/clang/llvm-project/llvm/include;/home/benni/projects/clang/llvm-project/build/include
-- Clang-tidy tests are enabled.
-- Performing Test OPENMP_HAVE_ONEAPI_COMPILER
-- Performing Test OPENMP_HAVE_ONEAPI_COMPILER - Failed
CMake Error at /home/benni/projects/clang/llvm-project/openmp/runtime/cmake/LibompUtils.cmake:26 (message):
  LIBOMP: 128-bit quad precision functionality requested but not available
Call Stack (most recent call first):
  /home/benni/projects/clang/llvm-project/openmp/runtime/CMakeLists.txt:286 (libomp_error_say)
Comment 37 Benjamin Schulz 2024-12-07 10:47:25 UTC
the code with which i run my example, was this, by the way:

clang++  -O3 -fopenmp  -fopenmp-targets=nvptx64-nvidia-cuda ./main3.cpp ./main.cpp -lm -lstdc++ 

Here is at least some documentation:

https://openmp.llvm.org/SupportAndFAQ.html

It indeed says that offload AND openmp should be both build as runtime, which I can't because of the 

  LIBOMP: 128-bit quad precision functionality requested but not available
Call Stack (most recent call first):

error that appears then in cmake configure stage...
Comment 38 Benjamin Schulz 2024-12-07 11:49:40 UTC
this may be related to the quad precision support problem for the omp target https://reviews.llvm.org/D64289
Comment 39 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2024-12-07 16:48:30 UTC
Today's LLVM 20.x reintroduces sys-libs/llvm-offload for this.  Will do 19.x later.
Comment 41 Benjamin Schulz 2024-12-08 02:31:36 UTC
Hi, i found time to test your ebuilds.

I just wanted to give the following feedback:


this clang 20.0.0.0.9999 ebuild compiles but apparently, it does not install the necessary symlinks.

so the new version 20 clang can not be found by typing clang++, but must be reached by  /usr/lib/llvm/20/bin/clang.

If I emerge llvm-offload

the following code:


#include <stdio.h>

#ifdef _OPENMP
#include <omp.h>
#endif

int main()
{
  int num_devices = omp_get_num_devices();
  printf("Number of available devices %d\n", num_devices);

  #pragma omp target
  {
    if (omp_is_initial_device()) {
      printf("Running on host\n");
    } else {
      printf("This code is running on the taget device.\n");
      int nteams= omp_get_num_teams();
      int nthreads= omp_get_num_threads();
      printf("Running on device with %d teams in total and %d threads in each team\n",nteams,nthreads);
    }
  }

}

Compiles on my system with 

/usr/lib/llvm/20/bin/clang  -O3 -fopenmp  -fopenmp-targets=nvptx64-nvidia-cuda ./main3.cpp ./main.cpp -lm -lstdc++  


and running it, the program returns that it is running on the device.

So that is sucessful. Thank you for making this possible,

Interestingly, in contrast to Sam's gcc crossdev solution, clang has access not only to numerical functions from libc, but also to printf.

gcc was a bit limited in this respect. This is amazing, since the gpu has only a small shared memory access and printf needs to be able to modify stdout. But with clang, this has limits, too. So it is not very surprising that if we modify the above code and replace printf by an include of iostream and std::cout<< 

"Text" then, we get the following errors:

./main.cpp:23:7: warning: type 'ostream' (aka 'basic_ostream<char>') is not trivially copyable and not guaranteed to be mapped correctly [-Wopenmp-mapping]
nvlink error   : Undefined reference to '_ZSt16__ostream_insertIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_PKS3_l' in '/tmp/main-ac8a7a-nvptx64-nvidia-cuda-sm_75-25a839-a15769.cubin'
nvlink error   : Undefined reference to 'strlen' in '/tmp/main-ac8a7a-nvptx64-nvidia-cuda-sm_75-25a839-a15769.cubin'
/usr/lib/llvm/20/bin/clang-nvlink-wrapper: error: 'nvlink' failed

With mathematical functions, there appears no problem, fortunately. But from that, we can see that t really works on the target gpu.

For optimized loops, it would be just nice if one would be able to use the optimizer polly with it. Currently, gentoo's clang has no polly support.

And, well, 

the ebuild has this line:


	cmake_src_configure

	if [[ -z ${gpus} ]]; then
		# clang requires libomptarget.devicertl.a, but it can be empty
		> "${BUILD_DIR}"/libomptarget.devicertl.a || die
	fi

while this may work nicely on gentoo, I generally want to write code that can run on most devices and operating systems possible.

(Therefore, using a standard like openmp is a good choice, since it does not depend on a specific acellerator.)

Now imagine, I am on another platform. The git manual for clang does nowhere write that one has to create an empty file called

libomptarget.devicertl.a

to satisfy the compiler.


I guess, thing like that should be in the build-system, like in a cmakelists.txt


And, well it is still a bit irritating that if i build openmp as a runtime with clang, that I get then this here:

CMake Error at /home/benni/projects/clang/llvm-project/openmp/runtime/cmake/LibompUtils.cmake:26 (message):
  LIBOMP: 128-bit quad precision functionality requested but not available
Call Stack (most recent call first):

which One apparently does not get when building it as a project. 
How is this solved in gentoo's libopenmp? Since it has apparently something to do with hardware bounds, where the processor supports quad precision and the consumer cpu just plain double...


But, otherwise, well I get code now that works on target with clang.

So sucess.

Thanks for the good work.

I can now continue on my mathematical code.

Thanks.

With best regards,

Benjamin
Comment 42 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-12-08 02:37:53 UTC
(In reply to Benjamin Schulz from comment #41)
> Hi, i found time to test your ebuilds.
> 
> I just wanted to give the following feedback:
> 
> 
> this clang 20.0.0.0.9999 ebuild compiles but apparently, it does not install
> the necessary symlinks.
> 
> so the new version 20 clang can not be found by typing clang++, but must be
> reached by  /usr/lib/llvm/20/bin/clang.

You have to run . /etc/profile afterwards as it adds a new dir to PATH.

> 
> If I emerge llvm-offload
> 
> [...]
> and running it, the program returns that it is running on the device.
> 
> So that is sucessful. Thank you for making this possible,
> 

\o/

> Interestingly, in contrast to Sam's gcc crossdev solution, clang has access
> not only to numerical functions from libc, but also to printf.
> 
> gcc was a bit limited in this respect. This is amazing, since the gpu has
> only a small shared memory access and printf needs to be able to modify
> stdout. But with clang, this has limits, too. So it is not very surprising
> that if we modify the above code and replace printf by an include of
> iostream and std::cout<< 
> 
> "Text" then, we get the following errors:
> 
> ./main.cpp:23:7: warning: type 'ostream' (aka 'basic_ostream<char>') is not
> trivially copyable and not guaranteed to be mapped correctly
> [-Wopenmp-mapping]
> nvlink error   : Undefined reference to
> '_ZSt16__ostream_insertIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_PKS3_
> l' in '/tmp/main-ac8a7a-nvptx64-nvidia-cuda-sm_75-25a839-a15769.cubin'
> nvlink error   : Undefined reference to 'strlen' in
> '/tmp/main-ac8a7a-nvptx64-nvidia-cuda-sm_75-25a839-a15769.cubin'
> /usr/lib/llvm/20/bin/clang-nvlink-wrapper: error: 'nvlink' failed
> 

Interesting indeed!

> With mathematical functions, there appears no problem, fortunately. But from
> that, we can see that t really works on the target gpu.
> 
> For optimized loops, it would be just nice if one would be able to use the
> optimizer polly with it. Currently, gentoo's clang has no polly support.
> 



> And, well, 
> 
> the ebuild has this line:
> 
> 
> 	cmake_src_configure
> 
> 	if [[ -z ${gpus} ]]; then
> 		# clang requires libomptarget.devicertl.a, but it can be empty
> 		> "${BUILD_DIR}"/libomptarget.devicertl.a || die
> 	fi
> 
> while this may work nicely on gentoo, I generally want to write code that
> can run on most devices and operating systems possible.
> 
> (Therefore, using a standard like openmp is a good choice, since it does not
> depend on a specific acellerator.)

Yeah, one of the interesting things about the GCC offloading at least is it can support both at once and choose at runtime which to use. I am not sure if the LLVM one does or not (not saying it doesn't, just that I don't know).

> 
> Now imagine, I am on another platform. The git manual for clang does nowhere
> write that one has to create an empty file called
> 
> libomptarget.devicertl.a
> 
> to satisfy the compiler.
> 
> 
> I guess, thing like that should be in the build-system, like in a
> cmakelists.txt
> 

AFAIK upstream are planning on ditching this entirely (see https://github.com/llvm/llvm-project/pull/119091).

> 
> And, well it is still a bit irritating that if i build openmp as a runtime
> with clang, that I get then this here:
> 
> CMake Error at
> /home/benni/projects/clang/llvm-project/openmp/runtime/cmake/LibompUtils.
> cmake:26 (message):
>   LIBOMP: 128-bit quad precision functionality requested but not available
> Call Stack (most recent call first):
> 
> which One apparently does not get when building it as a project. 
> How is this solved in gentoo's libopenmp? Since it has apparently something
> to do with hardware bounds, where the processor supports quad precision and
> the consumer cpu just plain double...

LLVM's CMake is a mess. You'd have to look at the build log and copy the arguments and go from there. At a glance, -DLIBOMP_USE_QUAD_PRECISION=OFF may work. But looking at https://github.com/llvm/llvm-project/blob/f0297ae552e1e5aacafc1ed43968041994dc8a6e/openmp/runtime/cmake/config-ix.cmake#L241, it depends on if building with GCC or Clang, maybe?
Comment 43 Benjamin Schulz 2024-12-08 04:17:03 UTC
>Yeah, one of the interesting things about the GCC offloading at least is it can >support both at once and choose at runtime which to use. I am not sure if the >LLVM one does or not (not saying it doesn't, just that I don't know).


Well, openmp code generally runs on the host. Only code within an 

#pragma omp target {region} device(number)

uploads instructions inside {region} to the target device with a given number at runtime.

What would be interesting is if one can link to two different offload targets.

Say, two different gpu from separate manufacturers, e.g. one onboard chip with much shared memory between gpu and cüu, and one separate gpu on pci without that. So that one could populate both gpu's, or, if a gpu fails in a computer center, replace it with a newer and possibly different gpu, and then, at runtime, let the application use the newly installed graphics card. 

Unfortunately, with only one gpu, I can not test if that is possible.


> You have to run . /etc/profile afterwards as it adds a new dir to PATH.

Thanks. Yes that works. I just had read the manual 

https://wiki.gentoo.org/wiki/Clang

which is unfortunately silent about this.


>LLVM's CMake is a mess. You'd have to look at the build log and copy the >arguments and go from there. At a glance, -DLIBOMP_USE_QUAD_PRECISION=OFF may >work.

Ah thanks. That may be it...

Yes, clang's build system is really a mess. But also Clang's documentation should be updated...

Best regards,

Benjamin
Comment 44 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-12-08 04:39:27 UTC
(In reply to Benjamin Schulz from comment #43)
> 
> >Yeah, one of the interesting things about the GCC offloading at least is it can >support both at once and choose at runtime which to use. I am not sure if the >LLVM one does or not (not saying it doesn't, just that I don't know).
> 
> 
> Well, openmp code generally runs on the host. Only code within an 
> 
> #pragma omp target {region} device(number)
> 
> uploads instructions inside {region} to the target device with a given
> number at runtime.
> 
> What would be interesting is if one can link to two different offload
> targets.
> 
> Say, two different gpu from separate manufacturers, e.g. one onboard chip
> with much shared memory between gpu and cüu, and one separate gpu on pci
> without that. So that one could populate both gpu's, or, if a gpu fails in a
> computer center, replace it with a newer and possibly different gpu, and
> then, at runtime, let the application use the newly installed graphics card. 
> 
> Unfortunately, with only one gpu, I can not test if that is possible.
> 

I think GCC supports that at least:
"No hardware-vendor libraries (like CUDA or ROCm) are required for compilation. And when run: if the hardware library is not available and/or no suitable offload device is available, host fallback is done. Compiling with codegeneration both Nvidia (nvptx) and AMD GPUs in the same binary is supported; whether that program then run on the host, on Nvidia GPUs, or AMD GPUs – or on Nvidia and AMD GPUs is decided at run time. "

from https://gcc.gnu.org/wiki/Offloading.
Comment 45 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-12-08 04:39:40 UTC
(In reply to Benjamin Schulz from comment #43)
> Thanks. Yes that works. I just had read the manual 
> 
> https://wiki.gentoo.org/wiki/Clang
> 
> which is unfortunately silent about this.
> 

Please edit it ;)
Comment 46 Benjamin Schulz 2024-12-08 06:31:01 UTC
>I think GCC supports that at least:

That is interesting. Not only for computer centers where gpu's often fail due to usage and their high numbers, but also for consumers who have onboard gpu's, which could then be used for something useful while the main gpu is doing something else.
Comment 47 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-12-17 22:30:41 UTC
commit d9d53c78c0f6fa892498d313724d7dbfc7043401
Author: Michał Górny <mgorny@gentoo.org>
Date:   Tue Dec 17 22:34:38 2024 +0100

    llvm-runtimes/offload: Add 19.1.6

    Signed-off-by: Michał Górny <mgorny@gentoo.org>