After bumping to ROCm version 4.3 (sys-devel/llvm-roc-4.3.0), the sci-libs/rocBLAS-4.3 (developing, waiting for merge) fails to build, with the following errors: [45/276] /opt/gentoo/usr/lib/hip/bin/hipcc -DBUILD_WITH_TENSILE=1 -DROCBLAS_INTERNAL_API -DROCM_USE_FLOAT16 -DTENSILE_DEFAULT_SERIALIZATION -DTENSILE_MSGPACK=1 -DTENSILE_USE_HIP -DUSE_TENSILE_HOST -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -Drocblas _EXPORTS -I/tmp/portage/sci-libs/rocBLAS-4.3.0/work/rocBLAS-rocm-4.3.0/library/include -I/tmp/portage/sci-libs/rocBLAS-4.3.0/work/rocBLAS-rocm-4.3.0/library/include/internal -I/tmp/portage/sci-libs/rocBLAS-4.3.0/work/rocBLAS-rocm-4.3.0/library/src/include -I/tmp/portage/sci-libs/rocBLAS-4.3.0/work/rocBLAS-4.3.0_build/include/internal -I/tmp/portage/sci-libs/rocBLAS-4.3.0/work/rocBLAS-rocm-4.3.0/library/src/blas3/Tensile -I/tmp/portage/sci-libs/rocBLAS-4.3.0/work/rocBLAS-4.3.0_build/include -I/tmp/portage/sc i-libs/rocBLAS-4.3.0/work/rocBLAS-4.3.0_build/virtualenv/lib/python3.9/site-packages/Tensile/Source/lib/include -march=native -mtune=native -O2 -pipe -D__HIP_HCC_COMPAT_MODE__=1 -fPIC -fvisibility=hidden -fvisibility-inlines-hidden -Wno-unused-command-lin e-argument -mf16c -Werror=vla -xhip --hip-device-lib-path=/opt/gentoo/usr/lib/amdgcn/bitcode --offload-arch=gfx906:xnack- -std=c++14 -MD -MT library/src/CMakeFiles/rocblas.dir/blas_ex/rocblas_nrm2_strided_batched_ex.cpp.o -MF library/src/CMakeFiles/rocblas .dir/blas_ex/rocblas_nrm2_strided_batched_ex.cpp.o.d -o library/src/CMakeFiles/rocblas.dir/blas_ex/rocblas_nrm2_strided_batched_ex.cpp.o -c /tmp/portage/sci-libs/rocBLAS-4.3.0/work/rocBLAS-rocm-4.3.0/library/src/blas_ex/rocblas_nrm2_strided_batched_ex.cpp FAILED: library/src/CMakeFiles/rocblas.dir/blas_ex/rocblas_nrm2_strided_batched_ex.cpp.o /opt/gentoo/usr/lib/hip/bin/hipcc -DBUILD_WITH_TENSILE=1 -DROCBLAS_INTERNAL_API -DROCM_USE_FLOAT16 -DTENSILE_DEFAULT_SERIALIZATION -DTENSILE_MSGPACK=1 -DTENSILE_USE_HIP -DUSE_TENSILE_HOST -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -Drocblas_EXPORTS -I/tmp/portage/sci-libs/rocBLAS-4.3.0/work/rocBLAS-rocm-4.3.0/library/include -I/tmp/portage/sci-libs/rocBLAS-4.3.0/work/rocBLAS-rocm-4.3.0/library/include/internal -I/tmp/portage/sci-libs/rocBLAS-4.3.0/work/rocBLAS-rocm-4.3.0/library/src/include -I/tmp/po rtage/sci-libs/rocBLAS-4.3.0/work/rocBLAS-4.3.0_build/include/internal -I/tmp/portage/sci-libs/rocBLAS-4.3.0/work/rocBLAS-rocm-4.3.0/library/src/blas3/Tensile -I/tmp/portage/sci-libs/rocBLAS-4.3.0/work/rocBLAS-4.3.0_build/include -I/tmp/portage/sci-libs/ro cBLAS-4.3.0/work/rocBLAS-4.3.0_build/virtualenv/lib/python3.9/site-packages/Tensile/Source/lib/include -march=native -mtune=native -O2 -pipe -D__HIP_HCC_COMPAT_MODE__=1 -fPIC -fvisibility=hidden -fvisibility-inlines-hidden -Wno-unused-command-line-argumen t -mf16c -Werror=vla -xhip --hip-device-lib-path=/opt/gentoo/usr/lib/amdgcn/bitcode --offload-arch=gfx906:xnack- -std=c++14 -MD -MT library/src/CMakeFiles/rocblas.dir/blas_ex/rocblas_nrm2_strided_batched_ex.cpp.o -MF library/src/CMakeFiles/rocblas.dir/blas _ex/rocblas_nrm2_strided_batched_ex.cpp.o.d -o library/src/CMakeFiles/rocblas.dir/blas_ex/rocblas_nrm2_strided_batched_ex.cpp.o -c /tmp/portage/sci-libs/rocBLAS-4.3.0/work/rocBLAS-rocm-4.3.0/library/src/blas_ex/rocblas_nrm2_strided_batched_ex.cpp In file included from /tmp/portage/sci-libs/rocBLAS-4.3.0/work/rocBLAS-rocm-4.3.0/library/src/blas_ex/rocblas_nrm2_strided_batched_ex.cpp:5: In file included from /tmp/portage/sci-libs/rocBLAS-4.3.0/work/rocBLAS-rocm-4.3.0/library/src/blas_ex/../blas1/rocblas_reduction_impl.hpp:11: In file included from /tmp/portage/sci-libs/rocBLAS-4.3.0/work/rocBLAS-rocm-4.3.0/library/src/blas_ex/../blas1/rocblas_reduction_template.hpp:7: /tmp/portage/sci-libs/rocBLAS-4.3.0/work/rocBLAS-rocm-4.3.0/library/src/blas_ex/../blas1/fetch_template.hpp:34:17: error: reference to __host__ function 'norm<float>' in __host__ __device__ function return std::norm(A); ^ /tmp/portage/sci-libs/rocBLAS-4.3.0/work/rocBLAS-rocm-4.3.0/library/src/blas_ex/../blas1/rocblas_nrm2.hpp:15:17: note: called by 'operator()<float>' return {fetch_abs2(x)}; ^ /tmp/portage/sci-libs/rocBLAS-4.3.0/work/rocBLAS-rocm-4.3.0/library/src/blas_ex/../blas1/reduction_strided_batched.hpp:248:19: note: called by 'rocblas_reduction_strided_batched_kernel_part1<512, rocblas_fetch_nrm2<float>, rocblas_reduce_sum, const float * const *, float>' tmp[tx] = FETCH{}(x[tid * incx], tid); ^ /opt/gentoo/usr/lib/gcc/x86_64-pc-linux-gnu/11.2.0/include/g++-v11/complex:1870:5: note: 'norm<float>' declared here norm(_Tp __x) ^ In file included from /tmp/portage/sci-libs/rocBLAS-4.3.0/work/rocBLAS-rocm-4.3.0/library/src/blas_ex/rocblas_nrm2_strided_batched_ex.cpp:5: In file included from /tmp/portage/sci-libs/rocBLAS-4.3.0/work/rocBLAS-rocm-4.3.0/library/src/blas_ex/../blas1/rocblas_reduction_impl.hpp:11: In file included from /tmp/portage/sci-libs/rocBLAS-4.3.0/work/rocBLAS-rocm-4.3.0/library/src/blas_ex/../blas1/rocblas_reduction_template.hpp:7: /tmp/portage/sci-libs/rocBLAS-4.3.0/work/rocBLAS-rocm-4.3.0/library/src/blas_ex/../blas1/fetch_template.hpp:34:17: error: reference to __host__ function 'norm<double>' in __host__ __device__ function return std::norm(A); ^ /tmp/portage/sci-libs/rocBLAS-4.3.0/work/rocBLAS-rocm-4.3.0/library/src/blas_ex/../blas1/rocblas_nrm2.hpp:15:17: note: called by 'operator()<double>' return {fetch_abs2(x)}; ^ /tmp/portage/sci-libs/rocBLAS-4.3.0/work/rocBLAS-rocm-4.3.0/library/src/blas_ex/../blas1/reduction_strided_batched.hpp:248:19: note: called by 'rocblas_reduction_strided_batched_kernel_part1<512, rocblas_fetch_nrm2<double>, rocblas_reduce_sum, const double *const *, double>' tmp[tx] = FETCH{}(x[tid * incx], tid); ^ /opt/gentoo/usr/lib/gcc/x86_64-pc-linux-gnu/11.2.0/include/g++-v11/complex:1870:5: note: 'norm<double>' declared here norm(_Tp __x) ^ 2 errors generated when compiling for gfx906. And similar errors. Reproducible: Always Steps to Reproduce: 1. emerge '=sys-devel/llvm-rocv-4.3.0' 2. emerge '=sci-libs/rocBLAS-4.3.0' 3. After investigation, it is the ROCM_PATH environment variable that causes clang to do strange things. Then hip_runtime.h and cuda wrappers for <complex> std library is included in a incorrect way, so clang found the reference for std::norm from gcc's <complex>, which is a __host__ only function without implementation on GPU devices. From 4.0.0 to 4.1.0 sys-devel/llvm-roc uses the llvm-roc-4.0.0-hip-location.patch which replaced the code (who uses $ROCM_PATH) for searching hip runtime with fixed hip installation location. But from 4.2.0 sys-devel/llvm-roc drop tihs patch. Meanwhile, before hip-4.1.0-r1, ROCM_PATH is set in environmental files (env.d/99-hip), so even without llvm-roc-4.0.0-hip-location.patch things can works well. But from hip-4.1.0-r1 on ROCM_PATH is remomved by directly writing it to hipvars.pm. So the two changes make llvm-roc-4.2 and llvm-roc-4.3 search the include dirs abnormally. Restoring llvm-roc-4.0.0-hip-location.patch is strongly suggested.
Created attachment 734872 [details] build.log I choose asm_lite as Tensile library sets to compile, because the default one "asm_full" can take large amount of time and filling the build.log with tens of thousands of lines.
Created attachment 734875 [details] temp/environment
The bug has been closed via the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=e78aa10a00b855cc9ab96fb36d1cebec991530ac commit e78aa10a00b855cc9ab96fb36d1cebec991530ac Author: YiyangWu <xgreenlandforwyy@gmail.com> AuthorDate: 2021-08-21 11:00:55 +0000 Commit: Benda Xu <heroxbd@gentoo.org> CommitDate: 2021-08-26 12:38:58 +0000 sys-devel/llvm-roc: add hip-location.patch back Clang from llvm-roc-4.3.0 throws error during compilation of rocm packages for GPU devices (e.g. rocBLAS). The missing of $ROCM_PATH and deprecation of hip-location.patch together causes in this situation. This commit update the hip-location.patch so it can be used again. Closes: https://bugs.gentoo.org/809392 Closes: https://github.com/gentoo/gentoo/pull/22060 Package-Manager: Portage-3.0.20, Repoman-3.0.3 Signed-off-by: Yiyang Wu <xgreenlandforwyy@gmail.com> Signed-off-by: Benda Xu <heroxbd@gentoo.org> .../files/llvm-roc-4.3.0-hip-location.patch | 189 +++++++++++++++++++++ ...m-roc-4.3.0.ebuild => llvm-roc-4.3.0-r1.ebuild} | 1 + 2 files changed, 190 insertions(+)