Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 879907 - sci-libs/rocBLAS-5.1.3 does not respect AMDGPU_TARGETS and fails to install with an AMD Zen4 CPU.
Summary: sci-libs/rocBLAS-5.1.3 does not respect AMDGPU_TARGETS and fails to install w...
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Gentoo Science Related Packages
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-11-05 21:15 UTC by Nick Wallingford
Modified: 2022-12-23 21:48 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
build log (sci-libs:rocBLAS-5.1.3:20221105-195922.log,6.94 KB, text/plain)
2022-11-06 22:21 UTC, Nick Wallingford
Details
emerge --info (emerge_info,6.89 KB, text/plain)
2022-11-07 04:28 UTC, Nick Wallingford
Details
build log (sci-libs_rocBLAS-5.1.3_20221107-042344.log,5.64 KB, text/plain)
2022-11-07 04:29 UTC, Nick Wallingford
Details
build.log with ROCM_TARGET_LST trick (sci-libs:rocBLAS-5.1.3:20221112-204448.log.gz,332.80 KB, application/gzip)
2022-11-12 21:43 UTC, Nick Wallingford
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Nick Wallingford 2022-11-05 21:15:53 UTC
I have an AMD Ryzen 7950X CPU and an AMD Radeon 6800XT GPU. The CPU has an APU mini-video card in it, which is an RDNA2 chipset not entirely dissimilar from the 6800XT. The output from rocm_agent_enumerator (from dev-util/rocminfo) shows 3 agents installed: gfx000, gfx1030, and gfx1036. The gfx000 is presumably just the CPU. The gfx1030 is the 6800XT. The gfx1036 is the APU.

I want rocBLAS to run on my GPU. I have no particular need to run it on the APU. (it'd be nice) So I set AMDGPU_TARGETS="gfx1030" in make.conf to tell ROCm to target my GPU and ignore the APU.

When I try to compile rocBLAS I get this error message:

-- Check for working CXX compiler: /usr/bin/hipcc
-- Check for working CXX compiler: /usr/bin/hipcc - broken
CMake Error at /usr/share/cmake/Modules/CMakeTestCXXCompiler.cmake:62 (message):
  The C++ compiler

    "/usr/bin/hipcc"

  is not able to compile a simple test program.

  It fails with the following output:

    Change Dir: /var/tmp/portage/sci-libs/rocBLAS-5.1.3/work/rocBLAS-rocm-5.1.3_build/CMakeFiles/CMakeTmp
    
    Run Build Command(s):/usr/bin/ninja cmTC_a64ff && [1/2] Building CXX object CMakeFiles/cmTC_a64ff.dir/testCXXCompiler.cxx.o
    FAILED: CMakeFiles/cmTC_a64ff.dir/testCXXCompiler.cxx.o 
    /usr/bin/hipcc    -O2 -march=native -pipe -Wl,-O1 -Wl,--as-needed -o CMakeFiles/cmTC_a64ff.dir/testCXXCompiler.cxx.o -c /var/tmp/portage/sci-libs/rocBLAS-5.1.3/work/rocBLAS-rocm-5.1.3_build/CMakeFiles/CMakeTmp/testCXXCompiler.cxx
    clang-14: warning: -Wl,-O1: 'linker' input unused [-Wunused-command-line-argument]
    clang-14: warning: -Wl,--as-needed: 'linker' input unused [-Wunused-command-line-argument]
    clang-14: error: invalid target ID 'gfx1036'; format is a processor name followed by an optional colon-delimited list of features followed by an enable/disable sign (e.g., 'gfx908:sramecc+:xnack-')
    ninja: build stopped: subcommand failed.

Very well; gfx1036 support doesn't exist yet in llvm:14, but it is supported in llvm:15. I made a local copy of dev-util/hip, dev-util/Tensile, dev-libs/rocm-comgr, dev-libs/rocm-device-libs, dev-libs/rocr-runtime, and set the max llvm version to 15. I get a similar error message:

    Run Build Command(s):/usr/bin/ninja cmTC_f6a41 && [1/2] Building CXX object CMakeFiles/cmTC_f6a41.dir/testCXXCompiler.cxx.o
    FAILED: CMakeFiles/cmTC_f6a41.dir/testCXXCompiler.cxx.o 
    /usr/bin/hipcc    -O2 -march=native -pipe -Wl,-O1 -Wl,--as-needed -o CMakeFiles/cmTC_f6a41.dir/testCXXCompiler.cxx.o -c /var/tmp/portage/sci-libs/rocBLAS-5.1.3/work/rocBLAS-rocm-5.1.3_build/CMakeFiles/CMakeTmp/testCXXCompiler.cxx
    clang-15: warning: -Wl,-O1: 'linker' input unused [-Wunused-command-line-argument]
    clang-15: warning: -Wl,--as-needed: 'linker' input unused [-Wunused-command-line-argument]
    clang-15: error: cannot find ROCm device library for gfx1036; provide its path via '--rocm-path' or '--rocm-device-lib-path', or pass '-nogpulib' to build without ROCm device library
    ninja: build stopped: subcommand failed.
 
CMake probably shouldn't be testing the compiler by building against a target that I won't be using. Alternatively, the ROCm stack could be bumped to support 5.3.1; gfx1036 support was added in 5.2.0.

Reproducible: Always
Comment 1 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2022-11-06 06:07:57 UTC
Please attach the full build.log and emerge --info.
Comment 2 Yiyang Wu 2022-11-06 06:18:44 UTC
hipcc automatically detects all GPU archs and compiles. Your Zen4 CPU contains gfx1036 which is not covered in ROCm 5.1.3 releases. Although AMDGPU_TARGETS=gfx1030 is specified, this has no effect on CMake testing the CXX compiler, so hipcc fails at very first stage.

Possible solution:

1. Upgrade ROCm toolchain

2. Find a way to insert --offload-arch=${AMDGPU_TARGETS} flag to CMake test CXX command
Comment 3 Nick Wallingford 2022-11-06 22:21:53 UTC
Created attachment 828203 [details]
build log
Comment 4 Nick Wallingford 2022-11-06 22:54:17 UTC
I've since emerge -e world with LTO, -fomg-optimize etc. I'll set off a new one tonight with sane CFLAGS, re-confirm the issue, and post emerge --info in the morning. If the build log is meaningfully different I'll post a new copy of that, but I don't expect it to change.

Honestly it would probably be easiest to just version bump ROCm. The portage tree is on 5.1.3 and upstream is 5.3.1.
Comment 5 Nick Wallingford 2022-11-07 04:28:59 UTC
Created attachment 828207 [details]
emerge --info
Comment 6 Nick Wallingford 2022-11-07 04:29:39 UTC
Created attachment 828209 [details]
build log
Comment 7 Yiyang Wu 2022-11-11 04:01:24 UTC
A possible solution is to cheat your system that only 6800XT is present.

For hipcc, it uses rocm_agent_enumerator to detect GPU present. You can use a self specified target list to override its self detection:

`echo gfx1030 > /tmp/amdgpu.list`, then `export ROCM_TARGET_LST=/tmp/amdgpu.list`. Then run hipcc or emerge hip based package, and see if cmake stops complaining about hipcc unable to compile.
Comment 8 Nick Wallingford 2022-11-12 21:40:50 UTC
Progress! With the ROCM_TARGET_LST trick it now fails with the following error message:

FAILED: Tensile/lib/CMakeFiles/TensileHost.dir/source/hip/HipSolutionAdapter.cpp.o 
/usr/bin/hipcc -DTENSILE_DEFAULT_SERIALIZATION -DTENSILE_MSGPACK=1 -DTENSILE_USE_HIP -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/usr/share/Tensile/Source/lib/include  -O2 -march=native -pipe -ftree-vectorize -D__HIP_HCC_COMPAT_MODE__=1 -fPIC -Wno-unused-command-line-argument -Wno-unused-result -x hip --hip-device-lib-path=/usr/lib/amdgcn/bitcode --offload-arch=gfx1030 -std=c++14 -MD -MT Tensile/lib/CMakeFiles/TensileHost.dir/source/hip/HipSolutionAdapter.cpp.o -MF Tensile/lib/CMakeFiles/TensileHost.dir/source/hip/HipSolutionAdapter.cpp.o.d -o Tensile/lib/CMakeFiles/TensileHost.dir/source/hip/HipSolutionAdapter.cpp.o -c /usr/share/Tensile/Source/lib/source/hip/HipSolutionAdapter.cpp
In file included from /usr/share/Tensile/Source/lib/source/hip/HipSolutionAdapter.cpp:33:
In file included from /usr/share/Tensile/Source/lib/include/Tensile/EmbeddedData.hpp:30:
In file included from /usr/lib/gcc/x86_64-pc-linux-gnu/12/include/g++-v12/memory:77:
In file included from /usr/lib/gcc/x86_64-pc-linux-gnu/12/include/g++-v12/bits/shared_ptr.h:53:
/usr/lib/gcc/x86_64-pc-linux-gnu/12/include/g++-v12/bits/shared_ptr_base.h:196:22: error: use of undeclared identifier 'noinline'; did you mean 'inline'?
      __attribute__((__noinline__))
                     ^
/usr/include/hip/amd_detail/host_defines.h:50:37: note: expanded from macro '__noinline__'
#define __noinline__ __attribute__((noinline))
                                    ^
Comment 9 Nick Wallingford 2022-11-12 21:43:04 UTC
Created attachment 832201 [details]
build.log with ROCM_TARGET_LST trick
Comment 10 Yiyang Wu 2022-11-13 04:14:12 UTC
(In reply to Nick Wallingford from comment #9)
> Created attachment 832201 [details]
> build.log with ROCM_TARGET_LST trick

So the trick solves this bug.

The failure is a duplication of https://bugs.gentoo.org/857126.

Currently it has a fix pending in https://github.com/gentoo/gentoo/pull/28144
Comment 11 Nick Wallingford 2022-11-13 23:04:55 UTC
Good to hear. Of course, if ROCm were updated to 5.3.1, both of these would be unnecessary; 5.3.1 targets llvm-15, and 5.3.1 supports gfx1036.
Comment 12 Nick Wallingford 2022-12-23 21:48:01 UTC
Portage is now updated to 5.3.3, which fixes this issue.