Even with https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=7475e8ff0742ead7b45edea3ed7e79c394e23958, packages like rccl and composite-kernel hard-codes the ABI of gfx908 to gfx908:xnack-. e.g. with composite-kernel /opt/gentoo/usr/bin/hipcc -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/tmp/portage/sci-libs/composable-kernel-5.7.1-r1/work/composable_kernel-rocm-5.7.1/library/include -I/tmp/portage/sci-libs/composable-kernel-5.7.1-r1/work/composable_kernel-rocm-5.7.1/include -O2 -pipe -march=znver2 -DNDEBUG -std=c++17 -fPIC -Wall -Wextra -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wunused -Wno-reserved-identifier -Wsign-compare -Wno-extra-semi-stmt -Wno-missing-field-initializers -Wno-deprecated-declarations -Wall -Wextra -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wunused -Wno-reserved-identifier -Wsign-compare -Wno-extra-semi-stmt -Weverything -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -x hip --offload-arch=gfx908:xnack- --offload-arch=gfx90a:xnack+ -MD -MT library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/conv2d_fwd/device_conv2d_dl_perlayer_quantization_int8_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/conv2d_fwd/device_conv2d_dl_perlayer_quantization_int8_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/quantization/CMakeFiles/device_quantization_instance.dir/conv2d_fwd/device_conv2d_dl_perlayer_quantization_int8_instance.cpp.o -c /tmp/portage/sci-libs/composable-kernel-5.7.1-r1/work/composable_kernel-rocm-5.7.1/library/src/tensor_operation_instance/gpu/quantization/conv2d_fwd/device_conv2d_dl_perlayer_quantization_int8_instance.cpp Reproducible: Always
composable-kernel-6.1.1 does not recognize gfx1031. include/ck/ck.hpp should be modified: #elif defined(__gfx1030__) || defined(__gfx1031__) // for GPU code #define CK_BUFFER_RESOURCE_3RD_DWORD 0x31014000