Summary: | sci-libs/caffe2-2.2.1-r1: nvcc fatal : Unsupported gpu architecture 'compute_35' | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | look |
Component: | Current packages | Assignee: | Tupone Alfredo <tupone> |
Status: | UNCONFIRMED --- | ||
Severity: | normal | CC: | bugzilla, look, negril.nx+gentoo, parona |
Priority: | Normal | ||
Version: | unspecified | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- | |
Attachments: |
The build log.
The emerge --info output The build environment. This is the build log after setting TORCH_CUDA_ARCH_LIST and using GCC12 The new (3rd) build.log. |
Description
look
2024-04-04 14:23:01 UTC
Created attachment 889403 [details]
The build log.
Created attachment 889404 [details]
The emerge --info output
Created attachment 889405 [details]
The build environment.
*** Bug 928579 has been marked as a duplicate of this bug. *** > nvcc fatal : Unsupported gpu architecture 'compute_35' > WARNING: caffe2 is being built with its default CUDA compute capabilities: 3.5 and 7.0. > These may not be optimal for your GPU. > > To configure caffe2 with the CUDA compute capability that is optimal for your GPU, > set TORCH_CUDA_ARCH_LIST in your make.conf, and re-emerge caffe2. > For example, to use CUDA capability 7.5 & 3.5, add: TORCH_CUDA_ARCH_LIST=7.5 3.5 > For a Maxwell model GPU, an example value would be: TORCH_CUDA_ARCH_LIST=Maxwell > > You can look up your GPU's CUDA compute capability at https://developer.nvidia.com/cuda-gpus > or by running /opt/cuda/extras/demo_suite/deviceQuery | grep 'CUDA Capability' Nevertheless nvidia-cuda-toolkit-12 only supports 5.0+. I added TORCH_CUDA_ARCH_LIST="6.1" and I still get the same error. I've tried switching CFLAGS to "march=native -O2 pipe" and using GCC12 & GCC13 and it fails in both cases even though arch=compute_61. I honestly don't have any other ideas and the error is not descriptive. Can you add a the build.log for TORCH_CUDA_ARCH_LIST="6.1"? Created attachment 889534 [details] This is the build log after setting TORCH_CUDA_ARCH_LIST and using GCC12 So I got a similar error in https://bugs.gentoo.org/928605 with media-libs/opencv and I was able to fix it by using GCC12 (as well as disabling the sandbox) based on this Reddit post: https://www.reddit.com/r/Gentoo/comments/1arlsfi/cuda_gcc_too_recent/. However, I have not been able to fix sci-libs/caffe2 with the same approach. The CC="gcc-12" CXX="g++-12" is causing: > /usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: /usr/lib64/libprotobuf.so.23.3.0: undefined reference to `std::ios_base_library_init()@GLIBCXX_3.4.32' > /usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: /usr/lib64/libprotobuf.so.23.3.0: undefined reference to `__cxa_call_terminate@CXXABI_1.3.15' Meaning protobuf was compiled with gcc-13. You don't _ever_ need to set CC or CXX to make cuda work. For most cuda related packages you only need to set the cuda host compiler and the arch. See https://wiki.gentoo.org/wiki/User:Negril/CUDA. For caffe2 you need to add for now: > export TORCH_CUDA_ARCH_LIST="6.1" Making the full env file: > CUDA_VERBOSE="false" > CUDAHOSTCXX="/usr/x86_64-pc-linux-gnu/gcc-bin/12" > TORCH_CUDA_ARCH_LIST="6.1" Created attachment 889965 [details]
The new (3rd) build.log.
Hi. So I rebuilt world and now the error I'm getting when building caffe2 is:
CMake Error in torch/CMakeLists.txt:
Imported target "pybind::pybind11" includes non-existent path
"/include"
in its INTERFACE_INCLUDE_DIRECTORIES. Possible reasons include:
* The path was deleted, renamed, or moved to another location.
* An install or uninstall procedure did not complete successfully.
* The installation package was faulty and references files it does not
provide.
CMake Error in torch/CMakeLists.txt:
Imported target "pybind::pybind11" includes non-existent path
"/include"
in its INTERFACE_INCLUDE_DIRECTORIES. Possible reasons include:
* The path was deleted, renamed, or moved to another location.
* An install or uninstall procedure did not complete successfully.
* The installation package was faulty and references files it does not
provide.
CMake Error in torch/CMakeLists.txt:
Imported target "pybind::pybind11" includes non-existent path
"/include"
in its INTERFACE_INCLUDE_DIRECTORIES. Possible reasons include:
* The path was deleted, renamed, or moved to another location.
* An install or uninstall procedure did not complete successfully.
* The installation package was faulty and references files it does not
provide.
CMake Error in functorch/CMakeLists.txt:
Imported target "pybind::pybind11" includes non-existent path
"/include"
in its INTERFACE_INCLUDE_DIRECTORIES. Possible reasons include:
* The path was deleted, renamed, or moved to another location.
* An install or uninstall procedure did not complete successfully.
* The installation package was faulty and references files it does not
provide.
|