[1214/3992] cd /var/tmp/portage/sci-libs/pytorch-1.7.1-r2/work/pytorch-1.7.1_build/third_party/gloo/gloo/CMakeFiles/gloo_cuda.dir && /usr/bin/cmake -E make_directory /var/tmp/portage/sci-libs/pytorch-1.7.1-r2/work/pytorch-1.7.1_build/third_party/gloo/gloo/CMakeFiles/gloo_cuda.dir//. && /usr/bin/cmake -D verbose:BOOL=OFF -D build_configuration:STRING=Gentoo -D generated_file:STRING=/var/tmp/portage/sci-libs/pytorch-1.7.1-r2/work/pytorch-1.7.1_build/third_party/gloo/gloo/CMakeFiles/gloo_cuda.dir//./gloo_cuda_generated_cuda_private.cu.o -D generated_cubin_file:STRING=/var/tmp/portage/sci-libs/pytorch-1.7.1-r2/work/pytorch-1.7.1_build/third_party/gloo/gloo/CMakeFiles/gloo_cuda.dir//./gloo_cuda_generated_cuda_private.cu.o.cubin.txt -P /var/tmp/portage/sci-libs/pytorch-1.7.1-r2/work/pytorch-1.7.1_build/third_party/gloo/gloo/CMakeFiles/gloo_cuda.dir//gloo_cuda_generated_cuda_private.cu.o.Gentoo.cmake FAILED: third_party/gloo/gloo/CMakeFiles/gloo_cuda.dir/gloo_cuda_generated_cuda_private.cu.o cd /var/tmp/portage/sci-libs/pytorch-1.7.1-r2/work/pytorch-1.7.1_build/third_party/gloo/gloo/CMakeFiles/gloo_cuda.dir && /usr/bin/cmake -E make_directory /var/tmp/portage/sci-libs/pytorch-1.7.1-r2/work/pytorch-1.7.1_build/third_party/gloo/gloo/CMakeFiles/gloo_cuda.dir//. && /usr/bin/cmake -D verbose:BOOL=OFF -D build_configuration:STRING=Gentoo -D generated_file:STRING=/var/tmp/portage/sci-libs/pytorch-1.7.1-r2/work/pytorch-1.7.1_build/third_party/gloo/gloo/CMakeFiles/gloo_cuda.dir//./gloo_cuda_generated_cuda_private.cu.o -D generated_cubin_file:STRING=/var/tmp/portage/sci-libs/pytorch-1.7.1-r2/work/pytorch-1.7.1_build/third_party/gloo/gloo/CMakeFiles/gloo_cuda.dir//./gloo_cuda_generated_cuda_private.cu.o.cubin.txt -P /var/tmp/portage/sci-libs/pytorch-1.7.1-r2/work/pytorch-1.7.1_build/third_party/gloo/gloo/CMakeFiles/gloo_cuda.dir//gloo_cuda_generated_cuda_private.cu.o.Gentoo.cmake /usr/lib/gcc/x86_64-pc-linux-gnu/10.3.0/include/g++-v10/chrono: In substitution of ‘template<class _Rep, class _Period> template<class _Period2> using __is_harmonic = std::__bool_constant<(std::ratio<((_Period2::num / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::num, _Period::num)) * (_Period::den / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::den, _Period::den))), ((_Period2::den / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::den, _Period::den)) * (_Period::num / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::num, _Period::num)))>::den == 1)> [with _Period2 = _Period2; _Rep = _Rep; _Period = _Period]’: /usr/lib/gcc/x86_64-pc-linux-gnu/10.3.0/include/g++-v10/chrono:473:154: required from here /usr/lib/gcc/x86_64-pc-linux-gnu/10.3.0/include/g++-v10/chrono:428:27: internal compiler error: Segmentation fault 428 | _S_gcd(intmax_t __m, intmax_t __n) noexcept | ^~~~~~ 0x7f62e1a63bdf ??? /var/tmp/portage/sys-libs/glibc-2.33/work/glibc-2.33/signal/../sysdeps/unix/sysv/linux/x86_64/sigaction.c:0 0x7f62e1a4e80c __libc_start_main ../csu/libc-start.c:332 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <https://bugs.gentoo.org/> for instructions. CMake Error at gloo_cuda_generated_cuda_private.cu.o.Gentoo.cmake:278 (message): Error generating file /var/tmp/portage/sci-libs/pytorch-1.7.1-r2/work/pytorch-1.7.1_build/third_party/gloo/gloo/CMakeFiles/gloo_cuda.dir//./gloo_cuda_generated_cuda_private.cu.o Reproducible: Always Actual Results: After updating cuda-toolkit to dev-util/nvidia-cuda-toolkit-11.3.0-r1 and cudnn to dev-libs/cudnn-8.2.1.32, pytorch failes to (re-)compile. gcc (Gentoo 10.3.0 p1) 10.3.0 Copyright (C) 2020 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Created attachment 717672 [details] emerge --info
Created attachment 717675 [details] build.log
It turned out to be a GCC-10.3 - problem, as mentioned here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100102#c20 It can be fixed by applying the following patch for gcc: https://gcc.gnu.org/git/?p=gcc.git;a=patch;h=f1feb74046e0feb0596b93bbb822fae02940a90e;hp=23fa1e7eab7680ae0488b4c8802b0bcd8f78425d
Thanks for the report and analysis, Frederik! @toolchain Team, I am assign it to you. This is an gcc bug that is fixed in upstream CVS. It affects sci-libs/pytorch in the science overlay.
(In reply to Frederik Pfautsch from comment #0) > gcc (Gentoo 10.3.0 p1) 10.3.0 Can you have a check if gcc-10.3.0-r1 works for you? (unstable gcc) We already backported the fix in bug #792705.
(In reply to Sergei Trofimovich from comment #5) > (In reply to Frederik Pfautsch from comment #0) > > gcc (Gentoo 10.3.0 p1) 10.3.0 > > Can you have a check if gcc-10.3.0-r1 works for you? (unstable gcc) We > already backported the fix in bug #792705. I am sorry for not replying for so long, pytorch is now throwing another error (for both gcc versions) I haven't had time to investigate yet. I think r1 fixes this particular problem, since it usually appeared earlier when compiling. I saw r1 got marked stable so I think this particular bug can be closed now. I will report back once I have found the other problem and if this bug is solved.
Let's close as FIXED then. Feel free to file a new bug if you get more details on current crash.