After I updated to dev-util/nvidia-cuda-toolkit-12.4.0, sci-libs/caffe2-2.2.1-r1 fails to rebuild. Downgrade to dev-util/nvidia-cuda-toolkit-12.3.2 results in a successful rebuild. Reproducible: Always Steps to Reproduce: System Gcc: sys-devel/gcc-12.3.1_p20240209 sys-devel/gcc-13.2.1_p20240210 $ gcc-config -c x86_64-pc-linux-gnu-13 System nvidia driver: x11-drivers/nvidia-drivers-535.171.04 (This compile failure is also reproducible with x11-drivers/nvidia-drivers-550.67) Actual Results: Section of failed compile message: /var/tmp/portage/sci-libs/caffe2-2.2.1-r1/work/pytorch-2.2.1/aten/src/ATen/core/IListRef_inl.h:171:13: warning: possibly dangling reference to a temporary [-Wdangling-reference] 171 | const auto& ivalue = (*it).get(); | ^~~~~~ /var/tmp/portage/sci-libs/caffe2-2.2.1-r1/work/pytorch-2.2.1/aten/src/ATen/core/IListRef_inl.h:171:33: note: the temporary was destroyed at the end of the full expression ‘(& it)->c10::impl::ListIterator<std::optional<at::Tensor>, __gnu_cxx::__normal_iterator<c10::IValue*, std::vector<c10::IValue> > >::operator*().c10::impl::ListElementReference<std::optional<at::Tensor>, __gnu_cxx::__normal_iterator<c10::IValue*, std::vector<c10::IValue> > >::get()’ 171 | const auto& ivalue = (*it).get(); | ~~~~~~~~~~~^~ /var/tmp/portage/sci-libs/caffe2-2.2.1-r1/work/pytorch-2.2.1/aten/src/ATen/core/boxing/impl/boxing.h: At global scope: /var/tmp/portage/sci-libs/caffe2-2.2.1-r1/work/pytorch-2.2.1/aten/src/ATen/core/boxing/impl/boxing.h:41:104: error: expected primary-expression before ‘>’ token 41 | struct has_ivalue_to<T, guts::void_t<decltype(std::declval<IValue>().to<T>())>> | ^ /var/tmp/portage/sci-libs/caffe2-2.2.1-r1/work/pytorch-2.2.1/aten/src/ATen/core/boxing/impl/boxing.h:41:107: error: expected primary-expression before ‘)’ token 41 | struct has_ivalue_to<T, guts::void_t<decltype(std::declval<IValue>().to<T>())>> | ^ /var/tmp/portage/sci-libs/caffe2-2.2.1-r1/work/pytorch-2.2.1/aten/src/ATen/core/dispatch/DispatchKeyExtractor.h: In lambda function: /var/tmp/portage/sci-libs/caffe2-2.2.1-r1/work/pytorch-2.2.1/aten/src/ATen/core/dispatch/DispatchKeyExtractor.h:154:24: warning: possibly dangling reference to a temporary [-Wdangling-reference] 154 | for (const at::Tensor& tensor : ivalue.toTensorList()) { | ^~~~~~ /var/tmp/portage/sci-libs/caffe2-2.2.1-r1/work/pytorch-2.2.1/aten/src/ATen/core/dispatch/DispatchKeyExtractor.h:154:53: note: the temporary was destroyed at the end of the full expression ‘__for_begin .c10::impl::ListIterator<at::Tensor, __gnu_cxx::__normal_iterator<c10::IValue*, std::vector<c10::IValue> > >::operator*().c10::impl::ListElementReference<at::Tensor, __gnu_cxx::__normal_iterator<c10::IValue*, std::vector<c10::IValue> > >::operator std::conditional_t<true, const at::Tensor&, at::Tensor>()’ 154 | for (const at::Tensor& tensor : ivalue.toTensorList()) { | ^
Created attachment 889168 [details] emerge --info
Created attachment 889169 [details] emerge --pqv
Created attachment 889175 [details] head of build.log The build.log is too large (7.8MB). I can only truncate it and paste the head and the tail section of the log. Hope that's sufficient
Created attachment 889176 [details] tail of build.log
Upstream issue, with patches: https://github.com/pytorch/pytorch/issues/122169 These two patches fix the build for me: https://github.com/pytorch/pytorch/commit/2a440348958b3f0a2b09458bd76fe5959b371c0c.patch https://gitlab.archlinux.org/archlinux/packaging/packages/python-pytorch/-/blob/main/python-pytorch-fix-cuda-12_4.patch?ref_type=heads
(In reply to znjameswu from comment #3) > The build.log is too large (7.8MB). I can only truncate it and paste the > head and the tail section of the log. Hope that's sufficient In the future, please compress the log using gzip to make it small enough to attach.
The bug has been closed via the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=e74afa2c5f42fbc0ff31574a7796c7ee39727e9f commit e74afa2c5f42fbc0ff31574a7796c7ee39727e9f Author: Alfredo Tupone <tupone@gentoo.org> AuthorDate: 2024-04-04 09:19:35 +0000 Commit: Alfredo Tupone <tupone@gentoo.org> CommitDate: 2024-04-04 09:20:03 +0000 sci-libs/caffe2: add 2.2.2 Closes: https://bugs.gentoo.org/928339 Signed-off-by: Alfredo Tupone <tupone@gentoo.org> sci-libs/caffe2/Manifest | 1 + sci-libs/caffe2/caffe2-2.2.2.ebuild | 269 ++++++++++++++++++++++++++++++++++++ 2 files changed, 270 insertions(+)
This bug has not been fixed by caffe2-2.2.2 and the two patches I linked earlier are still required.
(In reply to Ștefan Talpalaru from comment #8) > This bug has not been fixed by caffe2-2.2.2 and the two patches I linked > earlier are still required. I knew. I put a blocking, wait for upstream, unless they 'll take too long. Multumesc
> I put a blocking, wait for upstream, unless they 'll take too long. Upstream fixed it in 2.3.0, but the ebuild was not updated to remove the dep version limitation in "<dev-util/nvidia-cuda-toolkit-12.4.0:=[profiler]".
Correction: it was a partial fix. They missed a patch - https://github.com/pytorch/pytorch/issues/122169