Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 914572 - sci-libs/caffe2[cuda] does not install files properly
Summary: sci-libs/caffe2[cuda] does not install files properly
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Tupone Alfredo
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-09-23 13:31 UTC by Yiyang Wu
Modified: 2023-12-25 12:56 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
build.log (caffe2-fail.log.xz,103.47 KB, application/x-xz)
2023-09-23 13:32 UTC, Yiyang Wu
Details
emerge --info (emerge-info.log,7.46 KB, text/x-log)
2023-09-23 13:33 UTC, Yiyang Wu
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Yiyang Wu 2023-09-23 13:31:23 UTC
The nvfuser.so, functorch.so and libnvfuser_codegen.so are not properly installed in ${ED}:

tmp
└── portage
    └── sci-libs
        └── caffe2-2.0.1-r4
            └── work
                └── pytorch-2.0.1
                    ├── functorch
                    │   └── functorch.so
                    ├── nvfuser
                    │   └── nvfuser.so
                    ├── third_party
                    │   └── nvfuser
                    └── torch
                        └── lib
                            └── libnvfuser_codegen.so

, which violates the FHS, and cause "installation outside prefix" error on Gentoo prefix systems.

Also, missing nvfuser.so causes sci-libs/pytorch fails to build:

2023-09-23 21:28:37,453 root INFO building 'nvfuser._C' extension
2023-09-23 21:28:37,454 root INFO x86_64-pc-linux-gnu-gcc -shared -fuse-ld=gold -O2 -pipe -march=znver2 -DNDEBUG -L/opt/gentoo/usr/lib64 -o /tmp/portage/sci-libs/pytorch-2.0.1-r1/work/pytorch-2.0.1_python3.11/build/lib.linux-x86_64-cpython-311/nvfuser/_C.cpython-311-x86_64-linux-gnu.so
x86_64-pc-linux-gnu-gcc: fatal error: no input files

Because setup.py is trying to copy ${S}/torch/nvfuser/nvfuser.so, fails and fall back to compile, and cannot find source.

Reproducible: Always
Comment 1 Yiyang Wu 2023-09-23 13:32:45 UTC
Created attachment 871192 [details]
build.log
Comment 2 Yiyang Wu 2023-09-23 13:33:19 UTC
Created attachment 871193 [details]
emerge --info
Comment 3 Ștefan Talpalaru 2023-10-05 18:25:38 UTC
Fixed in my overlay: https://github.com/stefantalpalaru/gentoo-overlay
Comment 4 Larry the Git Cow gentoo-dev 2023-12-01 05:53:23 UTC
The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=6066231bec8d7a83aff48ff16eb28e44eadd6ef4

commit 6066231bec8d7a83aff48ff16eb28e44eadd6ef4
Author:     Alfredo Tupone <tupone@gentoo.org>
AuthorDate: 2023-12-01 05:52:18 +0000
Commit:     Alfredo Tupone <tupone@gentoo.org>
CommitDate: 2023-12-01 05:53:04 +0000

    sci-libs/caffe2: install nvfuser and functorch files
    
    Closes: https://bugs.gentoo.org/914572
    Signed-off-by: Alfredo Tupone <tupone@gentoo.org>

 sci-libs/caffe2/caffe2-2.0.1-r5.ebuild             | 210 +++++++++++++++++++++
 sci-libs/caffe2/files/caffe2-2.0.1-cudaExtra.patch |  28 +++
 2 files changed, 238 insertions(+)
Comment 5 Jiezhe Wang 2023-12-06 01:56:59 UTC
The issue with this patch is that it will additionally install an '__init__.py' file in '/usr/lib64', whereas the original installation location for this file should be '/usr/lib/python*/site-packages/nvfuser'.
Comment 6 Larry the Git Cow gentoo-dev 2023-12-06 19:49:33 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=a30ecc1b69628e20faa989943ff1b0dda32d9d69

commit a30ecc1b69628e20faa989943ff1b0dda32d9d69
Author:     Alfredo Tupone <tupone@gentoo.org>
AuthorDate: 2023-12-06 19:48:46 +0000
Commit:     Alfredo Tupone <tupone@gentoo.org>
CommitDate: 2023-12-06 19:49:16 +0000

    sci-libs/caffe2: install nvfuser python module
    
    Bug: https://bugs.gentoo.org/914572
    Signed-off-by: Alfredo Tupone <tupone@gentoo.org>

 sci-libs/caffe2/caffe2-2.1.1.ebuild | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)
Comment 7 Tupone Alfredo gentoo-dev 2023-12-06 19:50:54 UTC
(In reply to Jiezhe Wang from comment #5)
> The issue with this patch is that it will additionally install an
> '__init__.py' file in '/usr/lib64', whereas the original installation
> location for this file should be '/usr/lib/python*/site-packages/nvfuser'.

Can you test if that is fixed in the 2.1.1 or if not, what I need to do.
pytorch 2.1.1 is not yet ready though
Comment 8 Yiyang Wu 2023-12-07 04:01:07 UTC
(In reply to Tupone Alfredo from comment #7)
> (In reply to Jiezhe Wang from comment #5)
> > The issue with this patch is that it will additionally install an
> > '__init__.py' file in '/usr/lib64', whereas the original installation
> > location for this file should be '/usr/lib/python*/site-packages/nvfuser'.
> 
> Can you test if that is fixed in the 2.1.1 or if not, what I need to do.
> pytorch 2.1.1 is not yet ready though

Thank you! I will try this out in the weekend.
Comment 9 Yiyang Wu 2023-12-11 11:27:14 UTC
I confirm that the fixed caffe2-2.0.1-r5 worked out well. pytorch installs also smoothly

pytorch didn't install /usr/lib64/__init__.py (or maybe I did not catch up with what you were talking about?)
Comment 10 Tupone Alfredo gentoo-dev 2023-12-11 13:52:23 UTC
However I don't know what to do with nvfuser.
I need to install _C.cpython-311-x86_64-linux-gnu.so inside the nvfuser python module, but I cannot find how to build it
Comment 11 Jiezhe Wang 2023-12-21 06:06:35 UTC
(In reply to Tupone Alfredo from comment #10)
> However I don't know what to do with nvfuser.
> I need to install _C.cpython-311-x86_64-linux-gnu.so inside the nvfuser
> python module, but I cannot find how to build it

The _C.so file is built as nvfuser.so with caffe2, as shown in comment #0.
My temporary solution is to install nvfuser.so with caffe2, then when installing pytorch, move `nvfuser.so` and `third_party/nvfuser/python/__init__.py` to `nvfuser` directory at top-level. This should be the stage after caffe2 building. Then nvfuser would be installed automatically.
Comment 12 Larry the Git Cow gentoo-dev 2023-12-23 16:05:27 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=60f42ca58ef409f2a08c17e6c62fd53b4cc5b87d

commit 60f42ca58ef409f2a08c17e6c62fd53b4cc5b87d
Author:     Alfredo Tupone <tupone@gentoo.org>
AuthorDate: 2023-12-23 16:04:31 +0000
Commit:     Alfredo Tupone <tupone@gentoo.org>
CommitDate: 2023-12-23 16:05:09 +0000

    sci-libs/caffe2: fix nvfuser python module
    
    Bug: https://bugs.gentoo.org/914572
    Signed-off-by: Alfredo Tupone <tupone@gentoo.org>

 sci-libs/caffe2/{caffe2-2.1.1-r5.ebuild => caffe2-2.1.1-r6.ebuild} | 1 +
 1 file changed, 1 insertion(+)
Comment 13 Tupone Alfredo gentoo-dev 2023-12-23 16:38:32 UTC
I hope I didn't broke eprefix
Comment 14 Martin Rott 2023-12-24 08:50:59 UTC
Apparently there's some file collision to this... 
Not sure how to solve it though... 

sci-libs/caffe2-2.1.1-r6:

Detected file collision(s):
 * 
 *      /usr/lib/python3.10/site-packages/nvfuser/__init__.py
 *      /usr/lib/python3.10/site-packages/nvfuser/__pycache__/__init__.cpython-310.pyc
 *      /usr/lib/python3.10/site-packages/nvfuser/__pycache__/__init__.cpython-310.opt-1.pyc
 *      /usr/lib/python3.10/site-packages/nvfuser/__pycache__/__init__.cpython-310.opt-2.pyc



portageq owners / /usr/lib/python3.10/site-packages/nvfuser/__init__.py
sci-libs/pytorch-2.1.1
        /usr/lib/python3.10/site-packages/nvfuser/__init__.py
Comment 15 Larry the Git Cow gentoo-dev 2023-12-24 11:48:06 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=c884f73010d4d4459e9c9fb257a1d7d4467c7b70

commit c884f73010d4d4459e9c9fb257a1d7d4467c7b70
Author:     Alfredo Tupone <tupone@gentoo.org>
AuthorDate: 2023-12-24 11:47:13 +0000
Commit:     Alfredo Tupone <tupone@gentoo.org>
CommitDate: 2023-12-24 11:47:48 +0000

    sci-libs/pytorch: nvfuser installed in caffe2
    
    Bug: https://bugs.gentoo.org/914572
    Signed-off-by: Alfredo Tupone <tupone@gentoo.org>

 sci-libs/pytorch/{pytorch-2.1.1.ebuild => pytorch-2.1.1-r1.ebuild} | 2 --
 1 file changed, 2 deletions(-)
Comment 16 Tupone Alfredo gentoo-dev 2023-12-24 11:50:51 UTC
if you still have a conflict, remove pytorch before emerging caffe2, and install after.

Thanks for report.

If/when everything looks ok, I'll bump to pytorch 2.1.2
Comment 17 Martin Rott 2023-12-24 14:43:53 UTC
seems to work fine now... tried rebuilding both pytorch and caffe2...

Verifying ebuild manifests
>>> Emerging (1 of 2) sci-libs/caffe2-2.1.1-r6::gentoo
>>> Installing (1 of 2) sci-libs/caffe2-2.1.1-r6::gentoo
>>> Recording sci-libs/caffe2 in "world" favorites file...
>>> Completed (1 of 2) sci-libs/caffe2-2.1.1-r6::gentoo
>>> Emerging (2 of 2) sci-libs/pytorch-2.1.1-r1::gentoo
>>> Installing (2 of 2) sci-libs/pytorch-2.1.1-r1::gentoo
>>> Completed (2 of 2) sci-libs/pytorch-2.1.1-r1::gentoo
>>> Jobs: 2 of 2 complete  

                        
Also Merry Christmas :)