921129 – sci-libs/caffe2 and pytorch are using reference (netlib) blas; should allow to use other implementations

Bug 921129 - sci-libs/caffe2 and pytorch are using reference (netlib) blas; should allow to use other implementations

Summary: sci-libs/caffe2 and pytorch are using reference (netlib) blas; should allow t...

Status:	RESOLVED FIXED

Alias:	None

Product:	Gentoo Linux
Classification:	Unclassified
Component:	Current packages (show other bugs)
Hardware:	AMD64 Linux

Importance:	Normal normal
Assignee:	Gentoo Linux bug wranglers

URL:
Whiteboard:
Keywords:	PullRequest

Depends on:
Blocks:

Reported:	2023-12-31 14:07 UTC by Sv. Lockal
Modified:	2023-12-31 16:23 UTC (History)
CC List:	0 users

See Also:	https://github.com/gentoo/gentoo/pull/34584
Package list:
Runtime testing required:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Sv. Lockal 2023-12-31 14:07:02 UTC

In sci-libs/caffe2-2.1.2 due to RPATH set to $origin in /usr/lib64/libtorch_cpu.so (part of caffe2), pytorch ignores  always uses slowest reference implementation of BLAS and LAPACK.

With after `eselect blas set openblas`, here is how resolution works for libtorch_cpu.so:

ldd /usr/lib64/libtorch_cpu.so | grep liblapack
        liblapack.so.3 => /usr/lib64/liblapack.so.3 (0x00007f8934800000)

Here is how resolution works for other (correct) libraries :

ldd /usr/lib/python3.11/site-packages/numpy/linalg/lapack_lite.cpython-311-x86_64-linux-gnu.so | grep liblapack
        liblapack.so.3 => /usr/lib64/lapack/openblas/liblapack.so.3 (0x00007f886bc00000)

As a result, on zen4, for matrix multiplications resulting binary is 253 times slower than build from pypi and 321 times slower than it could be after extra customizations (I will attach benchmarks to PR).

To solve the issue, sci-libs/caffe2 should allow to specify BLAS/LAPACK provider with https://wiki.gentoo.org/wiki/Blas-lapack-switch or build with MKL.

Comment 1 Larry the Git Cow gentoo-dev

2023-12-31 16:23:56 UTC

The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=53a5bc45ce81fd5e1cc56536f408d1eea1c7f537

commit 53a5bc45ce81fd5e1cc56536f408d1eea1c7f537
Author:     Sv. Lockal <lockalsash@gmail.com>
AuthorDate: 2023-12-31 15:07:04 +0000
Commit:     Alfredo Tupone <tupone@gentoo.org>
CommitDate: 2023-12-31 16:23:21 +0000

    sci-libs/caffe2: add support of blas/lapack providers, including mkl
    
    Closes: https://bugs.gentoo.org/921129
    Signed-off-by: Sv. Lockal <lockalsash@gmail.com>
    Closes: https://github.com/gentoo/gentoo/pull/34584
    Signed-off-by: Alfredo Tupone <tupone@gentoo.org>

 profiles/features/musl/package.use.mask               |  3 +++
 .../{caffe2-2.1.2.ebuild => caffe2-2.1.2-r1.ebuild}   | 19 +++++++++++++------
 .../caffe2/files/caffe2-2.1.2-fix-openmp-link.patch   | 15 +++++++++++++++
 sci-libs/caffe2/files/caffe2-2.1.2-fix-rpath.patch    | 12 ++++++++++++
 sci-libs/caffe2/metadata.xml                          |  1 +
 5 files changed, 44 insertions(+), 6 deletions(-)