Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 764206 - >=sci-libs/openblas-0.3.9-r1 introduced instability: Error DSYTRF in openblas
Summary: >=sci-libs/openblas-0.3.9-r1 introduced instability: Error DSYTRF in openblas
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Gentoo Science Related Packages
URL: https://github.com/xianyi/OpenBLAS/is...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-01-06 22:55 UTC by Carlo Nervi
Modified: 2021-01-10 18:58 UTC (History)
3 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
oblas.patch (oblas.patch,582 bytes, patch)
2021-01-07 13:31 UTC, Aisha Tammy
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Carlo Nervi 2021-01-06 22:55:51 UTC
The whole story is described here: https://github.com/xianyi/OpenBLAS/issues/3054

In summary, I tested openblas on quantum espresso (QE) program version 6.4.1, 6.5 and 6.6, a free program for chemical computation on solid state.
With 'emerge =openblas-0.3.9-r1' QE is working correctly.
Also the 0.3.13.dev (as 06/01/2020) is ok.
Switching openblas dynamic libraries with LD_LIBRARY_PATH works as well.

Updating to the last stable openblas-0.3.12-r1 QE stops with crash message after some time: Error DSYTRF in openblas
The crash appears for all versions, included 0.3.9-r1

Re-emerging the 0.3.9-r1 solves the problem, and also 0.3.13.dev is working properly.
Comment 1 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2021-01-07 11:45:08 UTC
If .13 works, we can stable that?
Comment 2 Carlo Nervi 2021-01-07 11:58:16 UTC
(In reply to Sam James from comment #1)
> If .13 works, we can stable that?
Hi, I did not performed an exhaustive test on 0.3.13.dev. This version is under development, so I don't think it can be stabilized. It passed all the tests, as the 0.3.12-r1 did.

My point is that all the openblas versions I mentioned are working apparently in a correct way, but the command:
emerge =openblas-0.3.12-r1
introduces instability, whereas the command:
emerge =openblas-0.3.9-r1 restores the whole functionality, at least in QE program.
Since openblas sources are apparently not the cause (see the link in the first message), to my limited knowledge and to my understanding, the problems are probably within the ebuild configuration. But I do not have enough skill to investigate the ebuilds schema of compiling (although I may give a working config for openblas).
Comment 3 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2021-01-07 12:10:48 UTC
(In reply to Carlo Nervi from comment #2)
> (In reply to Sam James from comment #1)
> > If .13 works, we can stable that?
> Hi, I did not performed an exhaustive test on 0.3.13.dev. This version is
> under development, so I don't think it can be stabilized. It passed all the
> tests, as the 0.3.12-r1 did.

Are you sure? It's in Gentoo and it was released before: https://github.com/xianyi/OpenBLAS/releases/tag/v0.3.13.

So, does the 0.3.13 ebuild work?

> 
> My point is that all the openblas versions I mentioned are working
> apparently in a correct way, but the command:
> emerge =openblas-0.3.12-r1
> introduces instability, whereas the command:
> emerge =openblas-0.3.9-r1 restores the whole functionality, at least in QE
> program.

Okay, this is helpful. Is it possible for you to try 0.3.10 too? (again, the ebuild)

> Since openblas sources are apparently not the cause (see the link in the
> first message), to my limited knowledge and to my understanding, the
> problems are probably within the ebuild configuration. But I do not have
> enough skill to investigate the ebuilds schema of compiling (although I may
> give a working config for openblas).

They hopefully aren't _too_ complicated but obviously we'll work with you on figuring this out: https://gitweb.gentoo.org/repo/gentoo.git/tree/sci-libs/openblas/openblas-0.3.13.ebuild
Comment 4 Carlo Nervi 2021-01-07 13:02:00 UTC
Okay, I did the test.
after emerge openblas=0.3.10 all is working, including the 0.3.13.dev (loaded setting LD_LIBRARY_PATH).

after emerge openblas=0.3.13, the .13 and the .13.dev are not working anymore (DSYTRF Error at the same point of the calculation).

after emerge openblas=0.3.10 the .13dev, the .13 and the .10 are fully working again!
I have no idea what it can mess up...
Comment 5 Aisha Tammy 2021-01-07 13:31:05 UTC
Very busy for the next few weeks, can't devote time for gentoo right now.

A small hypothesis, try the patch attached, if it doesn't work, I'll come back to this in a few weeks.


diff --git a/sci-libs/openblas/openblas-0.3.13.ebuild b/sci-libs/openblas/openblas-0.3.13.ebuild
index 1c5dedff184..e7d0e7e93c3 100644
--- a/sci-libs/openblas/openblas-0.3.13.ebuild
+++ b/sci-libs/openblas/openblas-0.3.13.ebuild
@@ -86,9 +86,6 @@ pkg_setup() {
 		       NO_AFFINITY=1 \
 		       TARGET=GENERIC
 
-	export NUM_PARALLEL=${OPENBLAS_NPARALLEL:-8} \
-	       NUM_THREADS=${OPENBLAS_NTHREAD:-64}
-
 	# setting OPENBLAS_TARGET to override auto detection
 	# in case the toolchain is not enough to detect
 	# https://github.com/xianyi/OpenBLAS/blob/develop/TargetList.txt
Comment 6 Aisha Tammy 2021-01-07 13:31:21 UTC
Created attachment 681634 [details, diff]
oblas.patch
Comment 7 Larry the Git Cow gentoo-dev 2021-01-09 02:47:22 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=c0b8101f6e9d3d61702cf0011c092e090e30aa78

commit c0b8101f6e9d3d61702cf0011c092e090e30aa78
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2021-01-09 02:44:44 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2021-01-09 02:46:04 +0000

    sci-libs/openblas: don't set relapack by default
    
    This is experimental upstream and shouldn't be
    enabled by default.
    
    URL: https://github.com/xianyi/OpenBLAS/issues/3054
    Bug: https://bugs.gentoo.org/764206
    Package-Manager: Portage-3.0.12, Repoman-3.0.2
    Signed-off-by: Sam James <sam@gentoo.org>

 profiles/base/package.use.stable.mask       | 5 +++++
 sci-libs/openblas/openblas-0.3.12-r1.ebuild | 2 +-
 sci-libs/openblas/openblas-0.3.13.ebuild    | 2 +-
 3 files changed, 7 insertions(+), 2 deletions(-)
Comment 8 Carlo Nervi 2021-01-10 18:58:59 UTC
Hey guys, thanks a lot.
The bug is solved!