Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 817680 - =sci-libs/sundials-5.8.0[int64,superlumt]: test_sunlinsol_superlumt fails with "COLAMD failed at line 58 in file get_perm_c.c"
Summary: =sci-libs/sundials-5.8.0[int64,superlumt]: test_sunlinsol_superlumt fails wit...
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal
Assignee: Gentoo Science Related Packages
URL:
Whiteboard:
Keywords: PullRequest, TESTFAILURE
Depends on:
Blocks:
 
Reported: 2021-10-11 02:47 UTC by Alex Fan
Modified: 2021-11-28 20:52 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments
emerge --info (emerge.info,5.60 KB, text/plain)
2021-10-11 02:48 UTC, Alex Fan
Details
test output (LastTest.log,320.45 KB, text/plain)
2021-10-11 02:48 UTC, Alex Fan
Details
build log (sundials-5.8.0:20211010-133432.log.gz,89.20 KB, application/gzip)
2021-10-11 02:50 UTC, Alex Fan
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Alex Fan archtester 2021-10-11 02:47:32 UTC
tested on sifive unmatched

The following tests FAILED:
	13 - test_nvector_mpi_4_1000_0 (Failed)[0;0m
	14 - test_fnvector_parallel_mod_4 (Failed)[0;0m
	15 - test_nvector_mpimanyvector_parallel1_4_1000_200_0 (Failed)[0;0m
	16 - test_nvector_mpimanyvector_parallel2_4_200_1000_0 (Failed)[0;0m
	18 - test_fnvector_mpimanyvector_mod_4 (Failed)[0;0m
	19 - test_nvector_mpiplusx_4_1000_9 (Failed)[0;0m
	21 - test_fnvector_mpiplusx_mod_4 (Failed)[0;0m
	26 - test_nvector_parhyp_4_1000_0 (Failed)[0;0m
	83 - test_sunlinsol_spgmr_parallel_100_1_1_50_1e-3_0 (Failed)[0;0m
	84 - test_sunlinsol_spgmr_parallel_100_1_2_50_1e-3_0 (Failed)[0;0m
	85 - test_sunlinsol_spgmr_parallel_100_2_1_50_1e-3_0 (Failed)[0;0m
	86 - test_sunlinsol_spgmr_parallel_100_2_2_50_1e-3_0 (Failed)[0;0m
	87 - test_sunlinsol_spfgmr_parallel_100_1_50_1e-3_0 (Failed)[0;0m
	88 - test_sunlinsol_spfgmr_parallel_100_2_50_1e-3_0 (Failed)[0;0m
	89 - test_sunlinsol_spbcgs_parallel_100_1_50_1e-3_0 (Failed)[0;0m
	90 - test_sunlinsol_spbcgs_parallel_100_2_50_1e-3_0 (Failed)[0;0m
	91 - test_sunlinsol_sptfqmr_parallel_100_1_50_1e-3_0 (Failed)[0;0m
	92 - test_sunlinsol_sptfqmr_parallel_100_2_50_1e-3_0 (Failed)[0;0m
	106 - test_sunlinsol_superlumt_300_0_1_0 (Failed)[0;0m
	107 - test_sunlinsol_superlumt_300_1_1_0 (Failed)[0;0m


Reproducible: Always
Comment 1 Alex Fan archtester 2021-10-11 02:48:21 UTC
Created attachment 744411 [details]
emerge --info
Comment 2 Alex Fan archtester 2021-10-11 02:48:47 UTC
Created attachment 744414 [details]
test output
Comment 3 Alex Fan archtester 2021-10-11 02:50:31 UTC
Created attachment 744417 [details]
build log

compressed due to size limit, please use zcat to view it
Comment 4 Alex Fan archtester 2021-10-11 02:56:05 UTC
This issue in upstream may be relevant. https://github.com/LLNL/sundials/issues/26
Comment 5 Alex Fan archtester 2021-10-11 11:13:16 UTC
It seems the cause of most failure is just my testing machine is overloaded and there are not enough core/slots for openmpi, therefore openmpi gave explicit warning:
 "There are not enough slots available in the system to satisfy the 4 slots that were requested by the application:"

Rerun the test manually with --oversubscribe solved most failures.
Comment 6 Alex Fan archtester 2021-10-11 11:16:36 UTC
The only test failure left now is test_sunlinsol_superlumt. The error message is raised from file SRC/get_perm_c.c in sci-libs/superlu.
Comment 7 Alex Fan archtester 2021-10-12 05:48:53 UTC
Looks like the problem with test_sunlinsol_superlumt is integer sizes. Sundials always use 8 bytes integer as index type, 
> if(HAS_${INT64_TYPE_NOSPACE} EQUAL "8") 

but superlu_mt is compiled with int_t default to int, which is 4 bytes on risv64 lp64. set -D_LONGINT will solve the problem

> #ifdef _LONGINT
> typedef long long int int_t;
> #define IFMT "%lld"
> #else
> typedef int int_t; /* default */
> #define IFMT "%8d"
> #endif
Comment 8 Marek Szuba (RETIRED) archtester gentoo-dev 2021-10-12 20:21:41 UTC
(In reply to Alex Fan from comment #7)

> Sundials always use 8 bytes integer as index type, 

Not quite - the size of indices is controlled by the CMake option SUNDIALS_INDEX_SIZE, which defaults to 64 in our case but could be set to 32.

> but superlu_mt is compiled with int_t default to int, which is 4 bytes on
> risv64 lp64. set -D_LONGINT will solve the problem

...or emerge sci-libs/superlu_mt with USE=int64.

I've got an idea on how to solve this but will probably not test it until tomorrow, my systems are fully loaded at present.
Comment 9 Marek Szuba (RETIRED) archtester gentoo-dev 2021-10-12 21:23:10 UTC
Reproduced the error on amd64.
Comment 10 Marek Szuba (RETIRED) archtester gentoo-dev 2021-10-12 22:27:47 UTC
Posting this manually because something seems to be wrong with the post-commit hook on g.g.o:


commit dd47d1b3ec10aa6779430d1c1142e64d462ee707
Author: Marek Szuba
Date:   Tue Oct 12 21:55:29 2021 +0000

    sci-libs/sundials: implement USE=int64
    
    What it already does:
     * allows changing the index size from 64 (default) to 32 bits
     * if USE=superlumt is set makes sure the state of USE=int64 in
       sci-libs/superlu_mt is the same as here, so that the assumption
       about index-type compatibility made in sunlinsol_superlumt.h is
       correct
     * allows test_sunlinsol_superlumt_300_0_1_0 and
        test_sunlinsol_superlumt_300_1_1_0 to pass *if USE=-int64*
    
    What still needs work:
     * getting the two superlumt tests to pass for USE=int64 - they still
       fail with "COLAMD failed at line 58 in file get_perm_c.c"
    
    Bug: https://bugs.gentoo.org/817680
    Signed-off-by: Marek Szuba <marecki@gentoo.org>
Comment 11 Larry the Git Cow gentoo-dev 2021-11-28 20:52:37 UTC
The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=b25061d0cf0d41213c90ebab0845ed3f50403eef

commit b25061d0cf0d41213c90ebab0845ed3f50403eef
Author:     Marek Szuba <marecki@gentoo.org>
AuthorDate: 2021-11-28 20:48:56 +0000
Commit:     Marek Szuba <marecki@gentoo.org>
CommitDate: 2021-11-28 20:48:56 +0000

    sci-libs/superlu_mt: apply Alex's PREDEFS patch
    
    New revision because it changes runtime behaviour for USE=int64 users.
    
    Closes: https://bugs.gentoo.org/817680
    Closes: https://github.com/gentoo/gentoo/pull/23063
    Signed-off-by: Marek Szuba <marecki@gentoo.org>

 sci-libs/superlu_mt/superlu_mt-3.1-r1.ebuild | 105 +++++++++++++++++++++++++++
 1 file changed, 105 insertions(+)