Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 614464 - sci-libs/scipy can only be built with MAKEOPTS=-j1
Summary: sci-libs/scipy can only be built with MAKEOPTS=-j1
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Gentoo Science Related Packages
URL: https://github.com/scipy/scipy/issues...
Whiteboard:
Keywords:
: 631298 634858 635872 639778 646328 658864 676640 701642 704226 (view as bug list)
Depends on:
Blocks:
 
Reported: 2017-04-01 23:02 UTC by Andrés Becerra Sandoval
Modified: 2020-08-18 11:39 UTC (History)
16 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
emerge --info output (info.txt,5.46 KB, text/plain)
2017-04-01 23:03 UTC, Andrés Becerra Sandoval
Details
full build log (scipy-0.18.1:20170520-092201.log.bz2,87.57 KB, application/x-bzip2)
2017-05-20 10:24 UTC, François Bissey
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Andrés Becerra Sandoval 2017-04-01 23:02:21 UTC
Compilation aborts at:


usr/lib64/python3.6/site-packages/numpy/distutils/system_info.py:1510: UserWarning: 
    Atlas (http://math-atlas.sourceforge.net/) libraries not found.
    Directories to search for the libraries can be specified in the
    numpy/distutils/site.cfg file (section [atlas]) or by setting
    the ATLAS environment variable.
  warnings.warn(AtlasNotFoundError.__doc__)
Running from scipy source directory.
/usr/lib64/python3.6/site-packages/numpy/distutils/system_info.py:620: UserWarning: Specified path /usr/lib64/python3.6/site-packages/numpy/__init__.py/include/python3.6m is invalid.
  warnings.warn('Specified path %s is invalid.' % d)
/usr/lib64/python3.6/site-packages/numpy/distutils/system_info.py:1608: UserWarning: 
    Atlas (http://math-atlas.sourceforge.net/) libraries not found.
    Directories to search for the libraries can be specified in the
    numpy/distutils/site.cfg file (section [atlas]) or by setting
    the ATLAS environment variable.
  warnings.warn(AtlasNotFoundError.__doc__)
"object of type 'type' has no len()" in evaluating 'len(list)' (available names: [])
"object of type 'type' has no len()" in evaluating 'len(list)' (available names: [])
"object of type 'type' has no len()" in evaluating 'len(list)' (available names: [])
"object of type 'type' has no len()" in evaluating 'len(list)' (available names: [])
"object of type 'type' has no len()" in evaluating 'len(list)' (available names: [])
"object of type 'type' has no len()" in evaluating 'len(list)' (available names: [])
error: Command "/usr/bin/gfortran -Wall -g -Wl,-O1 -shared /var/tmp/portage/sci-libs/scipy-0.18.1/work/scipy-0.18.1-python3_6/build/temp.linux-x86_64-3.6/var/tmp/portage/sci-libs/scipy-0.18.1/work/scipy-0.18.1-python3_6/build/src.linux-x86_64-3.6/scipy/fftpack/_fftpackmodule.o /var/tmp/portage/sci-libs/scipy-0.18.1/work/scipy-0.18.1-python3_6/build/temp.linux-x86_64-3.6/scipy/fftpack/src/zfft.o /var/tmp/portage/sci-libs/scipy-0.18.1/work/scipy-0.18.1-python3_6/build/temp.linux-x86_64-3.6/scipy/fftpack/src/drfft.o /var/tmp/portage/sci-libs/scipy-0.18.1/work/scipy-0.18.1-python3_6/build/temp.linux-x86_64-3.6/scipy/fftpack/src/zrfft.o /var/tmp/portage/sci-libs/scipy-0.18.1/work/scipy-0.18.1-python3_6/build/temp.linux-x86_64-3.6/scipy/fftpack/src/zfftnd.o /var/tmp/portage/sci-libs/scipy-0.18.1/work/scipy-0.18.1-python3_6/build/temp.linux-x86_64-3.6/var/tmp/portage/sci-libs/scipy-0.18.1/work/scipy-0.18.1-python3_6/build/src.linux-x86_64-3.6/scipy/fftpack/src/dct.o /var/tmp/portage/sci-libs/scipy-0.18.1/work/scipy-0.18.1-python3_6/build/temp.linux-x86_64-3.6/var/tmp/portage/sci-libs/scipy-0.18.1/work/scipy-0.18.1-python3_6/build/src.linux-x86_64-3.6/scipy/fftpack/src/dst.o /var/tmp/portage/sci-libs/scipy-0.18.1/work/scipy-0.18.1-python3_6/build/temp.linux-x86_64-3.6/var/tmp/portage/sci-libs/scipy-0.18.1/work/scipy-0.18.1-python3_6/build/src.linux-x86_64-3.6/fortranobject.o -L/usr/lib64 -L/var/tmp/portage/sci-libs/scipy-0.18.1/work/scipy-0.18.1-python3_6/build/temp.linux-x86_64-3.6 -ldfftpack -lfftpack -lpython3.6m -lgfortran -o /var/tmp/portage/sci-libs/scipy-0.18.1/work/scipy-0.18.1-python3_6/build/lib/scipy/fftpack/_fftpack.cpython-36m-x86_64-linux-gnu.so" failed with exit status 1
Comment 1 Andrés Becerra Sandoval 2017-04-01 23:03:55 UTC
Created attachment 468914 [details]
emerge --info output
Comment 2 Andrés Becerra Sandoval 2017-04-06 15:55:50 UTC
This seems caused for not having python-3.5, in other box with python-3.5 and python3.6 installed scipy merges correctly.

May be I should close the report!
Comment 3 François Bissey 2017-05-20 10:05:33 UTC
Got hit. I don't think having to install python 3.5 for scipy to install is an acceptable solution. It is certainly not sustainable when you consider that one day 3.5 will be removed from the tree. We better have a better solution by then for 3.6 and potentially later versions of python.
Comment 4 Justin Lecher (RETIRED) gentoo-dev 2017-05-20 10:19:49 UTC
The message about atlas is standard and has nothing todo with the build failure. I see it myself using atlas on all supported python ABIs and it works fine.

Could you please add a full build.log?
Comment 5 Justin Lecher (RETIRED) gentoo-dev 2017-05-20 10:20:57 UTC
You emerge info doesn't show you have the science overlay installed. How do you use atlas?
Comment 6 François Bissey 2017-05-20 10:24:51 UTC
Created attachment 473460 [details]
full build log
Comment 7 François Bissey 2017-05-20 10:28:53 UTC
I have attached a full log. I use the science overlay, the use of openblas should be a give away. I have just enabled python 3.6 and was proceeding to rebuild everything python related.
Comment 8 Justin Lecher (RETIRED) gentoo-dev 2017-05-20 10:29:56 UTC
I cannot see the actual error which gfortran throws. it just errors. Any chance you can run the command by hand and see more?
Comment 9 François Bissey 2017-05-20 10:32:36 UTC
(In reply to Justin Lecher from comment #8)
> I cannot see the actual error which gfortran throws. it just errors. Any
> chance you can run the command by hand and see more?

Actually thought of that, and the command did not fail. I have restarted building to see if there was actually a file produced when run from inside portage. Waiting for the result.
Comment 10 François Bissey 2017-05-20 10:51:29 UTC
OK so just after the ebuild failure, if I inspect the folder where the file should have been produced. It is not present, but If I execute the command manually, it doesn't fail and the file is produced in the right place.

Is there some parallelism in the build with python 3.6. I seem to remember something about scipy introducing some parallel building. In that case we may have a race condition.
Comment 11 Justin Lecher (RETIRED) gentoo-dev 2017-05-20 10:52:35 UTC
Could you try some serial build?
Comment 12 François Bissey 2017-05-20 10:55:06 UTC
Started, may take some time... I will let you know the result ASAP.
Comment 13 François Bissey 2017-05-20 11:12:20 UTC
Success with `MAKEOPTS=-j1`! We do have a race condition.
Comment 14 Hendrik v. Raven 2017-05-29 14:33:30 UTC
I can reproduce the full error. Failed to rebuild scipy-0.18.1 multiple times with only python2_7 and python3_6. Previous builds where python3_5 was enabled as well passed. After a (lengthly) serial build a can confirm that it does not occur with MAKEOPTS="-j1". Environment is almost identically to Andrés. Using openblas from the science overlay as blas/cblas provider in combination with the reference lapack.
Comment 15 François Bissey 2017-06-28 21:58:28 UTC
Still present in 0.19.1.
Comment 16 Benda Xu gentoo-dev 2017-09-22 12:42:00 UTC
*** Bug 631298 has been marked as a duplicate of this bug. ***
Comment 17 Benda Xu gentoo-dev 2017-09-22 12:52:43 UTC
What is the version of your dev-python/numpy?
Comment 18 Hendrik v. Raven 2017-09-22 13:39:33 UTC
I can trigger it using numpy-1.13.1, trying to build scipy-0.19.1 (vs. openblas-0.2.19, if thats interesting).
Comment 19 François Bissey 2017-09-22 20:41:12 UTC
The problem is independent of blas. python 3.6 (and may be 3.5) are able to do parallel builds and an object is not compiled in time. I don't know how python3 does manage target dependencies but obviously there is something broken upstream here.
Comment 20 younky.yang 2017-11-12 09:09:34 UTC
even with MAKEOPTS=-j1, the error is still the same. The only difference with the bug report here maybe I use gcc 7.2.
Comment 21 François Bissey 2017-11-12 09:12:37 UTC
OK, that's weird because the cause of the problem is definitely an object not being compiled yet that is being used in a linking command.

Could you post the full python3 build log?
Comment 22 Benda Xu gentoo-dev 2017-11-20 01:02:30 UTC
Anyone forwarded it upstream?
Comment 23 Benda Xu gentoo-dev 2017-11-20 01:02:48 UTC
*** Bug 634858 has been marked as a duplicate of this bug. ***
Comment 24 Benda Xu gentoo-dev 2017-11-20 01:07:39 UTC
Spack has hard-coded it to be a serial build https://github.com/spack/spack/pull/3275/commits/607520e938aa5268dfbdbb52b95a93630af333cc
Comment 25 François Bissey 2017-11-20 01:17:27 UTC
I added a comment upstream. I may have a deeper look but no promises.
Comment 26 François Bissey 2017-11-20 02:03:01 UTC
Just repeating some observations I made upstream. numpy's distutils can compile C/C++ code in parallel but not fortran code. This is rather explicit in their distutils.
All three instances (reported by spack and at least two the reports here) are when gfortran is used for linking. 

This particular report is about a case where the source is mixed fortran/C so you could expect troubles. The other two are pure fortran. 

In all case it appears some parallel processing is done in a context it shouldn't and the problem is more likely to be in numpy or some subtle way the sources are prepared.
Comment 27 Benda Xu gentoo-dev 2017-11-20 02:18:44 UTC
(In reply to François Bissey from comment #26)
> 
> In all case it appears some parallel processing is done in a context it
> shouldn't and the problem is more likely to be in numpy or some subtle way
> the sources are prepared.

The problem is very likely to be in numpy.  Although I forgot about the details, my package.env says:

  # Fatal Python error: Couldn't create autoTLSkey mapping
  dev-python/numpy one-make.conf

  one-make.conf contains MAKEOPTS="-j1"

it was re-added by me at least 3 times in the past 5+ years.
Comment 28 François Bissey 2017-11-20 03:17:35 UTC
Just re-reading numpy distutils. I believe they must parallelize fortran and c source because of the way fortran is implemented. Only explicit f90 sources are serialized.
Comment 29 François Bissey 2017-11-20 09:12:37 UTC
I was hoping that something like
--- /usr/lib64/python2.7/site-packages/numpy/distutils/ccompiler.py	2017-09-29 13:31:46.000000000 +1300
+++ /usr/lib64/python3.6/site-packages/numpy/distutils/ccompiler.py	2017-11-20 20:02:36.411464265 +1300
@@ -337,6 +337,7 @@
         pool = multiprocessing.pool.ThreadPool(jobs)
         pool.map(single_compile, build_items)
         pool.close()
+        pool.join()
     else:
         # build serial
         for o in build_items:

would be sufficient but it appears not. Strangely enough it move the error to another file I hadn't seen before
error: Command "/usr/bin/gfortran -Wall -g -Wl,-O1 -Wl,--as-needed -shared /dev/shm/portage/sci-libs/scipy-0.19.1/work/scipy-0.19.1-python3_6/build/temp.linux-x86_64-3.6/dev/shm/portage/sci-libs/scipy-0.19.1/work/scipy-0.19.1-python3_6/build/src.linux-x86_64-3.6/scipy/integrate/_test_odeint_bandedmodule.o /dev/shm/portage/sci-libs/scipy-0.19.1/work/scipy-0.19.1-python3_6/build/temp.linux-x86_64-3.6/dev/shm/portage/sci-libs/scipy-0.19.1/work/scipy-0.19.1-python3_6/build/src.linux-x86_64-3.6/scipy/integrate/fortranobject.o /dev/shm/portage/sci-libs/scipy-0.19.1/work/scipy-0.19.1-python3_6/build/temp.linux-x86_64-3.6/scipy/integrate/tests/banded5x5.o /dev/shm/portage/sci-libs/scipy-0.19.1/work/scipy-0.19.1-python3_6/build/temp.linux-x86_64-3.6/dev/shm/portage/sci-libs/scipy-0.19.1/work/scipy-0.19.1-python3_6/build/src.linux-x86_64-3.6/scipy/integrate/_test_odeint_banded-f2pywrappers.o -L/usr/lib64 -L/usr/lib64 -L/dev/shm/portage/sci-libs/scipy-0.19.1/work/scipy-0.19.1-python3_6/build/temp.linux-x86_64-3.6 -lodepack -lmach -lopenblas_threads -lreflapack -lopenblas_threads -lpython3.6m -lgfortran -o /dev/shm/portage/sci-libs/scipy-0.19.1/work/scipy-0.19.1-python3_6/build/lib/scipy/integrate/_test_odeint_banded.cpython-36m-x86_64-linux-gnu.so" failed with exit status 1

One of the problem here is that we deal with a undocumented python process
http://lucasb.eyer.be/snips/python-thread-pool.html and there may be a bug in its behavior or possibly some functionality missing for .join() to work.
Comment 30 François Bissey 2017-11-21 22:42:16 UTC
The point of failure seems random. I can repeat the build and have it break at a different file. I have patched numpy's distutils to enforce serialization and I can still get a failed build.
In all cases the actual error, when you search for it higher up than the last message about the failed compilation, is of the kind (file will vary):
/dev/shm/portage/sci-libs/scipy-0.19.1/work/scipy-0.19.1-python3_6/build/temp.linux-x86_64-3.6/scipy/special/cdf_wrappers.o: file not recognized: File truncated
collect2: error: ld returned 1 exit status
/dev/shm/portage/sci-libs/scipy-0.19.1/work/scipy-0.19.1-python3_6/build/temp.linux-x86_64-3.6/scipy/special/cdf_wrappers.o: file not recognized: File truncated
collect2: error: ld returned 1 exit status

Which is indicative of an unfinished compilation. The fact that you still get a failure when you disable the threading of the build from numpy's distutils means that it is probably the wrong place to search.
Comment 31 François Bissey 2017-11-22 21:58:20 UTC
I think what we are hitting at is that https://github.com/numpy/numpy/issues/7139 is not really fixed.
Comment 32 François Bissey 2017-11-22 23:00:35 UTC
Correct that. The fact that when you disable threading from numpy, you still get problems means that the issue is in python-3.5+ distutils which enables parallel building. I don't think it is a numpy problem anymore this is a python problem.

Interestingly passing "-j1 --parallel $(makeopts_jobs)" - that is disabled parallelism from python's distutils while enabling the one from numpy still leads to failures. So neither implementations seem to be good enough to deal with scipy.
Comment 33 François Bissey 2017-11-23 00:50:30 UTC
And I think all I have done needs re-doing. I have been testing with python3.6 but the ebuild doesn't know how to deal with python3.6. Only 3.5.

	distutils-r1_python_compile \
		$(usex python_targets_python3_5 "" "-j $(makeopts_jobs)") \
		${SCIPY_FCONFIG}

So parallelism must have taken default values from the number of cores.
Comment 34 François Bissey 2017-11-23 01:23:16 UTC
Unrelated but this is a QA problem. The logic enabling parallel building is not sound.

If python3.5 is enabled in PYTHON_TARGETS, the builds of all the python versions will receive the "-j $(makeopts_jobs)", including python2.7 and 3.4 if they are enabled. On the other hand if python3.5 is disabled in PYTHON_TARGETS no builds should receive the option, including python3.6.

Which may explain why there are reports of failed build when MAKEOPTS=-j1. It may join the random pool. I have just been incredibly lucky.

Things are still random when adjusted for python 3.6 so my conclusion on brokenness seem to still stand.
Comment 35 François Bissey 2017-11-23 04:29:17 UTC
This would be much better for selecting parallel building
--- /usr/portage/sci-libs/scipy/scipy-0.19.1.ebuild	2017-07-05 05:43:02.000000000 +1200
+++ scipy-0.19.1.ebuild	2017-11-23 17:24:52.840105829 +1300
@@ -102,9 +102,16 @@
 }
 
 python_compile() {
+	is_python_35_plus(){
+		[[ ${EPYTHON} == python3.[^0-4] ]]
+	}
+	local parallel_build=""
+	if is_python_35_plus ; then
+		parallel_build="-j $(makeopts_jobs)"
+	fi
 	${EPYTHON} tools/cythonize.py || die
 	distutils-r1_python_compile \
-		$(usex python_targets_python3_5 "" "-j $(makeopts_jobs)") \
+		${parallel_build} \
 		${SCIPY_FCONFIG}
 }
Comment 36 François Bissey 2017-11-23 23:53:41 UTC
I don't think we can fix this. This is a problem from python3.5 onward. numpy bravely tried to fix their own distutils to behave with python3.5+'s one, but I don't think they succeeded.

So to add salt to the injury. numpy's distutils parallelism is supposed to work with any python including python2.7 and 3.4. And it does for those versions of python. Which means passing "--parallel $(makeopts_jobs)" works for those versions of python but for 3.5 and 3.6 the best thing is to pass nothing at this stage.
Comment 37 matoro archtester 2017-11-30 00:32:17 UTC
I just ran into this problem myself.  Is will hardcoding MAKEOPTS=-j1 disable both python and numpy parallelization?  And if so, will doing so guarantee a successful build?  IMO if so it should either just be hardcoded in the ebuild or added as a gentoo patch.  Going to try with a user patch now.
Comment 38 François Bissey 2017-11-30 00:37:10 UTC
Just removing the bit "-j $(makeopts_jobs)" altogether is the better solution. Although I'll admit to use the following in a private overlay
python_compile() {
	is_python_34_under(){
		[[ ${EPYTHON} == python3.[0-4] || ${EPYTHON} == python2.7 ]]
	}
	local parallel_build=""
	if is_python_34_under ; then
		parallel_build="--parallel $(makeopts_jobs)"
	fi
	${EPYTHON} tools/cythonize.py || die
	distutils-r1_python_compile \
		${parallel_build} \
		${SCIPY_FCONFIG}
}
Comment 39 Benda Xu gentoo-dev 2017-11-30 02:51:08 UTC
*** Bug 635872 has been marked as a duplicate of this bug. ***
Comment 40 matoro archtester 2017-11-30 04:34:11 UTC
(In reply to François Bissey from comment #38)
> Just removing the bit "-j $(makeopts_jobs)" altogether is the better
> solution. Although I'll admit to use the following in a private overlay
> python_compile() {
> 	is_python_34_under(){
> 		[[ ${EPYTHON} == python3.[0-4] || ${EPYTHON} == python2.7 ]]
> 	}
> 	local parallel_build=""
> 	if is_python_34_under ; then
> 		parallel_build="--parallel $(makeopts_jobs)"
> 	fi
> 	${EPYTHON} tools/cythonize.py || die
> 	distutils-r1_python_compile \
> 		${parallel_build} \
> 		${SCIPY_FCONFIG}
> }

Is it possible to add this override via /etc/portage/package.env without going through a local overlay?  I have tried a few things and portage does not seem to be picking up the override for the function.
Comment 41 François Bissey 2017-11-30 04:47:34 UTC
(In reply to matoro from comment #40)
> (In reply to François Bissey from comment #38)
> > Just removing the bit "-j $(makeopts_jobs)" altogether is the better
> > solution. Although I'll admit to use the following in a private overlay
> > python_compile() {
> > 	is_python_34_under(){
> > 		[[ ${EPYTHON} == python3.[0-4] || ${EPYTHON} == python2.7 ]]
> > 	}
> > 	local parallel_build=""
> > 	if is_python_34_under ; then
> > 		parallel_build="--parallel $(makeopts_jobs)"
> > 	fi
> > 	${EPYTHON} tools/cythonize.py || die
> > 	distutils-r1_python_compile \
> > 		${parallel_build} \
> > 		${SCIPY_FCONFIG}
> > }
> 
> Is it possible to add this override via /etc/portage/package.env without
> going through a local overlay?  I have tried a few things and portage does
> not seem to be picking up the override for the function.

Not that I know of. A simpler and equally effective alteration is
python_compile() {
	${EPYTHON} tools/cythonize.py || die
	distutils-r1_python_compile \
		${SCIPY_FCONFIG}
}
You don't have any parallelism with any python but the loss seems to be negligible as far as I can measure it.
Comment 42 matoro archtester 2017-11-30 20:14:04 UTC
(In reply to François Bissey from comment #41)
> (In reply to matoro from comment #40)
> > (In reply to François Bissey from comment #38)
> > > Just removing the bit "-j $(makeopts_jobs)" altogether is the better
> > > solution. Although I'll admit to use the following in a private overlay
> > > python_compile() {
> > > 	is_python_34_under(){
> > > 		[[ ${EPYTHON} == python3.[0-4] || ${EPYTHON} == python2.7 ]]
> > > 	}
> > > 	local parallel_build=""
> > > 	if is_python_34_under ; then
> > > 		parallel_build="--parallel $(makeopts_jobs)"
> > > 	fi
> > > 	${EPYTHON} tools/cythonize.py || die
> > > 	distutils-r1_python_compile \
> > > 		${parallel_build} \
> > > 		${SCIPY_FCONFIG}
> > > }
> > 
> > Is it possible to add this override via /etc/portage/package.env without
> > going through a local overlay?  I have tried a few things and portage does
> > not seem to be picking up the override for the function.
> 
> Not that I know of. A simpler and equally effective alteration is
> python_compile() {
> 	${EPYTHON} tools/cythonize.py || die
> 	distutils-r1_python_compile \
> 		${SCIPY_FCONFIG}
> }
> You don't have any parallelism with any python but the loss seems to be
> negligible as far as I can measure it.

This works great, including on =sci-libs/scipy-1.0.0 .  Is this grounds for changing the ebuild in the official tree?  IMO a functional build is more important than features advertised by upstream.
Comment 43 François Bissey 2017-11-30 20:17:21 UTC
(In reply to matoro from comment #42)
> (In reply to François Bissey from comment #41)
> > (In reply to matoro from comment #40)
> > > (In reply to François Bissey from comment #38)
> > > > Just removing the bit "-j $(makeopts_jobs)" altogether is the better
> > > > solution. Although I'll admit to use the following in a private overlay
> > > > python_compile() {
> > > > 	is_python_34_under(){
> > > > 		[[ ${EPYTHON} == python3.[0-4] || ${EPYTHON} == python2.7 ]]
> > > > 	}
> > > > 	local parallel_build=""
> > > > 	if is_python_34_under ; then
> > > > 		parallel_build="--parallel $(makeopts_jobs)"
> > > > 	fi
> > > > 	${EPYTHON} tools/cythonize.py || die
> > > > 	distutils-r1_python_compile \
> > > > 		${parallel_build} \
> > > > 		${SCIPY_FCONFIG}
> > > > }
> > > 
> > > Is it possible to add this override via /etc/portage/package.env without
> > > going through a local overlay?  I have tried a few things and portage does
> > > not seem to be picking up the override for the function.
> > 
> > Not that I know of. A simpler and equally effective alteration is
> > python_compile() {
> > 	${EPYTHON} tools/cythonize.py || die
> > 	distutils-r1_python_compile \
> > 		${SCIPY_FCONFIG}
> > }
> > You don't have any parallelism with any python but the loss seems to be
> > negligible as far as I can measure it.
> 
> This works great, including on =sci-libs/scipy-1.0.0 .  Is this grounds for
> changing the ebuild in the official tree?  IMO a functional build is more
> important than features advertised by upstream.

I think it is. I experimented all sorts of combinations and python-3.5+'s distutils cannot cope with scipy. End of story.
However, I am not in charge of this ship.
Comment 44 matoro archtester 2017-11-30 20:44:26 UTC
(In reply to François Bissey from comment #43)
> (In reply to matoro from comment #42)
> > (In reply to François Bissey from comment #41)
> > > (In reply to matoro from comment #40)
> > > > (In reply to François Bissey from comment #38)
> > > > > Just removing the bit "-j $(makeopts_jobs)" altogether is the better
> > > > > solution. Although I'll admit to use the following in a private overlay
> > > > > python_compile() {
> > > > > 	is_python_34_under(){
> > > > > 		[[ ${EPYTHON} == python3.[0-4] || ${EPYTHON} == python2.7 ]]
> > > > > 	}
> > > > > 	local parallel_build=""
> > > > > 	if is_python_34_under ; then
> > > > > 		parallel_build="--parallel $(makeopts_jobs)"
> > > > > 	fi
> > > > > 	${EPYTHON} tools/cythonize.py || die
> > > > > 	distutils-r1_python_compile \
> > > > > 		${parallel_build} \
> > > > > 		${SCIPY_FCONFIG}
> > > > > }
> > > > 
> > > > Is it possible to add this override via /etc/portage/package.env without
> > > > going through a local overlay?  I have tried a few things and portage does
> > > > not seem to be picking up the override for the function.
> > > 
> > > Not that I know of. A simpler and equally effective alteration is
> > > python_compile() {
> > > 	${EPYTHON} tools/cythonize.py || die
> > > 	distutils-r1_python_compile \
> > > 		${SCIPY_FCONFIG}
> > > }
> > > You don't have any parallelism with any python but the loss seems to be
> > > negligible as far as I can measure it.
> > 
> > This works great, including on =sci-libs/scipy-1.0.0 .  Is this grounds for
> > changing the ebuild in the official tree?  IMO a functional build is more
> > important than features advertised by upstream.
> 
> I think it is. I experimented all sorts of combinations and python-3.5+'s
> distutils cannot cope with scipy. End of story.
> However, I am not in charge of this ship.

Well according to https://devmanual.gentoo.org/general-concepts/ebuild-revisions/index.html fixing a build issue does not warrant a new revision, and according to https://devmanual.gentoo.org/ebuild-maintenance/index.html#touching-other-developers-ebuilds we should give the dev time to fix it.  I've sent an email with the appropriate patch, if nothing is forthcoming I will submit a pull request myself.
Comment 45 François Bissey 2017-11-30 20:49:11 UTC
If you do, check if the "multiprocessing" eclass is still needed. I am fairly sure it is only included to get makeopts_jobs().
Comment 46 Benda Xu gentoo-dev 2017-12-01 05:21:51 UTC
(In reply to François Bissey from comment #41)

> Not that I know of. A simpler and equally effective alteration is
> python_compile() {
> 	${EPYTHON} tools/cythonize.py || die
> 	distutils-r1_python_compile \
> 		${SCIPY_FCONFIG}
> }
> You don't have any parallelism with any python but the loss seems to be
> negligible as far as I can measure it.

So, in short there is no hope to support parallel build of scipy on distutils and we just use the serialized build unconditionally.  Is that correct?
Comment 47 François Bissey 2017-12-01 06:23:18 UTC
(In reply to Benda Xu from comment #46)
> (In reply to François Bissey from comment #41)
> 
> > Not that I know of. A simpler and equally effective alteration is
> > python_compile() {
> > 	${EPYTHON} tools/cythonize.py || die
> > 	distutils-r1_python_compile \
> > 		${SCIPY_FCONFIG}
> > }
> > You don't have any parallelism with any python but the loss seems to be
> > negligible as far as I can measure it.
> 
> So, in short there is no hope to support parallel build of scipy on
> distutils and we just use the serialized build unconditionally.  Is that
> correct?

I cannot fix python 3.5+ distutils. numpy's one perform better but python's one cannot be shut off without patching.
So yes serialize unconditionally, or use numpy parallelism in python <3.5. Too complicated in my opinion.
Comment 48 Denis Descheneaux 2017-12-14 07:21:56 UTC
GRAPHITE="-floop-interchange -floop-strip-mine -floop-block -fgraphite-identity"
CFLAGS="-march=sandybridge -O2 -ftree-vectorize ${GRAPHITE} -pipe"
CXXFLAGS="${CFLAGS}"

USE="-doc -sparse {-test}"
PYTHON_TARGETS="python3_6 -python2_7 -python3_4 -python3_5"

sci-libs/scipy-1.0.0 fails for me with -j5

builds successfully when package.env / MAKEOPTS="-j1"
Comment 49 Mathy Vanvoorden 2018-06-25 20:58:27 UTC
Was bitten by this with scipy-1.1.0 when switching to python 3.6, using MAKEOPTS=-j1 did the trick for me.

See also Bug 658864 someone else reported, which is a duplicate.
Comment 50 Thomas Deutschmann (RETIRED) gentoo-dev 2018-06-25 22:03:43 UTC
*** Bug 658864 has been marked as a duplicate of this bug. ***
Comment 51 Thomas Deutschmann (RETIRED) gentoo-dev 2018-06-25 22:05:12 UTC
I can confirm that using MAKEOPTS=-j1 fixes problem for x86.
Comment 52 Pacho Ramos gentoo-dev 2018-06-26 12:41:44 UTC
*** Bug 646328 has been marked as a duplicate of this bug. ***
Comment 53 Pacho Ramos gentoo-dev 2018-06-27 21:04:43 UTC
[master dea904d8d71a] sci-libs/scipy: Parallel build fails (#614464)
 3 files changed, 7 insertions(+), 4 deletions(-)
Comment 54 Benda Xu gentoo-dev 2020-01-27 04:15:13 UTC
*** Bug 701642 has been marked as a duplicate of this bug. ***
Comment 55 Benda Xu gentoo-dev 2020-01-27 04:15:48 UTC
*** Bug 704226 has been marked as a duplicate of this bug. ***
Comment 56 Larry the Git Cow gentoo-dev 2020-01-27 04:59:51 UTC
The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=809afd4a24533311ced5ecfb2f022b539a3b6dd2

commit 809afd4a24533311ced5ecfb2f022b539a3b6dd2
Author:     Benda Xu <heroxbd@gentoo.org>
AuthorDate: 2020-01-27 04:54:55 +0000
Commit:     Benda Xu <heroxbd@gentoo.org>
CommitDate: 2020-01-27 04:59:38 +0000

    sci-libs/scipy: disable parallel build completely.
    
      After 4 years discussion and debugging, we conclude that Python 3 is
      deeply broken in parallel builds for anything involving compiling of
      C/C++/fortran code.  The problem is universal, regardless how
      dev-python/numpy is built.
    
      Numpy and scipy upstream cannot do anything about this.  We bite the
      bullet and disable parallel build of scipy completely.
    
      Thanks to all who have contributed to this heroic marathon
      debugging.  We regret that only a workaround can be provided at this
      moment.
    
    Credit: Andrés Becerra Sandoval, Hendrik v. Raven, younky.yang@yahoo.com
    Credit: matoro, Denis Descheneaux, Mathy Vanvoorden, email200202@yahoo.com
    Credit: jon R-B, Anton Kochkov, Jonas Stein, edes, David Duchesne
    Credit: thulle, Mathy Vanvoorden, Sasha Medvedev, rtgiskard@gmail.com
    Credit: Lukasz Ligowski, Zentoo, Jouni Kosonen, Neil, Harris Landgarten
    Credit: Markus Oehme, Andreas Proteus
    Suggested-By: François Bissey,  Arfrever Frehtes Taifersar Arahesis
    Reference: https://github.com/numpy/numpy/issues/13080
    Reference: https://github.com/scipy/scipy/issues/7112
    Closes: https://bugs.gentoo.org/614464
    Package-Manager: Portage-2.3.79, Repoman-2.3.18
    Signed-off-by: Benda Xu <heroxbd@gentoo.org>

 sci-libs/scipy/scipy-1.4.1.ebuild | 2 ++
 1 file changed, 2 insertions(+)
Comment 57 Pacho Ramos gentoo-dev 2020-05-01 17:08:33 UTC
*** Bug 639778 has been marked as a duplicate of this bug. ***
Comment 58 Thomas R. (TRauMa) 2020-05-05 22:23:59 UTC
The stable 1.1.0 ebuild is still missing this fix.
Comment 59 Andrew Ammerlaan gentoo-dev 2020-08-18 11:39:37 UTC
*** Bug 676640 has been marked as a duplicate of this bug. ***