This should be safe since we are either running configure in an external build directory or in a separate copy of the sources. A possible danger would be if configure writes files to ${S}, but that would be a very broken configure program indeed. If you think that will be a problem, maybe we can have someone tinderbox it. Index: multilib-minimal.eclass =================================================================== RCS file: /var/cvsroot/gentoo-x86/eclass/multilib-minimal.eclass,v retrieving revision 1.5 diff -u -r1.5 multilib-minimal.eclass --- multilib-minimal.eclass 28 Jun 2013 12:42:48 -0000 1.5 +++ multilib-minimal.eclass 15 Sep 2013 22:56:30 -0000 @@ -47,7 +47,7 @@ popd >/dev/null || die } - multilib_foreach_abi multilib-minimal_abi_src_configure + multilib_parallel_foreach_abi multilib-minimal_abi_src_configure } multilib-minimal_src_compile() {
This was already tested and there was breakage.
Can you (or axs) elaborate? I'm wondering if it is an isolated break, or a pattern.
Right now they seem isolated. But I haven't done anywhere close to a full audit of multilib-minimal inheritors; i tested 5 or 6 (including one that I converted and haven't submitted yet), and two failed. Since one of them was that one I didn't submit yet, and I haven't figured out how to make it not fail yet with parallel ./configure's, I figured it would make sense to wait for now. I *think* the failure relates to the environment not being completely independent between the two ./configure's, but I haven't debugged it yet to confirm.
If you give me some specific ebuilds to look at, I would be happy to do some investigation myself.
Created attachment 358818 [details] workaround needed for environment reset after parallel configure .. so the only one that I can duplicate right now is my attempt at adapting sys-libs/readline. The issue there is that it has a two-part build system, where the libs are configured and built and then rlfe is configured and built. The problem is actually with the this second ./configure run, and as i guessed before it seems to have to do with the environment (specifically the effects of append-cppflags and append-ldflags) not being preserved between the parallelized src_configure, and the non-parallelized src_compile. When src_configure is not parallelized everything is fine. I *am* able to make it work by checking if the modifications to cppflags exists or not, and re-running append_cppflags/append_ldflags if it doesn't exist. I've attached the example ebuild for readline (with this workaround), as an example. Thoughts?
Oh my, that is ugly. Maybe we should split that rlfe build into a separate package?
The flags are "disappearing" due to the sub-shells created by mulitprocessing.eclass. To avoid this, set the variables outside of the multilib function. src_configure() { # do this in src_compile due to the sub-econf needing the same environment # fix implicit decls with widechar funcs append-cppflags -D_GNU_SOURCE # http://lists.gnu.org/archive/html/bug-readline/2010-07/msg00013.html append-cppflags -Dxrealloc=_rl_realloc -Dxmalloc=_rl_malloc -Dxfree=_rl_free # This is for rlfe, but we need to make sure LDFLAGS doesn't change # so we can re-use the config cache file between the two. append-ldflags -L. multilib-minimal_src_configure } multilib_src_configure() { ECONF_SOURCE="${S}" econf \ --cache-file="${BUILD_DIR}"/config.cache \ --with-curses \ $(use_enable static-libs static) # if ! tc-is-cross-compiler; then # fi }
Created attachment 358836 [details] Working readline ebuild Here's an example of a functional readline ebuild that works with parallel configure. Feel free to steal it.
Unfortunately, I have found at least one other package (media-libs/libvpx) which exports environment variables within multilib_src_configure; these are bound to get lost in the subshell. Perhaps we could provide some method for the ebuild maintainer to certify that their code is subshell-safe and run parallel configure at that point?
(In reply to Mike Gilbert from comment #8) > Created attachment 358836 [details] > Working readline ebuild > > Here's an example of a functional readline ebuild that works with parallel > configure. Feel free to steal it. Nice -- i guess I should've actually checked examples/rlfe/configure.in to see if it even did anything with the detection of those libs that are symlinked, rather than assuming the original ebuild did it that way to avoid failure. I can also confirm that, when both econfs are run in multilib_src_configure, that calling append_{cpp,ld}flags only in multilib_src_configure set the values in the config and so emake loads them appropriately rather than needing them in the environment at src_compile time. Note, however, that if *not* using multilib_parallel_foreach_abi for multilib-minimal-src_configure, then doing this ends up double-appending for the second ABI: ---Quote: with parallel-- x86_64-pc-linux-gnu-gcc -Wl,-O1 -Wl,--as-needed -L. -o rlfe rlfe.o pty.o -lreadline -lhistory -lcurses >>> Source compiled. ---Quote: without parallel-- x86_64-pc-linux-gnu-gcc -Wl,-O1 -Wl,--as-needed -L. -L. -o rlfe rlfe.o pty.o -lreadline -lhistory -lcurses >>> Source compiled. (note the double -L. ; and CPPFLAGS are doubled as well) I wonder if it would make sense to adjust multilib-minimal.eclass or the multilib_foreach_abi call so that it runs the multilib_src_configure in a subshell even if it isn't parallel, so that identical behaviour occurs?
(In reply to Ian Stakenvicius from comment #10) > Note, however, that if *not* using multilib_parallel_foreach_abi for > multilib-minimal-src_configure, then doing this ends up double-appending for > the second ABI: > Right. Appending flags in multilib_src_configre is just plain wrong for that very reason. Unfortunately it is probably quite common.
(In reply to Mike Gilbert from comment #11) Actually, you can append flags in mulitlib_src_configure if you are careful and reduce the scope of the variables. For example, we do this in several python ebuilds when we want difference CFLAGS for python2 vs python3: python_compile() { if [[ ${EPYTHON} != python3* ]]; then local CFLAGS=${CFLAGS} append-cflags -fno-strict-aliasing fi distutils-r1_python_compile } CFLAGS is declared as local var, and then we append the flags we want.
(In reply to Mike Gilbert from comment #12) > (In reply to Mike Gilbert from comment #11) > > Actually, you can append flags in mulitlib_src_configure if you are careful > and reduce the scope of the variables. > > For example, we do this in several python ebuilds when we want difference > CFLAGS for python2 vs python3: > > python_compile() { > if [[ ${EPYTHON} != python3* ]]; then > local CFLAGS=${CFLAGS} > append-cflags -fno-strict-aliasing > fi > distutils-r1_python_compile > } > > CFLAGS is declared as local var, and then we append the flags we want. (In reply to Mike Gilbert from comment #11) > (In reply to Ian Stakenvicius from comment #10) > > Note, however, that if *not* using multilib_parallel_foreach_abi for > > multilib-minimal-src_configure, then doing this ends up double-appending for > > the second ABI: > > > > Right. Appending flags in multilib_src_configre is just plain wrong for that > very reason. Unfortunately it is probably quite common. Sweet, so we don't ever have a reason to force each src_configure to run in isolation. Regarding multilib_src_configure for libvpx , i'm not seeing any issues with my tests. Current (non-parallel) has the same values for AR,CC,LD,etc during src_configure, as does parallel with all the tc-exports being called again in src_compile, as does parallel with the original ebuild. Although I don't see anything in ./configure to do this, it must copy the values of these vars (or emake uses the toolchain's values directly somehow s.t. it all just works); did you see a failure on that one?
Looking closer at the log file, I think my libvpx problem is unrelated. >>> Compiling source in /tmp/portage/media-libs/libvpx-1.2.0_pre20130625/work/libvpx-1.2.0_pre20130625 ... * x86: running multilib-minimal_abi_src_compile make -j6 verbose=yes GEN_EXAMPLES= Makefile:29: *** Recursive variable `PATH' references itself (eventually). Stop. * ERROR: media-libs/libvpx-1.2.0_pre20130625::gentoo failed (compile phase): * emake failed
If you want to rely on earlier ABIs setting some variables for later ABIs, this is simply wrong. If you need anything set for sub-builds, you need to set it before multibuild_foreach_* is called. That said, I will probably work on having more separation between environments running multibuild. If it goes well, it will be possible to e.g. 'append-flags' for an ABI and expect later phases for that ABI get those changes.
See #487478. TLDR of the above: In principle, simultaneous parallel executions of portage's econf() seem to have race conditions. For occult reasons this does not seem to be a problem in practice. The bug describes (in prose) a fairly easy recipe once could follow to correct the putative problem, without disturbing too much plumbing.
I'd be willing to make parallel configure an optional thing controlled by a variable, but it does not seem right to me to make this default behavior. Ebuild writers can test parallel configure and then enable it, if it works. patches welcome
(In reply to Julian Ospald (hasufell) from comment #17) > I'd be willing to make parallel configure an optional thing controlled by a > variable, but it does not seem right to me to make this default behavior. > Ebuild writers can test parallel configure and then enable it, if it works. > > patches welcome http://git.overlays.gentoo.org/gitweb/?p=user/gmt.git;a=blob;f=eclass/multilib-minimal.eclass has an implementation of this, but also some cruft. If the basic approach looks acceptable I'd be happy to dust it off and make a nice incremental patch out of it.
(In reply to Greg Turner from comment #18) > http://git.overlays.gentoo.org/gitweb/?p=user/gmt.git;a=blob;f=eclass/ > multilib-minimal.eclass has an implementation of this, but also some cruft. > If the basic approach looks acceptable I'd be happy to dust it off and make > a nice incremental patch out of it. Too much cruft. It doesn't make sense to run phases other than configure (and sometimes maybe tests) in parallel. Just focus on configure since that's what really gives you a boost.
(In reply to Michał Górny from comment #19) > (In reply to Greg Turner from comment #18) > > http://git.overlays.gentoo.org/gitweb/?p=user/gmt.git;a=blob;f=eclass/ > > multilib-minimal.eclass has an implementation of this, but also some cruft. > > If the basic approach looks acceptable I'd be happy to dust it off and make > > a nice incremental patch out of it. > > Too much cruft. It doesn't make sense to run phases other than configure > (and sometimes maybe tests) in parallel. Just focus on configure since > that's what really gives you a boost. Agreed about src_install being overkill... However src_compile has been working very nicely in my tree, as often as not. For huge ebuilds with make -j problems, it can be quite a win. As for src_test -- lord knows, some of them take forever and a day, but, in practice, I've found a lot of them need a bunch of work to parallelize ... they tend to run amok in "${S}", or grab some global resource (hard-coded TCP sockets, for example)... surmountable issues, but non-trivial to solve correctly, as you often have to grok some confusing home-grown framework to figure out what the hell is going on. No doubt, all manifestations of QA's role as the red-headed stepchild of the software world :) Let me see if dropping src_install support and tidying up the grossly neglected in-source documentation doesn't pare it down to a comfortable size for you... been meaning to do that anyhow.... should be able to take a crack at it tomorrow or Tuesday.
OK, refactored patches are in #493214 and: https://493214.bugs.gentoo.org/attachment.cgi?id=364520 has the particular patch in question. hth!
I completely forgot if there was something we felt missing from the parallel phase patch? Should we apply it?
Greg's patch? It was missing simplicity and lack of non-useful features :).
I'd probably rather go for parallelizing the phases and have a way of disabling it, in case it actually causes breakage.
As for default src_compile(), src_test() and src_install(), I have an idea to reuse make job server to optimally run them in parallel (i.e. without spawning NxN jobs). I will try to write a working patch tonight and send you.
(In reply to Michał Górny from comment #25) > As for default src_compile(), src_test() and src_install(), I have an idea > to reuse make job server to optimally run them in parallel (i.e. without > spawning NxN jobs). I will try to write a working patch tonight and send you. I think the most benefit actually comes from parallelizing src_configure. I have a lot of packages where compilation takes less time than running "./configure", at least on my CPU.
I guess I will do my own tinderbox run this weekend or so and just add parallel src_configure.
I've tested all multilib-minimal ebuilds, and cmake-multilib ebuilds after applying the patch making them use multilib-minimal, and found no new breakages except for sys-devel/llvm. I've fixed llvm yesterday, and now I've committed the eclass change. /var/cvsroot/gentoo-x86/eclass/multilib-minimal.eclass,v <-- multilib-minimal.eclass new revision: 1.9; previous revision: 1.8