Summary: | sys-apps/portage: running econf in parallel will race to update config.{sub,guess} | ||
---|---|---|---|
Product: | Portage Development | Reporter: | Greg Turner <gmturner007> |
Component: | Core - Configuration | Assignee: | Portage team <dev-portage> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | bobo, floppym, gmturner007 |
Priority: | Normal | Keywords: | InVCS |
Version: | unspecified | ||
Hardware: | All | ||
OS: | Linux | ||
See Also: | https://bugs.gentoo.org/show_bug.cgi?id=485046 | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- |
Description
Greg Turner
2013-10-09 21:29:25 UTC
Good catch. I think your proposed solution makes sense. If you want to attempt an implementation, I would be happy to review/test it. (In reply to Mike Gilbert from comment #1) > Good catch. I think your proposed solution makes sense. If you want to > attempt an implementation, I would be happy to review/test it. Sure, I think it will be fairly easy... pretty busy though, give me a few days. Is here anything to do for dev-portage? Otherwise please reassign. (In reply to Sebastian Luther (few) from comment #3) > Is here anything to do for dev-portage? Otherwise please reassign. Yes; if/when Greg provides an implementation it will be a patch against the Portage source code and dev-portage will need to review it. I'm just volunteering to do an initial review because it is an issue I would like to see fixed. Forgive me thinking out loud here, but, returning to this, I think I see a bug in my proposed solution. The problem is with the "_portage_econfed_something_" marker -- it's based on an implicit assumption (admittedly, almost always a correct one) that the first configure will occur in a parent directory of any subsequent econf invocation. An example case where this would not be the case is that of an ebuild exploiting the in-source build feature of multibuild for multiple ABI's in parallel (pretty easy to do with multilib-minimal, for example); this would run configure in parallel from a separate BUILD_DIR for each ABI. I can fix the bug by renaming "${T}/_portage_econfed_something_" to the (anyhow more descriptive) name "${T}/_config_sub_guess_substituted_", and keeping a list of (root?) directories where portage has already executed the recursive substitution heuristic in that file (it would still only be touched while holding the dir-lock). This makes for a potentially considerably more complicated solution, however, as now I'm faced with questions like, whether to do directory/subdirectory containment analysis, worry about complications due to symbolic links, and so forth. One way to avoid all that mess would be to a flat list of realpath's in there, not attempting to decide if A is a subdirectory of B at all. An even simpler approach would be to simply do the substitution for all of ${WORKDIR} on the first invocation... although I'm not 110% sure how reliable the assumption that everything is in ${WORKDIR} actually is. I guess, for a first crack at it, I'll go with the flat list of absolute real-paths. Once a directory already tests as "done" I can stop iterating through sub-directories, to hopefully preserve some of the performance benefit I was hoping to realize in my original take on the problem (technically, we no longer achieve any O(# of dirs) reduction, in this revised proposal, for revisited configure directories, but, anyhow, the point of this was never to enhance performance but to prevent races). Actually, this makes me think about a second assumption I've made in both my original and revised proposal -- that inappropriate config.{sub,guess} files do not somehow "regenerate" in-between econf invocations. My inclination is (it would seem) to cling to that assumption, but if anyone has any reason to question it, please let me know! (In reply to Greg Turner from comment #5) > The problem is with the "_portage_econfed_something_" marker -- it's based > on an implicit assumption (admittedly, almost always a correct one) that the > first configure will occur in a parent directory of any subsequent econf > invocation. Oh, no it isn't :) Should have read the code again -- it already uses ${WORKDIR}, not ${ECONF_SOURCE} to iterate. So my original approach seems OK. Sorry for the noise. Hopefully I can churn this out pretty quickly, and do some basic testing in over the next few days, guys. Presumably I should post my results to portage-devel as well as this bug? (In reply to Greg Turner from comment #5) > Forgive me thinking out loud here, but [snip] > this makes me think about a second assumption I've made in [my proposal] > -- that inappropriate config.{sub,guess} files > do not somehow "regenerate" in-between econf invocations. My inclination is > (it would seem) to cling to that assumption, but if anyone has any reason to > question it, please let me know! This part still stands, however. Technically, my proposal changes the semantics of econf, insofar as, right now, econf will do the right thing even if bad config.{sub,guess} files regenerate in-between econf invocations, but with the proposed modifications, it no longer would. If that's ever going to be a problem, then something will need to be tweaked. it's fairly easy to do: - leverage `flock` and a subshell when updating the two files - use `cp;mv` to do an automatic rewrite no need to hack anything else (In reply to SpanKY from comment #8) > it's fairly easy to do: > - leverage `flock` and a subshell when updating the two files by this you mean config.{sub,guess}? Note that we shuffle through these recursively in all of ${WORKDIR}... > - use `cp;mv` to do an automatic rewrite here you mean the shebang replacement, correct? > no need to hack anything else Any existing flock dependencies in portage? Concerned this might screw up some obscure prefix bootstrap somewhere. This still doesn't protect against config.{sub,guess} replacement racing with the configure script itself... but..., not sure how afraid of that we should be. What about just using cp;mv there, as well? They're sourced, after all.... Can anyone think of a reason that simply using "mv"-replacement for all the manipulations in econf wouldn't solve the entire problem? I'm not sure I can, actually... should be fixed in git now http://git.overlays.gentoo.org/gitweb/?p=proj/portage.git;a=commitdiff;h=8d4d077d5775c4f1dac724d5e6fbb7cf14db1920 Closing, this is resolved. *** Bug 516892 has been marked as a duplicate of this bug. *** |