It would be nice if portage provided a jobserver to each ebuild phase via the MAKEFLAGS environment variable, and ensured that any slots consumed by a given ebuild phase are automatically reclaimed after the phase completes. Slots can be allocated using emerge --jobs and --load-average arguments. References: https://www.gnu.org/software/make/manual/html_node/POSIX-Jobserver.html http://make.mad-scientist.net/papers/jobserver-implementation/ https://github.com/rust-lang/cargo/pull/4110 https://github.com/alexcrichton/jobserver-rs
I second this request.
This also will help with bugs like #737098. Thank you!
+1 clarification: - current emerge may run parallel ebuilds - each ebuild may use GNU make - top-level GNU make of each ebuild may create jobserver - children GNU make(s) may use created jobserver by extended MAKEFLAGS problem: - there can be multiple GNU make jobserver in parallel - these jobservers can overuse given cpu resources goal: - let emerge create a master GNU make jobserver - each parallel ebuild top-level GNU make should ask as client job than unanswered: - no bash-ish jobserver example for easy emerge or prescript integration? - no simple/plain GNU make jobserver command? - sharing pipes between parallel ebuilds possible? sandbox? GNU make procedure: $ cat Makefile all: echo $$MAKEFLAGS $ make -j63 echo $MAKEFLAGS -j63 --jobserver-auth=3,4
(In reply to and from comment #3) > problem: > - there can be multiple GNU make jobserver in parallel > - these jobservers can overuse given cpu resources Note that you can mitigate the over-provision issue already without a central portage-provided jobserver by using make's --load-average option. Multiple parallel emerges will then implicitly communicate over the load overage and restrict their task creation based on it. For example, on a 8 core system, you could set: MAKEOPTS="--jobs 8 --load-average 9"
The work in make 4.4 (not yet released) will make this much easier as we don't have to worry about fd inheritance at all: https://github.com/ninja-build/ninja/issues/1139#issuecomment-1223785608. Of course, we still need to implement a basic jobserver, but marxin's example is a solid basis to start on.
There's now a PR for this: https://github.com/gentoo/portage/pull/913 from syu!
https://lore.kernel.org/all/20240404111613.2574424-1-martin@geanix.com/
It would be useful to have a similar fetch job server as I've noted in bug 425682 comment 11.
(In reply to Sam James from comment #7) > https://lore.kernel.org/all/20240404111613.2574424-1-martin@geanix.com/ This one uses a fifo based approach that relies on cooperation and aligns with what I had in mind in comment #0. (In reply to Sam James from comment #6) > There's now a PR for this: https://github.com/gentoo/portage/pull/913 from > syu! This one uses wrappers to intercept calls to programs, which seems to assume that the program callers are basically uncooperative. It's a much different job server pattern than I had in mind. It seems backwards in a way for programs to be spawned *before* the corresponding jobs have been allocated.
(In reply to Zac Medico from comment #9) > (In reply to Sam James from comment #7) > > https://lore.kernel.org/all/20240404111613.2574424-1-martin@geanix.com/ > > This one uses a fifo based approach that relies on cooperation and aligns > with what I had in mind in comment #0. There's a python implementation of a GNU make jobserver using a named pipe here: https://lore.kernel.org/all/20240404111613.2574424-6-martin@geanix.com/ It has this tiny main: +def main(path, user, group, mode, jobs): + """Setup a fifo to use as jobserver shared between builds.""" + try: + path.unlink(missing_ok=True) + os.mkfifo(path) + shutil.chown(path, user, group) + os.chmod(path, mode) + except (FileNotFoundError, PermissionError) as exc: + raise SystemExit(f"failed to create fifo: {path}: {exc.strerror}") + + print(f"jobserver: {path}: {jobs} jobs") + fifo = os.open(path, os.O_RDWR) + os.write(fifo, b"+" * jobs) + + print("jobserver: ready; waiting indefinitely") + signal.signal(signal.SIGTERM, signal_handler) + signal.signal(signal.SIGINT, signal_handler) + resumed.wait() + + print("jobserver: exiting") + path.unlink() + os.close(fifo)
As I commented on the portage PR implementing the "uncooperative jobs protocol": It seems a bit unfortunate to not share an implementation with the GNU make jobserver protocol for the purpose of interoperating with other programs that expect a GNU make jobserver. For example, gcc's -flto=jobserver will check for a GNU make jobserver in order to run the link-time lto1 processes in parallel. It is unlikely GCC will implement support for "the gentoo jobserver". It would also mean that all Make-based packages would get seamless support for all build edges without verifying every program you want to wrap and making a shellscript symlink. A wrapper could still be useful for packages that don't natively support the GNU make jobserver.
I think it would in general be a tremendous mistake to avoid the significant benefits of allowing make, ninja, and various other well-behaved programs to seamlessly interoperate and have first-class support for tracking running jobs. If we want wrappers for uncooperative programs, that can still be opt-in by individual packages that need it, and the wrapper should speak the GNU Make jobserver protocol.
(In reply to Sam James from comment #6) > There's now a PR for this: https://github.com/gentoo/portage/pull/913 from > syu! I've opened a separate bug 937165 to track the idea of a job sandbox that uses compiler wrappers to enforce MAKEOPTS. This idea is only slightly related to GNU make jobserver integration. In theory the job sandbox could probably allocate jobs from a GNU make jobserver.
(In reply to Zac Medico from comment #0) > It would be nice if portage provided a jobserver to each ebuild phase via > the MAKEFLAGS environment variable, and ensured that any slots consumed by a > given ebuild phase are automatically reclaimed after the phase completes. What I've suggested is to create a sort of GNU make jobserver proxy for each ebuild phase, as a guard against token leakage. It appears that this kind of GNU make jobserver proxy is not feasible because there is no way to dynamically migrate tokens from a central jobserver. This arises because we would need to poll for the ability to write to the pipe in order to detect a corresponding read, but the poll for write ability will always succeed in practice for a buffered pipe that has plenty of unused buffer capacity. So, our ebuild phases will have to communicate directly with a central jobserver fifo. If we have a central service outside of emerge like the one mentioned in comment #10, then it could possibly leak job tokens, especially if jobs are interrupted. However, it's possible that leaks are negligible. One could always restart the central jobserver in order to reset the tokens. For the central jobserver, since python seems like overkill, we might implement it in bash and then exec sleep infinity to hold the fifo open like this: exec {fifo}<>"${FIFO}" eval "printf '+%.0s' {1..${JOBS}}" >&${fifo} exec sleep infinity
*** Bug 941551 has been marked as a duplicate of this bug. ***
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/proj/portage.git/commit/?id=6f958be3cfa5054541ab3db0bcff73a496ecd0be commit 6f958be3cfa5054541ab3db0bcff73a496ecd0be Author: Zac Medico <zmedico@gentoo.org> AuthorDate: 2024-08-05 05:43:31 +0000 Commit: Zac Medico <zmedico@gentoo.org> CommitDate: 2024-12-28 03:11:33 +0000 Pass through MAKEFLAGS and exclude from environment.bz2 Allow the MAKEFLAGS environment variable to pass through, in case a centralized GNU Make POSIX Jobserver is available. In order to prevent persistence of this variable in environment.bz2, exclude it when the __save_ebuild_env --exclude-init-phases argument is given. Ultimately we may want to add support for portage to parse MAKEFLAGS and use it to allocate job tokens in various circumstances. For example, emerge could allocate a job token for each job started for emerge --jobs. This would remove a job token from the pool that is available to nested make calls, but is reasonable because nested make calls will execute jobs serially when no jobserver tokens remain. Bug: https://bugs.gentoo.org/692576 Signed-off-by: Zac Medico <zmedico@gentoo.org> NEWS | 2 ++ bin/save-ebuild-env.sh | 3 +++ lib/portage/__init__.py | 3 ++- lib/portage/package/ebuild/_config/special_env_vars.py | 3 ++- lib/portage/package/ebuild/doebuild.py | 5 +++-- man/make.conf.5 | 10 +++++++++- 6 files changed, 21 insertions(+), 5 deletions(-)