744103 – sys-apps/portage: could wrap compilers to avoid exponential job growth

Bug 744103 - sys-apps/portage: could wrap compilers to avoid exponential job growth

Summary: sys-apps/portage: could wrap compilers to avoid exponential job growth

Status:	CONFIRMED

Alias:	None

Product:	Portage Development
Classification:	Unclassified
Component:	Conceptual/Abstract Ideas (show other bugs)
Hardware:	All Linux

Importance:	Normal normal
Assignee:	Portage team

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:	184128
	Show dependency tree

Reported:	2020-09-22 14:01 UTC by Michał Górny
Modified:	2020-09-23 03:19 UTC (History)
CC List:	2 users (show)

See Also:	692576
Package list:
Runtime testing required:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Michał Górny archtester

2020-09-22 14:01:06 UTC

Right now, if you use 'emerge --jobs' (or otherwise run emerge in parallel), the effective number of make (ninja, etc.) jobs spawned can grow exponentially (up to emerge jobs * make jobs).  This is both likely to be inefficient and to eat lots of memory.

I think a simple solution to workaround that would be to have 'locking' wrappers around CC, CXX and other commands known to consume lots of memory and be run in parallel.  These wrappers would block whenever total number of invocations exceeds permitted number (defaulting to -j from MAKEOPTS).

Comment 1 Michał Górny archtester

2020-09-22 16:29:57 UTC

Well, I see three options for implementing this.

An obvious solution would be to use POSIX named semaphores.  They have exactly the semantics we need and should be efficient.  However, they do not release locks (post semaphores) if the process crashes, so we could end up having successive jobs permalocked.  Adding a 'reaper' for this would probably make it more complex than the alternatives.

A trivial option would be to use lockfiles.  Basically, we create a lockfile for each allowed job and try to lock it non-blocking in a waiting loop.  Its disadvantage is that it introduces arbitrary delays while waiting for a lock, and I don't see any good way that would avoid adding complexity while maintaining good locking speed and small CPU utilization.

Finally, we could use a client-server layout, with the server being started on first emerge process and its clients requesting semaphore locks via UNIX socket.  This has the advantage that we can use socket connections to release resources automatically but it might have more complexity than the other solutions.