Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 914876 - sys-apps/portage: eliminate unsafe os.fork() usage
Summary: sys-apps/portage: eliminate unsafe os.fork() usage
Status: RESOLVED FIXED
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Core (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Portage team
URL: https://github.com/python/cpython/iss...
Whiteboard:
Keywords:
Depends on: 915099 915119 915123 915136 915834 915896 915903 916106 916108 916112 916139 916141 916142 916231 916235 916240 916242 916245 916247 916248 916566 916601 923841 923847 924273 924313 924416
Blocks:
  Show dependency tree
 
Reported: 2023-09-28 05:01 UTC by Zac Medico
Modified: 2024-03-04 05:13 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Zac Medico gentoo-dev 2023-09-28 05:01:47 UTC
In https://github.com/python/cpython/issues/84559 they've indicated that 'spawn' will be the default multiprocessing start method in python 3.14, since 'fork' is unsafe in threaded processes. We could try using 'forkserver', but 'spawn' is nice in the sense that it is cross-platform.

The portage unit tests currently emit this DeprecationWarning, but only when you run them all at together, so that a fork is triggered while a thread is running in the background:

> lib/portage/tests/util/futures/asyncio/test_event_loop_in_fork.py: 1 warning
>   /usr/lib/python3.12/multiprocessing/popen_fork.py:66: DeprecationWarning: This process (pid=57534) is multi-threaded, use of fork() may lead to deadlocks in the child.
>     self.pid = os.fork()
Comment 1 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-09-28 05:09:09 UTC
We started to hit problems with this in the past as macOS changed first: bug 758230.
Comment 2 Zac Medico gentoo-dev 2023-09-28 05:47:27 UTC

The portage.getpid() implementation will need to adjust to the post-fork world:

lib/portage/__init__.py:multiprocessing.util.register_after_fork(_ForkWatcher, _ForkWatcher.hook)

The multiprocessing.util.register_after_fork function is undocumented, so while we're still using forks it would be better to use os.register_at_fork:

https://docs.python.org/3/library/os.html#os.register_at_fork

(In reply to Sam James from comment #1)
> We started to hit problems with this in the past as macOS changed first: bug
> 758230.

Ah yes, prefix has explicitly set the multiprocessing start method to fork:

> commit 13b2f0d1edd4f70c3bf776503cdc505ccf04f2d3
> Author: Fabian Groffen <grobian@gentoo.org>
> Date:   2021-01-04 13:06:25 +0100
> 
>     lib/portage: add more definite fix for macOS multiprocessing default
>     
>     force 'fork' method for as long as we get pickle errors whilst using
>     'spawn'
>     
>     Bug: https://bugs.gentoo.org/758230
>     Signed-off-by: Fabian Groffen <grobian@gentoo.org>
> 
> diff --git a/lib/portage/__init__.py b/lib/portage/__init__.py
> index a5cc8fb08..ec1b50107 100644
> --- a/lib/portage/__init__.py
> +++ b/lib/portage/__init__.py
> @@ -21,0 +22,2 @@ try:
> +       # PREFIX LOCAL
> +       import multiprocessing
> @@ -40,0 +43,10 @@ except ImportError as e:
> +# BEGIN PREFIX LOCAL
> +# for bug #758230, on macOS the default was switched from fork to spawn,
> +# the latter causing issues because all kinds of things can't be
> +# pickled, so force fork mode for now
> +try:
> +       multiprocessing.set_start_method('fork')
> +except RuntimeError:
> +       pass
> +# END PREFIX LOCAL
Comment 3 Zac Medico gentoo-dev 2023-09-29 05:17:06 UTC
In order to test the multiprocessing spawn start method, I ran lib/portage/tests/emerge/test_simple.py with this added to the top of bin/emerge and bin/ebuild:

> import multiprocessing
> multiprocessing.set_start_method("spawn", force=True)

Since ForkProcess uses the instance method self._bootstrap as the multiprocessing.Process target function, the spawn start method tries to pickle the whole ForkProcess instance, which refers back to the Scheduler instance. The Scheduler instance has a threading.RLock instance that fails to pickle like this:

> Traceback (most recent call last):
>   File "/usr/lib/python3.12/asyncio/events.py", line 84, in _run
>     self._context.run(self._callback, *self._args)
>   File "portage/lib/_emerge/EbuildPhase.py", line 591, in <lambda>
>     lambda future: future.cancelled() or future.result()
>                                          ^^^^^^^^^^^^^^^
>   File "portage/lib/_emerge/EbuildPhase.py", line 604, in _soname_deps_qa
>     all_provides = await self.scheduler.run_in_executor(
>                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File "/usr/lib/python3.12/asyncio/base_events.py", line 840, in run_in_executor
>     executor.submit(func, *args), loop=self)
>     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File "portage/lib/portage/util/futures/executor/fork.py", line 46, in submit
>     self._schedule()
>   File "portage/lib/portage/util/futures/executor/fork.py", line 62, in _schedule
>     proc.start()
>   File "portage/lib/_emerge/AsynchronousTask.py", line 34, in start
>     self._start()
>   File "portage/lib/portage/util/_async/AsyncFunction.py", line 40, in _start
>     ForkProcess._start(self)
>   File "portage/lib/_emerge/SpawnProcess.py", line 125, in _start
>     retval = self._spawn(self.args, **kwargs)
>              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File "portage/lib/portage/util/_async/ForkProcess.py", line 46, in _spawn
>     self._proc.start()
>   File "/usr/lib/python3.12/multiprocessing/process.py", line 121, in start
>     self._popen = self._Popen(self)
>                   ^^^^^^^^^^^^^^^^^
>   File "/usr/lib/python3.12/multiprocessing/context.py", line 224, in _Popen
>     return _default_context.get_context().Process._Popen(process_obj)
>            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File "/usr/lib/python3.12/multiprocessing/context.py", line 289, in _Popen
>     return Popen(process_obj)
>            ^^^^^^^^^^^^^^^^^^
>   File "/usr/lib/python3.12/multiprocessing/popen_spawn_posix.py", line 32, in __init__
>     super().__init__(process_obj)
>   File "/usr/lib/python3.12/multiprocessing/popen_fork.py", line 19, in __init__
>     self._launch(process_obj)
>   File "/usr/lib/python3.12/multiprocessing/popen_spawn_posix.py", line 47, in _launch
>     reduction.dump(process_obj, fp)
>   File "/usr/lib/python3.12/multiprocessing/reduction.py", line 60, in dump
>     ForkingPickler(file, protocol).dump(obj)
> TypeError: cannot pickle '_thread.RLock' object

In order to avoid this, ForkProcess._bootstrap can be converted to a staticmethod. Also, _bootstrap tries to execute another instance method, self._run, so that needs to be changed to a callable that contains no self reference.
Comment 4 Zac Medico gentoo-dev 2023-09-29 05:33:16 UTC
PEP 703 proposes to make the Global Interpreter Lock optional in python 3.13, so threading may eventually become a more convenient alternative to multiprocessing.
Comment 5 Larry the Git Cow gentoo-dev 2023-10-15 22:02:49 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/proj/portage.git/commit/?id=3e38ae92bdd5b057352a2bcb044fb587b15b25f3

commit 3e38ae92bdd5b057352a2bcb044fb587b15b25f3
Author:     Zac Medico <zmedico@gentoo.org>
AuthorDate: 2023-10-15 19:21:44 +0000
Commit:     Zac Medico <zmedico@gentoo.org>
CommitDate: 2023-10-15 21:38:08 +0000

    MergeProcess: Eliminate target arguments that reference self
    
    This improves compatibility with the multiprocessing spawn
    start method, by eliminating this error:
    
    AttributeError: Can't pickle local object 'MergeProcess._start.<locals>.<lambda>'
    
    Bug: https://bugs.gentoo.org/914876
    Signed-off-by: Zac Medico <zmedico@gentoo.org>

 lib/portage/dbapi/_MergeProcess.py | 40 +++++++++++++++++++-------------------
 1 file changed, 20 insertions(+), 20 deletions(-)
Comment 6 Larry the Git Cow gentoo-dev 2023-10-24 15:31:37 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/proj/portage.git/commit/?id=79e7024147faff511a69ac725ba63ba71a474aee

commit 79e7024147faff511a69ac725ba63ba71a474aee
Author:     Zac Medico <zmedico@gentoo.org>
AuthorDate: 2023-10-24 03:46:19 +0000
Commit:     Zac Medico <zmedico@gentoo.org>
CommitDate: 2023-10-24 04:03:46 +0000

    EbuildFetcher: multiprocessing spawn compat
    
    Tested with this script:
    
    import contextlib
    import multiprocessing
    import unittest
    
    from portage.tests.conftest import prepare_environment
    from portage.tests.ebuild.test_fetch import EbuildFetchTestCase
    
    if __name__ == "__main__":
        multiprocessing.set_start_method("spawn", force=True)
        with contextlib.contextmanager(prepare_environment.__wrapped__)():
            unittest.main()
    
    Bug: https://bugs.gentoo.org/914876
    Signed-off-by: Zac Medico <zmedico@gentoo.org>

 lib/_emerge/EbuildFetcher.py           | 12 ++++--------
 lib/portage/tests/ebuild/test_fetch.py | 24 ++++++++++++++----------
 2 files changed, 18 insertions(+), 18 deletions(-)
Comment 7 Larry the Git Cow gentoo-dev 2023-10-31 02:41:52 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/proj/portage.git/commit/?id=255c7d5c430d445ee43ceda2f6bf716c75710f23

commit 255c7d5c430d445ee43ceda2f6bf716c75710f23
Author:     Zac Medico <zmedico@gentoo.org>
AuthorDate: 2023-10-31 02:01:49 +0000
Commit:     Zac Medico <zmedico@gentoo.org>
CommitDate: 2023-10-31 02:06:18 +0000

    sqlite: multiprocessing spawn compat
    
    Override __getstate__ to omit unpicklable attributes, and
    regenerate the unpicklable attributes after unpickling.
    
    Bug: https://bugs.gentoo.org/914876
    Signed-off-by: Zac Medico <zmedico@gentoo.org>

 lib/portage/cache/sqlite.py           | 15 ++++++++++++++-
 lib/portage/tests/dbapi/test_auxdb.py |  2 +-
 2 files changed, 15 insertions(+), 2 deletions(-)
Comment 8 Larry the Git Cow gentoo-dev 2024-02-05 06:44:25 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/proj/portage.git/commit/?id=cba5d579f170ee9616b1903dedc3597eafb1aee7

commit cba5d579f170ee9616b1903dedc3597eafb1aee7
Author:     Zac Medico <zmedico@gentoo.org>
AuthorDate: 2024-02-05 06:38:52 +0000
Commit:     Zac Medico <zmedico@gentoo.org>
CommitDate: 2024-02-05 06:43:11 +0000

    bin/fixpackages: multiprocessing spawn compat
    
    Use __main__ to avoid this RuntimeError:
    
    RuntimeError:
            An attempt has been made to start a new process before the
            current process has finished its bootstrapping phase.
    
    Bug: https://bugs.gentoo.org/914876
    Signed-off-by: Zac Medico <zmedico@gentoo.org>

 bin/fixpackages | 71 ++++++++++++++++++++++++++++++---------------------------
 1 file changed, 38 insertions(+), 33 deletions(-)
Comment 9 Larry the Git Cow gentoo-dev 2024-02-05 22:56:10 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/proj/portage.git/commit/?id=a5d92fafab93f62e5b28b97a443e0e06a368251d

commit a5d92fafab93f62e5b28b97a443e0e06a368251d
Author:     Zac Medico <zmedico@gentoo.org>
AuthorDate: 2024-02-05 22:51:47 +0000
Commit:     Zac Medico <zmedico@gentoo.org>
CommitDate: 2024-02-05 22:54:04 +0000

    tests/bin/setup_env.py: multiprocessing spawn compat
    
    Replace unpicklable local pre_exec function with spawn
    cwd argument.
    
    Bug: https://bugs.gentoo.org/914876
    Signed-off-by: Zac Medico <zmedico@gentoo.org>

 lib/portage/tests/bin/setup_env.py | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)
Comment 10 Larry the Git Cow gentoo-dev 2024-02-07 00:32:11 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/proj/portage.git/commit/?id=b419577c2e5219af9d10c9856449e23fa4d87b24

commit b419577c2e5219af9d10c9856449e23fa4d87b24
Author:     Zac Medico <zmedico@gentoo.org>
AuthorDate: 2024-02-06 05:56:57 +0000
Commit:     Zac Medico <zmedico@gentoo.org>
CommitDate: 2024-02-06 08:05:01 +0000

    actions: Add muliprocessing start-method to matrix
    
    Add multiprocessing start-method to matrix so that we are prepared
    for when python changes to default to spawn. Exclude all python
    versions except 3.12-dev for now.
    
    This adds a conditional ci step that patches the sources just for
    the spawn start-method. It patches all bin/* scripts with python
    shebangs, and also patches the test command in meson.build.
    
    Bug: https://bugs.gentoo.org/914876
    Signed-off-by: Zac Medico <zmedico@gentoo.org>

 .github/workflows/ci.yml | 40 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)
Comment 11 Larry the Git Cow gentoo-dev 2024-02-07 15:32:00 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/proj/portage.git/commit/?id=ca8d46112c5b0f7ab9f298915b1cee5a39523173

commit ca8d46112c5b0f7ab9f298915b1cee5a39523173
Author:     Zac Medico <zmedico@gentoo.org>
AuthorDate: 2024-02-07 04:33:16 +0000
Commit:     Zac Medico <zmedico@gentoo.org>
CommitDate: 2024-02-07 15:31:30 +0000

    LockNonblockTestCase: Use multiprocessing.Process instead of os.fork()
    
    Since os.fork() is unsafe in threaded processes, use
    multiprocessing.Process instead. This way the fork will
    be eliminated when the default multiprocessing start
    method changes to "spawn".
    
    Bug: https://bugs.gentoo.org/914876
    Signed-off-by: Zac Medico <zmedico@gentoo.org>

 lib/portage/tests/locks/test_lock_nonblock.py | 52 +++++++++++++--------------
 1 file changed, 26 insertions(+), 26 deletions(-)
Comment 12 Zac Medico gentoo-dev 2024-02-09 08:43:55 UTC
Since bug 923841 is fixed, all the unsafe os.fork() usage is gone except for one in lib/portage/util/_pty.py which only triggers for fbsd, and the result is cached so it doesn't happen more than once.

I'm not counting the os.fork() call as unsafe inside portage.process._exec because it's isolated inside a fresh child process that should not have any threads running.
Comment 13 Larry the Git Cow gentoo-dev 2024-02-13 07:52:41 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/proj/portage.git/commit/?id=3110ec376cbcb1f5b7fb82ba30ec958798bb32cf

commit 3110ec376cbcb1f5b7fb82ba30ec958798bb32cf
Author:     Zac Medico <zmedico@gentoo.org>
AuthorDate: 2024-02-13 00:46:52 +0000
Commit:     Zac Medico <zmedico@gentoo.org>
CommitDate: 2024-02-13 05:04:40 +0000

    actions: Fix interaction between start-method and pytest-xdist
    
    Use portage.test.TestCase setUp method to setup the multiprocessing
    start method if needed. It needs to be done relatively late in order
    to work with the pytest-xdist plugin due to execnet usage.
    
    Bug: https://bugs.gentoo.org/914876
    Signed-off-by: Zac Medico <zmedico@gentoo.org>

 .github/workflows/ci.yml                                |  2 +-
 lib/portage/tests/__init__.py                           | 12 +++++++++++-
 lib/portage/tests/env/config/test_PortageModulesFile.py |  3 ++-
 lib/portage/tests/news/test_NewsItem.py                 |  3 ++-
 lib/portage/tests/sets/files/test_config_file_set.py    |  3 ++-
 lib/portage/tests/sets/files/test_static_file_set.py    |  3 ++-
 lib/portage/tests/sets/shell/test_shell.py              |  4 ++--
 lib/portage/tests/util/futures/test_retry.py            |  1 +
 8 files changed, 23 insertions(+), 8 deletions(-)
Comment 14 Zac Medico gentoo-dev 2024-02-28 06:41:59 UTC
The CI runs for start-method spawn are now free of os.fork() DeprecationWarning messages, since this commit:

https://github.com/gentoo/portage/actions/runs/8076223928/job/22064346335

https://gitweb.gentoo.org/proj/portage.git/commit/?id=e882b1e956d50808a0143875a8ca35f2fc21f368

commit e882b1e956d50808a0143875a8ca35f2fc21f368
Author: Zac Medico <zmedico@gentoo.org>
Date:   2024-02-27 22:25:45 -0800

    UnshareNetTestCase: Initialize ABILITY_TO_UNSHARE in setUp
    
    Initialize ABILITY_TO_UNSHARE in setUp so that _unshare_validate
    uses the correct PORTAGE_MULTIPROCESSING_START_METHOD setup
    from super().setUp(), eliminating messages like this from CI
    runs for the multiprocessing start method spawn:
    
        /opt/hostedtoolcache/Python/3.12.2/x64/lib/python3.12/multiprocessing/popen_fork.py:66:
        DeprecationWarning: This process (pid=2886) is multi-threaded, use of fork() may lead to deadlocks in the child.
    
    The cause of these messages can be traced by patching python's
    multiprocessing popen_fork.py like this:
    
    --- /usr/lib/python3.12/multiprocessing/popen_fork.py
    +++ /usr/lib/python3.12/multiprocessing/popen_fork.py
    @@ -2,2 +2,3 @@
     import signal
    +import traceback
    
    @@ -62,2 +63,3 @@
         def _launch(self, process_obj):
    +        traceback.print_stack()
             code = 1
    
    Fixes: 3110ec376cbc ("actions: Fix interaction between start-method and pytest-xdist")
    Signed-off-by: Zac Medico <zmedico@gentoo.org>