Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 937740 - sys-apps/portage: Support coroutine exitfuncs for non-main loops that were instantiated for non-main threads
Summary: sys-apps/portage: Support coroutine exitfuncs for non-main loops that were in...
Status: RESOLVED FIXED
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Core (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Portage team
URL:
Whiteboard:
Keywords: InVCS
Depends on: 933499
Blocks:
  Show dependency tree
 
Reported: 2024-08-11 01:49 UTC by Zac Medico
Modified: 2024-09-11 01:30 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Zac Medico gentoo-dev 2024-08-11 01:49:49 UTC
In the wake of bug 937384 I tested to see if the socks5 proxy would shut down correctly when started via a non-main thread, and found that it did not, causing it to hang in the multiprocessing _exit_function atexit hook:

  /usr/lib/python3.12/multiprocessing/util.py(360)_exit_function()
-> p.join()
  /usr/lib/python3.12/multiprocessing/process.py(149)join()
-> res = self._popen.wait(timeout)
  /usr/lib/python3.12/multiprocessing/popen_fork.py(43)wait()
-> return self.poll(os.WNOHANG if timeout == 0.0 else 0)
  /usr/lib/python3.12/multiprocessing/popen_fork.py(27)poll()
-> pid, sts = os.waitpid(self.pid, flag)
Comment 1 Zac Medico gentoo-dev 2024-08-11 02:04:03 UTC
This is a variant of the script from Socks5ServerAtExitTestCase which triggers the hang because there is no "main" loop to call run_coroutine_exitfuncs:

import pdb
import signal
import threading

import portage
from portage.util import socks5
from portage.util._eventloop.global_event_loop import global_event_loop


def debug_signal(_signum, _frame):
    pdb.set_trace()


def main():
    socks5.get_socks5_proxy(portage.settings)
    global_event_loop().run_until_complete(socks5.proxy.ready())
    print(socks5.proxy._proc.pid, flush=True)


if __name__ == "__main__":
    signal.signal(signal.SIGUSR1, debug_signal)
    t = threading.Thread(target=main)
    t.start()
    t.join()
    print("exit", flush=True)
Comment 2 Zac Medico gentoo-dev 2024-08-11 02:32:28 UTC
We can change _coroutine_exithandlers to a dict with a separate list per thread, and then use that to decide which loop will run those coroutine exit handlers from the _thread_weakrefs_atexit hook.
Comment 3 Zac Medico gentoo-dev 2024-08-11 04:50:58 UTC
(In reply to Zac Medico from comment #2)
> We can change _coroutine_exithandlers to a dict with a separate list per
> thread, and then use that to decide which loop will run those coroutine exit
> handlers from the _thread_weakrefs_atexit hook.

Since _thread_weakrefs_atexit runs in the main thread, the _thread_weakrefs.loops mapping between threads and loops is practically irrelevant and this point, so we need to stop using threading.get_ident() in _safe_loop() while atexit hooks are running.
Comment 4 Larry the Git Cow gentoo-dev 2024-08-13 21:03:10 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/proj/portage.git/commit/?id=cb0c09d8cecbcc086786e3e2c7cdd8ffc023a48a

commit cb0c09d8cecbcc086786e3e2c7cdd8ffc023a48a
Author:     Zac Medico <zmedico@gentoo.org>
AuthorDate: 2024-08-11 07:50:49 +0000
Commit:     Zac Medico <zmedico@gentoo.org>
CommitDate: 2024-08-11 07:50:49 +0000

    Support coroutine exitfuncs for non-main loops
    
    Since an API consumer can cause loops to be instantiated
    for non-main threads, support coroutine exitfuncs for each
    loop. The included Socks5ServerAtExitThreadedTestCase calls
    get_socks5_proxy from a non-main thread, and demonstrates
    that coroutine exitfuncs for the resulting non-main loop
    will reliably stop the socks5 proxy via atexit hook.
    
    The _thread_weakrefs_atexit function will now make a
    temporary adjustment to _thread_weakrefs.loops so that a
    loop is associated with the current thread when it is
    closing. Also, the _get_running_loop function will now
    store weak references to all _AsyncioEventLoop instances
    it creates, since each has a _coroutine_exithandlers
    attribute that can be modified by atexit_register calls.
    
    Bug: https://bugs.gentoo.org/937740
    Signed-off-by: Zac Medico <zmedico@gentoo.org>

 lib/portage/process.py                            | 11 ++++--
 lib/portage/tests/util/test_socks5.py             | 38 +++++++++++++++------
 lib/portage/util/_eventloop/asyncio_event_loop.py | 15 ++-------
 lib/portage/util/futures/_asyncio/__init__.py     | 41 ++++++++++++++++++++---
 4 files changed, 76 insertions(+), 29 deletions(-)
Comment 5 Larry the Git Cow gentoo-dev 2024-08-19 14:49:13 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/proj/portage.git/commit/?id=e97acd3c62ff02eb41ff643e75eb5e07c27717f8

commit e97acd3c62ff02eb41ff643e75eb5e07c27717f8
Author:     Zac Medico <zmedico@gentoo.org>
AuthorDate: 2024-08-18 14:59:46 +0000
Commit:     Zac Medico <zmedico@gentoo.org>
CommitDate: 2024-08-18 15:46:45 +0000

    _wrap_loop: Prevent redundant AsyncioEventLoop instances
    
    Ultimately the loop arguments that necessitate the _wrap_loop
    function can be removed, because our aim since bug 761538 should
    be to eliminate them. Meanwhile, we don't want _wrap_loop to return
    redundant AsyncioEventLoop instances if we can easily prevent it.
    
    Therefore, use _safe_loop(create=False) to look up the AsyncioEventLoop
    instance associated with the current thread, and avoid creating
    redundant instances. This serves to guard against garbage collection
    of AsyncioEventLoop instances which may have _coroutine_exithandlers
    added by the atexit_register function since commit cb0c09d8cecb from
    bug 937740.
    
    If _safe_loop(create=False) fails to associate a loop with the current
    thread, raise an AssertionError for portage internal API consumers.
    It's not known whether external API consumers will trigger this case,
    so if it happens then emit a UserWarning and return a temporary
    AsyncioEventLoop instance.
    
    Fixes: cb0c09d8cecb ("Support coroutine exitfuncs for non-main loops")
    Bug: https://bugs.gentoo.org/938127
    Bug: https://bugs.gentoo.org/937740
    Bug: https://bugs.gentoo.org/761538
    Signed-off-by: Zac Medico <zmedico@gentoo.org>

 lib/portage/util/futures/_asyncio/__init__.py | 43 +++++++++++++++++++++++----
 1 file changed, 38 insertions(+), 5 deletions(-)
Comment 6 Larry the Git Cow gentoo-dev 2024-09-11 01:30:59 UTC
The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=02d0e00a1ba811b39140d10e17488f7fc3916534

commit 02d0e00a1ba811b39140d10e17488f7fc3916534
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2024-09-11 01:30:10 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2024-09-11 01:30:30 +0000

    sys-apps/portage: add 3.0.66
    
    Closes: https://bugs.gentoo.org/435066
    Closes: https://bugs.gentoo.org/907061
    Closes: https://bugs.gentoo.org/910560
    Closes: https://bugs.gentoo.org/933433
    Closes: https://bugs.gentoo.org/934220
    Closes: https://bugs.gentoo.org/934514
    Closes: https://bugs.gentoo.org/934784
    Closes: https://bugs.gentoo.org/935830
    Closes: https://bugs.gentoo.org/936273
    Closes: https://bugs.gentoo.org/937384
    Closes: https://bugs.gentoo.org/937485
    Closes: https://bugs.gentoo.org/937740
    Closes: https://bugs.gentoo.org/937888
    Closes: https://bugs.gentoo.org/937891
    Closes: https://bugs.gentoo.org/938127
    Closes: https://bugs.gentoo.org/933499
    Signed-off-by: Sam James <sam@gentoo.org>

 sys-apps/portage/Manifest              |   1 +
 sys-apps/portage/portage-3.0.66.ebuild | 227 +++++++++++++++++++++++++++++++++
 2 files changed, 228 insertions(+)