Created attachment 617966 [details] emerge --info I have a prefix where portage hangs on pkg_postinst(). Happens with a few of the latest versions as of recently. Could be due to an old kernel, as this is with kernel 2.6.32-754.27.1.el6.x86_64 from Scientific Linux 6. The stack trace is shown below using Python 3.8, but I see the same with Python 3.6: #0 0x00007f7710c96a73 in epoll_wait () at ../sysdeps/unix/syscall-template.S:84 #1 0x00007f77105bfa12 in select_epoll_poll_impl (self=0x7f770f989e10, maxevents=<optimized out>, timeout_obj=<optimized out>) at /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/src/debug/dev-lang/python-3.8.2/Python-3.8.2/Modules/selectmodule.c:1558 #2 select_epoll_poll (self=0x7f770f989e10, args=<optimized out>, nargs=<optimized out>, kwnames=<optimized out>) at /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/src/debug/dev-lang/python-3.8.2/Python-3.8.2/Modules/clinic/selectmodule.c.h:859 #3 0x00007f7710e107ec in method_vectorcall_FASTCALL_KEYWORDS (func=0x7f77105e8130, args=0x560804cb1ca8, nargsf=<optimized out>, kwnames=0x0) at Objects/descrobject.c:371 #4 0x00007f7710ddcef3 in _PyObject_Vectorcall (kwnames=<optimized out>, nargsf=<optimized out>, args=<optimized out>, callable=<optimized out>) at ./Include/cpython/abstract.h:123 #5 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=0x56080426ca80) at Python/ceval.c:4987 #6 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3486 #7 0x00007f7710ecb4e7 in _PyEval_EvalCodeWithName (_co=_co@entry=0x7f77105f1660, globals=<optimized out>, locals=locals@entry=0x0, args=<optimized out>, argcount=2, kwnames=0x0, kwargs=0x560804e8d130, kwcount=0, kwstep=1, defs=0x7f77105f3178, defcount=1, kwdefs=0x0, closure=0x0, name=0x7f77105d36f0, qualname=0x7f77105f0620) at Python/ceval.c:4298 #8 0x00007f7710e06839 in _PyFunction_Vectorcall (func=<optimized out>, stack=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/call.c:435 #9 0x00007f7710ddcef3 in _PyObject_Vectorcall (kwnames=<optimized out>, nargsf=<optimized out>, args=<optimized out>, callable=<optimized out>) at ./Include/cpython/abstract.h:123 #10 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=0x56080426ca80) at Python/ceval.c:4987 #11 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3486 #12 0x00007f7710dda883 in function_code_fastcall (co=<optimized out>, args=<optimized out>, nargs=1, globals=<optimized out>) at Objects/call.c:283 #13 0x00007f7710ddcef3 in _PyObject_Vectorcall (kwnames=<optimized out>, nargsf=<optimized out>, args=<optimized out>, callable=<optimized out>) at ./Include/cpython/abstract.h:123 #14 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=0x56080426ca80) at Python/ceval.c:4987 #15 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3486 #16 0x00007f7710dda883 in function_code_fastcall (co=<optimized out>, args=<optimized out>, nargs=1, globals=<optimized out>) at Objects/call.c:283 #17 0x00007f7710ddcef3 in _PyObject_Vectorcall (kwnames=<optimized out>, nargsf=<optimized out>, args=<optimized out>, callable=<optimized out>) at ./Include/cpython/abstract.h:123 #18 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=0x56080426ca80) at Python/ceval.c:4987 #19 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3486 #20 0x00007f7710dda883 in function_code_fastcall (co=<optimized out>, args=<optimized out>, nargs=2, globals=<optimized out>) at Objects/call.c:283 #21 0x00007f7710ddcef3 in _PyObject_Vectorcall (kwnames=<optimized out>, nargsf=<optimized out>, args=<optimized out>, callable=<optimized out>) at ./Include/cpython/abstract.h:123 #22 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=0x56080426ca80) at Python/ceval.c:4987 #23 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3486 #24 0x00007f7710dda883 in function_code_fastcall (co=<optimized out>, args=<optimized out>, nargs=2, globals=<optimized out>) at Objects/call.c:283 #25 0x00007f7710e09efb in _PyObject_Vectorcall (kwnames=<optimized out>, nargsf=<optimized out>, args=0x7f770c0f7f38, callable=0x7f770fce7ee0) at ./Include/cpython/abstract.h:127 #26 method_vectorcall (method=<optimized out>, args=0x7f770c0f7f40, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/classobject.c:60 #27 0x00007f7710de21c2 in _PyObject_Vectorcall (kwnames=<optimized out>, nargsf=<optimized out>, args=<optimized out>, callable=<optimized out>) at ./Include/cpython/abstract.h:123 #28 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=0x56080426ca80) at Python/ceval.c:4987 #29 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3469 #30 0x00007f7710dda883 in function_code_fastcall (co=<optimized out>, args=<optimized out>, nargs=1, globals=<optimized out>) at Objects/call.c:283 #31 0x00007f7710ddcef3 in _PyObject_Vectorcall (kwnames=<optimized out>, nargsf=<optimized out>, args=<optimized out>, callable=<optimized out>) at ./Include/cpython/abstract.h:123 #32 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=0x56080426ca80) at Python/ceval.c:4987 #33 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3486 #34 0x00007f7710ecb4e7 in _PyEval_EvalCodeWithName (_co=_co@entry=0x7f770f650a80, globals=<optimized out>, locals=locals@entry=0x0, args=<optimized out>, argcount=1, kwnames=0x0, kwargs=0x560805027cb8, kwcount=0, kwstep=1, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0, name=0x7f770f64d930, qualname=0x7f770f64fc10) at Python/ceval.c:4298 #35 0x00007f7710e06839 in _PyFunction_Vectorcall (func=<optimized out>, stack=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/call.c:435 #36 0x00007f7710ddcef3 in _PyObject_Vectorcall (kwnames=<optimized out>, nargsf=<optimized out>, args=<optimized out>, callable=<optimized out>) at ./Include/cpython/abstract.h:123 #37 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=0x56080426ca80) at Python/ceval.c:4987 #38 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3486 #39 0x00007f7710ecb4e7 in _PyEval_EvalCodeWithName (_co=_co@entry=0x7f770f6500e0, globals=<optimized out>, locals=locals@entry=0x0, args=<optimized out>, argcount=1, kwnames=0x0, kwargs=0x560804786930, kwcount=0, kwstep=1, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0, name=0x7f771074a5f0, qualname=0x7f770f64dab0) at Python/ceval.c:4298 #40 0x00007f7710e06839 in _PyFunction_Vectorcall (func=<optimized out>, stack=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/call.c:435 #41 0x00007f7710ddcef3 in _PyObject_Vectorcall (kwnames=<optimized out>, nargsf=<optimized out>, args=<optimized out>, callable=<optimized out>) at ./Include/cpython/abstract.h:123 #42 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=0x56080426ca80) at Python/ceval.c:4987 #43 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3486 #44 0x00007f7710ecb4e7 in _PyEval_EvalCodeWithName (_co=_co@entry=0x7f770f8ce190, globals=<optimized out>, locals=locals@entry=0x0, args=<optimized out>, argcount=1, kwnames=0x7f770f8c3238, kwargs=0x5608047d4640, kwcount=1, kwstep=1, defs=0x7f770f6b3dd8, defcount=6, kwdefs=0x0, closure=0x0, name=0x7f770f8cf170, qualname=0x7f770f8cf170) at Python/ceval.c:4298 #45 0x00007f7710e06839 in _PyFunction_Vectorcall (func=<optimized out>, stack=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/call.c:435 #46 0x00007f7710ddc867 in _PyObject_Vectorcall (kwnames=<optimized out>, nargsf=<optimized out>, args=<optimized out>, callable=<optimized out>) at ./Include/cpython/abstract.h:123 #47 call_function (kwnames=0x7f770f8c3220, oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=0x56080426ca80) at Python/ceval.c:4987 #48 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3515 #49 0x00007f7710ecb4e7 in _PyEval_EvalCodeWithName (_co=_co@entry=0x7f770f956710, globals=<optimized out>, locals=locals@entry=0x0, args=<optimized out>, argcount=1, kwnames=0x0, kwargs=0x560804448de8, kwcount=0, kwstep=1, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0, name=0x7f770fcf20f0, qualname=0x7f770fcf20f0) at Python/ceval.c:4298 #50 0x00007f7710e06839 in _PyFunction_Vectorcall (func=<optimized out>, stack=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/call.c:435 #51 0x00007f7710ddcc6a in _PyObject_Vectorcall (kwnames=<optimized out>, nargsf=<optimized out>, args=<optimized out>, callable=<optimized out>) at ./Include/cpython/abstract.h:123 #52 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=0x56080426ca80) at Python/ceval.c:4987 #53 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3500 #54 0x00007f7710dda883 in function_code_fastcall (co=<optimized out>, args=<optimized out>, nargs=1, globals=<optimized out>) at Objects/call.c:283 #55 0x00007f7710ddcc6a in _PyObject_Vectorcall (kwnames=<optimized out>, nargsf=<optimized out>, args=<optimized out>, callable=<optimized out>) --Type <RET> for more, q to quit, c to continue without paging-- at ./Include/cpython/abstract.h:123 #56 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=0x56080426ca80) at Python/ceval.c:4987 #57 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3500 #58 0x00007f7710ecb4e7 in _PyEval_EvalCodeWithName (_co=_co@entry=0x7f77107850e0, globals=globals@entry=0x7f77108c3c40, locals=locals@entry=0x7f77108c3c40, args=args@entry=0x0, argcount=argcount@entry=0, kwnames=kwnames@entry=0x0, kwargs=0x0, kwcount=0, kwstep=2, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0, name=0x0, qualname=0x0) at Python/ceval.c:4298 #59 0x00007f7710ecb7cd in PyEval_EvalCodeEx (_co=_co@entry=0x7f77107850e0, globals=globals@entry=0x7f77108c3c40, locals=locals@entry=0x7f77108c3c40, args=args@entry=0x0, argcount=argcount@entry=0, kws=kws@entry=0x0, kwcount=0, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0) at Python/ceval.c:4327 #60 0x00007f7710ecb7ff in PyEval_EvalCode (co=co@entry=0x7f77107850e0, globals=globals@entry=0x7f77108c3c40, locals=locals@entry=0x7f77108c3c40) at Python/ceval.c:718 #61 0x00007f7710f015fa in run_eval_code_obj (co=co@entry=0x7f77107850e0, globals=globals@entry=0x7f77108c3c40, locals=locals@entry=0x7f77108c3c40) at Python/pythonrun.c:1125 #62 0x00007f7710f016b5 in run_mod (mod=mod@entry=0x5608043078d0, filename=filename@entry=0x7f7710775ed0, globals=globals@entry=0x7f77108c3c40, locals=locals@entry=0x7f77108c3c40, flags=flags@entry=0x7ffed8708298, arena=arena@entry=0x7f77108bccf0) at Python/pythonrun.c:1147 #63 0x00007f7710f040d2 in PyRun_FileExFlags (fp=fp@entry=0x5608042a3880, filename_str=filename_str@entry=0x7f7710776150 "/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python-exec/python3.8/emerge", start=start@entry=257, globals=globals@entry=0x7f77108c3c40, locals=locals@entry=0x7f77108c3c40, closeit=closeit@entry=1, flags=0x7ffed8708298) at Python/pythonrun.c:1063 #64 0x00007f7710f04240 in PyRun_SimpleFileExFlags (fp=fp@entry=0x5608042a3880, filename=<optimized out>, closeit=closeit@entry=1, flags=flags@entry=0x7ffed8708298) at Python/pythonrun.c:428 #65 0x00007f7710f04721 in PyRun_AnyFileExFlags (fp=fp@entry=0x5608042a3880, filename=<optimized out>, closeit=closeit@entry=1, flags=flags@entry=0x7ffed8708298) at Python/pythonrun.c:86 #66 0x00007f7710f1edfd in pymain_run_file (cf=0x7ffed8708298, config=0x56080426ab00) at Modules/main.c:381 #67 pymain_run_python (exitcode=0x7ffed8708290) at Modules/main.c:565 #68 Py_RunMain () at Modules/main.c:644 #69 0x00007f7710f1f179 in pymain_main (args=args@entry=0x7ffed8708380) at Modules/main.c:674 #70 0x00007f7710f1f24a in Py_BytesMain (argc=<optimized out>, argv=<optimized out>) at Modules/main.c:698 #71 0x00007f7710bb7156 in __libc_start_main (main=0x560802aa4050 <main>, argc=5, argv=0x7ffed87084a8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffed8708498) at ../csu/libc-start.c:295 #72 0x0000560802aa408a in _start () at ../sysdeps/x86_64/start.S:120 The call to epoll_wait is from this loop in Python: do { Py_BEGIN_ALLOW_THREADS errno = 0; nfds = epoll_wait(self->epfd, evs, maxevents, (int)ms); // this line Py_END_ALLOW_THREADS if (errno != EINTR) break; /* poll() was interrupted by a signal */ if (PyErr_CheckSignals()) goto error; if (timeout >= 0) { timeout = deadline - _PyTime_GetMonotonicClock(); if (timeout < 0) { nfds = 0; break; } ms = _PyTime_AsMilliseconds(timeout, _PyTime_ROUND_CEILING); /* retry epoll_wait() with the recomputed timeout */ } } while(1); Please let me know how else I can test/debug this. It would be nice to be able to fix it.
If it's anything like bug 711322, then the --load-average option might cause it to un-hang itself (it has a 30 second time that will wake it up). In cases like this it's also possible to send the main emerge process a SIGUSR1 signal in order to get a pdb prompt, then move up the stack to the scope of the Scheduler class and call self._schedule() to un-hang it.
I am already using --load-average, so the hang happens with or without it. Here is a stack trace from pdb in any case: >>> Completed installing dev-db/sqlite-3.31.1 into /tmp/portage/dev-db/sqlite-3.31.1/image * Final size of build directory: 28960 KiB (28.2 MiB) * Final size of installed tree: 2340 KiB ( 2.2 MiB) --Return-- > /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python-exec/python3.8/emerge(30)debug_signal()->None -> pdb.set_trace() (Pdb) bt /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python-exec/python3.8/emerge(53)<module>() -> retval = emerge_main() /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/main.py(1309)emerge_main() -> return run_action(emerge_config) /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/actions.py(3374)run_action() -> retval = action_build(emerge_config, spinner=spinner) /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/actions.py(564)action_build() -> retval = mergetask.merge() /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/Scheduler.py(1020)merge() -> rval = self._merge() /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/Scheduler.py(1414)_merge() -> self._main_loop() /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/Scheduler.py(1390)_main_loop() -> self._event_loop.run_until_complete(self._main_exit) /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/portage/util/_eventloop/asyncio_event_loop.py(127)_run_until_complete() -> return self._loop.run_until_complete(future) /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/asyncio/base_events.py(603)run_until_complete() -> self.run_forever() /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/asyncio/base_events.py(570)run_forever() -> self._run_once() /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/asyncio/base_events.py(1823)_run_once() -> event_list = self._selector.select(timeout) /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/selectors.py(468)select() -> fd_event_list = self._selector.poll(timeout, max_ev) > /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python-exec/python3.8/emerge(30)debug_signal()->None -> pdb.set_trace() for portage-3.2.84, the stack trace is --Return-- > /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python-exec/python3.8/emerge(30)debug_signal()->None -> pdb.set_trace() (Pdb) (Pdb) bt /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python-exec/python3.8/emerge(53)<module>() -> retval = emerge_main() /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/main.py(1309)emerge_main() -> return run_action(emerge_config) /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/actions.py(3358)run_action() -> retval = action_build(emerge_config, spinner=spinner) /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/actions.py(563)action_build() -> retval = mergetask.merge() /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/Scheduler.py(1020)merge() -> rval = self._merge() /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/Scheduler.py(1414)_merge() -> self._main_loop() /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/_emerge/Scheduler.py(1390)_main_loop() -> self._event_loop.run_until_complete(self._main_exit) /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/site-packages/portage/util/_eventloop/asyncio_event_loop.py(140)_run_until_complete() -> return self._loop.run_until_complete(future) /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/asyncio/base_events.py(603)run_until_complete() -> self.run_forever() /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/asyncio/base_events.py(570)run_forever() -> self._run_once() /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/asyncio/base_events.py(1823)_run_once() -> event_list = self._selector.select(timeout) /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python3.8/selectors.py(468)select() -> fd_event_list = self._selector.poll(timeout, max_ev) > /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/lib/python-exec/python3.8/emerge(30)debug_signal()->None -> pdb.set_trace() (Pdb) I didn't manage to unhang portage-2.3.93 by calling self._schedule() either. Even calling just ebuild portage-2.3.93.ebuild install already causes the hang for me. All versions of portage currently in the tree show this behavior, so I wonder if the machine I'm running this on is not the problem...
Typically this type of hang indicates that the stdout/stderr file descriptors of the ebuild process have not reached EOF, which can be due to some process running in the background which has inherited these file descriptors. Does this issue affect all ebuilds? Do you have anything in a bashrc like ${EPREFIX}/etc/portage/bashrc or in ${EPREFIX}/etc/portage/env/* that might leave a process running in the background? Can you see any processes running in the background, using tools like ps? Typically I would use a command like `ps axf` to view the process tree, but it's possible that the process would be a child of pid 1 if it has daemonized itself. Over the years we have come up with 3 mechanisms to combat this kind issue: 1) The sys-apps/portage ebuilds have an ipc USE flag which enables an ipc call for the ebuild to indicate when a given phase has completed. The ipc USE flag is sometimes disabled for prefix installations, so you might try enabling it if possible. 2) FEATURES=cgroup (I wouldn't recommend this one since it's not really maintained and could have various issues like for example it is incompatible with systemd) 3) FEATURES=pid-sandbox (requires root privileges)
This is what I have in FEATURES: gentoo ~ $ portageq envvar FEATURES assume-digests binpkg-docompress binpkg-dostrip binpkg-logs config-protect-if-modified distlocks ebuild-locks fixlafiles force-prefix ipc-sandbox merge-sync multilib-strict network-sandbox parallel-fetch pid-sandbox preserve-libs protect-owned sfperms strict unknown-features-warn unmerge-logs unmerge-orphans unprivileged And how portage is configured: gentoo ~ $ emerge -pv1 --nodeps portage These are the packages that would be merged, in order: [ebuild U ] sys-apps/portage-2.3.94::gentoo [2.3.84-r1::gentoo] USE="(ipc) -apidoc% -build -doc -gentoo-dev (-native-extensions) -rsync-verify (-selinux) -xattr (-epydoc%)" PYTHON_TARGETS="python3_6 python3_7 python3_8 -pypy3 -python2_7" 0 KiB I rebooted the machine, and no suspicious process is running. At the point ebuild gets stuck, I see this process tree: 2720 pts/0 Ss 0:00 | \_ -bash 2767 pts/0 S 0:00 | \_ /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/bin/bash /home/jenkins/bin/gentoo 2768 pts/0 S 0:00 | \_ /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/bin/bash -l 2839 pts/0 S+ 0:01 | \_ /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/bin/python3.6 -b /cvmfs/sft.cern.ch/lcg/contrib/gento 7577 pts/0 S+ 0:00 | \_ /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/bin/bash /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/ 9317 pts/0 S+ 0:00 | \_ /cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/bin/bash /cvmfs/sft.cern.ch/lcg/contrib/gentoo/li 9332 pts/0 S+ 0:00 | \_ mkfifo -m 600 /tmp/portage/sys-apps/portage-2.3.94/temp/multijob.CuM3SW So, it seems that mkfifo might be involved. This is prefix, so I cannot try things that need root privileges. If I ctrl-C then to ebuild ... install again, then it thinks installation finished. After that, I can use ebuild ... qmerge to finish installing the package. I still don't know how this started to happen. The issue affects all ebuilds.
Killing the mkfifo points to portage getting stuck here: https://gitweb.gentoo.org/proj/portage.git/tree/bin/helper-functions.sh#n25
So, turns out that it's mkfifo that hangs on a call to open(..., O_RDONLY): $ strace mkfifo -m 600 /tmp/myfifo execve("/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/bin/mkfifo", ["mkfifo", "-m", "600", "/tmp/myfifo"], [/* 45 vars */]) = 0 brk(0) = 0x55b786134000 mmap(NULL, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f213b2ec000 access("/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) open("/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=554196, ...}) = 0 mmap(NULL, 554196, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f213b264000 close(3) = 0 open("/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0p\"\2\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=16585416, ...}) = 0 mmap(NULL, 1804472, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f213b0ab000 mprotect(0x7f213b0cb000, 1634304, PROT_NONE) = 0 mmap(0x7f213b0cb000, 1355776, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x20000) = 0x7f213b0cb000 mmap(0x7f213b216000, 274432, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x16b000) = 0x7f213b216000 mmap(0x7f213b25a000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1ae000) = 0x7f213b25a000 mmap(0x7f213b260000, 14520, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f213b260000 close(3) = 0 mmap(NULL, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f213b0a8000 arch_prctl(ARCH_SET_FS, 0x7f213b0a8700) = 0 mprotect(0x7f213b25a000, 16384, PROT_READ) = 0 mprotect(0x55b784f03000, 4096, PROT_READ) = 0 mprotect(0x7f213b314000, 4096, PROT_READ) = 0 munmap(0x7f213b264000, 554196) = 0 brk(0) = 0x55b786134000 brk(0x55b786155000) = 0x55b786155000 umask(0) = 022 umask(022) = 0 mknod("/tmp/myfifo", S_IFIFO|0600) = 0 open("/tmp/myfifo", O_RDONLY|O_NOFOLLOW|O_CLOEXEC|0x200000 The mkfifo from the host does not hang (this is a prefix system). After downgrading coreutils to 8.31-r1, mkfifo stopped hanging, and strace now shows: $ strace mkfifo -m 600 /tmp/myfifo execve("/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/usr/bin/mkfifo", ["mkfifo", "-m", "600", "/tmp/myfifo"], [/* 45 vars */]) = 0 brk(0) = 0x562b1ccbb000 mmap(NULL, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb562343000 access("/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) open("/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=554196, ...}) = 0 mmap(NULL, 554196, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fb5622bb000 close(3) = 0 open("/cvmfs/sft.cern.ch/lcg/contrib/gentoo/linux/x86_64/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0p\"\2\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=16585416, ...}) = 0 mmap(NULL, 1804472, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fb562102000 mprotect(0x7fb562122000, 1634304, PROT_NONE) = 0 mmap(0x7fb562122000, 1355776, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x20000) = 0x7fb562122000 mmap(0x7fb56226d000, 274432, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x16b000) = 0x7fb56226d000 mmap(0x7fb5622b1000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1ae000) = 0x7fb5622b1000 mmap(0x7fb5622b7000, 14520, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fb5622b7000 close(3) = 0 mmap(NULL, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb5620ff000 arch_prctl(ARCH_SET_FS, 0x7fb5620ff700) = 0 mprotect(0x7fb5622b1000, 16384, PROT_READ) = 0 mprotect(0x562b1c98f000, 4096, PROT_READ) = 0 mprotect(0x7fb56236b000, 4096, PROT_READ) = 0 munmap(0x7fb5622bb000, 554196) = 0 brk(0) = 0x562b1ccbb000 brk(0x562b1ccdc000) = 0x562b1ccdc000 umask(0) = 022 umask(022) = 0 mknod("/tmp/myfifo", S_IFIFO|0600) = 0 chmod("/tmp/myfifo", 0600) = 0 close(1) = 0 close(2) = 0 exit_group(0) = ? +++ exited with 0 +++ Portage now works fine after the downgrade.
In case anyone had the same problem, here is how I managed to fix it (on a gentoo-prefix). 1. Make a backup of your ${EPREFIX}/bin/mkfifo 2. Make a symbolic link of the mkfifo file in the host (typically at /usr/bin/mkfifo or /bin/mkfifo) to ${EPREFIX}/bin/mkfifo 3. Downgrade coreutils with emerge -v1 '=sys-apps/coreutils-8.30*' 4. Re-install coreutils with emerge sys-apps/coreutils I am not sure if step 3 is necessary.
Getting this in a non-prefix install.
...and rebuilding coreutils fixed it for me too. *shrug*
(In reply to Luke-Jr from comment #10) > ...and rebuilding coreutils fixed it for me too. *shrug* Sigh, spoke too soon. Only reproduces with the -m option Looks like 8.31 is using native lchmod, while 8.32 is using a (broken) substitute for some reason.
(In reply to Luke-Jr from comment #11) > Looks like 8.31 is using native lchmod, while 8.32 is using a (broken) substitute for some reason. Do you see this in the configure output? checking for lchmod... yes If not, it is probably using an implementation from gnulib as a fallback.
(In reply to Mike Gilbert from comment #12) > (In reply to Luke-Jr from comment #11) > > Looks like 8.31 is using native lchmod, while 8.32 is using a (broken) substitute for some reason. > > Do you see this in the configure output? > > checking for lchmod... yes > > If not, it is probably using an implementation from gnulib as a fallback. No, both 8.31 and 8.32 have "no" here. It does look like lchmod is coming from gnulib. In 8.31, it appears that it is #define'd to normal chmod...
Created attachment 670109 [details] build log
If you're still hitting this, please include your config.log (and ideally a fresh build.log & emerge --info), and share anything interesting about your machine.