|Summary:||>=sys-apps/portage-2.1.9/2.2._rc78 with USE=ipc: ebuild-ipc timed out if bashrcng shmfs-plugin is enabled.|
|Product:||Gentoo Linux||Reporter:||Marco Clocchiatti <ziapannocchia>|
|Component:||[OLD] Core system||Assignee:||Portage team <dev-portage>|
|Severity:||normal||CC:||disinbox, esigra, fabiano.francesconi, zeekec|
|Package list:||Runtime testing required:||---|
|Bug Depends on:|
output from strace
ps aux|grep ebuild|grep -v grep
Description Marco Clocchiatti 2010-08-25 11:27:53 UTC
>=sys-apps/portage-126.96.36.199_rc68 : emerge hangs up if bashrcng is enabled. bashrcng is from layman gechi overlay: http://gechi-overlay.sourceforge.net/ <source type="svn">https://gechi-overlay.svn.sourceforge.net/svnroot/gechi-overlay/overlay/</source> Reproducible: Always Steps to Reproduce: 1. layman -a gechi 2. emerge =app-portage/bashrcng-shmfs-1.2.6 3. eselect bashrcng set 1.1.4 4. eselect bashrcng enable shmfs 5. emerge any-atom Actual Results: emerge hangs up. Expected Results: emerge builds fine.
Comment 1 Marco Clocchiatti 2010-08-25 11:30:51 UTC
Created attachment 244521 [details] emerge --info
Comment 2 Marco Clocchiatti 2010-08-25 11:51:34 UTC
Created attachment 244527 [details] emerge data tail from emerge -f --nodeps sys-apps/portage >strace_stdout.txt 2>trace_stderror.txt
Comment 3 Marco Clocchiatti 2010-08-25 11:53:05 UTC
Created attachment 244529 [details] output from strace
Comment 4 Marco Clocchiatti 2010-08-25 11:54:58 UTC
Created attachment 244531 [details] ps aux|grep ebuild|grep -v grep sent after ctrl-c from previous strace command.
Comment 5 Marco Clocchiatti 2010-08-25 11:59:56 UTC
upstream related bug: http://sourceforge.net/tracker/?func=detail&aid=3052915&group_id=176946&atid=879268
Comment 6 Sebastian Luther (few) 2010-08-25 12:55:12 UTC
What exactly does this thing do?
Comment 7 Zac Medico 2010-08-25 15:01:58 UTC
Apparently bashrcng is corrupting $PORTAGE_BUILDIR/.ipc_in which a fifo that emerge uses to listen for interprocess communication via the ebuild-ipc helper. So, how and why is bashrcng corrupting this fifo? I can guess that the "how" is that it is copying it to a separate filesytem, which would make it a new inode and thus emerge would be listening at the old inode that's been lost. So, the main questions are why does bashrcng do this and what alternatives are there to its current behavior.
Comment 8 Maciej Mrozowski 2010-08-25 22:24:14 UTC
Well, portage-2.2_rc69 here, no bashrcng-shmfs installed, yet it hangs on call ebuil.sh prerm while executing command: /usr/bin/python2.6 /usr/lib64/portage/bin/ebuild-ipc.py exit 0 emerge --info
Comment 9 Maciej Mrozowski 2010-08-25 22:25:12 UTC
Upon CTRL+C: >>> Installing (1 of 18) virtual/libiconv-0 * checking 0 files for package collisions >>> Safely unmerging already-installed instance... ^C Exiting on signal 2 Traceback (most recent call last): File "/usr/lib64/portage/bin/ebuild-ipc.py", line 76, in <module> sys.exit(ebuild_ipc_main(sys.argv[1:])) File "/usr/lib64/portage/bin/ebuild-ipc.py", line 73, in ebuild_ipc_main return ebuild_ipc.communicate(args) File "/usr/lib64/portage/bin/ebuild-ipc.py", line 45, in communicate return self._communicate(args) File "/usr/lib64/portage/bin/ebuild-ipc.py", line 52, in _communicate output_file = open(self.ipc_in_fifo, 'wb') KeyboardInterrupt
Comment 10 Zac Medico 2010-08-25 22:56:59 UTC
(In reply to comment #9) > File "/usr/lib64/portage/bin/ebuild-ipc.py", line 52, in _communicate > output_file = open(self.ipc_in_fifo, 'wb') > KeyboardInterrupt Does `lsof | grep .ipc_in` show the emerge process listening to $PORTAGE_BUILDIR/.ipc_in, and does it say that the inode has been deleted or anything odd like that? The ipc is well tested in my stage builds and it works flawlessly here.
Comment 11 Igor Ulyanov 2010-08-28 14:05:09 UTC
> Does `lsof | grep .ipc_in` show the emerge process listening to > $PORTAGE_BUILDIR/.ipc_in, and does it say that the inode has been deleted or > anything odd like that? The ipc is well tested in my stage builds and it works > flawlessly here. > I have similar problems emerging net-libs/libwww-5.4.0-r7. >>> Installing (2 of 23) net-libs/libwww-5.4.0-r7 hangs. lsof prints this: lsof | grep .ipc_in emerge 10943 root 5r FIFO 8,5 0t0 2242486 /var/tmp/binpkgs/net-libs/libwww-5.4.0-r7/.ipc_in
Comment 12 Zac Medico 2010-08-28 15:21:59 UTC
(In reply to comment #11) > I have similar problems emerging net-libs/libwww-5.4.0-r7. >>> Installing (2 of > 23) net-libs/libwww-5.4.0-r7 hangs. On irc we found that it was the pkg_prerm phase that was hanging. When it was hung up we checked the content of /var/tmp/binpkgs/net-libs/libwww-5.4.0-r7/temp/environment and the pkg_prerm function was defined but it contained only a return statement and nothing more. We removed /var/db/pkg/net-libs/libwww-5.4.0-r7/libwww-5.4.0-r7.ebuild in order to disable the pkg_prerm phase. After that the problem no longer occurred for the package, so now we don't know what was wrong with the environment of the old instance.
Comment 13 Zac Medico 2010-08-28 15:24:51 UTC
(In reply to comment #12) We do know that the output of `ps | grep ebuild` looked like this when it was hung: 15850 pts/0 SN+ 0:00 /bin/bash /usr/lib64/portage/bin/ebuild.sh prerm 15871 pts/0 SN+ 0:00 /bin/bash /usr/lib64/portage/bin/ebuild.sh prerm 15884 pts/0 SN+ 0:00 /usr/bin/python2.6 /usr/lib64/portage/bin/ebuild-ipc.py exit 0
Comment 14 Fabiano Francesconi 2010-08-28 15:28:17 UTC
Is it just "shmfs-plugin" that's not working? Because I use bashrcng with bashrcng-lafilefixer plugin and it works like a charm.
Comment 15 Fabiano Francesconi 2010-08-28 15:32:25 UTC
(In reply to comment #13) > (In reply to comment #12) > We do know that the output of `ps | grep ebuild` looked like this when it was > hung: > > 15850 pts/0 SN+ 0:00 /bin/bash /usr/lib64/portage/bin/ebuild.sh prerm > 15871 pts/0 SN+ 0:00 /bin/bash /usr/lib64/portage/bin/ebuild.sh prerm > 15884 pts/0 SN+ 0:00 /usr/bin/python2.6 > /usr/lib64/portage/bin/ebuild-ipc.py exit 0 > By enabling shmfs plugin it hangs but my "ps aux | grep ebuild" output is quite different: root 30462 0.2 0.0 12004 2364 pts/0 S+ 17:31 0:00 /bin/bash /usr/lib64/portage/bin/ebuild.sh setup root 30537 0.0 0.0 12136 1796 pts/0 S+ 17:31 0:00 /bin/bash /usr/lib64/portage/bin/ebuild.sh setup root 30559 0.4 0.1 60056 6840 pts/0 S+ 17:31 0:00 /usr/bin/python2.6 /usr/lib64/portage/bin/ebuild-ipc.py exit 0 It seems to hang on "setup" instead of prerm
Comment 16 Zac Medico 2010-08-28 15:45:27 UTC
(In reply to comment #15) > By enabling shmfs plugin it hangs but my "ps aux | grep ebuild" output is quite > different: Right, the people who aren't using bashrcng with shmfs plugin really have a separate issue and they should file a new bug.
Comment 17 Zac Medico 2010-09-11 18:14:05 UTC
Since bug 335777 (2.2_rc75 and 2.1.9) ebuild-ipc will time out after 40 seconds instead of hanging indefinitely.
Comment 18 Zac Medico 2010-09-15 00:47:00 UTC
Since portage-188.8.131.52 and 2.2_rc82 the behavior may be a little different, as described in bug #336142, comment #25.
Comment 19 Zac Medico 2010-09-18 06:13:40 UTC
(In reply to comment #12) > After that the problem no longer occurred for > the package, so now we don't know what was wrong with the environment of the > old instance. Now I've found that in some rare cases bash unset statements can fail (seems like some sort of memory corruption). This can cause stale PORTAGE_BUILDDIR settings from /var/db/pkg/*/*/environment.bz2 to leak into the pkg_prerm environment and interfere with ebuild-ipc. There is a workaround for this issue in the following two commits: http://git.overlays.gentoo.org/gitweb/?p=proj/portage.git;a=commit;h=4319c6525684013f76cf4294d417a3250b690e34 http://git.overlays.gentoo.org/gitweb/?p=proj/portage.git;a=commit;h=0d410db2aedbb5ec6c61a342e4bc56f825831ca5
Comment 20 Marco Clocchiatti 2010-09-18 13:46:39 UTC
the last workaround seem not working for me.
Comment 21 Zac Medico 2010-09-18 14:11:17 UTC
(In reply to comment #20) > the last workaround seem not working for me. Right, that was a separate issue from the bashrcng problem, but with similar symptoms.
Comment 22 Marco Clocchiatti 2010-10-17 10:35:51 UTC
sys-apps/portage-2.2_rc97 seems no more affected by this bug in a amd64 gentoo box, while it is in another x86 gentoo box of mine, embedded in a chroot of the previous amd64 box. please, ask me for any kind of documentation you need.
Comment 23 Zac Medico 2010-10-17 12:46:49 UTC
I guess we can consider with fixed by the ability to disable USE=ipc in portage portage ebuild.
Comment 24 Marco Clocchiatti 2010-10-17 16:00:34 UTC
just few questions: first: in my amd64 envinronment (the working one), the use ipc is enabled, while in my embedded system the ipc use flag is needed. what should I do (if possibile) to have the ipc use flag working inside the chroot? In this moment, a mount comand from amd64 envinronment, shows these virtual directories enabled (/mnt/chroot/root32 is the chroot path): /dev on /mnt/chroot/root32/dev type none (rw,bind) /dev/pts on /mnt/chroot/root32/dev/pts type none (rw,bind) /dev/shm on /mnt/chroot/root32/dev/shm type none (rw,bind) /proc on /mnt/chroot/root32/proc type none (rw,bind) /proc/bus/usb on /mnt/chroot/root32/proc/bus/usb type none (rw,bind) /sys on /mnt/chroot/root32/sys type none (rw,bind) a second question: the ipc feature is documented as: "Use inter-process communication between portage and running ebuilds." what does it means exactly? it is useful just for commands such as genlop -c or it is important for other portage operation (for example to avoid possible corruptions when two emerge instances are running in the same time?)
Comment 25 Zac Medico 2010-10-17 16:14:06 UTC
(In reply to comment #24) > what should I do (if possibile) to have the ipc use flag working inside the > chroot? Either stop using bashrcng shmfs-plugin, for fix it so that it doesn't interfere with the $PORTAGE_BUILDDIR/.ipc_in fifo. > a second question: > the ipc feature is documented as: "Use inter-process communication between > portage and running ebuilds." > > what does it means exactly? it is useful just for commands such as genlop -c or > it is important for other portage operation (for example to avoid possible > corruptions when two emerge instances are running in the same time?) I've added some documentation here: http://git.overlays.gentoo.org/gitweb/?p=proj/portage.git;a=commit;h=d2d2f43ca1908b2827790bf3e844a0d90f13fede * There is a new ipc (inter-process communication) USE flag which is enabled by default. This allows portage to communicate with running ebuild processes, for things like best_version, has_version, and die calls in nested processes. This flag should remain enabled unless it is found to be incompatible with a specific profile or environment.
Comment 26 Zac Medico 2010-10-17 16:17:25 UTC
Also, it's worth noting that USE=ipc fixes bug 278895 and bug 315615.
Comment 27 Marco Clocchiatti 2010-10-31 21:00:18 UTC
sorry. I'm stupid. ipc doesn't work when bashrcng-shmfs is enabled nor in main system nor in chroot ...
Comment 28 Marco Clocchiatti 2010-11-29 10:32:06 UTC
sys-apps/portage-2.2_rc67 was deleted today from portage tree. please restore it because off this bug. no other version of sys-apps/portage-2.2* works with bashrcng-shmfs
Comment 29 Zac Medico 2010-11-29 10:53:43 UTC
You can disable ipc like this: mkdir /etc/portage/profile echo "sys-apps/portage ipc" >> /etc/portage/profile/package.use.mask
Comment 30 Marco Clocchiatti 2010-11-29 10:55:56 UTC
I know, but I think is not a good think to work without ipc.
Comment 31 Zac Medico 2010-11-29 16:09:11 UTC
When the "ipc" USE flag is disabled, it makes portage behave like older portage (such as portage-2.1.8.x and portage-2.2_rc67). So, there's no point is using older releases when you can get the same behavior with latest portage.