Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 334423

Summary: >=sys-apps/portage-2.1.9/2.2._rc78 with USE=ipc: ebuild-ipc timed out if bashrcng shmfs-plugin is enabled.
Product: Gentoo Linux Reporter: Marco Clocchiatti <ziapannocchia>
Component: [OLD] Core systemAssignee: Portage team <dev-portage>
Status: CONFIRMED ---    
Severity: normal CC: disinbox, esigra, fabiano.francesconi, zeekec
Priority: High    
Version: unspecified   
Hardware: All   
OS: Linux   
URL: http://gechi-overlay.sourceforge.net/
See Also: https://bugs.gentoo.org/show_bug.cgi?id=524328
Whiteboard:
Package list:
Runtime testing required: ---
Bug Depends on:    
Bug Blocks: 335925    
Attachments: emerge --info
emerge data
output from strace
ps aux|grep ebuild|grep -v grep

Description Marco Clocchiatti 2010-08-25 11:27:53 UTC
>=sys-apps/portage-2.2.2.2_rc68 : emerge hangs up if bashrcng is enabled.


bashrcng is from layman gechi overlay:
http://gechi-overlay.sourceforge.net/
<source
type="svn">https://gechi-overlay.svn.sourceforge.net/svnroot/gechi-overlay/overlay/</source>


Reproducible: Always

Steps to Reproduce:
1. layman -a gechi
2. emerge =app-portage/bashrcng-shmfs-1.2.6
3. eselect bashrcng set 1.1.4
4. eselect bashrcng enable shmfs
5. emerge any-atom

Actual Results:  
emerge hangs up.

Expected Results:  
emerge builds fine.
Comment 1 Marco Clocchiatti 2010-08-25 11:30:51 UTC
Created attachment 244521 [details]
emerge --info
Comment 2 Marco Clocchiatti 2010-08-25 11:51:34 UTC
Created attachment 244527 [details]
emerge data

tail from emerge -f --nodeps sys-apps/portage >strace_stdout.txt 2>trace_stderror.txt
Comment 3 Marco Clocchiatti 2010-08-25 11:53:05 UTC
Created attachment 244529 [details]
output from strace
Comment 4 Marco Clocchiatti 2010-08-25 11:54:58 UTC
Created attachment 244531 [details]
ps aux|grep ebuild|grep -v grep

sent after ctrl-c from previous strace command.
Comment 5 Marco Clocchiatti 2010-08-25 11:59:56 UTC
upstream related bug:

http://sourceforge.net/tracker/?func=detail&aid=3052915&group_id=176946&atid=879268
Comment 6 Sebastian Luther (few) 2010-08-25 12:55:12 UTC
What exactly does this thing do?
Comment 7 Zac Medico gentoo-dev 2010-08-25 15:01:58 UTC
Apparently bashrcng is corrupting $PORTAGE_BUILDIR/.ipc_in which a fifo that emerge uses to listen for interprocess communication via the ebuild-ipc helper. So, how and why is bashrcng corrupting this fifo? I can guess that the "how" is that it is copying it to a separate filesytem, which would make it a new inode and thus emerge would be listening at the old inode that's been lost. So, the main questions are why does bashrcng do this and what alternatives are there to its current behavior.
Comment 8 Maciej Mrozowski gentoo-dev 2010-08-25 22:24:14 UTC
Well, portage-2.2_rc69 here, no bashrcng-shmfs installed, yet it hangs on call ebuil.sh prerm while executing command:
/usr/bin/python2.6 /usr/lib64/portage/bin/ebuild-ipc.py exit 0

emerge --info
Comment 9 Maciej Mrozowski gentoo-dev 2010-08-25 22:25:12 UTC
Upon CTRL+C:

>>> Installing (1 of 18) virtual/libiconv-0
 * checking 0 files for package collisions
>>> Safely unmerging already-installed instance...
^C

Exiting on signal 2
Traceback (most recent call last):
  File "/usr/lib64/portage/bin/ebuild-ipc.py", line 76, in <module>
    sys.exit(ebuild_ipc_main(sys.argv[1:]))
  File "/usr/lib64/portage/bin/ebuild-ipc.py", line 73, in ebuild_ipc_main
    return ebuild_ipc.communicate(args)
  File "/usr/lib64/portage/bin/ebuild-ipc.py", line 45, in communicate
    return self._communicate(args)
  File "/usr/lib64/portage/bin/ebuild-ipc.py", line 52, in _communicate
    output_file = open(self.ipc_in_fifo, 'wb')
KeyboardInterrupt
Comment 10 Zac Medico gentoo-dev 2010-08-25 22:56:59 UTC
(In reply to comment #9)
>   File "/usr/lib64/portage/bin/ebuild-ipc.py", line 52, in _communicate
>     output_file = open(self.ipc_in_fifo, 'wb')
> KeyboardInterrupt

Does `lsof | grep .ipc_in` show the emerge process listening to $PORTAGE_BUILDIR/.ipc_in, and does it say that the inode has been deleted or anything odd like that? The ipc is well tested in my stage builds and it works flawlessly here.
Comment 11 Igor Ulyanov 2010-08-28 14:05:09 UTC
> Does `lsof | grep .ipc_in` show the emerge process listening to
> $PORTAGE_BUILDIR/.ipc_in, and does it say that the inode has been deleted or
> anything odd like that? The ipc is well tested in my stage builds and it works
> flawlessly here.
> 

I have similar problems emerging net-libs/libwww-5.4.0-r7. >>> Installing (2 of 23) net-libs/libwww-5.4.0-r7 hangs.

lsof prints this:

lsof | grep .ipc_in
emerge    10943             root    5r     FIFO                8,5          0t0    2242486 /var/tmp/binpkgs/net-libs/libwww-5.4.0-r7/.ipc_in

Comment 12 Zac Medico gentoo-dev 2010-08-28 15:21:59 UTC
(In reply to comment #11)
> I have similar problems emerging net-libs/libwww-5.4.0-r7. >>> Installing (2 of
> 23) net-libs/libwww-5.4.0-r7 hangs.

On irc we found that it was the pkg_prerm phase that was hanging. When it was hung up we checked the content of /var/tmp/binpkgs/net-libs/libwww-5.4.0-r7/temp/environment and the pkg_prerm function was defined but it contained only a return statement and nothing more. We removed /var/db/pkg/net-libs/libwww-5.4.0-r7/libwww-5.4.0-r7.ebuild in order to disable the pkg_prerm phase. After that the problem no longer occurred for the package, so now we don't know what was wrong with the environment of the old instance.
Comment 13 Zac Medico gentoo-dev 2010-08-28 15:24:51 UTC
(In reply to comment #12)
We do know that the output of `ps | grep ebuild` looked like this when it was hung:

15850 pts/0    SN+    0:00 /bin/bash /usr/lib64/portage/bin/ebuild.sh prerm
15871 pts/0    SN+    0:00 /bin/bash /usr/lib64/portage/bin/ebuild.sh prerm
15884 pts/0    SN+    0:00 /usr/bin/python2.6 /usr/lib64/portage/bin/ebuild-ipc.py exit 0
Comment 14 Fabiano Francesconi 2010-08-28 15:28:17 UTC
Is it just "shmfs-plugin" that's not working? Because I use bashrcng with bashrcng-lafilefixer plugin and it works like a charm.
Comment 15 Fabiano Francesconi 2010-08-28 15:32:25 UTC
(In reply to comment #13)
> (In reply to comment #12)
> We do know that the output of `ps | grep ebuild` looked like this when it was
> hung:
> 
> 15850 pts/0    SN+    0:00 /bin/bash /usr/lib64/portage/bin/ebuild.sh prerm
> 15871 pts/0    SN+    0:00 /bin/bash /usr/lib64/portage/bin/ebuild.sh prerm
> 15884 pts/0    SN+    0:00 /usr/bin/python2.6
> /usr/lib64/portage/bin/ebuild-ipc.py exit 0
> 

By enabling shmfs plugin it hangs but my "ps aux | grep ebuild" output is quite different:

root     30462  0.2  0.0  12004  2364 pts/0    S+   17:31   0:00 /bin/bash /usr/lib64/portage/bin/ebuild.sh setup
root     30537  0.0  0.0  12136  1796 pts/0    S+   17:31   0:00 /bin/bash /usr/lib64/portage/bin/ebuild.sh setup
root     30559  0.4  0.1  60056  6840 pts/0    S+   17:31   0:00 /usr/bin/python2.6 /usr/lib64/portage/bin/ebuild-ipc.py exit 0

It seems to hang on "setup" instead of prerm
Comment 16 Zac Medico gentoo-dev 2010-08-28 15:45:27 UTC
(In reply to comment #15)
> By enabling shmfs plugin it hangs but my "ps aux | grep ebuild" output is quite
> different:

Right, the people who aren't using bashrcng with shmfs plugin really have a separate issue and they should file a new bug.
Comment 17 Zac Medico gentoo-dev 2010-09-11 18:14:05 UTC
Since bug 335777 (2.2_rc75 and 2.1.9) ebuild-ipc will time out after 40 seconds instead of hanging indefinitely.
Comment 18 Zac Medico gentoo-dev 2010-09-15 00:47:00 UTC
Since portage-2.1.9.6 and 2.2_rc82 the behavior may be a little different, as described in bug #336142, comment #25.
Comment 19 Zac Medico gentoo-dev 2010-09-18 06:13:40 UTC
(In reply to comment #12)
> After that the problem no longer occurred for
> the package, so now we don't know what was wrong with the environment of the
> old instance.

Now I've found that in some rare cases bash unset statements can fail (seems like some sort of memory corruption). This can cause stale PORTAGE_BUILDDIR settings from /var/db/pkg/*/*/environment.bz2 to leak into the pkg_prerm environment and interfere with ebuild-ipc. There is a workaround for this issue in the following two commits:

http://git.overlays.gentoo.org/gitweb/?p=proj/portage.git;a=commit;h=4319c6525684013f76cf4294d417a3250b690e34
http://git.overlays.gentoo.org/gitweb/?p=proj/portage.git;a=commit;h=0d410db2aedbb5ec6c61a342e4bc56f825831ca5
Comment 20 Marco Clocchiatti 2010-09-18 13:46:39 UTC
the last workaround seem not working for me.
Comment 21 Zac Medico gentoo-dev 2010-09-18 14:11:17 UTC
(In reply to comment #20)
> the last workaround seem not working for me.

Right, that was a separate issue from the bashrcng problem, but with similar symptoms.
Comment 22 Marco Clocchiatti 2010-10-17 10:35:51 UTC
sys-apps/portage-2.2_rc97 seems no more affected by this bug in a amd64 gentoo box, while it is in another x86 gentoo box of mine, embedded in a chroot of the previous amd64 box.

please, ask me for any kind of documentation you need.
Comment 23 Zac Medico gentoo-dev 2010-10-17 12:46:49 UTC
I guess we can consider with fixed by the ability to disable USE=ipc in portage portage ebuild.
Comment 24 Marco Clocchiatti 2010-10-17 16:00:34 UTC
just few questions:
first:

in my amd64 envinronment (the working one), the use ipc is enabled, while in my embedded system the ipc use flag is needed.

what should I do (if possibile) to have  the ipc use flag working inside the chroot?
In this moment, a mount comand from amd64 envinronment, shows these virtual directories enabled (/mnt/chroot/root32 is the chroot path):

/dev on /mnt/chroot/root32/dev type none (rw,bind)
/dev/pts on /mnt/chroot/root32/dev/pts type none (rw,bind)
/dev/shm on /mnt/chroot/root32/dev/shm type none (rw,bind)
/proc on /mnt/chroot/root32/proc type none (rw,bind)
/proc/bus/usb on /mnt/chroot/root32/proc/bus/usb type none (rw,bind)
/sys on /mnt/chroot/root32/sys type none (rw,bind)

a second question:
the ipc feature is documented as: "Use inter-process communication between portage and running ebuilds."

what does it means exactly? it is useful just for commands such as genlop -c or it is important for other portage operation (for example to avoid possible corruptions when two emerge instances are running in the same time?)
Comment 25 Zac Medico gentoo-dev 2010-10-17 16:14:06 UTC
(In reply to comment #24)
> what should I do (if possibile) to have  the ipc use flag working inside the
> chroot?

Either stop using bashrcng shmfs-plugin, for fix it so that it doesn't interfere with the $PORTAGE_BUILDDIR/.ipc_in fifo.

> a second question:
> the ipc feature is documented as: "Use inter-process communication between
> portage and running ebuilds."
> 
> what does it means exactly? it is useful just for commands such as genlop -c or
> it is important for other portage operation (for example to avoid possible
> corruptions when two emerge instances are running in the same time?)

I've added some documentation here:

http://git.overlays.gentoo.org/gitweb/?p=proj/portage.git;a=commit;h=d2d2f43ca1908b2827790bf3e844a0d90f13fede

* There is a new ipc (inter-process communication) USE flag which is enabled
  by default. This allows portage to communicate with running ebuild processes,
  for things like best_version, has_version, and die calls in nested processes.
  This flag should remain enabled unless it is found to be incompatible with a
  specific profile or environment.
Comment 26 Zac Medico gentoo-dev 2010-10-17 16:17:25 UTC
Also, it's worth noting that USE=ipc fixes bug 278895 and bug 315615.
Comment 27 Marco Clocchiatti 2010-10-31 21:00:18 UTC
sorry. I'm stupid.

ipc doesn't work when bashrcng-shmfs is enabled nor in main system nor in chroot ...
Comment 28 Marco Clocchiatti 2010-11-29 10:32:06 UTC
sys-apps/portage-2.2_rc67 was deleted today from portage tree.

please restore it because off this bug.
no other version of sys-apps/portage-2.2* works with bashrcng-shmfs
Comment 29 Zac Medico gentoo-dev 2010-11-29 10:53:43 UTC
You can disable ipc like this:

mkdir /etc/portage/profile
echo "sys-apps/portage ipc" >> /etc/portage/profile/package.use.mask
Comment 30 Marco Clocchiatti 2010-11-29 10:55:56 UTC
I know, but I think is not a good think to work without ipc.
Comment 31 Zac Medico gentoo-dev 2010-11-29 16:09:11 UTC
When the "ipc" USE flag is disabled, it makes portage behave like older portage (such as portage-2.1.8.x and portage-2.2_rc67). So, there's no point is using older releases when you can get the same behavior with latest portage.