After switching to the modular xorg, emerge fails in weird ways. For example, I'll do an emerge -Du world, it will download a package and it continually shows the downloaded file extracting to a temporary directory, but then it continues, as if in a loop, extracting the contents over and over again. It did this with gentoo-sources for example. After a ctrl-c, a reissue of emerge -Du world will continue elsewhere in the download chain and the complete successfully but then it hangs when attempting to regenerate /etc/ld.so.cache. For instance, I've just attempted to emerge gettext during a emerge Du world. This worked except for that it failed to update ld.so.cache..or it at least seemed to hang though with some processor activity. I then thought to emerge systrace to attempt again and see what it was doing. This worked fine, but after the emerge attempted to remove packages that had been updated. It's attempting to remove python 2.4, but is stuck and repeatedly printing !mtime obj /usr/lib/python2.4/encodings..... One of the cores is 100% active and there's really nothing I can do except kill the process. Portage 2.1-r1 (default-linux/amd64/2006.0, gcc-3.4.6, glibc-2.3.6-r4, 2.6.16-gentoo-r9 x86_64) ================================================================= System uname: 2.6.16-gentoo-r9 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ Gentoo Base System version 1.6.15 app-admin/eselect-compiler: [Not Present] dev-lang/python: 2.4.2, 2.4.3-r1 dev-python/pycrypto: 2.0.1-r5 dev-util/ccache: [Not Present] dev-util/confcache: [Not Present] sys-apps/sandbox: 1.2.17 sys-devel/autoconf: 2.13, 2.59-r7 sys-devel/automake: 1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2 sys-devel/binutils: 2.16.1-r3 sys-devel/gcc-config: 1.3.13-r3 sys-devel/libtool: 1.5.22 virtual/os-headers: 2.6.11-r2 ACCEPT_KEYWORDS="amd64" AUTOCLEAN="yes" CBUILD="x86_64-pc-linux-gnu" CFLAGS="-O2 -march=k8 -pipe" CHOST="x86_64-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/kde/3.5/env /usr/kde/3.5/share/config /usr/kde/3.5/shutdown /usr/share/X11/xkb /usr/share/config" CONFIG_PROTECT_MASK="/etc/env.d /etc/gconf /etc/revdep-rebuild /etc/terminfo" CXXFLAGS="-O2 -march=k8 -pipe" DISTDIR="/usr/portage/distfiles" FEATURES="autoconfig distlocks metadata-transfer sandbox sfperms strict" GENTOO_MIRRORS="http://gentoo.ccccom.com ftp://gentoo.ccccom.com ftp://mirrors.tds.net/gentoo http://gentoo.mirrors.pair.com/" MAKEOPTS="-j4" PKGDIR="/usr/portage/packages" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --delete-after --stats --timeout=180 --exclude='/distfiles' --exclude='/local' --exclude='/packages'" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" SYNC="rsync://rsync.gentoo.org/gentoo-portage" USE="amd64 X aac alsa amazon arts artswrappersuid avi bash-completion cacheemu cdrom codecs cscope ctags cups curl cvs dhcp divx4linux dvd dvdr dvdread encode esd font-server freetype ftp gif gs gstreamer gtk gtk2 hal javascript jpeg kde libvisual mad mime mp3 mpeg mpeg4 mplayer network nptl nptlonly ogg oggvorbis opengl pcre pda pdflib perl png python qt qt3 readline rss samba ssl streamzap svg tcpd tiff truetype truetype-fonts type1 type1-fonts vim-pager vim-with-x vorbis x11 xine xmms xosd xprint xv xvid yahoo elibc_glibc input_devices_keyboard input_devices_mouse kernel_linux userland_GNU video_cards_radeon video_cards_vesa video_cards_fbdev" Unset: CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LANG, LC_ALL, LDFLAGS, LINGUAS, PORTAGE_RSYNC_EXTRA_OPTS, PORTDIR_OVERLAY
(In reply to comment #0) > it hangs when attempting to regenerate /etc/ld.so.cache. How about if you run ldconfig yourself? Does it always work without hanging? Did you change anything else immediately before this problem started (updates to critical things such as glibc or the kernel)?
Ran ldconfig manually without a problem. Running emerge -Du world through strace and watching for a hang..
Well, it hung here: onnect(4, {sa_family=AF_INET, sin_port=htons(465), sin_addr=inet_addr("66.249.83.111")}, 16) = 0 getsockname(4, {sa_family=AF_INET, sin_port=htons(33034), sin_addr=inet_addr("192.168.1.9")}, [34408872793866256]) = 0 close(4) = 0 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4 connect(4, {sa_family=AF_INET, sin_port=htons(465), sin_addr=inet_addr("66.249.83.109")}, 16) = 0 getsockname(4, {sa_family=AF_INET, sin_port=htons(33034), sin_addr=inet_addr("192.168.1.9")}, [34408872793866256]) = 0 close(4) = 0 socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 4 connect(4, {sa_family=AF_INET, sin_port=htons(465), sin_addr=inet_addr("66.249.83.111")}, 16) = 0 But I'm not sure what step this was or if this where it usually hangs. I started the emerge world again and its stuck here: >>> Regenerating /etc/ld.so.cache... >>> net-fs/samba-3.0.22-r3 merged. There is some cpu activity but not much..
So, I just emergeed gaim and again I'm stuck at: >>> Regenerating /etc/ld.so.cache... >>> net-im/gaim-1.5.0 merged. PID TT PPID USER %CPU TIME PRI WCHAN COMMAND 28823 pts/2 9382 root 0.0 00:00:01 24 sk_wai emerge Is there anyway to figure out what emerge is doing and why?
If you add FEATURES="python-trace" and run it with --debug then it will produce extremely verbose output. For example: FEATURES="python-trace" emerge --debug gaim > debug.log 2>&1 I may be interesting to see what it's doing at the tail end where it seems to hang.
Hmm, do you have the elog mail module enabled?
So, it's stuck emerging gaim using the command in the previous comment. This is repeating itself in the log: /usr/lib64/python2.4/socket.py line=154 name=__init__ event=call locals={'_sock': None, 'type': 1, 'self': 'omitted', 'family': 2, 'proto': 6} /usr/lib64/python2.4/socket.py line=161 name=__init__ event=return value=None locals={'_sock': <socket object, fd=4, family=2, type=1, protocol=6>, 'type': 1, 'self': 'omitted', 'family': 2, 'proto': 6} <string> line=1 name=connect event=call locals={'self': <socket._socketobject object at 0x2b7821078530>, 'args': (('64.233.167.111', 465),)} <string> line=1 name=connect event=return value=None locals={'self': <socket._socketobject object at 0x2b7821078530>, 'args': (('64.233.167.111', 465),)} /usr/lib64/python2.4/smtplib.py line=331 name=getreply event=call locals={'self': <smtplib.SMTP instance at 0x2b782107f440>} /usr/lib64/python2.4/socket.py line=179 name=makefile event=call locals={'self': <socket._socketobject object at 0x2b7821078530>, 'bufsize': -1, 'mode': 'rb'} /usr/lib64/python2.4/socket.py line=204 name=__init__ event=call locals={'self': 'omitted', 'bufsize': -1, 'sock': <socket object, fd=4, family=2, type=1, protocol=6>, 'mode': 'rb'} /usr/lib64/python2.4/socket.py line=219 name=__init__ event=return value=None locals={'self': 'omitted', 'bufsize': 8192, 'sock': <socket object, fd=4, family=2, type=1, protocol=6>, 'mode': 'rb'} /usr/lib64/python2.4/socket.py line=184 name=makefile event=return value=<socket._fileobject object at 0x2b7820f3b9f0> locals={'self': <socket._socketobject object at 0x2b7821078530>, 'bufsize': -1, 'mode': 'rb'} /usr/lib64/python2.4/socket.py line=315 name=readline event=call locals={'self': <socket._fileobject object at 0x2b7820f3b9f0>, 'size': -1} And yes, I do have the elog mail module enabled. Config lines look like this: #PORTAGE_ELOG_CLASSES="warn error log" #PORTAGE_ELOG_SYSTEM="mail syslog" #PORTAGE_ELOG_MAILURI="shawvrana@gmail.com shawvrana:passowrd@smtp.gmail.com:465" And no, it's never worked for me, and I haven't looked into it yet. I removed the elog lines and was able to emerge gaim without any lookups/pausing. Thanks! Any idea the problem?
Looks like it hangs when trying to send the mail in smtplib.SMTP.sendmail which is outside our control. You probably have to play around with your configuration a bit (make sure you understood how the port parameter is used) or add a few debug statements to /usr/lib/portage/pym/elog_modules/mod_mail.py to figure out what the actual problem is.
I think I've managed to reproduce this bug by updating mta's dependent library (mysql). If this is the case, I think many people will hit this bug even when following our upgrade instructions, as the emerge hangs indefinetely on revdep-rebuild if rebuild contains any elogging ebuild before rebuilding mta. Is this something that can be fixed or worked-around in code or should it be noted in our documentation (such as mysql upgrade guide)?
I seems like we should use an alarm signal to implement a timeout as documented here: http://docs.python.org/lib/node546.html
In svn r5430 I've used an alarm signal to implement a 1 minute timeout.
This has been released in 2.1.2_rc4-r4.