portage-2.0.51-r15 takes so long to emerge that it looks like the process has hung and may cause people to kill it. It progresses through the normal sort of emerge messages then prints out:
* Compiling python modules...
and doesn't do anything else for half an hour. This happens exactly the same for two different systems. I first saw this with 2.0.51-r14.
Note, this isn't really a bug report, just a report that something weird is going on.
Steps to Reproduce:
1. emerge -v portage
The log file shows:
1106758608: Started emerge on: Jan 26, 2005 08:56:48
1106758608: *** emerge --verbose portage
1106758609: >>> emerge (1 of 1) sys-apps/portage-2.0.51-r15 to /
1106758609: === (1 of 1) Cleaning
1106758610: === (1 of 1) Compiling/Merging
1106760704: === (1 of 1) Post-Build Cleaning
1106760720: >>> AUTOCLEAN: sys-apps/portage
1106760725: === Unmerging... (sys-apps/portage-2.0.51-r14)
1106760751: >>> unmerge success: sys-apps/portage-2.0.51-r14
1106760753: ::: completed emerge (1 of 1) sys-apps/portage-2.0.51-r15 to /
1106760753: *** Finished. Cleaning up...
1106760761: *** exiting successfully.
1106760762: *** terminating.
That is almost 36 minutes!
While it is sitting in the "Compiling python modules" state, top shows that
there is no CPU or memory being consumed, but the system is very sluggish. It
almost acts I/O bound, but there was no disk activity at the time either. The
load average was quite high though, around 3.
It should have used up lots of CPU and/or memory and/or disk I/O, and completed
the emerge quickly.
sabre root # emerge info
Portage 2.0.51-r15 (default-linux/x86/2004.3, gcc-3.3.5,
glibc-126.96.36.19940808-r1, 2.4.28-gentoo-r5 i686)
System uname: 2.4.28-gentoo-r5 i686 AMD Athlon(tm) XP 1700+
Gentoo Base System version 1.4.16
Python: dev-lang/python-2.2.3-r5,dev-lang/python-2.3.4 [2.3.4 (#1,
Dec 22 2004, 15:42:46)]
distcc 2.16 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [enabled]
ccache version 2.3 [enabled]
dev-lang/python: 2.2.3-r5, 2.3.4
CFLAGS="-O3 -march=athlon-xp -funroll-loops -pipe"
CONFIG_PROTECT="/etc /usr/X11R6/lib/X11/xkb /usr/kde/2/share/config
/usr/kde/3.1/share/config /usr/kde/3.2/share/config /usr/kde/3.3/env
/usr/kde/3.3/share/config /usr/kde/3.3/shutdown /usr/kde/3/share/config
CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d"
CXXFLAGS="-O3 -march=athlon-xp -funroll-loops -pipe"
FEATURES="autoaddcvs autoconfig ccache distcc distlocks sandbox severe sfperms
USE="x86 3dnow X alsa apache2 apm arts artswrappersuid avi berkdb bitmap-fonts
bonobo cdr cjk crypt cups curl dga directfb divx4linux doc dvb dvd dvdread emacs
encode esd ethereal f77 fam fastcgi fbcon flac font-server foomaticdb fortran gb
gd gdbm gif gnome gphoto2 gpm gtk gtk2 gtkhtml guile imagemagick imap imlib
innodb ipv6 java jpeg junit kde libg++ libwww live mad maildir mailwrapper
mikmod mmx motif mozilla mpeg mysql nas ncurses network nls odbc ofx oggvorbis
opengl oss pam pda pdflib perl png python qt quicktime quotes readline ruby
samba sasl sdl slang slp spell sse ssl svga tcltk tcpd tetex theora tiff
truetype truetype-fonts type1-fonts unicode usb wmf xml xml2 xmms xv xvid zeo zlib"
Unset: ASFLAGS, CBUILD, CTARGET, LDFLAGS
sabre root #
Did you check your dmesg? Make sure there weren't disk errors.
That part does use a lot of IO.
I just reran this emerge and checked in on top every once in a while.
It appears that the main action that happens during the "Compiling python modules..." phase is that chown runs for about 5-10 minutes and then chmod runs for another 20-25 minutes. During that time:
- there were no seek noises from my hard drives
- nothing was added to dmesg
- the only syslogs were recording cron jobs launching every 10 minutes
- top said that the CPU was 97% idle typically
- when chown was running, top said it consumed 1-4% of the CPU with a load average around 1.1
- when chmod was running, top said it consumed 0-4% of the CPU with a load average around 2.8
- the last time I saw chmod running, it had consumed about 8.5 CPU seconds over 20+ minutes of wall clock time
- top usually consumed the most CPU at around 2% the whole time
time emerge portage returned with
I'm running with reiser 3. Perhaps there are issues with changing file ownership and permissions with reiser, but I haven't noticed that any bulk chown or chmod commands took a long time before.
I will try again with strace.
As expected, strace shows a fork and wait on ebuild.sh, and then the strace output stops during the "Compiling python modules..." phase.
Oh well. Short of hooking up gdb, I'm not going to be able to give you any more info.
Curious that chown and chmod don't want to execute in a timely fashion.
Did you ever resolve this issue?
Did you try running chown and chmod yourself?
No, this was never resolved.
It looks like the recursive chown/chmod of /var/tmp/ccache (in the pkg_postinst section of the ebuild) takes an absurd amount of time to run. While it is running, it doesn't look like any CPU or I/O is being used. It would be interesting to find out if it acts the same way with other filesystems.
I ran the chown command from the ebuild manually to get the info above.
The ebuild no longer does these chowns. However, they do occur in portage itself. They should only ever happen when changing FEATURES="userpriv" on and off, however.
I just emerged portage-188.8.131.52-r1. It took less than a minute. This is a massive improvement.
Is there any sort of "go get a coffee" message printed by portage when it determines that it needs to run the chown/chmod?
Actually, nothing is printed. I should probably add that.