On a stable i686 system, enabled ~x86 keyword for sys-dvel/gcc to get gcc 5.3.0 to see if I can circumvent other building problems with it. More than 2 hours after starting the emerge, I see 3 "genmodes" processes running, each with 66,7% CPU with 112min of CPU time consumed and nothing else seems to happen. Reproducible: Always # emerge --info Portage 2.2.28 (python 2.7.10-final-0, default/linux/x86/13.0, gcc-4.9.3, glibc-2.22-r4, 3.18.6 i686) ================================================================= System uname: Linux-3.18.6-i686-QEMU_Virtual_CPU_version_1.1.2-with-gentoo-2.2 KiB Mem: 2074868 total, 117496 free KiB Swap: 0 total, 0 free Timestamp of repository gentoo: Thu, 02 Jun 2016 06:45:01 +0000 sh bash 4.3_p42-r1 ld GNU ld (Gentoo 2.25.1 p1.1) 2.25.1 distcc 3.2rc1 i686-pc-linux-gnu [enabled] app-shells/bash: 4.3_p42-r1::gentoo dev-lang/perl: 5.20.2::gentoo dev-lang/python: 2.7.10-r1::gentoo, 3.4.3-r1::gentoo dev-util/cmake: 3.3.1-r1::gentoo dev-util/pkgconfig: 0.28-r2::gentoo sys-apps/baselayout: 2.2::gentoo sys-apps/openrc: 0.19.1::gentoo sys-apps/sandbox: 2.10-r1::gentoo sys-devel/autoconf: 2.69::gentoo sys-devel/automake: 1.11.6-r1::gentoo, 1.14.1::gentoo, 1.15::gentoo sys-devel/binutils: 2.25.1-r1::gentoo sys-devel/gcc: 4.9.3::gentoo sys-devel/gcc-config: 1.7.3::gentoo sys-devel/libtool: 2.4.6::gentoo sys-devel/make: 4.1-r1::gentoo sys-kernel/linux-headers: 4.3::gentoo (virtual/os-headers) sys-libs/glibc: 2.22-r4::gentoo Repositories: gentoo location: /usr/portage sync-type: rsync sync-uri: rsync://gentoo32.petamem.com/gentoo-portage priority: -1000 Installed sets: @system ACCEPT_KEYWORDS="x86" ACCEPT_LICENSE="* -@EULA" CBUILD="i686-pc-linux-gnu" CFLAGS="-O2 -march=i686 -pipe" CHOST="i686-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/share/gnupg/qualified.txt" CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/php/apache2-php5.6/ext-active/ /etc/php/cgi-php5.6/ext-active/ /etc/php/cli-php5.6/ext-active/ /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo" CXXFLAGS="-O2 -march=i686 -pipe" DISTDIR="/usr/portage/distfiles" EMERGE_DEFAULT_OPTS="--quiet-build=y" FCFLAGS="-O2 -march=i686 -pipe" FEATURES="assume-digests binpkg-logs config-protect-if-modified distcc distlocks ebuild-locks fixlafiles merge-sync news parallel-fetch preserve-libs protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync" FFLAGS="-O2 -march=i686 -pipe" GENTOO_MIRRORS="http://distfiles.gentoo.org" LANG="en_US.UTF-8" LC_ALL="en_US.utf8" LDFLAGS="-Wl,-O1 -Wl,--as-needed" MAKEOPTS="-j3" PKGDIR="/usr/portage/packages" PORTAGE_CONFIGROOT="/" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git" PORTAGE_TMPDIR="/var/tmp" USE="berkdb bzip2 cxx emacs fpm ithreads ncurses pcre perl readline simplexml sse sse2 ssl ssse3 threads udev unicode x86 zlib" ABI_X86="32" CURL_SSL="gnutls" ELIBC="glibc" KERNEL="linux" LINGUAS="cs de en" NGINX_MODULES_HTTP="access autoindex charset fastcgi gzip gzip_static rewrite" PYTHON_SINGLE_TARGET="python2_7" PYTHON_TARGETS="python2_7" USERLAND="GNU" USE_PYTHON="2.7" Unset: CC, CPPFLAGS, CTARGET, CXX, INSTALL_MASK, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
Created attachment 436472 [details] the gcc build log
It looks like you're using qemu software virtualization, which is quite slow and may very well take several hours to build gcc. You also have allocated a fairly small amount of memory, this could become a problem. I don't think the build is hung or looping, I think it just takes this long.
virsh list | wc -l 52 The underlying iron is quite big and these problems do not occur on other VMs. If I should make an educated guess, I would attribute the problem to the missing global ~x86 than anything else. 2GB on a 32bit system is a "fairly small amount of memory"? There is no swap - if there was a problem with memory, it would trigger an oom and not the usual swap trashing.
Yes, 2GB is a small amount of memory when you set MAKEOPTS="-j3". Is hardware virt on or off? If it's off, is this on purpose?
Very well then, I replaced -j3 with -j1, let gcc compile and went to bed. Today - after almost 10 hours - it is still compiling 20001 portage 20 0 2268 1068 948 R 100.0 0.1 536:11.78 build/genmddeps /var/tmp/portage/sys-devel/gcc-5.3.0/work/gcc-5.3.0/gcc/common.md /var/tmp/portage/s+ With 75% of the memory freely available # free total used free shared buff/cache available Mem: 2074868 576752 103428 13280 1394688 1455000 And of course hardware virtualization is on. Actually lots of devices are paravirtualized.
I'm sorry that I've offended you. Good luck with your bug.
No offence taken. I just try to state the facts to find out what could and what could not be the reason for the observed behavior. So far, I have no clue. My next try will be to find the differences between this 32bit VM and other 32bit VMs that actually do not exhibit this problem.
Ok, the same happens now on a completely different 64bit system too. here's a pstree excerpt | |-sshd(10348)---bash(10350)---emerge(3953)---sandbox(14441)---ebuild.sh(14443)---ebuild.sh(14465)---emake(14484)---make(14487)---bash(14523)---make(14544)---bash(14581)---make(14688)---bash(10649)---make(10673)-+-bash(11127)---genhooks(11129) | | |-bash(11135)---genhooks(11136) | | |-bash(11142)---genhooks(11143) | | |-bash(11144)---genmodes(11145) | | |-bash(11148)---genmodes(11149) | | |-bash(11162)---genmodes(11163) | | |-bash(11183)---genmddeps(11184) | | `-gengtype(11186) # ps -u portage -l F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD 0 S 250 10649 14688 0 80 0 - 18358 wait pts/0 00:00:00 bash 0 S 250 10673 10649 0 80 0 - 19520 poll_s pts/0 00:00:00 make 0 S 250 11127 10673 0 80 0 - 18325 wait pts/0 00:00:00 bash 0 R 250 11129 11127 97 80 0 - 2148 - pts/0 01:28:49 genhooks 0 S 250 11135 10673 0 80 0 - 18325 wait pts/0 00:00:00 bash 0 R 250 11136 11135 97 80 0 - 2148 - pts/0 01:28:50 genhooks 0 S 250 11142 10673 0 80 0 - 18325 wait pts/0 00:00:00 bash 0 R 250 11143 11142 97 80 0 - 2148 - pts/0 01:28:53 genhooks 0 S 250 11144 10673 0 80 0 - 18325 wait pts/0 00:00:00 bash 0 R 250 11145 11144 97 80 0 - 2097 - pts/0 01:28:43 genmodes 0 S 250 11148 10673 0 80 0 - 18325 wait pts/0 00:00:00 bash 0 R 250 11149 11148 97 80 0 - 2097 - pts/0 01:28:52 genmodes 0 S 250 11162 10673 0 80 0 - 18325 wait pts/0 00:00:00 bash 0 R 250 11163 11162 97 80 0 - 2097 - pts/0 01:28:55 genmodes 0 S 250 11183 10673 0 80 0 - 18325 wait pts/0 00:00:00 bash 0 R 250 11184 11183 97 80 0 - 2095 - pts/0 01:28:51 genmddeps 0 R 250 11186 10673 97 80 0 - 2129 - pts/0 01:28:44 gengtype 4 S 250 14441 3953 0 80 0 - 1053 wait pts/0 00:00:00 sandbox 0 S 250 14443 14441 0 80 0 - 21445 wait pts/0 00:00:00 ebuild.sh 1 S 250 14465 14443 0 80 0 - 21481 wait pts/0 00:00:00 ebuild.sh 0 S 250 14484 14465 0 80 0 - 18387 wait pts/0 00:00:00 emake 0 S 250 14487 14484 0 80 0 - 17399 wait pts/0 00:00:00 make 0 S 250 14523 14487 0 80 0 - 18323 wait pts/0 00:00:00 bash 0 S 250 14544 14523 0 80 0 - 17408 wait pts/0 00:00:00 make 0 S 250 14581 14544 0 80 0 - 18324 wait pts/0 00:00:00 bash 0 S 250 14688 14581 0 80 0 - 17479 wait pts/0 00:00:00 make
Nope. The reason for the 64bit machine behavior was a wrong CFLAGS: Somehow, -march=core-avx2 got into these on the 64bit VM when gcc 4.9.3 was still in effect. Interesting part is, that there was no error, no warning, gcc simply went on compiling, but evidently failed or did something weird, so that compiled C programs (e.g. conftest.c needed for apache2 compilation, temacs etc.) simply went in some endless loop. Changing that "core-avx2" to "haswell" and everything is working fine again. So the problem with the 64bit machine is solved, although I am quite puzzled about the silent behavior/failure of gcc 4.9.3 Of course I inspected the 32bit machine for a similar error, but there the CFLAGS are -o2 -march=i686 and this has worked for years now. So it seems the cause is different, although by the phenotype of the error, I would also guess it's something with the helper binaries that gcc creates during emerge.
Well, the obvious thing to do would be to attach gdb to one of the spinning processes and produce a backtrace.
We think this ticket can be closed as there is no single 5.3.0 (or lower) instance of gcc running here anymore and we do not have the means - nor interest - to investigate this any further.
(In reply to PetaMem R&D from comment #11) > We think this ticket can be closed as there is no single 5.3.0 (or lower) > instance of gcc running here anymore and we do not have the means - nor > interest - to investigate this any further. Ack, thanks & sorry for the delay.