First Last Prev Next    No search results available      Search page      Enter new bug
Bug#: 233614
Alias:
Product:
Component:
Status: RESOLVED
Resolution: FIXED
Assigned To: Gentoo Science Related Packages <sci@gentoo.org>
Hardware:
OS:
Version:
Priority:
Severity:
Reporter: Alexandre Rostovtsev <tetromino@gmail.com>
Add CC:
CC:
Remove selected CCs
URL:
Summary:
Status Whiteboard:
Keywords:

Filename Description Type Creator Created Size Actions
blas-atlas-3.9.1-partial-build.log Small piece of the build log text/plain Alexandre Rostovtsev 2008-08-01 15:48 0000 646.72 KB Details
blas-atlas-3.9.1-timing.patch propsed patch for infinite compile loop patch Markus Dittrich 2008-08-02 15:33 0000 2.40 KB Details | Diff
blastest.sh Shell script to unpack/patch/compile using the proposed patch from above. text/plain Grant Edwards 2008-08-02 20:25 0000 563 bytes Details
blastest.out.gz Gzipped text output from ebuild unpack, patch, ebuild compile application/octet-stream Grant Edwards 2008-08-02 20:28 0000 48.41 KB Details
ebuild-blas_ps.out.gz Gzipped text output showing "ps axf" format process tree snapshots at 10s intervals text/plain Grant Edwards 2008-08-02 20:29 0000 26.00 KB Details
Create a New Attachment (proposed patch, testcase, etc.) View All

Bug 233614 depends on: Show dependency tree
Bug 233614 blocks:
Votes: 0    Show votes for this bug    Vote for this bug

Additional Comments: (this is where you put emerge --info)


Not eligible to see or edit group visibility for this bug.






View Bug Activity   |   Format For Printing   |   XML   |   Clone This Bug


Description:   Opened: 2008-08-01 15:43 0000
After starting an emerge to update a couple of packages, I discovered that
blas-atlas-3.9.1 was taking over 15 hours to emerge (normally, it takes 3.5
hours). After looking at the build log, it looks like it's stuck in an infinite
loop while running drottest.

Normally, I would have attached a complete build log, but it's 500MB long:)

Note 1: the build infinite loop is repeatable.
Note 2: blas-atlas-3.8.0, 3.8.1 and 3.8.2 had emerged fine on this machine,
taking ~3.5 hours to compile.
Note 3: blas-atlas-3.9.1 had emerged correctly on a different machine (~amd64,
Q6600).

# emerge --info
Portage 2.2_rc5 (default/linux/x86/2008.0/desktop, gcc-4.3.1,
glibc-2.8_p20080602-r0, 2.6.25-gentoo-r6 i686)
=================================================================
System uname:
Linux-2.6.25-gentoo-r6-i686-Intel-R-_Pentium-R-_M_processor_1.60GHz-with-glibc2.0
Timestamp of tree: Thu, 31 Jul 2008 04:30:01 +0000
distcc 2.18.3 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632)
[disabled]
ccache version 2.4 [enabled]
app-shells/bash:     3.2_p39
dev-java/java-config: 1.3.7, 2.1.6-r1
dev-lang/python:     2.4.4-r13, 2.5.2-r5
dev-python/pycrypto: 2.0.1-r6
dev-util/ccache:     2.4-r7
dev-util/confcache:  0.4.2-r1
sys-apps/baselayout: 2.0.0
sys-apps/openrc:     0.2.5
sys-apps/sandbox:    1.2.18.1-r3
sys-devel/autoconf:  2.13, 2.62-r1
sys-devel/automake:  1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2,
1.10.1-r1
sys-devel/binutils:  2.16.1-r3, 2.17-r2, 2.18-r3
sys-devel/gcc-config: 1.4.0-r4
sys-devel/libtool:   2.2.4
virtual/os-headers:  2.6.25-r4
ACCEPT_KEYWORDS="x86 ~x86"
CBUILD="i686-pc-linux-gnu"
CFLAGS="-march=pentium-m -O2 -pipe -frename-registers"
CHOST="i686-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/NX/etc /usr/NX/home /usr/kde/3.5/env
/usr/kde/3.5/share/config /usr/kde/3.5/shutdown /usr/share/config
/var/lib/hsqldb"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/java/
/etc/eselect/postgresql /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release
/etc/php/apache2-php5/ext-active/ /etc/php/cgi-php5/ext-active/
/etc/php/cli-php5/ext-active/ /etc/revdep-rebuild /etc/terminfo
/etc/texmf/web2c /etc/udev/rules.d"
CXXFLAGS="-march=pentium-m -O2 -pipe -frename-registers"
DISTDIR="/usr/portage/distfiles"
FEATURES="ccache distlocks parallel-fetch preserve-libs sandbox sfperms strict
unmerge-orphans userfetch userpriv"
GENTOO_MIRRORS="http://distfiles.gentoo.org
http://www.ibiblio.org/pub/Linux/distributions/gentoo"
LANG="en_US.utf8"
LDFLAGS="-Wl,--as-needed -Wl,-O1"
LINGUAS="C en POSIX ru"
MAKEOPTS="-j2"
PKGDIR="/usr/portage/packages"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress
--force --whole-file --delete --stats --timeout=180 --exclude=/distfiles
--exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"

------- Comment #1 From Alexandre Rostovtsev 2008-08-01 15:48:39 0000 -------
Created an attachment (id=161918) [edit]
Small piece of the build log

A small piece of the 500MB build log, showing the infinite loop.

------- Comment #2 From Grant Edwards 2008-08-01 23:51:10 0000 -------
I think I'm seeing something similar.  When I try to emerge
blas-atlas 3.9.1 the emerge runs for an hour or so, and then
starts to eat RAM.  I've got 1.5GB of RAM and 1.5GB of swap. At
some point during the blas-atlas emerge, all of RAM usage
skyrockets.  Soon RAM and swap are both 100% full and the
machine becomes non-response to the point where I have to press
the reset button.

Not good...

------- Comment #3 From Grant Edwards 2008-08-01 23:57:56 0000 -------
(In reply to comment #2)
> I think I'm seeing something similar.  When I try to emerge
> blas-atlas 3.9.1 the emerge runs for an hour or so, and then
> starts to eat RAM.

I should add that I've had blas-atlas installed on this machine
for years. Emerging previous versions didn't turn into DOS attacks.  ;)

------- Comment #4 From Grant Edwards 2008-08-02 00:18:28 0000 -------
It's far worse than an infinite loop: it's an infinite
recursion.

An infinite loop just wastes CPU time.  Infinite recursion will
kill a machine.

Based on what I could see from the logs it's doing a seeming
infinite number of "make drotest", but the real problem is the
number of processes.  When swap usage started to rise, there
were over 2800 processes running and it was climbing steadily.
All except about 90 of them were children of the "emerge".  All
the ones I could see where shells.

When I killed the emerge, the total number of processes dropped
back to 90, and my RAM usage resturned to normal. Having an
ebuild fail to install a package is one thing. Having it kill a
machine is pretty bad.

Not sure how to troubleshoot this further...

------- Comment #5 From Wormo 2008-08-02 07:19:45 0000 -------
*** Bug 233674 has been marked as a duplicate of this bug. ***

------- Comment #6 From Markus Dittrich 2008-08-02 11:21:28 0000 -------
Thanks much for your bug report! I'll have a look at it.

Best,
Markus

------- Comment #7 From Markus Dittrich 2008-08-02 15:33:23 0000 -------
Created an attachment (id=161995) [edit]
propsed patch for infinite compile loop

Folks,

Please give the above patch a spin and let me know if it
fixes your issues.

Thanks,
Markus

------- Comment #8 From emerald 2008-08-02 18:06:44 0000 -------
The recursive make calls happens for me too, on ~amd64 Q9450,
so it's no x86-only problem.

------- Comment #9 From Markus Dittrich 2008-08-02 18:44:43 0000 -------
(In reply to comment #8)
> The recursive make calls happens for me too, on ~amd64 Q9450,
> so it's no x86-only problem.
> 

With or without the patch?

------- Comment #10 From emerald 2008-08-02 19:19:57 0000 -------
Without the patch it went into the recursion, with it it compiles and installs
fine.

------- Comment #11 From Grant Edwards 2008-08-02 20:23:54 0000 -------
I tried with the patch, and it still recurses infinitely. I'm
going to attempt to attach:

 * a shellscript I used to unpack, patch, compile.

 * the output from that shellscript containing output from
     ebuild unpack
     patch
     ebuild compile

 * snapshots of the ebuild process tree taken every 10 seconds
   or so until the ebuild had created a few hundred processes.

------- Comment #12 From Grant Edwards 2008-08-02 20:25:08 0000 -------
Created an attachment (id=162025) [edit]
Shell script to unpack/patch/compile using the proposed patch from above.

------- Comment #13 From Grant Edwards 2008-08-02 20:28:51 0000 -------
Created an attachment (id=162026) [edit]
Gzipped text output from ebuild unpack, patch, ebuild compile

------- Comment #14 From Grant Edwards 2008-08-02 20:29:50 0000 -------
Created an attachment (id=162028) [edit]
Gzipped text output showing "ps axf" format process tree snapshots at 10s
intervals

------- Comment #15 From Grant Edwards 2008-08-02 20:55:24 0000 -------
(In reply to comment #11)
> I tried with the patch, and it still recurses infinitely. I'm
> going to attempt to attach:
> 
>  * a shellscript I used to unpack, patch, compile.

Just to be paranoid, I added an "ebuild <...> clean" before the
unpack, and I still got the unending recusion when the make got
to the point where it was trying to do a "make drotest".

This time I killed the ebuild once it had about 120 process
running, and grepped the ebuild output for 'make drottest':

 # grep 'make drottest' blastest.out2
TST: make drottest urout=rot1_x0y0.c opt=" -X 4 1 -1 2 -3 -Y 4 1 -1 3 -2" 
TST: make drottest urout=rot1_x1y1.c opt="" 
TST: make drottest urout=rot4_x1y1.c opt="" 
TST: make drottest urout=rot1_x0y0.c opt=" -X 4 1 -1 2 -3 -Y 4 1 -1 3 -2" 
TST: make drottest urout=rot1_x1y1.c opt="" 
TST: make drottest urout=rot4_x1y1.c opt="" 
TST: make drottest urout=rot1_x0y0.c opt=" -X 4 1 -1 2 -3 -Y 4 1 -1 3 -2" 
TST: make drottest urout=rot1_x1y1.c opt="" 
TST: make drottest urout=rot4_x1y1.c opt="" 
TST: make drottest urout=rot1_x0y0.c opt=" -X 4 1 -1 2 -3 -Y 4 1 -1 3 -2" 
TST: make drottest urout=rot1_x1y1.c opt="" 
TST: make drottest urout=rot4_x1y1.c opt="" 
TST: make drottest urout=rot1_x0y0.c opt=" -X 4 1 -1 2 -3 -Y 4 1 -1 3 -2" 
TST: make drottest urout=rot1_x1y1.c opt="" 
TST: make drottest urout=rot4_x1y1.c opt="" 
TST: make drottest urout=rot1_x0y0.c opt=" -X 4 1 -1 2 -3 -Y 4 1 -1 3 -2" 
TST: make drottest urout=rot1_x1y1.c opt="" 
TST: make drottest urout=rot4_x1y1.c opt="" 
TST: make drottest urout=rot1_x0y0.c opt=" -X 4 1 -1 2 -3 -Y 4 1 -1 3 -2" 
TST: make drottest urout=rot1_x1y1.c opt="" 
TST: make drottest urout=rot4_x1y1.c opt="" 
TST: make drottest urout=rot1_x0y0.c opt=" -X 4 1 -1 2 -3 -Y 4 1 -1 3 -2" 
TST: make drottest urout=rot1_x1y1.c opt="" 
TST: make drottest urout=rot4_x1y1.c opt="" 

------- Comment #16 From Markus Dittrich 2008-08-02 22:54:36 0000 -------
(In reply to comment #15)
> (In reply to comment #11)
> > I tried with the patch, and it still recurses infinitely. I'm
> > going to attempt to attach:
> > 
> >  * a shellscript I used to unpack, patch, compile.
> 
> Just to be paranoid, I added an "ebuild <...> clean" before the
> unpack, and I still got the unending recusion when the make got
> to the point where it was trying to do a "make drotest".
> 
> This time I killed the ebuild once it had about 120 process
> running, and grepped the ebuild output for 'make drottest':

Grant,

They way you apply the patch (via your script) won't work
properly. The ebuild's "unpack" stage does more than just
unpack the tarball and also runs atlas' configure stage.
Rather, in the 3.9.1 ebuild add the line

epatch "${FILESDIR}"/${P}-timing.patch

right after the other patch lines, move the timing patch itself
to the files directory, re-digest and then try re-emerging.

Best,
Markus

------- Comment #17 From Grant Edwards 2008-08-03 14:07:05 0000 -------
(In reply to comment #16)

> > >  * a shellscript I used to unpack, patch, compile.

> They way you apply the patch (via your script) won't work
> properly. The ebuild's "unpack" stage does more than just
> unpack the tarball and also runs atlas' configure stage.

Of course.  I should have realized that.  I modified the ebuild
so that the patch is applied before the configure operation,
and it installed fine.

Thanks!

------- Comment #18 From Markus Dittrich 2008-08-03 14:43:00 0000 -------
(In reply to comment #17)
> (In reply to comment #16)
> 
> > > >  * a shellscript I used to unpack, patch, compile.
> 
> > They way you apply the patch (via your script) won't work
> > properly. The ebuild's "unpack" stage does more than just
> > unpack the tarball and also runs atlas' configure stage.
> 
> Of course.  I should have realized that.  I modified the ebuild
> so that the patch is applied before the configure operation,
> and it installed fine.
> 
> Thanks!
> 

That's good news and thanks a lot for testing! I'll add the patch
to portage then. I believe these fixes will be in the next atlas 
(3.9.2) release.

Best,
Markus

------- Comment #19 From Markus Dittrich 2008-08-03 19:11:11 0000 -------
These patches are now in portage cvs.
Thanks to everybody for testing them.

Best,
Markus

------- Comment #20 From Juergen Rose 2008-08-06 13:22:09 0000 -------
I know it is a little bit off topic, but where I can find a description about
of the meaning of the ${MY_PN}, ${PATCH_V}, ${DISTDIR} and ${P} macros?

------- Comment #21 From Jeffrey Gardner 2008-08-07 12:35:18 0000 -------
Some are here: http://devmanual.gentoo.org/ebuild-writing/variables/index.html
others are defined as needed.

First Last Prev Next    No search results available      Search page      Enter new bug