First Last Prev Next    No search results available      Search page      Enter new bug
Bug#: 224099
Alias:
Product:
Component:
Status: RESOLVED
Resolution: FIXED
Assigned To: Gentoo Linux High-Performance Clustering Team <hp-cluster@gentoo.org>
Hardware:
OS:
Version:
Priority:
Severity:
Reporter: Justin Bronder <jsbronder@gentoo.org>
Add CC:
CC:
Remove selected CCs
URL:
Summary:
Status Whiteboard:
Keywords:

Filename Description Type Creator Created Size Actions
build.log build.log-amd64-test-fail text/plain Santiago M. Mola 2008-06-24 16:24 0000 1.83 MB Details
Create a New Attachment (proposed patch, testcase, etc.) View All

Bug 224099 depends on: Show dependency tree
Show dependency graph
Bug 224099 blocks:
Votes: 0    Show votes for this bug    Vote for this bug

Additional Comments: (this is where you put emerge --info)







View Bug Activity   |   Format For Printing   |   XML   |   Clone This Bug


Description:   Opened: 2008-05-29 14:21 0000
We'd like to make openmpi the default implementation in virtual/mpi, the sooner
the better as mpich is old enough to just cause problems.

Ebuild has a test suite.

Thanks.

------- Comment #1 From Christian Faulhammer 2008-05-29 15:59:47 0000 -------
With USE="fortran threads"


 /usr/bin/install -c -m 644 openmpi-mca-params.conf
/var/tmp/portage/sys-cluster/openmpi-1.2.6/image//etc/openmpi/openmpi-mca-params.conf
/usr/bin/install: cannot create regular file
`/var/tmp/portage/sys-cluster/openmpi-1.2.6/image//etc/openmpi/openmpi-mca-params.conf':
No such file or directory
make[3]: *** [install-data-local] Error 1
make[3]: *** Waiting for unfinished jobs....

------- Comment #2 From Markus Rothe 2008-05-29 17:20:56 0000 -------
I cannot confirm this behavior on ppc64. waiting for another person to confirm
before marking stable.

------- Comment #3 From Justin Bronder 2008-05-29 21:32:19 0000 -------
(In reply to comment #1)
> With USE="fortran threads"
> 
> 
>  /usr/bin/install -c -m 644 openmpi-mca-params.conf
> /var/tmp/portage/sys-cluster/openmpi-1.2.6/image//etc/openmpi/openmpi-mca-params.conf
> /usr/bin/install: cannot create regular file
> `/var/tmp/portage/sys-cluster/openmpi-1.2.6/image//etc/openmpi/openmpi-mca-params.conf':
> No such file or directory
> make[3]: *** [install-data-local] Error 1
> make[3]: *** Waiting for unfinished jobs....
> 

I'm also unable to replicate this, can you provide any more information?

------- Comment #4 From Donnie Berkholz 2008-05-29 22:33:55 0000 -------
Could be a parallel install problem, what's MAKEOPTS?

------- Comment #5 From Christian Faulhammer 2008-05-30 15:02:16 0000 -------
(In reply to comment #4)
> Could be a parallel install problem, what's MAKEOPTS?

 I am not able to reproduce this anymore.

------- Comment #6 From Markus Rothe 2008-05-30 18:16:55 0000 -------
cool. ppc64 stable then

------- Comment #7 From Friedrich Oslage 2008-05-31 11:58:33 0000 -------
I'm getting these test failures on sparc(USE="-threads"):

--> Testing atomic_cmpset
atomic_cmpset: atomic_cmpset.c:198: main: Assertion `volptr == newptr' failed.
./run_tests: line 8: 21517 Aborted                 $* $threads
    - 1 threads: Failed
atomic_cmpset: atomic_cmpset.c:198: main: Assertion `volptr == newptr' failed.
./run_tests: line 8: 21518 Aborted                 $* $threads
    - 2 threads: Failed
atomic_cmpset: atomic_cmpset.c:198: main: Assertion `volptr == newptr' failed.
./run_tests: line 8: 21519 Aborted                 $* $threads
    - 4 threads: Failed
atomic_cmpset: atomic_cmpset.c:198: main: Assertion `volptr == newptr' failed.
./run_tests: line 8: 21520 Aborted                 $* $threads
    - 5 threads: Failed
atomic_cmpset: atomic_cmpset.c:198: main: Assertion `volptr == newptr' failed.
./run_tests: line 8: 21521 Aborted                 $* $threads
    - 8 threads: Failed
FAIL: atomic_cmpset
--> Testing atomic_cmpset_noinline
    - 1 threads: Passed
    - 2 threads: Passed
    - 4 threads: Passed
    - 5 threads: Passed
    - 8 threads: Passed
PASS: atomic_cmpset_noinline
========================================================
1 of 8 tests failed
Please report to http://www.open-mpi.org/community/help/
========================================================

If I ignore them, non-mpi and mpi programms seem to work fine...

# emerge --info
Portage 2.1.4.4 (default-linux/sparc/sparc64/2007.0/server, gcc-4.1.2,
glibc-2.6.1-r0, 2.6.24-gentoo-r8 sparc64)
=================================================================
System uname: 2.6.24-gentoo-r8 sparc64 sun4u
Timestamp of tree: Sat, 31 May 2008 07:34:01 +0000
app-shells/bash:     3.2_p33
dev-lang/python:     2.4.4-r13
dev-python/pycrypto: 2.0.1-r6
sys-apps/baselayout: 1.12.11.1
sys-apps/sandbox:    1.2.18.1-r2
sys-devel/autoconf:  2.13, 2.61-r1
sys-devel/automake:  1.10.1
sys-devel/binutils:  2.18-r1
sys-devel/gcc-config: 1.4.0-r4
sys-devel/libtool:   1.5.26
virtual/os-headers:  2.6.24
ACCEPT_KEYWORDS="sparc"
CBUILD="sparc-unknown-linux-gnu"
CFLAGS="-mcpu=ultrasparc -mtune=ultrasparc -mvis -Wa,-Av8plusa
-frename-registers -O2 -pipe"
CHOST="sparc-unknown-linux-gnu"
CONFIG_PROTECT="/etc"
CONFIG_PROTECT_MASK="/etc/env.d /etc/gconf /etc/revdep-rebuild /etc/terminfo
/etc/udev/rules.d"
CXXFLAGS="-mcpu=ultrasparc -mtune=ultrasparc -mvis -Wa,-Av8plusa
-frename-registers -O2 -pipe"
DISTDIR="/tmp/distfiles"
FEATURES="collision-protect distlocks metadata-transfer parallel-fetch sandbox
strict test unmerge-orphans userfetch userpriv usersandbox"
GENTOO_MIRRORS="http://distfiles.gentoo.org
http://distro.ibiblio.org/pub/linux/distributions/gentoo"
LANG="de_DE.UTF-8"
LDFLAGS="-Wl,-O1"
LINGUAS="en de"
MAKEOPTS="-j17"
PKGDIR="/usr/portage/packages"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress
--force --whole-file --delete --stats --timeout=180 --exclude=/distfiles
--exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage"
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="caps cli cracklib cups dri extensions fortran gdbm gpm hpn iconv ipv6
isdnlog ldap logrotate mailwrapper midi mudflap nls nothreads nptl nptlonly
openmp pcre ppds pppd qos reflection server session snmp sparc spl ssl symlink
tftp truetype unicode userlocales vim xml xorg" ELIBC="glibc"
INPUT_DEVICES="keyboard mouse evdev" KERNEL="linux" LINGUAS="en de"
USERLAND="GNU"
Unset:  CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL,
PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS

------- Comment #8 From Christian Faulhammer 2008-05-31 13:48:10 0000 -------
x86 stable

------- Comment #9 From nixnut 2008-06-01 13:08:11 0000 -------
ppc stable

------- Comment #10 From Raúl Porcel 2008-06-10 18:19:33 0000 -------
alpha stable

------- Comment #11 From Ferris McCormick 2008-06-18 13:12:42 0000 -------
Sparc stable, all tests pass for me.  Also marking sys-cluster/torque-2.3.0-r1
stable on sparc because torque is a dependency for openmpi.

I don't see the failures mentioned in Comment 7; I wonder if this is USE flags
related?  (I build with USE="fortran heterogeneous pbs smp")

------- Comment #12 From Friedrich Oslage 2008-06-18 19:41:33 0000 -------
(In reply to comment #11)
> I don't see the failures mentioned in Comment 7; I wonder if this is USE flags
> related?  (I build with USE="fortran heterogeneous pbs smp")

More testing seems to show that my test failures were caused by the -ggdb
CFLAG. Can someone confirm this and change the ebuild to filter out that flag
if appropriate?

------- Comment #13 From Justin Bronder 2008-06-19 00:19:28 0000 -------
(In reply to comment #12)
> More testing seems to show that my test failures were caused by the -ggdb
> CFLAG. Can someone confirm this and change the ebuild to filter out that flag
> if appropriate?

All tests still pass for me with USE="fortran pbs romio smp" and -ggdb in
CFLAGS.  This is on amd64 though.

------- Comment #14 From Santiago M. Mola 2008-06-24 16:24:30 0000 -------
Created an attachment (id=158295) [edit]
build.log-amd64-test-fail

Tests fails on amd64 with USE="fortran" and the rest of use flags unset.

------- Comment #15 From Santiago M. Mola 2008-06-24 17:46:53 0000 -------
I haven't digged too deep, but I'd say the problem in amd64 is:
https://svn.open-mpi.org/trac/ompi/ticket/1350
https://svn.open-mpi.org/trac/ompi/ticket/1351

------- Comment #16 From Santiago M. Mola 2008-06-25 10:41:29 0000 -------
The ebuild uses '$(use_enable romio romio-io)' but it's 'io-romio'.

------- Comment #17 From Justin Bronder 2008-06-25 14:57:50 0000 -------
(In reply to comment #14)
> Created an attachment (id=158295) [edit]
> build.log-amd64-test-fail
> 
> Tests fails on amd64 with USE="fortran" and the rest of use flags unset.
> 

Just a guess, but were you on a SMP machine?  These tests will fail if you
don't have the locking built in, which is configured by the smp useflag.

If that's the case, I'm going to suggest that I should probably just remove the
option as I doubt that many people are really using openmpi on single processor
machines and really concerned about locking.

Also, thanks for spotting the romio issue, it's fixed in cvs.

------- Comment #18 From Santiago M. Mola 2008-06-25 15:47:46 0000 -------
(In reply to comment #17)
> 
> Just a guess, but were you on a SMP machine?  These tests will fail if you
> don't have the locking built in, which is configured by the smp useflag.
> 

Yes. That's it ;-)

------- Comment #19 From Santiago M. Mola 2008-07-01 14:49:18 0000 -------
Justin, any news?

------- Comment #20 From Justin Bronder 2008-07-08 14:37:07 0000 -------
(In reply to comment #19)
> Justin, any news?

Sorry for the delay, I was moving over the past week.  Anyways, as the test
fails due to an SMP machine not adding the smp USE flag, is that really
blocking stablization?  I agree that the flag should be removed in the next
version, but I don't want to revbump and force everyone to stablize again.

You've been around a lot longer than me, so I'll follow your advice on this.

Thanks.

------- Comment #21 From Santiago M. Mola 2008-07-08 16:45:37 0000 -------
I've just discussed with other amd64 members and amd64 will wait until a
revbump is done addressing this. SMP is the rule these days for amd64 users and
USE="smp" isn't even in default profiles, so this is a bug for us.

Maybe this is less important for other arches, so I don't care that much if
they stabilize the new revision inmediately or after the usual 30 days. But
anyway, it shouldn't be a problem for them to stabilize too since it's a minor
change.

Thanks ;)

------- Comment #22 From Justin Bronder 2008-07-08 22:25:00 0000 -------
(In reply to comment #21)
Fair enough, expect another stabilization bug shortly :)

amd64 is the last arch, so I'm just going to close this, thanks everyone for
your time.

First Last Prev Next    No search results available      Search page      Enter new bug