Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 43690 - please try building lam-mpi
Summary: please try building lam-mpi
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: New packages (show other bugs)
Hardware: All All
: High minor (vote)
Assignee: Ferris McCormick (RETIRED)
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-03-04 01:45 UTC by Patrick Kursawe (RETIRED)
Modified: 2004-04-16 08:05 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Patrick Kursawe (RETIRED) gentoo-dev 2004-03-04 01:45:38 UTC
Older versions were fine, but newer didn't have sparc keyword. Please test latest version in portage.
Comment 1 Gustavo Zacarias (RETIRED) gentoo-dev 2004-03-04 06:01:27 UTC
Built fine for me, though i have no way / knowledge about how to test it.
Comment 2 Patrick Kursawe (RETIRED) gentoo-dev 2004-03-05 00:59:31 UTC
Thanks. Since it's considered ok for Irix and older versions seemed to work, I guess a successful build has to be enough :-)
Comment 3 Ferris McCormick (RETIRED) gentoo-dev 2004-03-11 07:50:56 UTC
I would like this re-opened (or a new one generated.)  I do use lam-mpi, and I
am in the process of testing lam-mpi-7.0.4 for sparc.

Preliminary indications are that this version works (at least with a "new enough"
kernel), but it is not automatic that lam-mpi-6.xxx OK ==> lam-mpi-7.xxx OK.

I can say that with a two-system configuration, each one with 2 CPUs,
lamtests-7.0.4 appears to do fine, but there is more work to do before blessing
it.

Thanks,
Ferris
Comment 4 Seemant Kulleen (RETIRED) gentoo-dev 2004-03-11 08:09:17 UTC
reopening for ferris
Comment 5 Ferris McCormick (RETIRED) gentoo-dev 2004-03-11 13:44:09 UTC
Thanks for reopening.  Here is what I know at this point.

1.  It builds fine, as previously mentioned, but the lam-mpi-7.0.4.ebuild
    should lose lines 54, 58, which are
	#need to correct the produced absolute symlink
	cd ${D}/usr/include
	rm mpi++.h
	ln -sf mpi2c++/mpi++.h mpi++.h
    The names are changed to mpicxx.h, mpi2cxx; but, the mpicxx.h file no longer
    lives in mpi2cxx (it is ${D}/usr/include).  So, the symbolic link is a link
    to nowhere used by nothing.
2.  If you download lamtests-7.0.4 and build them, they run in a configuration
    defined by a bhosts file that (for me) looks like
      antaresia cpu=3
      lacewing cpu=3
    In fact, both processors are dual sparc64 systems, but some of the tests
    require 6 nodes to run.

    This is almost enough to say lam-mpi-7.0.4 is OK for sparc, because these
    tests are both thorough and stressful.

3.  The pyMPI application (MPI in python) builds and runs in a similar
    configuration (but with cpu=2).  It can use all four CPUs to generate
    fractals (which is its test program).

4,  I haven't tried xmtv (an application from the LAM people) yet.

5.  The DaSSF/lam-mpi interface (simulation application) has problems entirely
    unrelated to lam-mpi:
      When building an application, it manages to get a conflict between
      bit/stdio.h and gcc's stdio.h because of conflicting definitions for
      some functions (like getchar).  The problem is DaSSF's, but it is new
      as of 2 weeks ago, and it will take me a while to figure it out.
    I think I can fake it out with a /usr/local/include/bits/stdio.h which
    never makes the conflicting definitions.  If so, I can verify the
    lam-mpi part of DaSSF, and that will take me about as far as I can go
    (but for xmtv, which I'll do too).

So, right now lam-mpi-7.0.4 looks pretty robust; the ebuild contains an
obsolete fix-up.

I'll know everything I'm going to in a day or so.
Comment 6 Ferris McCormick (RETIRED) gentoo-dev 2004-03-12 11:49:45 UTC
This will summarize just about all the testing I know how to do.

1.  DaSSF (A package to help with parallel simulations):  There is a problem
    new with latest bits/stdio.h which is unrelated to lam-mpi. The problem  needs a fix in DaSSF, but not in lam-mpi.
    A band-aid is to put a local version of /usr/include/bits/stdio.h in DaSSF's
    include tree at (for me) ..../DaSSF-LAM/include/bits/stdio.h in order to
    disable the in-line definitions of getchar() & friends.

    With this in place, my demo simulation runs fine.  It requires 2 CPU's and
    is started via "mpirun schema-file" where, for example, one such file looks
    like
#############################
# mpirun schema for transportation simulation.  Use one CPU on this system, one CPU on remote.
c0 xport-GENERIC -submodel .su1,1.dml
c2 xport-GENERIC -submodel .su1,1.dml
##############################
     This example exercises the mpic++ portion of lam-mpi

     (If you care, a DaSSF application looks a lot like a C++ program, but it
      isn't.  DaSSF converts it to C++ before it even knows about MPI, and
      that conversion needs fixing.  The user (me) doesn't know how parallelism
       is achieved.)

     For this to run, a system (DaSSF) designed to interface with mpich has to
     be able to interface with lam-mpi instead, so it demonstrates:
       - lam-mpi works;
       - lam-mpi conforms to standards.

2.   xmtv and such like interfaces do not work with lam-mpi-7.0.4 as configured
     by the ebuild.  They require a configuration flag '--with-trillium' which
     the current build does not support.  This is not a bug, but it is a
     possible enhancement if anyone cares.

3.   There appears to be one sparc keyworded package which can use lam-mpi:
     namely, the fast fourier transform suite at dev-libs/fftw; on sparc,
     this is version fftw-2.1.5-r1.  If you
      USE='mpi' emerge fftw
     it builds fine with lam-mpi, and its mpi tests all run in various 
     configurations with 2xSMP-2 as described above.  Typical test run is
     made like this:
         mpirun -s h -ssi rpi crtcp c0,2,3 fftw_mpi_test -v -t

4.   SPARC32.  All the above has taken place on Ultra systems.  What about
     an SS20?  lam-mpi builds on an SS20 SMP.  lamtests-7.0.4 build on ss20.

     Set up a hosts file that looks like this,
       dragonfly cpu=2
       antaresia cpu=2
       lacewing cpu=2
     on the ss20 and 'lamboot' from it, so you have a 3x2 network (and so the
     tests will have 6 different cpus, just like they want).

     Then, AS LONG AS YOU START THE TESTS FROM THE SS20, they run. (If you
     start them from an Ultra, they will do bad things to the ss20.)  I have
     not done more extensive testing with SS20 in the mix, and am probably not
     going to.

So, for my purposes, lam-mpi-7.0.4 works fine, and I have swapped the ebuilt
version in as a replacement for my locally build lam-mpi-7.0.

I would say that it's fine for sparc, but I don't know how anyone else is
using it.  And there can be compatibility issues upgrading from lam-mpi-6.xx
as thoroughly discussed on the lam-mpi mailing lists.  Note though the FAQs at
the lam-mpi.org website may fairly be summarized as:
  - Don't use lam-mpi-6.xxx
  - When you upgrade versions, you had best rebuild all your applications.
    lam-mpi remains source compatibile but not binary compatibile.

Summary:  Looks good to me.  There's not much more I can usefully do for this
request.

Regards,
Ferris 

Comment 7 Ferris McCormick (RETIRED) gentoo-dev 2004-04-16 08:05:11 UTC
Marked stable for sparc & closed.