mpich can be installed together with lam-mpi, but overwrites the following files /usr/bin/mpicc /usr/bin/mpif77 /usr/bin/mpirun /usr/include/mpi.h /usr/include/mpiof.h /usr/include/mpif.h /usr/include/mpio.h I think the packages should mutually exclude each other
In some cases both may be wished on the same system, as some programs are incompatible with one or the other. A couple of options: 1) changing binary and header names. This may break some hard-coded stuff in various third-party programs 2) a 'mpi-config / config-mpi' tool to switch between them using symlinks and installing to somewhere like /usr/{include,bin}/mpi/. Thoughts?
I'd vote towards the #2. There was a related discussion some time ago (actually on agreeing on certain config-xxx naming pattern) and I think general consensus was leaning towards accepting this approach (make such packages installable alongside and provide config-/setup-/whatever- script to regulate their "default visibility"). BTW, we have a similar thing in the works in the sci domain, with all the BLAS implementations. If anybody is interested you might want to look at #30453 (more specifically around the comment #22, the bugs grew quite a bit by now..) On a sidenote, anybody else developed a feeling that we would need to come up with some kind of a revamped virtual thing, as it starts to be overloaded with number of virtuals steadily approaching 100 for most profiles.. George
Made them blockers for now. Keeping this bug open pending a saner solution.
So, I'm thinking that we keep them in relatively standard paths with changed names, then have the "primary" be a symlink. That way people can still run both at the same time just by calling it something different if necessary. Thoughts?
Regards comment #1, lam and mpich are implementation of a standard. Unless a program use some "Extensions" (gosh I hate this word) the fact that it would require one rather than another is in itself strange. Granted, the usage is different, but who bothers to use the MPI flag should be able to use lamboot rather than mpd or whatever. I use mpich for a set of reasons, and until now the only problem I had was that xmgrace needed fftw in slot 2 that required lam. Modifying the ebuild (and putting it in overlay) to use mpich instead of lam fixed the problem, even if I do not usualy do ffts with xmgrace... So, I think that forcing only one installable at a time would be good to suggest to the upstream people to make their programs not depending from a particular implementation of a well defined standard.
That's not necessarily the only difference. Some implementations may have better performance in given areas, and some may implement more of a standard than others -- not all implementations are complete ones.
But since we are using gentoo and compiling from source, and assuming that the difference is only on performance, until one program is not looking for a missing part of standard in one implementation rather than another, I would still think that allowing only one implementation at a time would be better. In any case, this is just my idea, maybe it would be possible to ask in gentoo-cluster or gentoo-science?
Personally, I think it is highly useful to have several MPI implementations simultaneously installed. There can be significant performance gains sometimes just from using a different MPI. Some are thread-safe while other are not, etc. This is especially true when you start looking at MPI implementations over high-speed interconnects such as InfiniBand. This thread should extend further then just MPICH and LAM for P4. It would be nice to allow for simultaneous installations of MPICH (for P4), MPICH-GM (for Myrinet), and MVAPICH (for IB). Also LAM, LA-MPI, and soon OpenMPI. There needs to be a sane, consistent naming scheme (ie: /usr/bin/mpicc-mpich-1.2.6) with the mpi-config tool setting the default symlinks. I've been using 'modules' (http://modules.sf.net/) recently to manage user environments with great success, but I suppose we can't assume that is installed. Also, just to make things difficult: It is often necessary to compile your MPI application with the same compiler that the MPI was compiled with. This hits you when you've got MPI compiled by Intel and someone links to it with an app compiled with gcc -- stuff breaks (often only at run time -- blech). I'm not sure how bad the effect is with incremental versions of gcc, but this may be something to think about. Should you require an MPI upgrade every time the compiler is upgraded?
(In reply to comment #8) > Personally, I think it is highly useful to have several MPI implementations > simultaneously installed. There can be significant performance gains > sometimes just from using a different MPI. Some are thread-safe while other > are not, etc. Certainly. > This thread should extend further then just MPICH and LAM for P4. It would be > nice to allow for simultaneous installations of MPICH (for P4), MPICH-GM (for > Myrinet), and MVAPICH (for IB). Also LAM, LA-MPI, and soon OpenMPI. There > needs to be a sane, consistent naming scheme (ie: /usr/bin/mpicc-mpich-1.2.6) > with the mpi-config tool setting the default symlinks. The reason just lam-mpi and mpich are mentioned is that those are the only two in portage. It's worth considering the solutions already present for other switching utilities such as gcc-config, opengl-update, blas-config, lapack-config, etc and trying to do something similar here.
I've implemented a virtual/mpi for the moment. eselect should be usable for the multiple-mpi stuff easily, esp. when we get more MPI implementations (OpenMPI, MPICH2, EMP). I see a potential major problem in that MPI applications may have to be recompiled if mpi-config is used to switch MPI implementations. For example, LAM is selected, and an app is compiled against it using mpicc. Now that app will need to be recompiled if I switch MPI implementations away from LAM, due to LAM specific stuff that will fail on the others. Also, we need to look at ebuilds in the tree that currently only work on one MPI implementation (hpl for example, that i'm looking at already).
(In reply to comment #10) > I've implemented a virtual/mpi for the moment. > > eselect should be usable for the multiple-mpi stuff easily, esp. when we get > more MPI implementations (OpenMPI, MPICH2, EMP). I agree. > I see a potential major problem in that MPI applications may have to be > recompiled if mpi-config is used to switch MPI implementations. For example, > LAM is selected, and an app is compiled against it using mpicc. Now that app > will need to be recompiled if I switch MPI implementations away from LAM, due to > LAM specific stuff that will fail on the others. Yes, that's true. I'm not aware of a good solution to this. All I can think of is slotting all the apps by the MPI they compile against and installing them to some prefix.
Or the proper thing to do (TM) is to fix the packages using mpi to use either of the packages properly... but that's hard to impossible. The other way is to have each specific applications fooled to use the right mpi.
Don't think there's any ABI guarantee between MPI implems so that wouldn't really work.
(In reply to comment #13) > Don't think there's any ABI guarantee between MPI implems so that wouldn't > really work. It even goes further than that, MPI implementation differ in their interpretation of the MPI standard, so there is no guarantee on the behavior of some of the functions across different implementations (re: process spawning and control as well as communications sockets).
Can't a solution be found and implemented now? The discussion had been carrying on for 2,5 years now... I am in the process of writting ebuilds for OpenCascade, Salome, OpenFOAM, Code-Aster, netgen etc etc.. Some of them require MPI and this antagonism between the different MPI is really annoying. Why can't we simply have three separated flags: mpi, mpich and lam-mpi The two MPI implementations would be installed on different directories with separated names and the programs requiring specific MPI implementation would use the one they need when the others would use the one we consider the most standard and most efficient. I know, this is NOT an elegant solution with a config tool whatever, but an attempt to bring new ideas in the hope of a quick and reliable solution, in the wait of an elegant one... Daniel
That may be an option for you, but not for Gentoo. We use the best solution, even if it takes longer. (The time this bug has been open is not a reflection of that, it's just because nobody has been working on it.)
In the past I tought that alowing different implementation was not a good idea, but now I am becoming more pragmatic. An eselect option or an mpi-config would be welcome, as the ebuilds Daniel Tourde is working on. Besides, even mpich, if I remember correctly, with mpich2 is moving towards a daemon based parallel process, so my adversity towards the lam-boot, lam-wipe etc. etc. will have to resign itself. Cheers
I think it should be possible to have multiple implementation of mpi (mpich, mpich2, lam-mpi, openmpi, etc.) There should be a way to choose which compiler wrapper to use (or simply calling with absolute path: /usr/lib/mpi/openmi/bin/mpicc ) The wrapper should compile with -Wl,-R/usr/lib/mpi/openmi/lib so that resulting binaries know exactly where to find the shared libraries. I can see some relation between this bug and a problem I'm facing. I use more than one compiler for my code: gcc's gfortran, pgi's pgf90 and intel's ifort. This is mainly for portability issue. So I need 3 different version of opempi on my machine, one compiled with each compiler. Right now, I have: /usr/lib/openmpi/1.1.4-pgi/ /usr/lib/openmpi/1.1.4-intel/ /usr/lib/openmpi/1.1.4-gcc/ and I compile using : /usr/lib/openmpi/1.1.4-pgi/bin/mpif90 or /usr/lib/openmpi/1.1.4-intel/bin/mpif90 or /usr/lib/openmpi/1.1.4-gcc/bin/mpif90 This is a pain to install and maintain. The workaround I used to install multiple instance of the same package version is with sloting. (using r1, r2, and r3 as ebuild release). More info on ths thread: http://forums.gentoo.org/viewtopic-p-3963358.html Multiple virtual/mpi and multiple instance of the same library compiled with different compiler could be resolve by the same solution. Is this too complicated?
Portage can't handle slotting by compiler, because it can't deal with the same revision installed multiple times to separate slots. That's the limitation we hit last time.
(In reply to comment #19) > Portage can't handle slotting by compiler, because it can't deal with the same > revision installed multiple times to separate slots. That's the limitation we > hit last time. I've managed something to install a package, compiled with different compilers. What I did was to seperate ebuilds for the different compilers. I've done it for hdf5 and netcdf4. This could apply for openmpi also. The details are at: http://forums.gentoo.org/viewtopic-t-510409-highlight-.html
I was thinking today of ways that would allow us to do this. Here's what I came up with: * Install packages to multiple ROOTs, one per compiler-mpi combination. This circumvents the one SLOT per package revision restriction. * Add the appropriate directories of these ROOTs to LDPATH, PATH, MANPATH, etc using something similar to gcc-config. * This potentially allows for user-level installs in the future, with ROOTs in the user's home directory That's the core idea. We'll need to be careful with getting RPATH/RUNPATH set so they can run with the ROOT=/ system. We could also allow the eselect module to create groups of binary packages by setting appropriate PKGDIRs based on what was getting packaged.
Upon request for an example... Basically we set something like ROOT=/usr/lib/mpi/openmpi/1.2/gcc/4.1/ and emerge openmpi, and an arbitrary collection of mpi apps to there. We next set ROOT=/usr/lib/mpi/openmpi/1.2/icc/9.1/ and repeat. Then eselect-mpi shoves /usr/lib/mpi/icc-9.1.openmpi-1.2/usr/lib into LDPATH, ../bin into PATH, and we pray it works. By doing all the versions and package names as separate directories, we should be able to choose how finely grained we want this ROOT-based slotting to be. Do we want to separate gcc minor versions, or just gcc vs icc? Same for openmpi versions, etc.
Looks promising to me.
FYI, OpenMPI 1.2 now has a neat feature: --enable-orterun-prefix-by-default Make "orterun ..." behave exactly the same as "orterun --prefix \$prefix" (where \$prefix is the value given to --prefix in configure) This makes it much easier to run applications without having to specify the LDPATH and all. Obviously, we still have to change the MANPATH accordingly in the eselect-mpi module. As for separation scheme, I would go for with MPI minor versions with simple icc Vs gcc. Tracking compiler minor versions seems like adding quite a level of complexity for the actual benefits (given that a system might be more sensitive to a compiler change than an MPI version change). This also means that the eselect-mpi should check for "presently selected" compiler.
Yet another case of a package which cannot be installed at the moment: # emerge -uN world --pretend These are the packages that would be merged, in order: Calculating world dependencies... done! [ebuild N ] sys-cluster/mpich-1.2.7_p1 USE="crypt -doc" [ebuild U ] app-shells/tcsh-6.14-r33 [6.14-r5] [ebuild N ] sys-cluster/lam-mpi-7.1.2 USE="crypt fortran -debug -pbs -xmpi" [ebuild N ] virtual/mpi-1.0 [ebuild U ] app-emacs/htmlfontify-0.20-r1 [0.20] [ebuild R ] sci-libs/fftw-2.1.5-r2 USE="mpi*" [ebuild R ] net-ftp/yafc-1.1.1-r1 [ebuild U ] app-emulation/wine-0.9.35 [0.9.34] [ebuild U ] dev-tcltk/bwidget-1.8.0 [1.7.0] [ebuild NS ] sys-kernel/vanilla-sources-2.6.20.7 USE="-build -symlink" [ebuild U ] media-gfx/imagemagick-6.3.3 [6.3.0.5-r1] [ebuild R ] sci-biology/ncbi-tools-20061015-r1 USE="mpi*" [ebuild R ] sci-biology/mrbayes-3.1.2 USE="mpi*" [ebuild R ] sci-chemistry/gromacs-3.3.1-r1 USE="mpi*" [blocks B ] sys-cluster/mpich (is blocking sys-cluster/lam-mpi-7.1.2) [blocks B ] sys-cluster/lam-mpi (is blocking sys-cluster/mpich-1.2.7_p1) # # emerge --info Portage 2.1.2.4 (default-linux/x86/2006.1/desktop, gcc-3.4.6, glibc-2.5-r1, 2.6.19 i686) ================================================================= System uname: 2.6.19 i686 Intel(R) Xeon(TM) CPU 3.00GHz Gentoo Base System release 1.12.10 Timestamp of tree: Mon, 23 Apr 2007 08:30:01 +0000 dev-java/java-config: 1.3.7, 2.0.31-r3 dev-lang/python: 2.4.4 dev-python/pycrypto: 2.0.1-r5 sys-apps/sandbox: 1.2.18.1 sys-devel/autoconf: 2.13, 2.61 sys-devel/automake: 1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2, 1.10 sys-devel/binutils: 2.17 sys-devel/gcc-config: 1.3.16 sys-devel/libtool: 1.5.23b virtual/os-headers: 2.6.20-r2 ACCEPT_KEYWORDS="x86 ~x86" AUTOCLEAN="yes" CBUILD="i686-pc-linux-gnu" CFLAGS="-O3 -march=pentium4 -mmmx -msse -msse2 -msse3 -pipe" CHOST="i686-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/lib/mozilla/defaults/pref /usr/share/X11/xkb /usr/spool/PBS /var/bind /var/qmail/alias /var/qmail/control" CONFIG_PROTECT_MASK="/etc/env.d /etc/env.d/java/ /etc/gconf /etc/java-config/vms/ /etc/revdep-rebuild /etc/terminfo /etc/texmf/web2c" CXXFLAGS="-O2 -mcpu=i686 -pipe" DISTDIR="/usr/portage/distfiles" FEATURES="distlocks metadata-transfer sandbox sfperms strict" GENTOO_MIRRORS="http://distfiles.gentoo.org http://distro.ibiblio.org/pub/linux/distributions/gentoo" PKGDIR="/usr/portage/packages" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --delete-after --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --filter=H_**/files/digest-*" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/local/portage" SYNC="rsync://rsync.gentoo.org/gentoo-portage" USE="FFmpeg X Xaw3d aalib acpi alsa apache2 avi berkdb bidi bitmap-fonts caca cairo cdr cli cracklib crypt cscope curl dba dbus divx divx4 divx4linux divx5 divx5linux dri dvd dvdr dvdread eds emacs emacs-w3 emboss encode f77 faad faad2 fam fame ffmpeg firefox flash fortran fvwm fvwm2 gb gd gdbm ggi gif gpm gstreamer gtk gtk2 gtkhtml hal i8x0 icc iconv imagemagick imlib imlib2 innodb ipv6 isdnlog ithreads java javascript jpeg kerberos lcms leim libg++ libwww live lzo mad mcal mesa midi mikmod mmx mmx2 motif mozilla mp3 mpeg mpi mule mysql ncurses network nls nptl nptlonly ogg opengl oss pam pcre pda pdflib perl plotutils plugin png ppds pppd pthread pthreads python qt qt3 qt4 qtx quicktime readline reflection rtc samba scp sdl server session slp spell spl sse sse2 sse3 ssl tcltk tcpd tetex theora thread threads tiff truetype truetype-fonts type1-fonts unicode usb v4l v4l2 vorbis win32 win32codecs winvidix wmf x86 xml xml2 xorg xosd xv xvid xvmc zeo zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1 emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mulaw multi null plug rate route share shm softvol" ELIBC="glibc" INPUT_DEVICES="keyboard mouse evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" USERLAND="GNU" VIDEO_CARDS="radeon" Unset: CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LANG, LC_ALL, LDFLAGS, LINGUAS, MAKEOPTS, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
(In reply to comment #24) > This makes it much easier to run applications without having to specify the > LDPATH and all. Obviously, we still have to change the MANPATH accordingly in > the eselect-mpi module. Sure, but that's only one of the four MPI's in portage. > As for separation scheme, I would go for with MPI minor versions with simple > icc Vs gcc. Tracking compiler minor versions seems like adding quite a level of > complexity for the actual benefits (given that a system might be more sensitive > to a compiler change than an MPI version change). I have this idea for a plugin-based system to determine how to slot these things. Each plugin file would contain a command, whose response would determine that part of the ROOT slot. With the option to return the empty string "", this allows for slotting or not at any level -- compiler, compiler version, mpi version, etc. > This also means that the eselect-mpi should check for "presently selected" > compiler. Yeah. My thought is that people who want this slotting will _first_ emerge eselect-mpi, then use that to manage installation of everything from there on out. This is in contrast to many other eselect modules, which only manage previously installed programs, and is more similar to crossdev.
*** Bug 199095 has been marked as a duplicate of this bug. ***
Another interesting development recently is MorphMPI, a wrapper around MPIs to preserve the ABI. See http://www.clustermonkey.net//content/view/213/32/ for a more detailed description.
I've done some work on this, see the science overlay. http://overlays.gentoo.org/proj/science/changeset/952
Hi, I don't see eselect-mpi in the default portage-tree. Are you still trying on pushing this forward? Thanks!
(In reply to comment #30) > Hi, I don't see eselect-mpi in the default portage-tree. Are you still trying > on pushing this forward? Thanks! > Slowly but surely. Right now I'd like to get more packages that depend on mpi ported to use the eclass in the science overlay. This process usually has a nasty habit of revealing bugs/necessary features required in the eclass. Any help would be appreciated (testing/using is help).
So, the bug has been around for 9 years and no response for 3 years; ping. :)
(In reply to Tom Wijsman (TomWij) from comment #32) > So, the bug has been around for 9 years and no response for 3 years; ping. :) I was thinking of this issue before, but for a different reason: I wanted to add sys-cluster/modules support and build different version of a library (lapack/blas) with different compilers and mpi versions. The plan was do that with by changing $ROOT and $SLOT depending on the currently selected compiler and mpi. One issue we came across was that not all packages support ROOT!=/, but this got better with the improved prefix support. A second issue is that the modules slot won't provide the normal SLOT (SLOT=0). Anyhow modules support got never implemented as it's use-cases are relatively rare. However at this point in time, we could solve that issue, without going through the modules hacks, by adding the different mpi version as abi in the multilib-build. (CC'ing multilib for comments.) Beside the multilib build (adding tc-getMPICC etc.), we still need to figure out how to wrap mpirun.
(In reply to Tom Wijsman (TomWij) from comment #32) > So, the bug has been around for 9 years and no response for 3 years; ping. :) Well, empi and mpi.eclass are still in the science overlay and functional. They really just need someone with the time to champion them for inclusion in the main tree.