Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 302621 - [Tracker] pkgs depending on hdf5 should use hdf5[mpi=] and mpi wrappers
Summary: [Tracker] pkgs depending on hdf5 should use hdf5[mpi=] and mpi wrappers
Status: CONFIRMED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: High normal (vote)
Assignee: Gentoo Cluster Team
URL:
Whiteboard:
Keywords: Tracker
Depends on: 288230 296790 301538 302056 302715 302719 303160 303391
Blocks:
  Show dependency tree
 
Reported: 2010-01-28 12:31 UTC by Kacper Kowalik (Xarthisius) (RETIRED)
Modified: 2022-04-05 22:20 UTC (History)
4 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Kacper Kowalik (Xarthisius) (RETIRED) gentoo-dev 2010-01-28 12:31:06 UTC
When hdf5 is compiled with MPI support it bundles symbols from libmpi. Additionally it tries to include "mpi.h" which may not be in standard place (e.g. openmpi install mpi.h to /usr/include/openmpi)

As a result packages depending on hdf5 may:
 * fail to configure, with error message 'no hdf5.h found'
 * fail to link -lhdf5, with undefined symbols from MPI lib

To fix:
 * use sci-libs/hdf5[mpi=] as a dependency
 * with USE="mpi" enforce mpi wrappers as a default compiler

Confirmed affected packages:
 * sci-libs/mathgl
 * dev-lang/gdl
 * octave
 * sci-libs/netcdf
Comment 1 Markus Dittrich (RETIRED) gentoo-dev 2010-02-05 04:34:36 UTC
Personally, I am not very happy with this solution
because this will force an mpi useflag on package that don't
have anything to do with mpi apart from the fact that they
link against hdf5 (gdl and octave are examples). The mpi useflag
implies to me that a package explicitly uses mpi and not 
implicitly via a second party lib that may or may not be compiled
against mpi. It seems that the culprit is hdf5. 

Is there a chance that we can solve this via 
adding a pkg-config file (or sth similar) for hdf5 that pulls in the 
proper libs/includes depending on how it has been compiled.

Thanks,
Markus
Comment 2 Markus Dittrich (RETIRED) gentoo-dev 2010-02-05 04:48:34 UTC
Sorry, I wasn't very precise. I meant to say

> Personally, I am not very happy with this solution

for packages which don't themselves depend on mpi.

Markus
Comment 3 Sébastien Fabbro (RETIRED) gentoo-dev 2010-02-05 05:06:59 UTC
I have to say this solution is the hard way. I don't think it is specific to hdf5 but all packages depending on mpi. I have fftw:2.1 also including mpi.h.
Justin, could there be another way around it?

Instead of introducing a mpi use flag to packages which don't depend on it, we could also replace the use mpi && export CC=mpi by removing the mpi flag and use

has_version sci-libs/hdf5[mpi] && export CC=mpicc 
Comment 4 Justin Bronder (RETIRED) gentoo-dev 2010-02-05 16:13:52 UTC
(In reply to comment #3)
> I have to say this solution is the hard way. I don't think it is specific to
> hdf5 but all packages depending on mpi. I have fftw:2.1 also including mpi.h.
> Justin, could there be another way around it?
> 
> Instead of introducing a mpi use flag to packages which don't depend on it, we
> could also replace the use mpi && export CC=mpi by removing the mpi flag and
> use
> 
> has_version sci-libs/hdf5[mpi] && export CC=mpicc
I don't see any problem with this either.

Also mpich2 does distribute a pkg-config file, and the openmpi guys opened an enhancement bug to do the same.  Even without those, it shouldn't be too hard for hdf5 and similar to create their own pkg-config files using mpicc -showme and related.

Comment 5 Markus Dittrich (RETIRED) gentoo-dev 2010-02-05 23:00:03 UTC
(In reply to comment #3)
> has_version sci-libs/hdf5[mpi] && export CC=mpicc 
> 

This looks fine to me as well since we're not advertising
any (non-existing) mpi capability to the user. 

Thanks for the suggestion.

Markus

Comment 6 Steve Arnold archtester gentoo-dev 2010-02-27 03:19:29 UTC
This is somewhat better than adding a false USE flag, but it still places the responsibility in the wrong place, both package-wise and maintainer-wise.  Whatever the fix is, such as pkg-config or whatever, should really be provided by the package that uses mpi, ie, hdf5 (not gdal).
Comment 7 Steve Arnold archtester gentoo-dev 2010-02-27 03:47:16 UTC
Apparently I missed the debate, but I have to say I'm surprised there are so many people agreeing to hide the problem by kluging several packages instead of fixing one package.  I don't see how this approach is efficient or productive, nor does it make sense from a systems engineering standpoint.

AFAIK, mpich2 has already fixed their build issues upstream (the bug we were arguing about a while back) and is not causing this problem anymore.  How about we get openmpi fixed instead of touching half a dozen packages that aren't actually broken?  Please?  With sugar on it?
Comment 8 Sébastien Fabbro (RETIRED) gentoo-dev 2010-02-28 03:46:49 UTC
The solution offered here is far from an ideal one. It's also not clear to me mpich2 has fixed this issue upstream, it is only that we keep the mpi.h from mpich2 in standard include directories.
It does not look like any solution is happening soon (although Kacper has been working on it), so while waiting for the upstream or us to come up with a decent solution, I suggest to apply the simple hack from comment #3. It is not much intrusive and probably our sci-libs/gdal (bug #296790) users with openmpi would benefit from it.
Comment 9 Steve Arnold archtester gentoo-dev 2010-03-04 02:47:14 UTC
Since openmpi is broken without a patch to build the libs properly, I'd suggest the better workaround for now is "don't use opnempi" and use mpich2 instead.  The fix to mpich2 applied upstream was simply to properly link the mpi library at build time (we used to patch it ourselves until a couple of versions ago).  Applying a similar patch to openmpi would solve the issue on our end, and eliminate the need for workarounds or other dodgy kluges in all the other packages that depend on some library built against libmpi*.

I think this whole thing has gotten off track; the mpi wrappers are strictly for building mpi-aware applications, and should *not* be required to build gdal (which knows nothing about mpi) against some other lib that happens to be built with mpi support.  In fact, with mpich2, no wrappers are required at all to build other libraries with mpi, as long as mpi is built correctly.
Comment 10 Steve Arnold archtester gentoo-dev 2010-03-05 23:49:17 UTC
There are several libs in the openmpi build tree with undefined symbols, but this one appears to be the root cause of the build failures.  Maybe adding a USE flag (to openmpi) to disable f90 is the most appropriate workaround until it gets fixed.  Seems much easier than hacking half a dozen ebuilds with a questionable kluge (see bug #303160 for a case where the kluge didn't work).

ldd -r ...

./ompi/mpi/f90/.libs/libmpi_f90.so.0:
        linux-vdso.so.1 =>  (0x00007fff73bff000)
        libmpi.so.0 => /var/tmp/portage/sys-cluster/openmpi-1.4.1/work/openmpi-1.4.1/ompi/.libs/libmpi.so.0 (0x00007f00179b9000)
        libopen-rte.so.0 => /var/tmp/portage/sys-cluster/openmpi-1.4.1/work/openmpi-1.4.1/orte/.libs/libopen-rte.so.0 (0x00007f001776d000)
        libopen-pal.so.0 => /var/tmp/portage/sys-cluster/openmpi-1.4.1/work/openmpi-1.4.1/opal/.libs/libopen-pal.so.0 (0x00007f00174f9000)
        libdl.so.2 => /lib/libdl.so.2 (0x00007f00172b3000)
        libnsl.so.1 => /lib/libnsl.so.1 (0x00007f001709a000)
        libutil.so.1 => /lib/libutil.so.1 (0x00007f0016e97000)
        libgfortran.so.3 => /usr/lib/gcc/x86_64-pc-linux-gnu/4.4.3/libgfortran.so.3 (0x00007f0016baa000)
        libm.so.6 => /lib/libm.so.6 (0x00007f0016927000)
        libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00007f0016710000)
        libpthread.so.0 => /lib/libpthread.so.0 (0x00007f00164f4000)
        libc.so.6 => /lib/libc.so.6 (0x00007f001619d000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f0017e66000)
undefined symbol: mpi_testall_  (./ompi/mpi/f90/.libs/libmpi_f90.so.0)
undefined symbol: mpi_waitall_  (./ompi/mpi/f90/.libs/libmpi_f90.so.0)
undefined symbol: mpi_testsome_ (./ompi/mpi/f90/.libs/libmpi_f90.so.0)
undefined symbol: mpi_waitsome_ (./ompi/mpi/f90/.libs/libmpi_f90.so.0)
undefined symbol: mpi_comm_spawn_multiple_      (./ompi/mpi/f90/.libs/libmpi_f90.so.0)
Comment 11 Steve Arnold archtester gentoo-dev 2010-03-28 16:41:22 UTC
An even easier workaround than comment #3 would be to unmerge openmpi and emerge mpich2 instead, since that works just fine, ie, no build errors in 3rd-level packages.  And then no kluges are required at all...
Comment 12 Juergen Rose 2010-04-19 07:57:08 UTC
As noted in 310709 this solution does not work for me, because the installed, current sci-geosciences/mapnik-0.6.1-r3 requires boost-1.39.0, which requires openmpi.