Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 828123 - sys-cluster/openmpi-4.1.2 - /.../coll_base_alltoall.c: error: invalid use of undefined type struct opal_convertor_master_t
Summary: sys-cluster/openmpi-4.1.2 - /.../coll_base_alltoall.c: error: invalid use of ...
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal
Assignee: Gentoo Cluster Team
URL: https://github.com/open-mpi/ompi/issu...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-12-04 09:13 UTC by Toralf Förster
Modified: 2024-07-12 05:55 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
emerge-info.txt (emerge-info.txt,17.02 KB, text/plain)
2021-12-04 09:13 UTC, Toralf Förster
Details
emerge-history.txt (emerge-history.txt,142.18 KB, text/plain)
2021-12-04 09:13 UTC, Toralf Förster
Details
environment (environment,155.65 KB, text/plain)
2021-12-04 09:13 UTC, Toralf Förster
Details
etc.portage.tar.bz2 (etc.portage.tar.bz2,18.37 KB, application/x-bzip)
2021-12-04 09:13 UTC, Toralf Förster
Details
logs.tar.bz2 (logs.tar.bz2,206.31 KB, application/x-bzip)
2021-12-04 09:13 UTC, Toralf Förster
Details
sys-cluster:openmpi-4.1.2:20211204-031308.log.bz2 (sys-cluster:openmpi-4.1.2:20211204-031308.log.bz2,89.26 KB, application/x-bzip)
2021-12-04 09:13 UTC, Toralf Förster
Details
temp.tar.bz2 (temp.tar.bz2,211.88 KB, application/x-bzip)
2021-12-04 09:13 UTC, Toralf Förster
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Toralf Förster gentoo-dev 2021-12-04 09:13:23 UTC
too long lines were shrinked:

                 from /var/tmp/portage/sys-cluster/openmpi-4.1.2/work/openmpi-4.1.2/opal/datatype/opal_convertor.h:35,
                 from /var/tmp/portage/sys-cluster/openmpi-4.1.2/work/openmpi-4.1.2/ompi/datatype/ompi_datatype.h:38,
                 from /var/tmp/portage/sys-cluster/openmpi-4.1.2/work/openmpi-4.1.2/ompi/mca/coll/base/coll_base_alltoall.c:31:
/var/tmp/portage/sys-cluster/openmpi-4.1.2/work/openmpi-4.1.2/ompi/mca/coll/base/coll_base_alltoall.c: In function ‘mca_coll_base_alltoall_intra_basic_inplace’:
/var/tmp/portage/sys-cluster/openmpi-4.1.2/work/openmpi-4.1.2/ompi/mca/coll/base/coll_base_alltoall.c:81:85: error: invalid use of undefined type ‘struct opal_convertor_master_t’
   81 |         if( OPAL_UNLIKELY(opal_local_arch != ompi_proc->super.proc_convertor->master->remote_arch))  {
      |                                                                                     ^~
/var/tmp/portage/sys-cluster/openmpi-4.1.2/work/openmpi-4.1.2/opal/include/opal/prefetch.h:44:55: note: in definition of macro ‘OPAL_UNLIKELY’
   44 | #define OPAL_UNLIKELY(expression) __builtin_expect(!!(expression), 0)

  -------------------------------------------------------------------

  This is an unstable amd64 chroot image at a tinderbox (==build bot)
  name: 17.1_desktop-j4-20211202-110142

  -------------------------------------------------------------------

gcc-config -l:
 [1] x86_64-pc-linux-gnu-11.2.1 *
clang version 13.0.0
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/lib/llvm/13/bin
/usr/lib/llvm/13
13.0.0
Python 3.9.9
Available Ruby profiles:
  [1]   ruby26 (with Rubygems)
  [2]   ruby27 (with Rubygems)
  [3]   ruby30 (with Rubygems) *
Available Rust versions:
  [1]   rust-bin-1.56.1 *
The following VMs are available for generation-2:
1)	OpenJDK 8.312_p07 [openjdk-8]
*)	AdoptOpenJDK 8.312_p07 [openjdk-bin-8]
Available Java Virtual Machines:
  [1]   openjdk-8 
  [2]   openjdk-bin-8  system-vm

The Glorious Glasgow Haskell Compilation System, version 8.10.4
php cli:
  [1]   php7.3
  [2]   php7.4
  [3]   php8.1 *

  HEAD of ::gentoo
commit c8e2fb3130882c282cf7027199f0c447cde2d5f0
Author: Repository mirror & CI <repomirrorci@gentoo.org>
Date:   Fri Dec 3 23:51:40 2021 +0000

    2021-12-03 23:51:39 UTC

emerge -qpvO sys-cluster/openmpi
[ebuild  N    ] sys-cluster/openmpi-4.1.2  USE="fortran heterogeneous ipv6 -cma (-cuda) -cxx -java -libompitrace -peruse -romio" ABI_X86="(64) -32 (-x32)" OPENMPI_FABRICS="-knem -ofed -psm" OPENMPI_OFED_FEATURES="-control-hdr-padding -dynamic-sl -rdmacm -udcm" OPENMPI_RM="-pbs -slurm"
Comment 1 Toralf Förster gentoo-dev 2021-12-04 09:13:24 UTC
Created attachment 757391 [details]
emerge-info.txt
Comment 2 Toralf Förster gentoo-dev 2021-12-04 09:13:26 UTC
Created attachment 757392 [details]
emerge-history.txt
Comment 3 Toralf Förster gentoo-dev 2021-12-04 09:13:27 UTC
Created attachment 757393 [details]
environment
Comment 4 Toralf Förster gentoo-dev 2021-12-04 09:13:29 UTC
Created attachment 757394 [details]
etc.portage.tar.bz2
Comment 5 Toralf Förster gentoo-dev 2021-12-04 09:13:30 UTC
Created attachment 757395 [details]
logs.tar.bz2
Comment 6 Toralf Förster gentoo-dev 2021-12-04 09:13:32 UTC
Created attachment 757396 [details]
sys-cluster:openmpi-4.1.2:20211204-031308.log.bz2
Comment 7 Toralf Förster gentoo-dev 2021-12-04 09:13:34 UTC
Created attachment 757397 [details]
temp.tar.bz2
Comment 8 Iade Gesso 2021-12-04 13:31:44 UTC
(In reply to Toralf Förster from comment #0)
> too long lines were shrinked:
> 
>                  from
> /var/tmp/portage/sys-cluster/openmpi-4.1.2/work/openmpi-4.1.2/opal/datatype/
> opal_convertor.h:35,
>                  from
> /var/tmp/portage/sys-cluster/openmpi-4.1.2/work/openmpi-4.1.2/ompi/datatype/
> ompi_datatype.h:38,
>                  from
> /var/tmp/portage/sys-cluster/openmpi-4.1.2/work/openmpi-4.1.2/ompi/mca/coll/
> base/coll_base_alltoall.c:31:
> /var/tmp/portage/sys-cluster/openmpi-4.1.2/work/openmpi-4.1.2/ompi/mca/coll/
> base/coll_base_alltoall.c: In function
> ‘mca_coll_base_alltoall_intra_basic_inplace’:
> /var/tmp/portage/sys-cluster/openmpi-4.1.2/work/openmpi-4.1.2/ompi/mca/coll/
> base/coll_base_alltoall.c:81:85: error: invalid use of undefined type
> ‘struct opal_convertor_master_t’
>    81 |         if( OPAL_UNLIKELY(opal_local_arch !=
> ompi_proc->super.proc_convertor->master->remote_arch))  {
>       |                                                                     
> ^~
> /var/tmp/portage/sys-cluster/openmpi-4.1.2/work/openmpi-4.1.2/opal/include/
> opal/prefetch.h:44:55: note: in definition of macro ‘OPAL_UNLIKELY’
>    44 | #define OPAL_UNLIKELY(expression) __builtin_expect(!!(expression), 0)
> 
>   -------------------------------------------------------------------
> 
>   This is an unstable amd64 chroot image at a tinderbox (==build bot)
>   name: 17.1_desktop-j4-20211202-110142
> 
>   -------------------------------------------------------------------
> 
> gcc-config -l:
>  [1] x86_64-pc-linux-gnu-11.2.1 *
> clang version 13.0.0
> Target: x86_64-pc-linux-gnu
> Thread model: posix
> InstalledDir: /usr/lib/llvm/13/bin
> /usr/lib/llvm/13
> 13.0.0
> Python 3.9.9
> Available Ruby profiles:
>   [1]   ruby26 (with Rubygems)
>   [2]   ruby27 (with Rubygems)
>   [3]   ruby30 (with Rubygems) *
> Available Rust versions:
>   [1]   rust-bin-1.56.1 *
> The following VMs are available for generation-2:
> 1)	OpenJDK 8.312_p07 [openjdk-8]
> *)	AdoptOpenJDK 8.312_p07 [openjdk-bin-8]
> Available Java Virtual Machines:
>   [1]   openjdk-8 
>   [2]   openjdk-bin-8  system-vm
> 
> The Glorious Glasgow Haskell Compilation System, version 8.10.4
> php cli:
>   [1]   php7.3
>   [2]   php7.4
>   [3]   php8.1 *
> 
>   HEAD of ::gentoo
> commit c8e2fb3130882c282cf7027199f0c447cde2d5f0
> Author: Repository mirror & CI <repomirrorci@gentoo.org>
> Date:   Fri Dec 3 23:51:40 2021 +0000
> 
>     2021-12-03 23:51:39 UTC
> 
> emerge -qpvO sys-cluster/openmpi
> [ebuild  N    ] sys-cluster/openmpi-4.1.2  USE="fortran heterogeneous ipv6
> -cma (-cuda) -cxx -java -libompitrace -peruse -romio" ABI_X86="(64) -32
> (-x32)" OPENMPI_FABRICS="-knem -ofed -psm"
> OPENMPI_OFED_FEATURES="-control-hdr-padding -dynamic-sl -rdmacm -udcm"
> OPENMPI_RM="-pbs -slurm"

Same here, and already fired a bug... https://bugs.gentoo.org/827810 but no one helps us...

Iade
Comment 9 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2021-12-04 13:59:15 UTC
(In reply to Iade Gesso from comment #8)

You didn't reply to a request for more information?
Comment 10 Jeremy Stent 2022-01-01 19:43:03 UTC
It looks like upstream has fixed this by adding 

#include "opal/datatype/opal_convertor_internal.h"

to line 32 of 
ompi/ompi/mca/coll/base/coll_base_alltoall.c
ompi/ompi/mca/coll/base/coll_base_alltoallv.c
Comment 11 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2022-01-02 02:34:56 UTC
(In reply to Jeremy Stent from comment #10)
> It looks like upstream has fixed this by adding 
> 
> #include "opal/datatype/opal_convertor_internal.h"
> 
> to line 32 of 
> ompi/ompi/mca/coll/base/coll_base_alltoall.c
> ompi/ompi/mca/coll/base/coll_base_alltoallv.c

Was there a bug / commit you can link to?
Comment 12 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2022-01-02 02:39:57 UTC
(In reply to Sam James from comment #11)
> (In reply to Jeremy Stent from comment #10)
> > It looks like upstream has fixed this by adding 
> > 
> > #include "opal/datatype/opal_convertor_internal.h"
> > 
> > to line 32 of 
> > ompi/ompi/mca/coll/base/coll_base_alltoall.c
> > ompi/ompi/mca/coll/base/coll_base_alltoallv.c
> 
> Was there a bug / commit you can link to?

It's https://github.com/open-mpi/ompi/commit/927e9aa97373dac652f9cba4813e6ee609ca2830 but it's not been backported.
Comment 13 Larry the Git Cow gentoo-dev 2022-01-02 03:05:58 UTC
The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=fab20bbcbf246d0868b8d70b02ced33972f7c137

commit fab20bbcbf246d0868b8d70b02ced33972f7c137
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2022-01-02 03:02:35 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2022-01-02 03:02:35 +0000

    sys-cluster/openmpi: add upstream patch for build failure
    
    Closes: https://bugs.gentoo.org/828123
    Signed-off-by: Sam James <sam@gentoo.org>

 .../files/openmpi-4.1.2-missing-includes.patch     | 32 ++++++++++++++++++++++
 sys-cluster/openmpi/openmpi-4.1.2.ebuild           |  6 +++-
 2 files changed, 37 insertions(+), 1 deletion(-)
Comment 14 Jeff Squyres 2022-01-02 17:17:03 UTC
I am one of the Open MPI maintainers.

You should not enable the heterogeneous functionality in Open MPI v4.0.x or v4.1.x -- it is currently known to be broken.  Specifically, you should *not* include --enable-heterogeneous when building the Open MPI package.

Unfortunately, it looks like we had a minor glitch in our README such that the "Do not use the this functionality!" was accidentally located in the wrong section, so you had no realistic way of knowing this.  :-(

I just posted this upstream at https://github.com/open-mpi/ompi/issues/9697#issuecomment-1003746357, but wanted to make sure it was known downstream here, too.
Comment 15 Larry the Git Cow gentoo-dev 2022-01-03 00:22:01 UTC
The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=1e46e06ae70156fb4d4db508c727b1812e6a7aa4

commit 1e46e06ae70156fb4d4db508c727b1812e6a7aa4
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2022-01-03 00:20:38 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2022-01-03 00:21:48 +0000

    sys-cluster/openmpi: disable heterogeneous (unsupported, broken)
    
    Upstream have let us know (thank you!) that heterogeneous should
    _not_ be used for anything before 5.0.x (which is not out yet).
    
    We can look at restoring support in the future once it is ready
    upstream. Upstream documentation has been fixed to reflect this too.
    
    Closes: https://bugs.gentoo.org/828123
    Thanks-to: Jeff Squyres <jsquyres@cisco.com>
    Signed-off-by: Sam James <sam@gentoo.org>

 .../files/openmpi-4.1.2-missing-includes.patch     | 32 ----------------------
 sys-cluster/openmpi/openmpi-4.0.2-r1.ebuild        |  6 ++--
 sys-cluster/openmpi/openmpi-4.0.3-r1.ebuild        |  6 ++--
 sys-cluster/openmpi/openmpi-4.0.4-r1.ebuild        |  6 ++--
 sys-cluster/openmpi/openmpi-4.0.5-r2.ebuild        |  6 ++--
 sys-cluster/openmpi/openmpi-4.0.5-r3.ebuild        |  6 ++--
 sys-cluster/openmpi/openmpi-4.0.6-r1.ebuild        |  6 ++--
 sys-cluster/openmpi/openmpi-4.0.7.ebuild           |  6 ++--
 sys-cluster/openmpi/openmpi-4.1.1-r1.ebuild        |  6 ++--
 sys-cluster/openmpi/openmpi-4.1.2.ebuild           | 12 ++++----
 10 files changed, 30 insertions(+), 62 deletions(-)
Comment 16 Larry the Git Cow gentoo-dev 2024-07-12 05:55:26 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=496a4f0ce86f43da3fe77ffd6c9bef2e41cf3852

commit 496a4f0ce86f43da3fe77ffd6c9bef2e41cf3852
Author:     Eli Schwartz <eschwartz93@gmail.com>
AuthorDate: 2024-06-10 04:05:03 +0000
Commit:     Eli Schwartz <eschwartz@gentoo.org>
CommitDate: 2024-07-12 05:54:15 +0000

    sys-cluster/openmpi: add 5.0.3
    
    A bunch of upstream changes occurred. In particular:
    
    - openmpi drops ALL support for 32-bit, and errors out in ./configure if
      you try. This follows pmix. Rip out all the multilib-minimal
      scaffolding.
    
    - libompitrace "was incomplete and unmaintained" and is now removed from
      the sources
    
    - upstream now defaults to --disable-dlopen, and configuring with
      libltdl enabled externally returns errors saying a non libltdl header
      doesn't exist. Unclear if it actually supports this
    
    - a couple dependencies can now be configured --with-*=external instead
      of passing paths
    
    - libibverbs handling is gone upstream and no longer makes sense to
      configure via USE flags (or at all):
      https://github.com/open-mpi/ompi/commit/59c8ab6da4276ff398453a54910c6c0fb67a153c
    
    Delayed:
    - heterogeneous was broken in older versions, and its USE flag is
      supposed to be restored. But the upstream docs still suggest it is
      broken.
    
    Independent of upstream rework of pmix, we take the opportunity of a
    version bump to build against the system pmix, resolving a longstanding
    bug due to openmpi publicly shipping its own pmix installation that
    stomps all over the global system namespace. Temporarily drop keywords
    which the pmix package lacks.
    
    Bug: https://bugs.gentoo.org/828123
    Closes: https://bugs.gentoo.org/652432
    Closes: https://bugs.gentoo.org/927828
    Closes: https://bugs.gentoo.org/930362
    Signed-off-by: Eli Schwartz <eschwartz93@gmail.com>
    Signed-off-by: Eli Schwartz <eschwartz@gentoo.org>

 sys-cluster/openmpi/Manifest             |   1 +
 sys-cluster/openmpi/openmpi-5.0.3.ebuild | 141 +++++++++++++++++++++++++++++++
 2 files changed, 142 insertions(+)