Bug 489976 - sys-cluster/slurm - slurmctld: error: plugin_load_from_file: dlopen(/usr/lib64/slurm/ /usr/lib64/slurm/ undefined symbol: slurm_container_get_pids
Summary: sys-cluster/slurm - slurmctld: error: plugin_load_from_file: dlopen(/usr/lib6...
Product: Gentoo Linux
Component: [OLD] Server (show other bugs)
Hardware: All Linux
Assignee: Alexey Shvetsov
Reported: 2013-10-31 14:58 UTC by Olaf Leidinger
Modified: 2013-11-05 21:00 UTC (History)
3 users (show)

output during build using hardened toolchain (slurm.log.xz,32.02 KB, application/x-xz)
2013-10-31 22:21 UTC, Olaf Leidinger

Description Olaf Leidinger 2013-10-31 14:58:23 UTC
After configuring slurm, it won't run:

 slurmctld -Dvvvvv
slurmctld: pidfile not locked, assuming no running daemon
slurmctld: error: Configured MailProg is invalid
slurmctld: debug3: Trying to load plugin /usr/lib64/slurm/
slurmctld: debug2: slurmdb_init() called
slurmctld: Accounting storage FileTxt plugin loaded
slurmctld: debug3: Success.
slurmctld: debug3: not enforcing associations and no list was given so we are giving a blank list
slurmctld: debug2: No Assoc usage file (/var/tmp/slurm/slurmd/assoc_usage) to recover
slurmctld: slurmctld version 2.6.3 started on cluster cluster
slurmctld: debug3: Trying to load plugin /usr/lib64/slurm/
slurmctld: Munge cryptographic signature plugin loaded
slurmctld: debug3: Success.
slurmctld: debug3: Trying to load plugin /usr/lib64/slurm/
slurmctld: Consumable Resources (CR) Node Selection plugin loaded with argument 4
slurmctld: debug3: Success.
slurmctld: debug3: Trying to load plugin /usr/lib64/slurm/
slurmctld: preempt/none loaded
slurmctld: debug3: Success.
slurmctld: debug3: Trying to load plugin /usr/lib64/slurm/
slurmctld: debug3: Success.
slurmctld: Checkpoint plugin loaded: checkpoint/none
slurmctld: debug3: Trying to load plugin /usr/lib64/slurm/
slurmctld: error: plugin_load_from_file: dlopen(/usr/lib64/slurm/ /usr/lib64/slurm/ undefined symbol: slurm_container_get_pids
slurmctld: error: Couldn't load specified plugin name for jobacct_gather/linux: Dlopen of plugin file failed
slurmctld: error: cannot create jobacct_gather context for jobacct_gather/linux
slurmctld: fatal: failed to initialize jobacct_gather plugin


 # slurmd -D
slurmd: error: plugin_load_from_file: dlopen(/usr/lib64/slurm/ /usr/lib64/slurm/ undefined symbol: drain_nodes
slurmd: error: Couldn't load specified plugin name for select/cons_res: Dlopen of plugin file failed
slurmd: fatal: Can't find plugin for select/cons_res

The FAQ lists is as error due to an old version:

Yet, this is a first time installation.

Reproducible: Always

Steps to Reproduce:
I tried slurm 2.5.[4,6] and even bumped the ebuild of 2.5.6 to 2.6.3. Same problem every time.

# emerge --info
Portage 2.2.7 (hardened/linux/amd64, gcc-4.8.1, glibc-2.15-r3, 3.11.6-hardened x86_64)
System uname: Linux-3.11.6-hardened-x86_64-Intel-R-_Xeon-R-_CPU_E5-2660_0_@_2.20GHz-with-gentoo-2.2
KiB Mem:    65978028 total,  52892228 free
KiB Swap:   20971516 total,  20971516 free
Timestamp of tree: Thu, 31 Oct 2013 11:00:01 +0000
ld GNU ld (GNU Binutils) 2.23.1
app-shells/bash:          4.2_p45
dev-java/java-config:     2.1.12-r1
dev-lang/python:          2.7.5-r3, 3.2.5-r3
dev-util/pkgconfig:       0.28
sys-apps/baselayout:      2.2
sys-apps/openrc:          0.11.8
sys-apps/sandbox:         2.6-r1
sys-devel/autoconf:       2.13, 2.69
sys-devel/automake:       1.13.4
sys-devel/binutils:       2.23.1
sys-devel/gcc:            4.8.1-r1
sys-devel/gcc-config:     1.7.3
sys-devel/libtool:        2.4.2
sys-devel/make:           3.82-r4
sys-kernel/linux-headers: 3.9 (virtual/os-headers)
sys-libs/glibc:           2.15-r3
Repositories: gentoo science java local-lusi
CFLAGS="-O2 -pipe -march=native  -fomit-frame-pointer"
CONFIG_PROTECT="/etc /usr/share/gnupg/qualified.txt"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c"
CXXFLAGS="-O2 -pipe -march=native  -fomit-frame-pointer"
FCFLAGS="-O2 -pipe -march=native  -fomit-frame-pointer -fdefault-integer-8"
FEATURES="assume-digests binpkg-logs buildpkg config-protect-if-modified distlocks ebuild-locks fixlafiles merge-sync news parallel-fetch preserve-libs protect-owned sandbox sfperms splitdebug strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
FFLAGS="-O2 -pipe -march=native  -fomit-frame-pointer -fdefault-integer-8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTDIR_OVERLAY="/var/lib/layman/science /var/lib/layman/java /usr/local/portage"
USE="acl amd64 avx berkdb blas bzip2 cairo cli cracklib crypt cxx dri fftw gdbm gmp hardened hdf5 iconv int64 ipv6 jbig jpeg jpeg2k justify kmod lapack ldap lzma lzo mmx modules mudflap multilib ncurses nls nptl openipmi openmp pam pax_kernel pcre pdf png postscript qt3support readline session slurm sqlite sse sse2 sse3 ssl svg tbb tcpd threads tiff truetype udev unicode urandom xattr xpm zlib" ABI_X86="64" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump author" CAMERAS="ptp2" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" DRACUT_MODULES="btrfs ssh-client" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip tripmate tnt ubx" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" OFFICE_IMPLEMENTATION="libreoffice" OPENMPI_FABRICS="ofed" OPENMPI_OFED_FEATURES="connectx-xrc control-hdr-padding dynamic-sl failover rdmacm" OPENMPI_RM="slurm" PHP_TARGETS="php5-5" PYTHON_SINGLE_TARGET="python2_7" PYTHON_TARGETS="python2_7 python3_2" RUBY_TARGETS="ruby19 ruby18" USERLAND="GNU" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Comment 1 Olaf Leidinger 2013-10-31 19:18:17 UTC
Update: Seems to be related to gcc-4.8, as the daemon starts when compiling with icc. Testing of older gcc in progress.

I posted to the upstream mailing list about this:
Comment 2 Olaf Leidinger 2013-10-31 19:27:15 UTC
This seems to be a problem of the hardening... Building using gcc-4.5 doens't work either. However, selecting a vanilla gcc yields success!
Comment 3 Olaf Leidinger 2013-10-31 22:21:21 UTC
Created attachment 362374 [details]
output during build using hardened toolchain
Comment 4 Magnus Granberg gentoo-dev 2013-11-01 15:12:51 UTC
Test to link with lazy instead ow now.
Comment 5 Alexey Shvetsov archtester gentoo-dev 2013-11-04 11:25:50 UTC
Try slurm 2.6.3. Seems it works for me with gcc-4.8.1
Comment 6 Olaf Leidinger 2013-11-04 11:34:47 UTC
I assume you used the hardened toolchain? 
As stated in the bug report: "I tried slurm 2.5.[4,6] and even bumped the ebuild of 2.5.6 to 2.6.3. Same problem every time."
Comment 7 Daniel M. Weeks 2013-11-05 21:00:44 UTC
As stated by Olaf, this is a problem with Slurm 2.6.3 as well. I have tested this myself and it is a problem with the plugin loading due to the hardened toolchain. Updating the system toolchain to fix a problem like this hardly seem appropriate.

Adding -Wl,-z,lazy using append-ldflags and gcc-specs-now resolves the problem without requiring the user to update gcc. I would propose this as an interim fix until a patch for the affected components' Makefiles can be created.