Emerge of nvidia-cuda-sdk fails with "/opt/cuda/lib/libcudart.so: file not recognized: File format not recognized " Reproducible: Always Steps to Reproduce: emerge nvidia-cuda-sdk on an amd64 system emerge --info: Portage 2.1.6.13 (default/linux/amd64/2008.0, gcc-4.1.2, glibc-2.6.1-r0, 2.6.27.21-0.1-xen x86_64) ================================================================= System uname: Linux-2.6.27.21-0.1-xen-x86_64-Dual_Core_AMD_Opteron-tm-_Processor_275-with-glibc2.2.5 Timestamp of tree: Wed, 22 Jul 2009 13:15:02 +0000 distcc 2.18.3 x86_64-pc-linux-gnu (protocols 1 and 2) (default port 3632) [disabled] ccache version 2.3 [disabled] app-shells/bash: 3.2_p33 dev-java/java-config: 1.3.7, 2.1.6-r1 dev-lang/python: 2.4.4-r13, 2.5.4-r2 dev-python/pycrypto: 2.0.1-r6 dev-util/cmake: 2.6.2-r1 sys-apps/baselayout: 1.12.9-r2 sys-apps/sandbox: 1.2.18.1-r2 sys-devel/autoconf: 2.13, 2.61-r2 sys-devel/automake: 1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2, 1.10.1 sys-devel/binutils: 2.18-r3 sys-devel/gcc-config: 1.4.0-r4 sys-devel/libtool: 1.5.26 virtual/os-headers: 2.6.27-r2 ACCEPT_KEYWORDS="amd64" CBUILD="x86_64-pc-linux-gnu" CFLAGS="-march=opteron -pipe -O2" CHOST="x86_64-pc-linux-gnu" CONFIG_PROTECT="/etc /var/spool/torque" CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/java/ /etc/fonts/fonts.conf /etc/gconf /etc/revdep-rebuild /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c" CXXFLAGS="-march=opteron -pipe -O2" DISTDIR="/usr/portage/distfiles" FEATURES="collision-protect distlocks fixpackages parallel-fetch protect-owned sandbox sfperms strict unmerge-orphans userfetch usersync" GENTOO_MIRRORS="http://distfiles.gentoo.org http://distro.ibiblio.org/pub/linux/distributions/gentoo" LANG="POSIX" LDFLAGS="-Wl,-O1" MAKEOPTS="-j2" PKGDIR="/usr/portage/packages" PORTAGE_CONFIGROOT="/" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/portage/local/layman/zymeworks-overlay /usr/portage/local/layman/science" SYNC="rsync://portage.lan.zymeworks.com/gentoo-portage" USE="X acl amd64 bash-completion berkdb bzip2 cli cracklib crypt doc dri fortran gdbm glibc-omitfp gtk hashstyle iconv isdnlog jpeg kerberos latex ldap midi mmx mudflap multilib ncurses nptl nptlonly opengl openmp pam pcre pdf perl png pppd python readline reflection session snmp spl sse sse2 ssl sysfs tcpd tiff unicode vim-syntax wxwindows xorg zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" ELIBC="glibc" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" USERLAND="GNU" Unset: CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, FFLAGS, INSTALL_MASK, LC_ALL, LINGUAS, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
Created attachment 199489 [details] output of emerge nvidia-cuda-sdk
Which version of nvidia-cuda-toolkit are you using? Regardless of which version that is, please try reinstalling it and see whether this changes anything. If not, please include the output of '/opt/cuda/lib/libcudart.so'. If it is a symbolic link, please follow it (and any subsequent links) and include the `file` output for the actual file it is pointing to.
(In reply to comment #2) > anything. If not, please include the output of '/opt/cuda/lib/libcudart.so'. That was supposed to be: 'file /opt/cuda/lib/libcudart.so'
I've got nvidia-cuda-toolkit-2.2-r1. I've tried reinstalling it, but it didn't help. It looks like libcudart.so.2.2 is a binary file included in the distribution, so I suspect reinstallation won't help. file /opt/cuda/lib/libcudart.so.2.2 /opt/cuda/lib/libcudart.so.2.2: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), stripped
I managed to get this to work by creating my own version of the nvidia-cuda-toolkit ebuild and using the binaries from the rhel4.7 version instead. Sounds like some kind of ABI problem perhaps?
(In reply to comment #5) > I managed to get this to work by creating my own version of the > nvidia-cuda-toolkit ebuild and using the binaries from the rhel4.7 version > instead. Sounds like some kind of ABI problem perhaps? What is the output of `file /opt/cuda/lib/libcudart.so.2.2` after you have installed the package using your new ebuild?
/opt/cuda/lib/libcudart.so.2.2: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), stripped
Is there any difference in terms of dependencies (`ldd /opt/cuda/lib/libcudaart.so.2.2`)?
rhel4.7/lib/libcudart.so.2.2 linux-vdso.so.1 => (0x00007fff629fe000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f905a4bb000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f905a29f000) librt.so.1 => /lib64/librt.so.1 (0x00007f905a095000) libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00007f9059d89000) libm.so.6 => /lib64/libm.so.6 (0x00007f9059b33000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f905991a000) libc.so.6 => /lib64/libc.so.6 (0x00007f90595c1000) /lib64/ld-linux-x86-64.so.2 (0x00007f905a814000) and the one from the ebuild: suse11.0/lib/libcudart.so.2.2 linux-vdso.so.1 => (0x00007fff20bfe000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f8d18770000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f8d18554000) librt.so.1 => /lib64/librt.so.1 (0x00007f8d1834a000) libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00007f8d1803e000) libm.so.6 => /lib64/libm.so.6 (0x00007f8d17de8000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f8d17bcf000) libc.so.6 => /lib64/libc.so.6 (0x00007f8d17876000) /lib64/ld-linux-x86-64.so.2 (0x00007f8d18bcc000)
(In reply to comment #9) > rhel4.7/lib/libcudart.so.2.2 Could you also please try it with rhel5.3? I agree that this is most likely an ABI problem, but I'm wondering where exactly it is coming from. It seems that the only difference between rhel4.7 and suse11.0 is that the latter requires CXXABI_1.3.1, which should be a non-issue, as it is compatible with the gcc version you're using.
Same problem with the version for RHEL 5.3
(In reply to comment #11) > Same problem with the version for RHEL 5.3 OK. I did some more testing and it is indeed the ABI problem I mentioned in comment #10. I've just added -r3 of nvidia-cuda-toolkit-2.2 to the tree. It uses the older RHEL binaries, which should fix the problem.
I just came across this problem again with the 3.0 version. This time I managed to fix it by upgrading binutils to 2.20. It seems that is the problem. You might want to update the dependency list for the ebuild to reflect that.