Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 637604 - www-client/firefox-57 fails to compile (suspect sys-devel/binutils-2.28 and later)
Summary: www-client/firefox-57 fails to compile (suspect sys-devel/binutils-2.28 and l...
Status: RESOLVED INVALID
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal with 1 vote (vote)
Assignee: Mozilla Gentoo Team
URL:
Whiteboard:
Keywords:
: 637658 (view as bug list)
Depends on:
Blocks:
 
Reported: 2017-11-15 18:38 UTC by Evan Teran
Modified: 2019-03-31 20:06 UTC (History)
14 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
build log of the failure. (build.log.gz,83.06 KB, application/gzip)
2017-11-15 18:40 UTC, Evan Teran
Details
build.log.gz for LD_LIBRARY_PATH="" emerge --oneshot firefox (build.log.gz,86.49 KB, application/gzip)
2017-11-19 02:48 UTC, Marien Zwart
Details
Tiny patch to work around undefined symbol errors in Rust compilation, without using gold. (ff-cargo-dont-clobber-binutils-lib-search-path.patch,783 bytes, patch)
2017-11-20 08:13 UTC, Psi
Details | Diff
Trivial patch for >=dev-util/cargo-0.20.0 which resolves the invalid LD_LIBRARY_PATH issue that has been breaking Firefox builds (cargo-0.20.0-fix-bad-dylib-search-path.patch,1.24 KB, patch)
2017-11-23 08:55 UTC, Psi
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Evan Teran 2017-11-15 18:38:27 UTC
Attempting to build www-client/firefox-57 fails with the following errors:

---------

error: linking with `/var/tmp/portage/www-client/firefox-57.0/work/firefox-57.0/build/cargo-linker` failed: exit code: 1


  = note: /usr/lib/gcc/x86_64-pc-linux-gnu/6.4.0/../../../../x86_64-pc-linux-gnu/bin/ld: symbol lookup error: /usr/lib/gcc/x86_64-pc-linux-gnu/6.4.0/../../../../x86_64-pc-linux-gnu/bin/ld: undefined symbol: rx_additional_link_map_text
          collect2: error: ld returned 127 exit status

---------

Googling for the symbol in question doesn't yield an awful lot of results, but they all seem to be at least related to binutils.

My keywords for firefox are:

=www-client/firefox-57*   ~amd64
=dev-lang/rust-1.19.0     ~amd64
=dev-libs/nspr-4.17       ~amd64
=dev-util/cargo-0.21.0    ~amd64
=virtual/rust-1.19.0      ~amd64
=media-libs/libpng-1.6.34 ~amd64
=dev-libs/nss-3.33        ~amd64

All of the dependencies seem to install correctly. I will attach the build log as well.

Reproducible: Always




$  emerge --info
Portage 2.3.13 (python 3.4.5-final-0, default/linux/amd64/13.0/desktop/plasma, gcc-6.4.0, glibc-2.25-r9, 4.12.14-gentoo x86_64)
=================================================================
System uname: Linux-4.12.14-gentoo-x86_64-Intel-R-_Core-TM-_i7-4800MQ_CPU_@_2.70GHz-with-gentoo-2.4.1
KiB Mem:    16419180 total,   6121496 free
KiB Swap:   16777212 total,  16777212 free
Timestamp of repository gentoo: Wed, 15 Nov 2017 17:30:01 +0000
Head commit of repository gentoo: 12ed9d8926e9e5e21655a2c294d173fb9686d8b7
sh bash 4.3_p48-r1
ld GNU ld (Gentoo 2.28.1 p1.0) 2.28.1
app-shells/bash:          4.3_p48-r1::gentoo
dev-java/java-config:     2.2.0-r3::gentoo
dev-lang/perl:            5.24.3::gentoo
dev-lang/python:          2.7.14::gentoo, 3.4.5::gentoo, 3.5.4::gentoo
dev-util/cmake:           3.8.2::gentoo
dev-util/pkgconfig:       0.29.2::gentoo
sys-apps/baselayout:      2.4.1-r2::gentoo
sys-apps/openrc:          0.34.7::gentoo
sys-apps/sandbox:         2.10-r4::gentoo
sys-devel/autoconf:       2.13::gentoo, 2.69::gentoo
sys-devel/automake:       1.11.6-r1::gentoo, 1.15-r2::gentoo
sys-devel/binutils:       2.28.1::gentoo
sys-devel/gcc:            6.4.0::gentoo, 7.2.0::gentoo
sys-devel/gcc-config:     1.8-r1::gentoo
sys-devel/libtool:        2.4.6-r3::gentoo
sys-devel/make:           4.2.1::gentoo
sys-kernel/linux-headers: 4.7::gentoo (virtual/os-headers)
sys-libs/glibc:           2.25-r9::gentoo
Repositories:

gentoo
    location: /usr/portage
    sync-type: rsync
    sync-uri: rsync://rsync.gentoo.org/gentoo-portage
    priority: -1000
    sync-rsync-extra-opts: 

gentoo-overlay
    location: /home/eteran/projects/gentoo-overlay
    masters: gentoo
    priority: 0

vmware
    location: /var/lib/layman/vmware
    masters: gentoo
    priority: 50

ACCEPT_KEYWORDS="amd64"
ACCEPT_LICENSE="*"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=native -O2 -finline-functions -pipe -ggdb"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/config /usr/share/gnupg/qualified.txt"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/php/apache2-php7.0/ext-active/ /etc/php/cgi-php7.0/ext-active/ /etc/php/cli-php7.0/ext-active/ /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo"
CXXFLAGS="-march=native -O2 -finline-functions -pipe -ggdb"
DISTDIR="/usr/portage/distfiles"
FCFLAGS="-O2 -pipe"
FEATURES="assume-digests binpkg-logs collision-protect compressdebug config-protect-if-modified distlocks ebuild-locks fixlafiles merge-sync multilib-strict news parallel-fetch preserve-libs protect-owned sandbox sfperms splitdebug strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
FFLAGS="-O2 -pipe"
GENTOO_MIRRORS="http://distfiles.gentoo.org"
LANG="en_US.utf8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
MAKEOPTS="-j8"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git"
PORTAGE_TMPDIR="/var/tmp"
USE="X a52 aac acl acpi activities alsa amd64 android apache2 berkdb bluetooth branding bzip2 cairo cdda cdr clang clangcodemodel clangstaticanalyzer cli cmake consolekit cracklib crypt cups curl cxx dbus debugger declarative designer dri dts dvd dvdr emboss encode exif extra fam farstream firefox flac fontconfig fuse gd gdbm gif git glamor gold gpg gpm graphics graphite graphviz gtk hangouts hidpi highlight iconv icu ipv6 jit jpeg json kde kipi kwallet lcms ldap libkms libnotify lldb mad mng modules mp3 mp4 mpeg mtp mudflap multilib mysql mysqli nat ncurses networkmanager nls nptl ntfs ogg opencl opengl openmp pam pango pci pcre pdf pdo phonon plasma png policykit postproc ppds printsupport pulseaudio python qml qt3support qt4 qt5 readline resolvconf samba script sdl seccomp semantic-desktop session smbclient spell sql sqlite ssh ssl startup-notification subversion svg taglib tcpd threads thumbnail tiff tools touchpad truetype udev udisks unicode upower usb utils v4l v4l2 valgrind vmware-tools vmware_guest_linux vmware_guest_winPreVista vmware_guest_windows vorbis wallpapers webkit wext widgets wifi winbind wxwidgets x264 xattr xcb xcomposite xetex xinerama xml xnest xscreensaver xv xvid zenmap zeroconf zip zlib" ABI_X86="64" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump author" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="aes avx avx2 mmx mmxext popcnt sse sse2 sse3 sse4_1 sse4_2 ssse3" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" GRUB_PLATFORMS="pc" INPUT_DEVICES="evdev keyboard mouse synaptics" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" LLVM_TARGETS="BPF NVPTX X86" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php5-6" POSTGRES_TARGETS="postgres9_5" PYTHON_SINGLE_TARGET="python3_4" PYTHON_TARGETS="python2_7 python3_4" QEMU_SOFTMMU_TARGETS="x86_64 i386 arm" QEMU_USER_TARGETS="x86_64 i386 arm" RUBY_TARGETS="ruby22" USERLAND="GNU" VIDEO_CARDS="nvidia" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CC, CPPFLAGS, CTARGET, CXX, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
Comment 1 Evan Teran 2017-11-15 18:40:05 UTC
Created attachment 504298 [details]
build log of the failure.
Comment 2 Bernd Buschinski 2017-11-15 21:18:22 UTC
Same here,
with complete ~amd64 and
sys-devel/gcc:            7.2.0::gentoo
sys-devel/binutils:       2.29.1-r1::gentoo
Comment 3 Jory A. Pratt gentoo-dev 2017-11-16 13:58:02 UTC
Try to unset LDFLAGS and see if you can duplicate the issue please.
Comment 4 Jory A. Pratt gentoo-dev 2017-11-16 13:59:08 UTC
*** Bug 637658 has been marked as a duplicate of this bug. ***
Comment 5 Evan Teran 2017-11-16 17:35:51 UTC
Can you clarify what you mean by "unset LDFLAGS"? I don't have any LDFLAGS specified in my make.conf, so it should be the system default.

Are you suggesting that I override the default and set them to an empty string?
Comment 6 Bernd Buschinski 2017-11-16 17:45:31 UTC
I tried
unset LDFLAGS && emerge -1v firefox
still fails the same way.

LDFLAGS="" emerge -1v firefox
also fails, the same way.
Comment 7 Sven B. 2017-11-16 20:05:03 UTC
Not sure if it was the same issue; some time back with a ff57beta build i had a similar failure when using ld.bfd; switching to ld.gold solved the problem. I haven't tested again since.
Comment 8 Evan Teran 2017-11-16 21:10:15 UTC
@Sven: a quick and dirty test seems to show that ld.gold does not suffer from this issue:

$ LD_LIBRARY_PATH=/usr/lib ld.bfd --help
ld.bfd: symbol lookup error: ld.bfd: undefined symbol: rx_additional_link_map_text

$ LD_LIBRARY_PATH=/usr/lib ld.gold --help
Usage: ld.gold [options] file...
...

So that may be a viable workaround. What is the recommended method of telling emerge to use ld.gold (at least temporarily)?
Comment 9 Ian Stakenvicius (RETIRED) gentoo-dev 2017-11-16 21:19:47 UTC
Good question.. I believe you can override LD via /etc/portage/package.env on a per-package basis but I've never done it.  

If it's a one-off, running binutils-config --linker gold, doing your emerge, and then changing it back to ld.bfd after usually works.  

The alternative is to use gold all the time and file bugs on anything it doesn't work with -- most ebuilds that can't work with gold use a workaround to switch to ld.bfd internally already.

That said, ld.bfd is supported fine in general as I've built it on multiple systems so far without issue.  The specific cause of this is something we'll need to work out.
Comment 10 ozhdfw 2017-11-17 00:47:17 UTC
I trashed my system trying to sort out all the conflicts.  Firefox 57 should be masked until all the dependencies and conflicts are sorted out.
Comment 11 Evan Teran 2017-11-17 04:27:39 UTC
For reference, I have confirmed that the issue is the with the standard ld and can be worked around by using ld.gold temporarily. The following worked for me:

    binutils-config --linker ld.gold
    emerge -a firefox
    binutils-config --linker ld.bfd

@ ozhdfw@gmail.com, How did you end up trashing your system? I used the following keywords to unmask it and didn't have any real issues:

=www-client/firefox-57*   ~amd64
=dev-lang/rust-1.19.0     ~amd64
=dev-libs/nspr-4.17       ~amd64
=dev-util/cargo-0.21.0    ~amd64
=virtual/rust-1.19.0      ~amd64
=media-libs/libpng-1.6.34 ~amd64
=dev-libs/nss-3.33        ~amd64

I do have several other ebuilds keyworded as ~amd64, but I don't think any of the others are a factor on this one.
Comment 12 Andreas K. Hüttel archtester gentoo-dev 2017-11-17 14:34:40 UTC
Not a stabilization blocker for binutils-2.29.1-r1 (since firefox-57 is not stable)
Comment 13 Evan Teran 2017-11-17 15:54:50 UTC
Worth noting that I'm not using binutils-2.29, I think I'm on 2.28 (stable)
Comment 14 Andreas K. Hüttel archtester gentoo-dev 2017-11-17 16:51:33 UTC
(In reply to Evan Teran from comment #13)
> Worth noting that I'm not using binutils-2.29, I think I'm on 2.28 (stable)

Oh. Thanks.
Comment 15 Andreas K. Hüttel archtester gentoo-dev 2017-11-17 17:43:01 UTC
(In reply to Evan Teran from comment #0)
> My keywords for firefox are:
> 
> =www-client/firefox-57*   ~amd64
> =dev-lang/rust-1.19.0     ~amd64
> =dev-libs/nspr-4.17       ~amd64
> =dev-util/cargo-0.21.0    ~amd64
> =virtual/rust-1.19.0      ~amd64
> =media-libs/libpng-1.6.34 ~amd64
> =dev-libs/nss-3.33        ~amd64
> 

I have the same packages and it emerges fine here. 

The only difference in the "system packages" is gcc-5.4 (stable).

Portage 2.3.14 (python 3.4.5-final-0, default/linux/amd64/13.0/desktop/plasma, gcc-5.4.0, glibc-2.25-r9, 4.13.9-gentoo x86_64)
sh bash 4.3_p48-r1
ld GNU ld (Gentoo 2.29.1 p3) 2.29.1
app-shells/bash:          4.3_p48-r1::gentoo
dev-lang/perl:            5.26.1-r1::gentoo
dev-lang/python:          2.7.14::gentoo, 3.4.5::gentoo
dev-util/cmake:           3.8.2::gentoo
dev-util/pkgconfig:       0.29.2::gentoo
sys-apps/baselayout:      2.4.1-r2::gentoo
sys-apps/openrc:          0.34.8::gentoo
sys-apps/sandbox:         2.10-r4::gentoo
sys-devel/autoconf:       2.13::gentoo, 2.69::gentoo
sys-devel/automake:       1.11.6-r1::gentoo, 1.15-r2::gentoo
sys-devel/binutils:       2.29.1-r1::gentoo
sys-devel/gcc:            5.4.0-r3::gentoo
sys-devel/gcc-config:     1.8-r1::gentoo
sys-devel/libtool:        2.4.6-r3::gentoo
sys-devel/make:           4.2.1::gentoo
sys-kernel/linux-headers: 4.4::gentoo (virtual/os-headers)
sys-libs/glibc:           2.25-r9::gentoo
Comment 16 Jory A. Pratt gentoo-dev 2017-11-18 21:58:37 UTC
someone please run `LD_LIBRARY_PATH="" emerge --oneshot firefox` and post your output of the build failure please.
Comment 17 Marien Zwart 2017-11-19 02:48:24 UTC
Created attachment 504734 [details]
build.log.gz for LD_LIBRARY_PATH="" emerge --oneshot firefox
Comment 18 Marien Zwart 2017-11-19 02:51:38 UTC
Which version of rust are affected people building firefox with? I see this with dev-lang/rust-1.19, which installs itself into /usr/lib64. IIRC other versions of rust or rust-bin install to a separate directory. If the bogus LD_LIBRARY_PATH is actually the rust install dir, other versions of rust might be unaffected.
Comment 19 Jory A. Pratt gentoo-dev 2017-11-19 15:23:26 UTC
If anyone has time to join me on #gentoo-moz we will work to debug the issue. I will be there until 17:00UTC
Comment 20 Psi 2017-11-20 08:13:23 UTC
Created attachment 505054 [details, diff]
Tiny patch to work around undefined symbol errors in Rust compilation, without using gold.

The attached patch resolves the OP's issue, which I also had.  It does so without resorting to gold or other changes to binutils configuration (Firefox doesn't build for me with gold anyway).  It changes a single line in a wrapper script used by Cargo during the build, fixing the library search path so that ld.bfd works again.

IMPORTANT: This is a quick-and-dirty workaround, meant for dropping into /etc/portage/patches/www-client/firefox/, not a long-term solution.  You MUST make sure that the path in the modified line points to your active binutils directory when building Firefox with it!  The patch, as attached, works if you're using sys-devel/binutils-2.29.1 (doesn't matter which -r revision) on amd64, without modification.  If your *active* version of binutils is, say, 2.28, then modify the path accordingly.  This involves just changing the version number in the path using your text editor.  It's not rocket science.

If I come up with a proper fix for the issue in the next couple of days, I will post it.

In the meantime, I will post my working theories about what's going on in a separate comment.
Comment 21 Psi 2017-11-20 09:26:16 UTC
My setup:
sys-devel/binutils-2.29.1-r1
sys-libs/binutils-libs-2.29.1-r1
dev-lang/rust-1.19.0
dev-util/cargo-0.21.0


What's happening is this:

During the Rust portion of the Firefox build, something is setting LD_LIBRARY_PATH before running the Rust/Cargo linker.  Since it is being changed during the build, the failure occurs even if the variable is unset when starting the build.  I believe the culprit is Cargo.  If I get the full command line used when invoking the linker, and run it manually (even running as the portage user, sandboxed, and having sourced the environment file to match my environment to what it was during the build), the command completes just fine.  However, if I manually invoke cargo (which runs the rustc compiler and the linker) with the command line used during the build, I get the failures.  More on those command lines later.  At this point I'm not actually sure if the linker is being invoked by cargo or by rustc, but I'm picking on Cargo here because the linker to run is set in a Cargo environment variable.

Cargo is setting the LD_LIBRARY_PATH environment variable to something like this when it runs:

LD_LIBRARY_PATH=/var/tmp/portage/www-client/firefox-57.0/work/firefox-57.0/ff/toolkit/library/release/deps:/usr/lib

The Firefox build system tells Cargo to use a wrapper script for the linker command.  So when the wrapper script ${S}/build/cargo-linker is invoked, it invokes gcc, which invokes collect2, and on down until ld is invoked.  (BTW, that wrapper script can be modified to make it very useful for troubleshooting, by dumping the environment at invocation, and running gcc with -v.)  Now when ld is invoked, it searches for libbfd-${version}.so starting in the directories set in LD_LIBRARY_PATH, instead of its hardcoded path.  This doesn't affect gold because gold doesn't use libbfd.so.

The trouble is, if you have sys-libs/binutils-libs installed with the same version as your active sys-devel/binutils, then ld finds libbfd-${version}.so in /usr/lib instead of in /usr/lib64/binutils/x86_64-pc-linux-gnu/${version}!  Unfortunately, this /usr/lib/libbfd-${version}.so is a stripped-down variant of the libbfd.so in binutils' directory, so it lacks some symbols (like rx_additional_link_map_text) which exist in the libbfd.so that ld normally uses.  And while that symbol is completely unnecessary in this context, it becomes necessary when binutils is linked with -z now, which forces all symbols to be resolved at load time (which is not a bad thing).

So basically, because Cargo (or maybe Rust) plays with LD_LIBRARY_PATH, it causes the linker to find the wrong variant of the library it uses for most of its heavy lifting, so the linker fails (with some very strange errors if you managed to pass -v to gcc).

My little workaround patch works because it adds the correct location of libbfd.so to the beginning of the LD_LIBRARY_PATH.  It does this in the wrapper script that invokes the linker, so it doesn't corrupt the overall environment.

More on the command lines invoked:

This is an example of a cargo-linker wrapper script invocation that fails during build, but succeeds when run manually:
/usr/bin/x86_64-pc-linux-gnu-gcc -std=gnu99 -lpthread -fuse-linker-plugin -Wno-deprecated-declarations -Wno-maybe-uninitialized -Wtrampolines -Wstrict-aliasing=3 -fdiagnostics-color=always -Wa,-msse2avx -Wl,-O1 -z relro -Wl,-z,relro -z now -Wl,-z,now -Wl,--as-needed -Wl,-z,call-nop=prefix-0x40 -Wl,--enable-new-dtags -Wl,--hash-style=gnu -Wl,-rpath=/usr/lib64/firefox,--enable-new-dtags -fno-lto -z relro -z now -Wl,-z,relro,-z,now -Wl,-z,noexecstack -Wl,-z,text -Wl,-z,relro -Wl,-rpath-link,/var/tmp/portage/www-client/firefox-57.0/work/firefox-57.0/ff/dist/bin -Wl,-rpath-link,/usr/lib -Wl,--as-needed -Wl,-z,noexecstack -m64 -L /usr/lib64/rustlib/x86_64-unknown-linux-gnu/lib /var/tmp/portage/www-client/firefox-57.0/work/firefox-57.0/ff/toolkit/library/release/build/heapsize-daaa6fd14a54f910/build_script_build-daaa6fd14a54f910.0.o -o /var/tmp/portage/www-client/firefox-57.0/work/firefox-57.0/ff/toolkit/library/release/build/heapsize-daaa6fd14a54f910/build_script_build-daaa6fd14a54f910 -Wl,--gc-sections -pie -Wl,-O1 -nodefaultlibs -L /var/tmp/portage/www-client/firefox-57.0/work/firefox-57.0/ff/toolkit/library/release/deps -L /usr/lib64/rustlib/x86_64-unknown-linux-gnu/lib -Wl,-Bstatic /usr/lib64/rustlib/x86_64-unknown-linux-gnu/lib/libstd-4afea1b35e3086ee.rlib /usr/lib64/rustlib/x86_64-unknown-linux-gnu/lib/librand-702696134eb19eb7.rlib /usr/lib64/rustlib/x86_64-unknown-linux-gnu/lib/libcollections-43fd961efbc1b168.rlib /usr/lib64/rustlib/x86_64-unknown-linux-gnu/lib/libstd_unicode-e8e70dff3f5bb4b3.rlib /usr/lib64/rustlib/x86_64-unknown-linux-gnu/lib/libpanic_unwind-cb7775bc3aba988d.rlib /usr/lib64/rustlib/x86_64-unknown-linux-gnu/lib/libunwind-66cc4af1cea8fb0f.rlib /usr/lib64/rustlib/x86_64-unknown-linux-gnu/lib/liballoc-8d30e02f218f5a7e.rlib /usr/lib64/rustlib/x86_64-unknown-linux-gnu/lib/liballoc_jemalloc-ef11d16aad61dd80.rlib /usr/lib64/rustlib/x86_64-unknown-linux-gnu/lib/liblibc-9cd71f19dcf4c605.rlib /usr/lib64/rustlib/x86_64-unknown-linux-gnu/lib/libcore-5c94038e48113487.rlib /usr/lib64/rustlib/x86_64-unknown-linux-gnu/lib/libcompiler_builtins-701380125126dfef.rlib -Wl,-Bdynamic -l dl -l rt -l pthread -l gcc_s -l pthread -l c -l m -l rt -l pthread -l util

This is an example of an invocation of the cargo command by the build system, which, when run either by the build system *or* manually, generates the above invocation and results in failure (thus the culprit lies here):
env   RUSTC_BOOTSTRAP=1 RUSTFLAGS=' -v -C target-cpu=native -C opt-level=3 -C debuginfo=0'  CARGO_TARGET_DIR=/var/tmp/portage/www-client/firefox-57.0/work/firefox-57.0/ff/toolkit/library RUSTC=/usr/bin/rustc MOZ_SRC=/var/tmp/portage/www-client/firefox-57.0/work/firefox-57.0 MOZ_DIST=/var/tmp/portage/www-client/firefox-57.0/work/firefox-57.0/ff/dist LIBCLANG_PATH="/usr/lib64/llvm/5/lib64" CLANG_PATH="/usr/lib64/llvm/5/bin/clang" PKG_CONFIG_ALLOW_CROSS=1 RUST_BACKTRACE=full MOZ_TOPOBJDIR=/var/tmp/portage/www-client/firefox-57.0/work/firefox-57.0/ff MOZ_CARGO_WRAP_LDFLAGS="-lpthread -fuse-linker-plugin -Wno-deprecated-declarations -Wno-maybe-uninitialized -Wtrampolines -Wstrict-aliasing=3 -fdiagnostics-color=always -Wa,-msse2avx -Wl,-O1 -z relro -Wl,-z,relro -z now -Wl,-z,now -Wl,--as-needed -Wl,-z,call-nop=prefix-0x40 -Wl,--enable-new-dtags -Wl,--hash-style=gnu -Wl,-rpath=/usr/lib64/firefox,--enable-new-dtags -fno-lto -z relro -z now -Wl,-z,relro,-z,now -Wl,-z,noexecstack -Wl,-z,text -Wl,-z,relro -Wl,-rpath-link,/var/tmp/portage/www-client/firefox-57.0/work/firefox-57.0/ff/dist/bin -Wl,-rpath-link,/usr/lib" MOZ_CARGO_WRAP_LD=" /usr/bin/x86_64-pc-linux-gnu-gcc -std=gnu99" CARGO_TARGET_X86_64_UNKNOWN_LINUX_GNU_LINKER=/var/tmp/portage/www-client/firefox-57.0/work/firefox-57.0/build/cargo-linker /usr/bin/cargo rustc --verbose --release --frozen --manifest-path /var/tmp/portage/www-client/firefox-57.0/work/firefox-57.0/toolkit/library/rust/Cargo.toml --lib --target=x86_64-unknown-linux-gnu --features "servo bindgen cubeb_pulse_rust simd-accel no-static-ideograph-encoder-tables" -- -v -C target-cpu=native -C opt-level=3 -C debuginfo=0 -C lto

Hope this helps to track down the root cause, and provides a path to a permanent solution.  The exact root is probably obvious, but I'm just too tired right now to see it. ;-)  Will track that sucker down after some sleep.  Goodnight. :-)
Comment 22 Marien Zwart 2017-11-20 10:26:18 UTC
The symptoms in this bug are suspiciously similar to those on bug 600390 in rust. But I can't find the build system code being patched on that bug in Firefox, and rust itself is no longer affected by that bug.

It'd be nicer to unset LD_LIBRARY_PATH (or remove problematic entries from it) instead of adding more stuff to it. But I haven't tracked down where LD_LIBRARY_PATH is set by the firefox build system (I assume it's the build system and not rust/cargo, because packages like ripgrep that use cargo build fine, and the patch for bug 600390 was in build system code, not rust/cargo itself).
Comment 23 Ian Stakenvicius (RETIRED) gentoo-dev 2017-11-20 18:02:43 UTC
Out of curiosity, has anybody tried this with sandbox-2.12 or newer?  Does it make any difference?
Comment 24 Sven B. 2017-11-20 19:10:58 UTC
(In reply to Ian Stakenvicius from comment #23)
> Out of curiosity, has anybody tried this with sandbox-2.12 or newer?  Does
> it make any difference?

No difference; at least i haven't tried sandbox < 2.12;
Comment 25 Ian Stakenvicius (RETIRED) gentoo-dev 2017-11-20 19:26:03 UTC
(In reply to Sven B. from comment #24)
> (In reply to Ian Stakenvicius from comment #23)
> > Out of curiosity, has anybody tried this with sandbox-2.12 or newer?  Does
> > it make any difference?
> 
> No difference; at least i haven't tried sandbox < 2.12;

That's sufficient enough, the original poster had 2.10-r4 so that at least confirms to my satisfaction that any sandbox manipulation of LD-related environment variables are not to blame here.
Comment 26 Psi 2017-11-20 20:07:49 UTC
Found the root cause.  Not sure about the best way to go about fixing it, but at least devs will know where to look.

Ian, I'm using sandbox-2.12 and I do experience the problem.

Marien, Firefox does play with LD_LIBRARY_PATH quite a bit (just grep the source, it's hideous), but it's not doing so in this case.  I've verified this by running Firefox's build commands manually with an identical environment to that of the build, and using a modified version of the cargo-linker script that dumps the environment every time it's invoked.  With no LD_LIBRARY_FLAGS set in the environment, manually running cargo reproduces the failure (and cargo-linker shows LD_LIBRARY_PATH to now be set), while running rustc results in success (and no LD_LIBRARY_PATH set), and so on.  I agree that messing with LD_LIBRARY_PATH isn't a good fix, but it is a useful workaround in this case because Cargo is setting LD_LIBRARY_PATH to a value that is missing what ld needs in order to work.  Since I can't say that unsetting the variable won't break Cargo's build (it prepends the build directory, which it probably needs to do), I left it be.  Obviously the patch isn't a distributable solution, since it uses a hardcoded version-specific path.

Which leads us to the root of the problem.  In Cargo's source you will find:

cargo-0.21.0/src/cargo/util/paths.rs
cargo-0.21.0/src/cargo/ops/cargo_rustc/compilation.rs

paths.rs defines the functions dylib_path_envvar() which returns the string "LD_LIBRARY_PATH" on a Linux system, and dylib_path(), which appears to return the contents of that variable.

compilation.rs has a function fill_env() which pre-populates the environment with items that it will need before it spawns a subprocess (i.e. rustc).  One of the environment variables it sets is LD_LIBRARY_PATH, by way of the functions defined in paths.rs

Based on the location of compilation.rs in the source, it obviously does this only when Cargo is invoked as 'cargo rustc', but that's what Firefox is doing.

Assuming binutils and binutils-libs 2.29.1 are installed and active: Having LD_LIBRARY_PATH set to contain /usr/lib in place of or before /usr/lib64/binutils/x86_64-pc-linux-gnu/2.29.1 causes ld.bfd to pick up the wrong libbfd-2.29.1.so.  It picks up the one in /usr/lib (the one installed by binutils-libs), which doesn't have all the symbols defined that ld.bfd expects.  This would happen any time the binutils and binutils-libs packages have the same version.

I'm not a Rust programmer, so I don't have a fix for Cargo.  I will play with it and see if one comes to mind, but I think a Rust master/Gentoo dev will need to come to the rescue on this one. :-)

On another note: It might be that the difference in the two libbfd.so's could be caused by building binutils with USE="multitarget" and binutils-libs with USE="-multitarget", which is the case for me, and probably most.  Could try building both with identical USE flags and comparing nm dumps of the two.  *If* building them both with USE="multitarget" does produce the correct symbols, and *if* it allows Firefox to successfully build, then we might want to consider requiring binutils-libs to have the same multitarget USE value as binutils.  If we're lucky, it might be a fix that avoids having to make weird patches to Cargo and/or Firefox.  If our current luck continues though, there might be subtle differences that would manifest as subtle bugs in edge cases like Cargo/Firefox. :-P  I'll give it a go and see what happens.

HTH
Comment 27 Jory A. Pratt gentoo-dev 2017-11-20 22:00:51 UTC
(In reply to Psi from comment #26)
> Found the root cause.  Not sure about the best way to go about fixing it,
> but at least devs will know where to look.
> 
> Ian, I'm using sandbox-2.12 and I do experience the problem.
> 
> Marien, Firefox does play with LD_LIBRARY_PATH quite a bit (just grep the
> source, it's hideous), but it's not doing so in this case.  I've verified
> this by running Firefox's build commands manually with an identical
> environment to that of the build, and using a modified version of the
> cargo-linker script that dumps the environment every time it's invoked. 
> With no LD_LIBRARY_FLAGS set in the environment, manually running cargo
> reproduces the failure (and cargo-linker shows LD_LIBRARY_PATH to now be
> set), while running rustc results in success (and no LD_LIBRARY_PATH set),
> and so on.  I agree that messing with LD_LIBRARY_PATH isn't a good fix, but
> it is a useful workaround in this case because Cargo is setting
> LD_LIBRARY_PATH to a value that is missing what ld needs in order to work. 
> Since I can't say that unsetting the variable won't break Cargo's build (it
> prepends the build directory, which it probably needs to do), I left it be. 
> Obviously the patch isn't a distributable solution, since it uses a
> hardcoded version-specific path.
> 
> Which leads us to the root of the problem.  In Cargo's source you will find:
> 
> cargo-0.21.0/src/cargo/util/paths.rs
> cargo-0.21.0/src/cargo/ops/cargo_rustc/compilation.rs
> 
> paths.rs defines the functions dylib_path_envvar() which returns the string
> "LD_LIBRARY_PATH" on a Linux system, and dylib_path(), which appears to
> return the contents of that variable.
> 
> compilation.rs has a function fill_env() which pre-populates the environment
> with items that it will need before it spawns a subprocess (i.e. rustc). 
> One of the environment variables it sets is LD_LIBRARY_PATH, by way of the
> functions defined in paths.rs
> 
> Based on the location of compilation.rs in the source, it obviously does
> this only when Cargo is invoked as 'cargo rustc', but that's what Firefox is
> doing.
> 
> Assuming binutils and binutils-libs 2.29.1 are installed and active: Having
> LD_LIBRARY_PATH set to contain /usr/lib in place of or before
> /usr/lib64/binutils/x86_64-pc-linux-gnu/2.29.1 causes ld.bfd to pick up the
> wrong libbfd-2.29.1.so.  It picks up the one in /usr/lib (the one installed
> by binutils-libs), which doesn't have all the symbols defined that ld.bfd
> expects.  This would happen any time the binutils and binutils-libs packages
> have the same version.
> 
> I'm not a Rust programmer, so I don't have a fix for Cargo.  I will play
> with it and see if one comes to mind, but I think a Rust master/Gentoo dev
> will need to come to the rescue on this one. :-)
> 
> On another note: It might be that the difference in the two libbfd.so's
> could be caused by building binutils with USE="multitarget" and
> binutils-libs with USE="-multitarget", which is the case for me, and
> probably most.  Could try building both with identical USE flags and
> comparing nm dumps of the two.  *If* building them both with
> USE="multitarget" does produce the correct symbols, and *if* it allows
> Firefox to successfully build, then we might want to consider requiring
> binutils-libs to have the same multitarget USE value as binutils.  If we're
> lucky, it might be a fix that avoids having to make weird patches to Cargo
> and/or Firefox.  If our current luck continues though, there might be subtle
> differences that would manifest as subtle bugs in edge cases like
> Cargo/Firefox. :-P  I'll give it a go and see what happens.
> 
> HTH

If you believe it is cargo, you can test cargo-0.22.0 from my dev overlay and see if you can duplicate the problem. I for one am using musl and have not had the luxury of seeing this build failure.
Comment 28 Psi 2017-11-21 03:01:47 UTC
(In reply to Jory A. Pratt from comment #27)
> If you believe it is cargo, you can test cargo-0.22.0 from my dev overlay
> and see if you can duplicate the problem. I for one am using musl and have
> not had the luxury of seeing this build failure.

Good news and bad news.

Good news first.  There is a no-patch-required workaround!  Rebuild sys-libs/binutils-libs with USE="multitarget".  In this case, ld.bfd is still finding a different libbfd.so than it normally does, but this one contains all of the needed symbols.  I'm composing this on a Firefox that was built using this method, and my browser hasn't grown a second head or anything.  No real reason why it should, given that binutils-libs provides a functional libbfd.so.  Again, this all assumes that your *active* binutils version matches your binutils-libs version.  If anyone's versions don't match, and you're having this issue, say something, because that would point to something else going on (ld has a NEEDED entry for libbfd-${version}.so, where ${version} is the version of binutils).


Bad news is that the issue still exists when using cargo-0.22.0. :-(  BTW, thank you for providing that in your overlay, Jory!  It has a couple of non-fatal "character constant must be escaped" errors when building the docs, but that doesn't seem to harm anything.  cargo-0.21.0 did not have those errors, but I think I'll keep 0.22.0.

Here's a shell one-liner that one can use to check if cargo is setting a broken LD_LIBRARY_PATH during build (one obviously has to run this after the Rust building starts but before it fails):

# printf "cargo: "; cat /proc/$(pgrep -o cargo)/environ | tr "\0" "\n" | grep LD_LIBRARY_PATH ; printf "\nrustc: "; cat /proc/$(pgrep -o rustc)/environ | tr "\0" "\n" | grep LD_LIBRARY_PATH

If you get output like this:

cargo: 
rustc: LD_LIBRARY_PATH=/var/tmp/portage/www-client/firefox-57.0/work/firefox-57.0/ff/toolkit/library/release/deps:/usr/lib

then you're getting the broken LD_LIBRARY_PATH with the /usr/lib on the end.

The reason this is only manifesting with Firefox 57 is because this version builds crates with --crate-type bin.  All the failures I have encountered occur with that crate type.  --crate-type lib, --crate-type staticlib, and --crate-type rlib build fine (and Firefox 56 builds only those types).  The broken LD_LIBRARY_PATH is set in ALL cases, BUT the linker does NOT get invoked unless rustc is building with --crate-type bin (and possibly other types, this is just what I've encountered).  I've verified this by modifying the cargo-linker script to drop a file (since its output usually gets blackholed) whenever it is invoked.  All the rustc invocations that build libs never invoke cargo-linker, so the issue never arises with them.

Jory, does Firefox (to your knowledge) build different/fewer Rust components if Musl (or some other non-glibc) is in use?  I'd also be interested to know if your binutils and binutils-libs versions match, because *not* having the issue in that case would be interesting.

Some other miscellaneous things noted: when building cargo-0.22.0, I ran the above shell one-liner.  As expected, cargo did not have LD_LIBRARY_PATH in its environment, but the cargo-spawned rustc processes *did* have it.  However, it produced a "safe" LD_LIBRARY_PATH like this:

cargo: 
rustc: LD_LIBRARY_PATH=/var/tmp/portage/dev-util/cargo-0.22.0/work/cargo-0.22.0/target/release/deps

Note that it doesn't have a :/usr/lib at the end, so ld.bfd doesn't get confused about which directory to search for libbfd.so.  This looks to be the "normal" operation mode, and explains why Cargo can build itself (and other things) without this issue arising.

Still trying to figure out why cargo is appending /usr/lib to LD_LIBRARY_PATH when it's invoked by Firefox. :-(  There is nothing in the command that invokes cargo that looks like it could cause this.

And yes, I rebuilt binutils-libs with USE="-multitarget" so that I could continue to troubleshoot this.  Because I'm a masochist like that. :-P
Comment 29 Ian Stakenvicius (RETIRED) gentoo-dev 2017-11-21 03:08:19 UTC
This makes sense as to why my build tests never had this issue either -- I have USE="multitarget" sync'ed between binutils and binutils-libs on all of my test build environments.
Comment 30 Jory A. Pratt gentoo-dev 2017-11-21 04:16:42 UTC
The crates built are the same for all firefox builds, it is not dependant on libc. As you stated tho binutils and binutils-libs should match. This is something toolchain developers should ensure to ensure a fully functional system no matter what.
Comment 31 Marien Zwart 2017-11-21 09:10:45 UTC
Nice sleuthing. I do have multitarget on just binutils, so that fits.

Depending on ld working using libbdf from binutils-libs does seem a little iffy. Currently binutils-libs-2.29.1-r1 applies an older patchset than binutils-2.29.1-r1, for example. Maybe that doesn't affect libbdf, but relying on that seems undesirable, if easily avoided.

And I definitely don't see why binutils-libs and binutils should match. They're just libraries that binutils has an internal copy of, as far as I can tell. If the intention was to always have them match, binutils-config should just be providing libbfd and friends from the currently selected binutils, instead of having them be a separate (unslotted) package.

I don't understand why only "cargo rustc" (and not other cargo commands) would mess with LD_LIBRARY_PATH. If I understand https://github.com/rust-lang/cargo/issues/595 correctly, "cargo rustc" exists so you can use cargo to build all dependencies of some target, and then pass additional flags to the final rustc invocation. It's not obvious to me why that cargo command would need to run its subprocesses with a special LD_LIBRARY_PATH, while other cargo commands don't.

Maybe the LD_LIBRARY_PATH manipulation in the "rustc" subcommand is just wrong and/or out of sync with the rest of cargo, and this isn't normally noticeable?
Comment 32 Psi 2017-11-22 04:32:32 UTC
(In reply to Marien Zwart from comment #31)
> Nice sleuthing. I do have multitarget on just binutils, so that fits.
> 
> Depending on ld working using libbdf from binutils-libs does seem a little
> iffy. Currently binutils-libs-2.29.1-r1 applies an older patchset than
> binutils-2.29.1-r1, for example. Maybe that doesn't affect libbdf, but
> relying on that seems undesirable, if easily avoided.
> 
> And I definitely don't see why binutils-libs and binutils should match.
> They're just libraries that binutils has an internal copy of, as far as I
> can tell. If the intention was to always have them match, binutils-config
> should just be providing libbfd and friends from the currently selected
> binutils, instead of having them be a separate (unslotted) package.
> 
> I don't understand why only "cargo rustc" (and not other cargo commands)
> would mess with LD_LIBRARY_PATH. If I understand
> https://github.com/rust-lang/cargo/issues/595 correctly, "cargo rustc"
> exists so you can use cargo to build all dependencies of some target, and
> then pass additional flags to the final rustc invocation. It's not obvious
> to me why that cargo command would need to run its subprocesses with a
> special LD_LIBRARY_PATH, while other cargo commands don't.
> 
> Maybe the LD_LIBRARY_PATH manipulation in the "rustc" subcommand is just
> wrong and/or out of sync with the rest of cargo, and this isn't normally
> noticeable?

Good catch on the patchset difference between binutils and binutils-libs.  I hadn't even thought to check.  As it turns out, the only difference between the patchsets for those two packages are fixes for the testsuite.  There are no changes to the installed code itself between the two patchsets.  This makes sense.  As long as the two packages are compatible to begin with (having compatible patchsets in the initial release), everything should work fine, since no ABI-breaking patches would be released later for the same SLOT, and significant bugfixes would likely be applied to both.  But your point is well taken.  There is no guarantee of ABI compatibility between the two packages, and while synchronizing the "multitarget" USE flags produces ABI-compatible versions of libbfd.so now, it still remains only a *workaround* for this bug, as the real problem is that Cargo misdirects the linker.  Fortunately, with the existing binutils{,-libs} packages, any incompatible differences between the two libbfd's *should* always result in link failure.  It should *not* produce a broken binary unless there is a weird bug in binutils itself, or unless you built binutils-libs with crazy CFLAGS (binutils filters out most CFLAGS but binutils-libs does not filter).  But yes, it's certainly not what one desires to do, and certainly doesn't count as a fix.  Just think: everyone who has not experienced this bug because their binutils{,-libs} USE flags already matched (like Ian), has really been using this workaround all along, without knowing it.  Kinda disturbing, really... :-)

As to your second point, the two packages *don't* have to match.  The two packages don't have to have matching USE flags, they don't have to be ABI-compatible or even be the same version, and that is clearly an intentional design choice (in fact, using differing *versions* of the two packages should prevent using the wrong libbfd.so altogether).  binutils-libs appears to exist mainly to give other software (such as ocaml) the ability to do its own linking directly.  It isn't part of the toolchain from a purely technical standpoint, as it doesn't provide the system with a linker.  Unless some errant program sets LD_LIBRARY_PATH and *accidentally* makes it provide part of a linker... :-P  Whether or to what degree it should match with binutils is debatable.  I can see good arguments (and have seen bugs appear) either way.  But that debate likely concluded a long time ago.

My understanding so far of the purpose of 'cargo rustc' matches yours.  It appears that it sets LD_LIBRARY_PATH to the location of the dependencies that it builds so that those dependencies are used by the build programs it generates.  Many packages have a build.rs that is compiled and executed in order to control the building of that package.  It's sort of like a Makefile that gets compiled into a binary.  http://doc.crates.io/build-script.html has a simple overview.  I was mistaken when I originally said that 'cargo rustc' is the only command that sets LD_LIBRARY_PATH.  Others do it too, but they use code from the cargo_rustc directory to do it (not so obvious after all).  I believe this issue may be related to https://github.com/rust-lang/cargo/issues/3366 which involves cargo adding invalid items to LD_LIBRARY_PATH.  There have been a number of changes since the fix for that issue was put in place, and the LD_LIBRARY_PATH sanitizing is now done in its own function.  It looks like another function adds some stuff to that variable *after* the sanitizing, which could be the problem.  I'm still going through the history of changes to the affected source file to see what was changed after issue 3366 and why.  On another front, I've found code in the build.rs scripts for some of the crates that trigger this problem that tells cargo to add a particular /lib directory to the search path.  Their doing so could be causing this, but Cargo is *supposed* to be filtering these out, so round and round it goes...

It's probably going to take me a couple of days to sort this one out, so no more dissertations from me for a while. :-D  It doesn't help that I'm having to learn Rust as I go.  Never had any desire to learn it.  Still don't.  I see now Mozilla's strategy for driving Rust adoption: break Firefox with it and everyone who troubleshoots it becomes a Rust programmer. :-P
Comment 33 Psi 2017-11-23 08:55:52 UTC
Created attachment 505874 [details, diff]
Trivial patch for >=dev-util/cargo-0.20.0 which resolves the invalid LD_LIBRARY_PATH issue that has been breaking Firefox builds

The tl;dr:

dis fix ur forefox

This is The Fix™.  And it is *dead simple*.  It applies to cargo-0.20.0 and newer.  Put it in /etc/portage/patches/dev-util/cargo and let everyone know what happens.  Unless it fails.  Then don't tell anyone.


The explanation:

Cargo PR 4006 (https://github.com/rust-lang/cargo/pull/4006) added a default dynamic library search path to cargo in context.rs that is used in various parts of the code.  In compilation.rs, this and other search paths set elsewhere are added to LD_LIBRARY_PATH.  When building for the host system (i.e. not cross-building), the default path it adds is ${sysroot}/lib, which is /usr/lib on Gentoo.  I missed this for a while because I was focusing on all the code that dynamically generates search paths.  I didn't expect to find that they had effectively hardcoded /usr/lib into LD_LIBRARY_PATH and no one had noticed until now. :-P

Never, ever, add a system library path to LD_LIBRARY_PATH unless you have a damned good reason.  The failure of ld.bfd is a perfect example of why.

Cargo is attempting to make sure the Rust shared system libraries are visible to the rustc compiler and anything cargo executes as part of a build.  It just goes a bit overboard.  What the patch does is set Cargo's default dynamic library search path for host builds to the same path used when doing a cross-build: the Rust private library directory named for the target tuple.  On a host build, the target tuple is the same as the host tuple, so we get all the correct libraries (the path is something like /usr/lib/rustlib/x86_64-unknown-linux-gnu/lib).  As there are identical copies of those libraries in /usr/lib, and because /etc/ld.so.conf has an entry for the private directory, most systems (Gentoo or other) shouldn't need cargo to add these paths to LD_LIBRARY_PATH at all.  In fact, you can patch it to search a non-existent directory and it still builds Firefox on a Gentoo system.  However, in this form, the patch not only works on Gentoo, but it's also upstreamable, and is as simple and non-invasive as a patch can possibly be.

Incidentally, the author of PR 4006 originally suggested the path used in the patch as a possible alternative, and provided a good rationale, then talked him/herself out of it.  Oh well. :-(

So far I have successfully done the following:

* Built Firefox using the patched Cargo
* Built Cargo using the patched Cargo (just replaced the "bootstrap" cargo binary with a symlink to the system cargo binary, which worked fine)
* Built Firefox using the patched Cargo that was built using a patched Cargo ;-)

Making sure Cargo could build itself was the best way I could find to make sure that it didn't fix Firefox but break something else.  I don't exactly have a lot of Rust projects laying around.

<grumble>I don't think I've *ever* had to put this much work into something that resulted in a freaking three-line patch.</grumble>

Hope it helps.

P.S. Obviously, I don't do a lot of texting or tweeting.
Comment 34 Marien Zwart 2017-11-23 10:33:09 UTC
Thanks! I can't test this against the "normal" cargo because of bug 626272. But I've tested the patch applies to cargo-0.22.0, and that firefox, ripgrep and one of my local ebuilds still build afterwards (on a system where LD_LIBRARY_PATH=/usr/lib ld breaks).

I don't know if the patch is actually sane, but it's not obviously broken anything here :)
Comment 35 Ian Stakenvicius (RETIRED) gentoo-dev 2017-11-23 16:56:57 UTC
I've confirmed firefox-57 builds fine with a USE=multitarget mis-matched binutils{,-libs} and cargo-0.20.0 with this patch.

CC'ing Cardoe (and rust project for completion) so they can review and apply the solution.
Comment 36 Jory A. Pratt gentoo-dev 2017-11-24 03:37:35 UTC
(In reply to Psi from comment #33)
> Created attachment 505874 [details, diff] [details, diff]
> Trivial patch for >=dev-util/cargo-0.20.0 which resolves the invalid
> LD_LIBRARY_PATH issue that has been breaking Firefox builds
> 
> The tl;dr:
> 
> dis fix ur forefox
> 
> This is The Fix™.  And it is *dead simple*.  It applies to cargo-0.20.0 and
> newer.  Put it in /etc/portage/patches/dev-util/cargo and let everyone know
> what happens.  Unless it fails.  Then don't tell anyone.
> 
> 
> The explanation:
> 
> Cargo PR 4006 (https://github.com/rust-lang/cargo/pull/4006) added a default
> dynamic library search path to cargo in context.rs that is used in various
> parts of the code.  In compilation.rs, this and other search paths set
> elsewhere are added to LD_LIBRARY_PATH.  When building for the host system
> (i.e. not cross-building), the default path it adds is ${sysroot}/lib, which
> is /usr/lib on Gentoo.  I missed this for a while because I was focusing on
> all the code that dynamically generates search paths.  I didn't expect to
> find that they had effectively hardcoded /usr/lib into LD_LIBRARY_PATH and
> no one had noticed until now. :-P
> 
> Never, ever, add a system library path to LD_LIBRARY_PATH unless you have a
> damned good reason.  The failure of ld.bfd is a perfect example of why.
> 
> Cargo is attempting to make sure the Rust shared system libraries are
> visible to the rustc compiler and anything cargo executes as part of a
> build.  It just goes a bit overboard.  What the patch does is set Cargo's
> default dynamic library search path for host builds to the same path used
> when doing a cross-build: the Rust private library directory named for the
> target tuple.  On a host build, the target tuple is the same as the host
> tuple, so we get all the correct libraries (the path is something like
> /usr/lib/rustlib/x86_64-unknown-linux-gnu/lib).  As there are identical
> copies of those libraries in /usr/lib, and because /etc/ld.so.conf has an
> entry for the private directory, most systems (Gentoo or other) shouldn't
> need cargo to add these paths to LD_LIBRARY_PATH at all.  In fact, you can
> patch it to search a non-existent directory and it still builds Firefox on a
> Gentoo system.  However, in this form, the patch not only works on Gentoo,
> but it's also upstreamable, and is as simple and non-invasive as a patch can
> possibly be.
> 
> Incidentally, the author of PR 4006 originally suggested the path used in
> the patch as a possible alternative, and provided a good rationale, then
> talked him/herself out of it.  Oh well. :-(
> 
> So far I have successfully done the following:
> 
> * Built Firefox using the patched Cargo
> * Built Cargo using the patched Cargo (just replaced the "bootstrap" cargo
> binary with a symlink to the system cargo binary, which worked fine)
> * Built Firefox using the patched Cargo that was built using a patched Cargo
> ;-)
> 
> Making sure Cargo could build itself was the best way I could find to make
> sure that it didn't fix Firefox but break something else.  I don't exactly
> have a lot of Rust projects laying around.
> 
> <grumble>I don't think I've *ever* had to put this much work into something
> that resulted in a freaking three-line patch.</grumble>
> 
> Hope it helps.
> 
> P.S. Obviously, I don't do a lot of texting or tweeting.

                     rustlib.push("lib");
+                    rustlib.push("rustlib");
+                    rustlib.push(self.target_triple());
+                    rustlib.push("lib");

the patch is incomplete, you do not need both rustlib.push(:lib");
Comment 37 Psi 2017-11-24 05:53:30 UTC
(In reply to Jory A. Pratt from comment #36)
> 
>                      rustlib.push("lib");
> +                    rustlib.push("rustlib");
> +                    rustlib.push(self.target_triple());
> +                    rustlib.push("lib");
> 
> the patch is incomplete, you do not need both rustlib.push(:lib");

Um, yes, you do (only from a standpoint of strict "correctness" - it could point to /foo/bar/baz and cargo would still work - I tried it).  This is the *unmodified* code block from ${S}/src/cargo/ops/cargo_rustc/context.rs:

let mut rustlib = PathBuf::from(line);
if kind == Kind::Host {
    if cfg!(windows) {
        rustlib.push("bin");
    } else {
        rustlib.push("lib");
    }
    self.compilation.host_dylib_path = Some(rustlib);
} else {
    rustlib.push("lib");
    rustlib.push("rustlib");
    rustlib.push(self.target_triple());
    rustlib.push("lib");
    self.compilation.target_dylib_path = Some(rustlib);
}

At this point, the variable 'line' contains the first line of output from running 'rustc --print=sysroot --print=cfg'.  So the first line of code here sets rustlib to ${sysroot}, which is "/usr" in our case.  The 'if' conditional we take is 'kind == Kind::Host'.  We're not windows (the cfg!() is confusing - the "!" does not mean "not"), so we get rustlib.push("lib").  Now rustlib is set to "/usr/lib" (slashes are added automatically by push()).  Then host_dylib_path gets set to rustlib, which means it gets set to "/usr/lib".  host_dylib_path is put into LD_LIBRARY_PATH in fill_env() from compilation.rs.

What the patch does is to mostly duplicate what the outer 'else' conditional does, which is to take the "/usr" that is initially placed in rustlib, and effectively do (in pseudo-code):

rustlib = rustlib + "/lib" + "/rustlib" + "/"target_triple() + "/lib"

which produces "/usr/lib/rustlib/x86_64-unknown-linux-gnu/lib" (or x86_64-unknown-linux-musl or whatever one's system happens to be).  The only difference between the 'if kind == Kind::Host' and the 'else' conditionals *after* patching is whether the path produced goes into host_dylib_path or target_dylib_path, which each get handled differently by compilation.rs:fill_env().  I avoided consolidating the common parts in order to keep the patch very simple, and keep it easy to change either of the conditionals separately.

In a shell (adjust the host tuple as needed) you should get the following if you have dev-lang/rust-1.19.0 installed (this is the default Rust install path, which is *not* what the ebuild is trying to use, but it does anyway - I'm preparing to file a separate bug on that once I am sure that it's not just me):

$ ls -1 /usr/lib/rustlib/x86_64-unknown-linux-gnu/
lib
$ ls -1 /usr/lib/rustlib/x86_64-unknown-linux-gnu/lib/
liballoc-8d30e02f218f5a7e.rlib
liballoc_jemalloc-ef11d16aad61dd80.rlib
liballoc_system-386a94a85dfb1a8e.rlib
libarena-a70787fed58f2e0f.so
... and so on (static and shared libs).

Again, it doesn't strictly matter on Gentoo what the code puts into host_dylib_path, just as long as it's not a system path or any other path that might have conflicting libraries.  The system's dynamic linker will find the Rust libraries on its own.

The path I chose to put in the patch was one of the only two paths that would make any sense to search for Rust's libraries, because Rust installs its shared libraries in two places (bit-for-bit identical copies in both places).  /usr/lib is one (which we've discovered causes problems), so the only other one to use would be /usr/lib/rustlib/${host-tuple}/lib.  The author of https://github.com/rust-lang/cargo/pull/4006 knew this, and just picked one.  We're just picking the other, which not only works on Gentoo, but should be more palatable to upstream than removing the code altogether, or other more invasive options.


Looking at your comment again, I realize you might be thinking that the push("directory") statements are pushing colon-separated *paths* into LD_LIBRARY_PATH.  If that's the case, then sorry for the noise.  The push("directory") statements append slash-separated directory names to a single path, which is later added to LD_LIBRARY_PATH by other code.  So the code *here* is only adding *one* path to LD_LIBRARY_PATH.  Other parts of the code add more paths, such as the directory containing build dependencies.

If you want to make sure, you can always verify what is being set in LD_LIBRARY_PATH by using the shell one-liner from Comment 28 during a cargo build.  When building Firefox, the code path affected by the patch is taken.  In some other builds (seems to be things that run 'cargo build', such as cargo building cargo), that code path isn't taken at all (or possibly its results are overwritten), and host_dylib_path is not being set (thus the default path from context.rs does not get added to LD_LIBRARY_PATH for some build types, which is really preferable).
Comment 38 Peter Fox 2018-01-31 22:31:11 UTC
I had the same issue with building firefox, but upgrading to cargo-0.24.0 fixed it. Perhaps make firefox depend on this?
Comment 39 Jory A. Pratt gentoo-dev 2019-03-31 20:06:31 UTC
Please feel free to reopen and update any bug report that can be duplicated with current esr builds, 60.x. If you feel your feature needs to be re looked at in any of these bugs reopen and update, please attach patches when appropriate. Thank you Mozilla Team