Summary: | dev-lang/ghc fails to build on RAP systems | ||
---|---|---|---|
Product: | Gentoo/Alt | Reporter: | Horea Christian <gentoo> |
Component: | Prefix Support | Assignee: | Gentoo's Haskell Language team <haskell> |
Status: | CONFIRMED --- | ||
Severity: | normal | CC: | ac, indocomsoft, prefix |
Priority: | Normal | ||
Version: | unspecified | ||
Hardware: | AMD64 | ||
OS: | Linux | ||
URL: | http://elephly.net/posts/2017-01-09-bootstrapping-haskell-part-1.html | ||
See Also: | https://bugs.gentoo.org/show_bug.cgi?id=591172 | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- | |
Bug Depends on: | 591172 | ||
Bug Blocks: | |||
Attachments: |
emerge --info
Patch to relocate prebuilt binaries by overriding interpreter path Patch to relocate prebuilt binaries by overriding interpreter path (ghc ebuild patch) |
Description
Horea Christian
2017-11-14 21:44:35 UTC
I think it's caused by the following sed code: if use prefix; then # and insert LD_LIBRARY_PATH entry to EPREFIX dir tree # TODO: add the same for darwin's CHOST and it's DYLD_ local new_ldpath='LD_LIBRARY_PATH="'${EPREFIX}/$(get_libdir):${EPREFIX}/usr/$(get_libdir)'${LD_LIBRARY_PATH:+:}${LD_LIBRARY_PATH}"\nexport LD_LIBRARY_PATH' sed -i -e '2i'"$new_ldpath" \ "${WORKDIR}/usr/bin/$(cross)ghc-${GHC_PV}" \ "${WORKDIR}/usr/bin/$(cross)ghci-${GHC_PV}" \ "${WORKDIR}/usr/bin/$(cross)ghc-pkg-${GHC_PV}" \ "${WORKDIR}/usr/bin/$(cross)hsc2hs" \ "${WORKDIR}/usr/bin/$(cross)runghc-${GHC_PV}" \ "$gp_back" \ || die "Adding LD_LIBRARY_PATH for wrappers failed" Namely ${EPREFIX}/$(get_libdir) is actively harmful because it tries to use prefix's libc but hot's dynamic laoder. They are not really compatible. Try the following: --- a/dev-lang/ghc/ghc-8.0.2.ebuild +++ b/dev-lang/ghc/ghc-8.0.2.ebuild @@ -320,13 +320,2 @@ relocate_ghc() { if use prefix; then - # and insert LD_LIBRARY_PATH entry to EPREFIX dir tree - # TODO: add the same for darwin's CHOST and it's DYLD_ - local new_ldpath='LD_LIBRARY_PATH="'${EPREFIX}/$(get_libdir):${EPREFIX}/usr/$(get_libdir)'${LD_LIBRARY_PATH:+:}${LD_LIBRARY_PATH}"\nexport LD_LIBRARY_PATH' - sed -i -e '2i'"$new_ldpath" \ - "${WORKDIR}/usr/bin/$(cross)ghc-${GHC_PV}" \ - "${WORKDIR}/usr/bin/$(cross)ghci-${GHC_PV}" \ - "${WORKDIR}/usr/bin/$(cross)ghc-pkg-${GHC_PV}" \ - "${WORKDIR}/usr/bin/$(cross)hsc2hs" \ - "${WORKDIR}/usr/bin/$(cross)runghc-${GHC_PV}" \ - "$gp_back" \ - || die "Adding LD_LIBRARY_PATH for wrappers failed" hprefixify "${bin_libpath}"/${PN}*/settings If it works I'll commit it as-is. Hi Horea, What is the glibc version of your host system? @Sergei, as we have discussed in the IRC, the fundamental problem is in the distributed ghc binary packages. They are compiled against new glibc's in the latest Gentoo, making them fail to resolve symbols on old glibc of enterprise linux. IMHO, the only way out is to assume less from the host glibc. Would you please distribute ghc binaries built against older versions of glibc? At least as old as glibc-2.5 on RHEL5 as of year 2017. (In reply to Sergei Trofimovich from comment #1) Besides, I could only build ghc on RAP and RHEL5 and 6 by manually overriding ELF interpreter to Gentoo with patchelf. (In reply to Benda Xu from comment #2) > At least as old as glibc-2.5 on RHEL5 as of year 2017. I just noticed that RHEL5 has reached end-of-life. So the glibc supported should be as old as glibc-2.12 on RHEL6. (In reply to Benda Xu from comment #2) > Hi Horea, > > What is the glibc version of your host system? > > @Sergei, as we have discussed in the IRC, the fundamental problem is in the > distributed ghc binary packages. They are compiled against new glibc's in > the latest Gentoo, making them fail to resolve symbols on old glibc of > enterprise linux. I don't think the failure in #comment1 will disappear even if ghc would be built against the ancient glibc. There libc.so.6 loaded from RAP system by host's ld.so. > IMHO, the only way out is to assume less from the host glibc. Would you > please distribute ghc binaries built against older versions of glibc? At > least as old as glibc-2.5 on RHEL5 as of year 2017. Oldest non-masked available glibc in gentoo is 2.23. I don't think haskell@ can go older than that without major effort. (In reply to Sergei Trofimovich from comment #5) > (In reply to Benda Xu from comment #2) > > @Sergei, as we have discussed in the IRC, the fundamental problem is in the > > distributed ghc binary packages. They are compiled against new glibc's in > > the latest Gentoo, making them fail to resolve symbols on old glibc of > > enterprise linux. > > I don't think the failure in #comment1 will disappear even if ghc would be > built against the ancient glibc. There libc.so.6 loaded from RAP system by host's ld.so. I agree with this and your patch should be applied. My point is that, the fundamental issue is the backward compatibility on system with old glibc. With your patch, Gentoo Prefix still cannot build ghc on most of the non-Gentoo systems. > > IMHO, the only way out is to assume less from the host glibc. Would you > > please distribute ghc binaries built against older versions of glibc? At > > least as old as glibc-2.5 on RHEL5 as of year 2017. > > Oldest non-masked available glibc in gentoo is 2.23. I don't think haskell@ > can go older than that without major effort. Therefore, in order to support Gentoo Prefix, either we bite the bullet to invest this major effort, or we find away to override ELF interpreter reliably. The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=a6af937438bbd6d88028a5cda7ff8ba20a16721e commit a6af937438bbd6d88028a5cda7ff8ba20a16721e Author: Sergei Trofimovich <slyfox@gentoo.org> AuthorDate: 2017-11-24 07:52:12 +0000 Commit: Sergei Trofimovich <slyfox@gentoo.org> CommitDate: 2017-11-24 07:52:31 +0000 dev-lang/ghc: drop LD_LIBRARY_PATH hack, bug #637532 LD_LIBRARY_PATH only worked for prefix systems using host's libc. On systems with prefix/libc it causes host's ld.so to load prefix's libc.so. They are incompatible as ld.so relies on presence of certain private symbols libc.so Reported-by: Horea Christian Bug: https://bugs.gentoo.org/637532 Package-Manager: Portage-2.3.16, Repoman-2.3.6 dev-lang/ghc/Manifest | 62 +++++++++++++++++++++--------------------- dev-lang/ghc/ghc-7.10.3.ebuild | 11 -------- dev-lang/ghc/ghc-7.8.4.ebuild | 15 ---------- dev-lang/ghc/ghc-8.0.2.ebuild | 11 -------- dev-lang/ghc/ghc-8.2.1.ebuild | 11 -------- dev-lang/ghc/metadata.xml | 4 +-- 6 files changed, 33 insertions(+), 81 deletions(-)} This works now. (In reply to Benda Xu from comment #6) > I agree with this and your patch should be applied. My point is that, the > fundamental issue is the backward compatibility on system with old glibc. > With your patch, Gentoo Prefix still cannot build ghc on most of the > non-Gentoo systems. Agreed. > Therefore, in order to support Gentoo Prefix, either we bite the bullet to > invest this major effort, or we find away to override ELF interpreter > reliably. I would like interpreter change approach first. I guess it's roughly a matter of running 'patchelf --set-interpreter' (and maybe changing R*PATH, we'll try to tweak upstream to make R*PATH not to be absolute at some point). My questions are: 1. Do we have an INTERPRETER string to change to already exposed in an eclass? Or should we factor out something from toolchain-glibc? (Or autodiscover default interpreter from current compiler) 2. Does haskell@ need to build ghc in a special way (any special linker flags) to ease further INTERPRETER changes? 3. How to detect in an eclass fact that interpreter needs to be changed? (How to detect we are targeting prefix/libc and not old-style prefix osing host's libc?) (In reply to Sergei Trofimovich from comment #9) > > Therefore, in order to support Gentoo Prefix, either we bite the bullet to > > invest this major effort, or we find a way to override ELF interpreter > > reliably. > > I would like interpreter change approach first. I guess it's roughly a > matter of running 'patchelf --set-interpreter' That's true. However, the problem is that patchelf does not work on all the architectures, see bug 591172. > (and maybe changing R*PATH, > we'll try to tweak upstream to make R*PATH not to be absolute at some point). As far as I can see, rpath does not work here. glibc ABI is not forward compatible. If libraries (ffi, ncurses, gmp, etc.) built against new glibc (from Gentoo) and dynamic linker from old host glibc are mixed, prebuilt ghc will fail on different library in different environments. That's the worst of portability. > My questions are: > > 1. Do we have an INTERPRETER string to change to already exposed in an > eclass? No we don't. > Or should we factor out something from toolchain-glibc? I don't think it a bussiness of glibc. The linker takes care of if it, via --dynamic-linker=xxxxx, or equivalently via gcc's -Wl,--dynamic-linker=xxxx. > (Or > autodiscover default interpreter from current compiler) That can only be done by parsing gcc specs, gcc -dumpspecs | grep dynamic-linker Very complex patterns to look for. > 2. Does haskell@ need to build ghc in a special way (any special linker > flags) to ease further INTERPRETER changes? Yes, although not INTERPRETER change: by providing a minimal ghc built statically, not dynamically linking against ffi, gmp, or ncurses fancy libraries. > 3. How to detect in an eclass fact that interpreter needs to be changed? > (How to detect we are targeting prefix/libc and not old-style prefix osing > host's libc?) if use prefix && ! use prefix-guest; then ...; fi prefix-guest is the USE flag for "old-style" prefix-rpath. Hi Sergei, is it possible to ship intermediate c code instead of ELF binaries? see: https://elephly.net/posts/2017-01-09-bootstrapping-haskell-part-1.html and http://bootstrappable.org/projects.html (In reply to Benda Xu from comment #11) > Hi Sergei, is it possible to ship intermediate c code instead of ELF > binaries? For GHC not today. GHC does not generate portable C code and final result has many drawbacks: 2-8 times slower compilation times, lack of SMP support, binary incompatibility with GHC built using native code generator. GHC used to have rudimentary support for bootstrapping with other haskell compilers but it was never finished and was eventually removed not only from compiler but also from standard libraries. Now GHC requires recent GHC for bootstrapping. (In reply to Sergei Trofimovich from comment #12) > (In reply to Benda Xu from comment #11) > > Hi Sergei, is it possible to ship intermediate c code instead of ELF > > binaries? > > For GHC not today. GHC does not generate portable C code and final result > has many drawbacks: 2-8 times slower compilation times, lack of SMP support, > binary incompatibility with GHC built using native code generator. Do you mean the unregistered C here? The performance impact doesn't matter, because it is only for bootstrapping an optimized ghc. The latest ghc document implies unregistered C is portable: https://downloads.haskell.org/~ghc/8.0.2/docs/html/users_guide/codegens.html#unregisterised-compilation It might be: 1. rebuild ghc in unregistered C mode. 2. produce the intermediate unregistered C code to replace the pre-compiled ELF. 3. in an ebuild, compile the unregistered C into a suboptimal ghc. 4. use the suboptimal ghc to bootstrap an optimized one. (In reply to Benda Xu from comment #13) > (In reply to Sergei Trofimovich from comment #12) > > (In reply to Benda Xu from comment #11) > > > Hi Sergei, is it possible to ship intermediate c code instead of ELF > > > binaries? > > > > For GHC not today. GHC does not generate portable C code and final result > > has many drawbacks: 2-8 times slower compilation times, lack of SMP support, > > binary incompatibility with GHC built using native code generator. > > Do you mean the unregistered C here? The performance impact doesn't matter, > because it is only for bootstrapping an optimized ghc. Yes, I'm talking about --enable-unregisterised mode of GHC. > The latest ghc document implies unregistered C is portable: > https://downloads.haskell.org/~ghc/8.0.2/docs/html/users_guide/codegens. > html#unregisterised-compilation To clarify: doc says not portable but "portable". Generated C code is portable in a sense that you don't need to write platform-specific code to retarget GHC to the new platform (as opposed to write assembly backend for each new target). You just need a C compiler for a target. GHC will infer all the platform specifics from target C compiler at ./configure phase (or will assume worst). For example simple example that shows the platform details of generated C code: -- M.hs module M where a x = x + 42 :: Int when compiled for 64-bit LE platform and 32-bit BE platform produce: $ x86_64-UNREG-linux-gnu-ghc -fforce-recomp -C M.hs && mv M.hc M-x86_64.hc $ m68k-unknown-linux-gnu-ghc -fforce-recomp -C M.hs && mv M.hc M-m68k.hc $ diff -u M-x86_64.hc M-m68k.hc EC_(base_GHCziNum_zdfNumInt_closure); FN_(M_a_entry) { ... -if ((W_)((((W_)Sp+16) - 0x28UL) < (W_)SpLim)) goto _c195; else goto _c196; +if ((W_)((((W_)Sp+8) - 0x14U) < (W_)SpLim)) goto _c195; else goto _c196; ... -_c193 = (W_)Hp-7; +_c19f = (W_)Hp-3; Note how low-level the code is. It checks for haskell stack overflow in bytes precisely knowing how much is needed on 32-bit system and 64-bit system(Sp+<n>). How 64-bit pointers have 3 bits for tags and 32-bit pointers have only 2 bits for tags (Hp-<n>). This C code is not portable in a sense that generated C code can be compiled on any platform. C is used by ghc exactly as an assembler is used by gcc. Even if assembly syntax would be the same on every platform you won't be able to use intermediate assembly files to bootstrap gcc on anything except the targeted system. > It might be: > 1. rebuild ghc in unregistered C mode. > 2. produce the intermediate unregistered C code to replace the pre-compiled > ELF. > 3. in an ebuild, compile the unregistered C into a suboptimal ghc. > 4. use the suboptimal ghc to bootstrap an optimized one. .hc file porting is how ghc was ported to a new platform in the olden days (around version ghc-6.4). It did not work at that time without considerable amount of manual work required. I works nowadays even worse due to bitrot of .hc porting infrastructure. Proper cross-compilation has better chance to succeed. Hi Sergei, (In reply to Sergei Trofimovich from comment #14) > (In reply to Benda Xu from comment #13) > > To clarify: doc says not portable but "portable". > > Generated C code is portable in a sense that you don't need to write > platform-specific code to retarget GHC to the new platform (as opposed to > write assembly backend for each new target). You just need a C compiler for > a target. GHC will infer all the platform specifics from target C compiler > at ./configure phase (or will assume worst). > > For example simple example that shows the platform details of generated C > code: > -- M.hs > module M where a x = x + 42 :: Int > when compiled for 64-bit LE platform and 32-bit BE platform produce: > $ x86_64-UNREG-linux-gnu-ghc -fforce-recomp -C M.hs && mv M.hc > M-x86_64.hc > $ m68k-unknown-linux-gnu-ghc -fforce-recomp -C M.hs && mv M.hc M-m68k.hc > $ diff -u M-x86_64.hc M-m68k.hc > > EC_(base_GHCziNum_zdfNumInt_closure); > FN_(M_a_entry) { > ... > -if ((W_)((((W_)Sp+16) - 0x28UL) < (W_)SpLim)) goto _c195; else goto > _c196; > +if ((W_)((((W_)Sp+8) - 0x14U) < (W_)SpLim)) goto _c195; else goto _c196; > ... > -_c193 = (W_)Hp-7; > +_c19f = (W_)Hp-3; > > Note how low-level the code is. It checks for haskell stack overflow in > bytes precisely knowing how much is needed on 32-bit system and 64-bit > system(Sp+<n>). How 64-bit pointers have 3 bits for tags and 32-bit pointers > have only 2 bits for tags (Hp-<n>). > > This C code is not portable in a sense that generated C code can be compiled > on any platform. C is used by ghc exactly as an assembler is used by gcc. > Even if assembly syntax would be the same on every platform you won't be > able to use intermediate assembly files to bootstrap gcc on anything except > the targeted system. I did not mean to port ghc to a new architecture, but to make ghc bootstrap on Prefix, where EPREFIX is not a predefined location and so shipping prebuilt binaries does not help. In this regard, even keeping assembly-level code is acceptable, because we only need to flexibility to specify a prefix. What do you think? Benda I don't think current build system supports "relink-objects" use case. As ghc is a bootstrapping compiler it relies on linker command constructed by ghc. (In reply to Sergei Trofimovich from comment #16) > I don't think current build system supports "relink-objects" use case. As > ghc is a bootstrapping compiler it relies on linker command constructed by > ghc. If the missing piece is only a set of linker command line arguments, the "relink objects" way of compiling out a seed ghc for bootstrap is doable. *** Bug 738098 has been marked as a duplicate of this bug. *** Is this going anywhere? Honestly, I don't think we should bother about supporting "all" Prefix targets. rpath-based Linux Prefixes, may just work, or not. I won't even think about macOS here. If it works on RAP now, then let's consider that a success, and cherish it. Created attachment 685701 [details, diff]
Patch to relocate prebuilt binaries by overriding interpreter path
Found this bug as I was about to open a new one to submit this patch. I see that this solution is controversial, but it is simple and worked fine on ppc64le, and is much better than being stuck without ghc and its dependees (pandoc).
If solution is not good enough for merging, at least it will be archived here on bugzilla for users to apply the patch and move on (I'd leave the bug open so that it shows up in search results).
Created attachment 685704 [details, diff]
Patch to relocate prebuilt binaries by overriding interpreter path (ghc ebuild patch)
|