(Assigned to Duncan Coutts at his request.) On sparc, the ghc_asm mangler for ghc-6.4.2 marks the symbol _module_registered as: 00000004 C _module_registered instead of as 0000000000000000 b _module_registered (as taken from amd64). This results in failures from ghci, thus: ========================================== GHCi runtime linker: fatal error: I found a duplicate definition for symbol _module_registered whilst processing object file /usr/lib/ghc-6.4.2/HShaskell98.o =========================================== Apparently, this was a known problem for ghc on sparc/solaris and fixed there, but it is also present on linux. See https://sourceforge.net/tracker/?func=detail&atid=108032&aid=1170933&group_id=8032 Problem is specific to sparc, but otherwise should be independent from the environment. I note a cross-reference to https://bugs.gentoo.org/show_bug.cgi?id=140369 for bookkeeping purposes; feel free to ignore it.
*** Bug 144753 has been marked as a duplicate of this bug. ***
You may like to try the same fix that was used for Sparc/Solaris: http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/driver/mangler/ghc-asm.lprl.diff?r2=1.138&r1=1.137&f=H As you can see, that patch is applied inside of an: "elsif ( $TargetPlatform =~ /^sparc-.*-(solaris2|openbsd)/ )" I hope we can find a similar code snippet for sparc liux and patch that in a similar way. If this works then please ask one of the Haskell team member to submit the patch upstream as upstream are just about to release GHC 6.6.
It should be something very similar. Hard part is figuring out how to look at the assembly code, but this is what has to happen: Consider this tiny C program: ============================== int _module_registered_; static int _module_registered ; main() { _module_registered = 1; _module_registered_ = 42; } ============================= The assembly code from this is: ============================== .file "t.c" .section ".text" .align 4 .align 32 .global main .type main, #function .proc 04 main: !#PROLOGUE# 0 !#PROLOGUE# 1 sethi %hi(_module_registered), %g1 mov 1, %g2 st %g2, [%g1+%lo(_module_registered)] sethi %hi(_module_registered_), %g1 mov 42, %g3 retl st %g3, [%g1+%lo(_module_registered_)] .size main, .-main .common _module_registered_,4,4 .local _module_registered .common _module_registered_,4,4 .section ".note.GNU-stack" .ident "GCC: (GNU) 3.4.6 (Gentoo 3.4.6-r2, ssp-3.4.6-1.0, pie-8.7.9)" ================================================= And the corresponding object file shows: ================================================= 00000000 T main 00000000 b _module_registered 00000004 C _module_registered_ ================================================= So, what we need is a .local inserted before the .common for _module_registered. Now, experimenting on amd64 (x86_64), notice that the corresponding assembly code is: .comm _module_registered_,4,4 .local _module_registered .comm _module_registered,4,4 so, it is identical to sparc except for .comm instead of .common but the libraries for amd64 are correct. Finally, experimentation shows that if you just run the mangler on amd64 output from gcc, it doesn't convert anything, either. That is not everything, however. If you take some .hs file, say, Expr.hs from the hugs98 demo files, and just use ghc on it, you see that: 00000000 b _module_registered both before and after mangling. And, indeed, the assembly code put out by ghc is as it should be. At this point, I am confused, and am rebuilding ghc to keep the build directory around. Also, to see if a second build has any effect.
OK, here is where the bad external gets introduced. ghc itself normally seems to set things up the way we want (as above). However, when it builds, say, ghc-6.4.2/libraries/haskell98/libHShaskell98.a, it has split evreything up and converted the local _module_registered into 00000004 C _module_registered and nothing ever converts this back to local. I have no idea why or how this happens, nor what should convert it back, so at this point I await more information. I suppose it is related to -split-objs, but don't know where to go from there.
That's very interesting, thanks for investigating. So it's not the fault of the mangler. As for the splitter, suppose we have Foo.hs: module Foo where foo = 0 and then... mkdir Foo_split ghc -v -c Foo.hs -split-objs On my amd64 it tells us: *** Splitter /usr/lib64/ghc-6.4.2/ghc-split /tmp/ghc3498.split_s /tmp/ghc3498.split tmp/ghc3498.split *** Assembler gcc -march=k8 -Wa,--noexecstack -c -o Foo_split/Foo__1.o /tmp/ghc3498.split__1.s *** Assembler gcc -march=k8 -Wa,--noexecstack -c -o Foo_split/Foo__2.o /tmp/ghc3498.split__2.s /usr/lib/ghc-6.4.2/ghc-split is another perl script. It's cutting up the .s file into lots of little ones. Then ghc assembles each one. The ghc build does this when building the standard libraries. All the resulting <modue>_split/*.o files get put into the .a file, like libHSbase.a (which has >1000 tiny .o files in). This is not strictly necessary but means that later when linking we end up with much smaller binaries as only the necessary bits get linked in. (Yes, GNU ld should do that automatically but while it does it well for C code it doesn't seem to do so well for the Haskell code.) So if we don't figure it out we can just turn off building the standard libs using -split-objs. But if you do want to take a look, then comparing sparc vs am64 again for the splitter, like you did for the mangler seems like a good approach. Using -keep-tmp-files and -v you'll be able to see how to run the splitter manually and see what's going on.
I'll look at it tomorrow. Thanks.
It's in the demangler. Demangler does not do the sasme things when splitting and when not splitting; it needs a patch similar to the one for sparc-solaris (and to make it look more like the amd64 linux branch). I am checking a patch to ghs-asm (well, actually to ghs-asm.lprl) at the moment.
(In reply to comment #7) > It's in the demangler. Demangler does not do the sasme things when splitting > and when not splitting; it needs a patch similar to the one for sparc-solaris > (and to make it look more like the amd64 linux branch). I am checking a patch > to ghs-asm (well, actually to ghs-asm.lprl) at the moment. > read "[mM]angler" for "[dD]emangler" of course.
Created attachment 94957 [details, diff] Proposed fix for ghc-asm mangler problem on sparc-linux Purpose of this patch is to keep mangler from removing .local attribute from the _module_registered symbol. Without it, ghci fails as indicated above. With this patch, the /usr/lib/ghc-6.4.2 directory entries appear to mirror each other on sparc and on amd64; further, the compiler builds and installs fine with it. It looks as if all tests run successfully, but a complete test run on my (SB-smp) test system takes over seven hours, so I did not run all of it before forcing an install to verify ghci fix in a live environment. Note further: ghc-bin has the same problem and needs to be rebuilt (its libraries have multiple instances of external _module_registered, and so its version of ghci will fail.) There is an obvious patch for its version of ghc-asm, however. I believe this patch is correct, but it requires some independent verification.
Duncan, What's the status of this patch? As best as I can tell, it seems to be correct. Thanks, Ferris
(In reply to comment #10) > Duncan, > What's the status of this patch? As best as I can tell, it seems to be > correct. Hi Ferris, I'm back from the wilderness. Im testing your patches now. Thanks for all the work. Duncan
Cool! Great work Ferris! The testsuite is now down from 406 to 17 failures! OVERALL SUMMARY for test run started at Wed Sep 6 07:12:46 BST 2006 1365 total tests, which gave rise to 4157 test cases, of which 0 caused framework failures 580 were skipped 3509 expected passes 51 expected failures 0 unexpected passes 17 unexpected failures Unexpected failures: arith011(prof) barton-mangler-bug(normal,opt,prof,ghci,threaded) ffi014(threaded) galois_raytrace(prof) ghciprog004(normal) ioref001(normal,prof,threaded) joao-circular(normal,opt,prof,threaded) seward-space-leak(ghci) At least the barton-mangler-bug is a harmless minor difference in floating point precision between x86 and sparc. Others we run out of heap space. So looking pretty good! I'll add this patch to ghc-6.4.2.ebuild. I'll take care of sending the patch upstream too.
Comitted.