Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 144752 - dev-lang/ghc-6.4.2 --- mangler (ghc-asm) on sparc marks _modules_registered as entry too strongly
Summary: dev-lang/ghc-6.4.2 --- mangler (ghc-asm) on sparc marks _modules_registered a...
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: Sparc Linux
: Normal normal (vote)
Assignee: Duncan Coutts (RETIRED)
URL:
Whiteboard:
Keywords:
: 144753 (view as bug list)
Depends on:
Blocks:
 
Reported: 2006-08-22 07:43 UTC by Ferris McCormick (RETIRED)
Modified: 2006-10-03 16:42 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Proposed fix for ghc-asm mangler problem on sparc-linux (sparc-ghc-asm.patch,837 bytes, patch)
2006-08-23 12:06 UTC, Ferris McCormick (RETIRED)
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Ferris McCormick (RETIRED) gentoo-dev 2006-08-22 07:43:21 UTC
(Assigned to Duncan Coutts at his request.)
On sparc, the ghc_asm mangler for ghc-6.4.2 marks the symbol _module_registered as:
00000004 C _module_registered
instead of as
0000000000000000 b _module_registered
(as taken from amd64).
This results in failures from ghci, thus:
==========================================
GHCi runtime linker: fatal error: I found a duplicate definition for symbol
   _module_registered
whilst processing object file
   /usr/lib/ghc-6.4.2/HShaskell98.o
===========================================

Apparently, this was a known problem for ghc on sparc/solaris and fixed there, but it is also present on linux.  See https://sourceforge.net/tracker/?func=detail&atid=108032&aid=1170933&group_id=8032

Problem is specific to sparc, but otherwise should be independent from the environment.

I note a cross-reference to https://bugs.gentoo.org/show_bug.cgi?id=140369 for bookkeeping purposes; feel free to ignore it.
Comment 1 Ferris McCormick (RETIRED) gentoo-dev 2006-08-22 07:52:57 UTC
*** Bug 144753 has been marked as a duplicate of this bug. ***
Comment 2 Duncan Coutts (RETIRED) gentoo-dev 2006-08-22 08:20:18 UTC
You may like to try the same fix that was used for Sparc/Solaris:

http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/driver/mangler/ghc-asm.lprl.diff?r2=1.138&r1=1.137&f=H

As you can see, that patch is applied inside of an:
  "elsif ( $TargetPlatform =~ /^sparc-.*-(solaris2|openbsd)/ )"

I hope we can find a similar code snippet for sparc liux and patch that in a similar way. If this works then please ask one of the Haskell team member to submit the patch upstream as upstream are just about to release GHC 6.6.
Comment 3 Ferris McCormick (RETIRED) gentoo-dev 2006-08-22 10:07:16 UTC
It should be something very similar.  Hard part is figuring out how to look at the assembly code, but this is what has to happen:

Consider this tiny C program:
==============================
int        _module_registered_;
static int _module_registered ;
main() {
    _module_registered  =  1;
    _module_registered_ = 42;
}
=============================
The assembly code from this is:
==============================
        .file   "t.c"
        .section        ".text"
        .align 4
        .align 32
        .global main
        .type   main, #function
        .proc   04
main:
        !#PROLOGUE# 0
        !#PROLOGUE# 1
        sethi   %hi(_module_registered), %g1
        mov     1, %g2
        st      %g2, [%g1+%lo(_module_registered)]
        sethi   %hi(_module_registered_), %g1
        mov     42, %g3
        retl
        st      %g3, [%g1+%lo(_module_registered_)]
        .size   main, .-main
        .common _module_registered_,4,4
        .local  _module_registered
        .common _module_registered_,4,4
        .section        ".note.GNU-stack"
        .ident  "GCC: (GNU) 3.4.6 (Gentoo 3.4.6-r2, ssp-3.4.6-1.0, pie-8.7.9)"
=================================================
And the corresponding object file shows:
=================================================
00000000 T main
00000000 b _module_registered
00000004 C _module_registered_
=================================================
So, what we need is a .local inserted before the .common for _module_registered.

Now, experimenting on amd64 (x86_64), notice that the corresponding assembly code is:
        .comm   _module_registered_,4,4
        .local  _module_registered
        .comm   _module_registered,4,4
so, it is identical to sparc except for .comm instead of .common but the libraries for amd64 are correct.

Finally, experimentation shows that if you just run the mangler on amd64 output from gcc, it doesn't convert anything, either.

That is not everything, however.  If you take some .hs file, say, Expr.hs from the hugs98 demo files, and just use ghc on it, you see that:
00000000 b _module_registered
both before and after mangling.
And, indeed, the assembly code put out by ghc is as it should be.
At this point, I am confused, and am rebuilding ghc to keep the build directory around.  Also, to see if a second build has any effect.
Comment 4 Ferris McCormick (RETIRED) gentoo-dev 2006-08-22 11:42:18 UTC
OK, here is where the bad external gets introduced.

ghc itself normally seems to set things up the way we want (as above).  However, when it builds, say, ghc-6.4.2/libraries/haskell98/libHShaskell98.a, it has split evreything up and converted the local _module_registered into
00000004 C _module_registered
and nothing ever converts this back to local.  I have no idea why or how this happens, nor what should convert it back, so at this point I await more information.

I suppose it is related to -split-objs, but don't know where to go from there.
Comment 5 Duncan Coutts (RETIRED) gentoo-dev 2006-08-22 12:43:17 UTC
That's very interesting, thanks for investigating.
So it's not the fault of the mangler.

As for the splitter, suppose we have Foo.hs:

module Foo where

foo = 0


and then...

mkdir Foo_split
ghc -v -c Foo.hs -split-objs

On my amd64 it tells us:
*** Splitter
/usr/lib64/ghc-6.4.2/ghc-split /tmp/ghc3498.split_s /tmp/ghc3498.split tmp/ghc3498.split
*** Assembler
gcc -march=k8 -Wa,--noexecstack -c -o Foo_split/Foo__1.o /tmp/ghc3498.split__1.s
*** Assembler
gcc -march=k8 -Wa,--noexecstack -c -o Foo_split/Foo__2.o /tmp/ghc3498.split__2.s


/usr/lib/ghc-6.4.2/ghc-split is another perl script. It's cutting up the .s file into lots of little ones. Then ghc assembles each one.

The ghc build does this when building the standard libraries. All the resulting <modue>_split/*.o files get put into the .a file, like libHSbase.a (which has >1000 tiny .o files in).

This is not strictly necessary but means that later when linking we end up with much smaller binaries as only the necessary bits get linked in. (Yes, GNU ld should do that automatically but while it does it well for C code it doesn't seem to do so well for the Haskell code.)

So if we don't figure it out we can just turn off building the standard libs using -split-objs. But if you do want to take a look, then comparing sparc vs am64 again for the splitter, like you did for the mangler seems like a good approach. Using -keep-tmp-files and -v you'll be able to see how to run the splitter manually and see what's going on.
Comment 6 Ferris McCormick (RETIRED) gentoo-dev 2006-08-22 13:53:48 UTC
I'll look at it tomorrow.  Thanks.
Comment 7 Ferris McCormick (RETIRED) gentoo-dev 2006-08-23 07:30:53 UTC
It's in the demangler.  Demangler does not do the sasme things when splitting and when not splitting; it needs a patch similar to the one for sparc-solaris (and to make it look more like the amd64 linux branch).  I am checking a patch to ghs-asm (well, actually to ghs-asm.lprl) at the moment.
Comment 8 Ferris McCormick (RETIRED) gentoo-dev 2006-08-23 07:33:53 UTC
(In reply to comment #7)
> It's in the demangler.  Demangler does not do the sasme things when splitting
> and when not splitting; it needs a patch similar to the one for sparc-solaris
> (and to make it look more like the amd64 linux branch).  I am checking a patch
> to ghs-asm (well, actually to ghs-asm.lprl) at the moment.
> 
read "[mM]angler" for "[dD]emangler" of course.
Comment 9 Ferris McCormick (RETIRED) gentoo-dev 2006-08-23 12:06:09 UTC
Created attachment 94957 [details, diff]
Proposed fix for ghc-asm mangler problem on sparc-linux

Purpose of this patch is to keep mangler from removing .local attribute from the _module_registered symbol.  Without it, ghci fails as indicated above.

With this patch, the /usr/lib/ghc-6.4.2 directory entries appear to mirror each other on sparc and on amd64; further, the compiler builds and installs fine with it.  It looks as if all tests run successfully, but a complete test run on my (SB-smp) test system takes over seven hours, so I did not run all of it before forcing an install to verify ghci fix in a live environment.

Note further:  ghc-bin has the same problem and needs to be rebuilt (its libraries have multiple instances of external _module_registered, and so its version of ghci will fail.)  There is an obvious patch for its version of ghc-asm, however.

I believe this patch is correct, but it requires some independent verification.
Comment 10 Ferris McCormick (RETIRED) gentoo-dev 2006-09-05 07:01:50 UTC
Duncan,
  What's the status of this patch?  As best as I can tell, it seems to be correct.
Thanks,
Ferris
Comment 11 Duncan Coutts (RETIRED) gentoo-dev 2006-09-05 15:05:20 UTC
(In reply to comment #10)
> Duncan,
>   What's the status of this patch?  As best as I can tell, it seems to be
> correct.

Hi Ferris,
I'm back from the wilderness. Im testing your patches now. Thanks for all the work.

Duncan

Comment 12 Duncan Coutts (RETIRED) gentoo-dev 2006-09-06 08:14:07 UTC
Cool! Great work Ferris!

The testsuite is now down from 406 to 17 failures!

OVERALL SUMMARY for test run started at Wed Sep  6 07:12:46 BST 2006
    1365 total tests, which gave rise to     4157 test cases, of which
       0 caused framework failures
     580 were skipped
     3509 expected passes
      51 expected failures
       0 unexpected passes
      17 unexpected failures 
Unexpected failures:
   arith011(prof)
   barton-mangler-bug(normal,opt,prof,ghci,threaded)
   ffi014(threaded)
   galois_raytrace(prof)
   ghciprog004(normal)
   ioref001(normal,prof,threaded)
   joao-circular(normal,opt,prof,threaded)
   seward-space-leak(ghci)

At least the barton-mangler-bug is a harmless minor difference in floating point precision between x86 and sparc. Others we run out of heap space.

So looking pretty good! I'll add this patch to ghc-6.4.2.ebuild. I'll take care of sending the patch upstream too.
Comment 13 Duncan Coutts (RETIRED) gentoo-dev 2006-10-03 16:42:29 UTC
Comitted.