First Last Prev Next    No search results available      Search page      Enter new bug
Bug#: 144752
Alias:
Product:
Component:
Status: RESOLVED
Resolution: FIXED
Assigned To: Duncan Coutts (RETIRED) <dcoutts@gentoo.org>
Hardware:
OS:
Version:
Priority:
Severity:
Reporter: Ferris McCormick <fmccor@gentoo.org>
Add CC:
CC:
Remove selected CCs
URL:
Summary:
Status Whiteboard:
Keywords:

Filename Description Type Creator Created Size Actions
sparc-ghc-asm.patch Proposed fix for ghc-asm mangler problem on sparc-linux patch Ferris McCormick 2006-08-23 12:06 0000 837 bytes Details | Diff
Create a New Attachment (proposed patch, testcase, etc.) View All

Bug 144752 depends on: Show dependency tree
Bug 144752 blocks:
Votes: 0    Show votes for this bug    Vote for this bug

Additional Comments: (this is where you put emerge --info)


Not eligible to see or edit group visibility for this bug.






View Bug Activity   |   Format For Printing   |   XML   |   Clone This Bug


Description:   Opened: 2006-08-22 07:43 0000
(Assigned to Duncan Coutts at his request.)
On sparc, the ghc_asm mangler for ghc-6.4.2 marks the symbol _module_registered
as:
00000004 C _module_registered
instead of as
0000000000000000 b _module_registered
(as taken from amd64).
This results in failures from ghci, thus:
==========================================
GHCi runtime linker: fatal error: I found a duplicate definition for symbol
   _module_registered
whilst processing object file
   /usr/lib/ghc-6.4.2/HShaskell98.o
===========================================

Apparently, this was a known problem for ghc on sparc/solaris and fixed there,
but it is also present on linux.  See
https://sourceforge.net/tracker/?func=detail&atid=108032&aid=1170933&group_id=8032

Problem is specific to sparc, but otherwise should be independent from the
environment.

I note a cross-reference to https://bugs.gentoo.org/show_bug.cgi?id=140369 for
bookkeeping purposes; feel free to ignore it.

------- Comment #1 From Ferris McCormick 2006-08-22 07:52:57 0000 -------
*** Bug 144753 has been marked as a duplicate of this bug. ***

------- Comment #2 From Duncan Coutts (RETIRED) 2006-08-22 08:20:18 0000 -------
You may like to try the same fix that was used for Sparc/Solaris:

http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/driver/mangler/ghc-asm.lprl.diff?r2=1.138&r1=1.137&f=H

As you can see, that patch is applied inside of an:
  "elsif ( $TargetPlatform =~ /^sparc-.*-(solaris2|openbsd)/ )"

I hope we can find a similar code snippet for sparc liux and patch that in a
similar way. If this works then please ask one of the Haskell team member to
submit the patch upstream as upstream are just about to release GHC 6.6.

------- Comment #3 From Ferris McCormick 2006-08-22 10:07:16 0000 -------
It should be something very similar.  Hard part is figuring out how to look at
the assembly code, but this is what has to happen:

Consider this tiny C program:
==============================
int        _module_registered_;
static int _module_registered ;
main() {
    _module_registered  =  1;
    _module_registered_ = 42;
}
=============================
The assembly code from this is:
==============================
        .file   "t.c"
        .section        ".text"
        .align 4
        .align 32
        .global main
        .type   main, #function
        .proc   04
main:
        !#PROLOGUE# 0
        !#PROLOGUE# 1
        sethi   %hi(_module_registered), %g1
        mov     1, %g2
        st      %g2, [%g1+%lo(_module_registered)]
        sethi   %hi(_module_registered_), %g1
        mov     42, %g3
        retl
        st      %g3, [%g1+%lo(_module_registered_)]
        .size   main, .-main
        .common _module_registered_,4,4
        .local  _module_registered
        .common _module_registered_,4,4
        .section        ".note.GNU-stack"
        .ident  "GCC: (GNU) 3.4.6 (Gentoo 3.4.6-r2, ssp-3.4.6-1.0, pie-8.7.9)"
=================================================
And the corresponding object file shows:
=================================================
00000000 T main
00000000 b _module_registered
00000004 C _module_registered_
=================================================
So, what we need is a .local inserted before the .common for
_module_registered.

Now, experimenting on amd64 (x86_64), notice that the corresponding assembly
code is:
        .comm   _module_registered_,4,4
        .local  _module_registered
        .comm   _module_registered,4,4
so, it is identical to sparc except for .comm instead of .common but the
libraries for amd64 are correct.

Finally, experimentation shows that if you just run the mangler on amd64 output
from gcc, it doesn't convert anything, either.

That is not everything, however.  If you take some .hs file, say, Expr.hs from
the hugs98 demo files, and just use ghc on it, you see that:
00000000 b _module_registered
both before and after mangling.
And, indeed, the assembly code put out by ghc is as it should be.
At this point, I am confused, and am rebuilding ghc to keep the build directory
around.  Also, to see if a second build has any effect.

------- Comment #4 From Ferris McCormick 2006-08-22 11:42:18 0000 -------
OK, here is where the bad external gets introduced.

ghc itself normally seems to set things up the way we want (as above). 
However, when it builds, say, ghc-6.4.2/libraries/haskell98/libHShaskell98.a,
it has split evreything up and converted the local _module_registered into
00000004 C _module_registered
and nothing ever converts this back to local.  I have no idea why or how this
happens, nor what should convert it back, so at this point I await more
information.

I suppose it is related to -split-objs, but don't know where to go from there.

------- Comment #5 From Duncan Coutts (RETIRED) 2006-08-22 12:43:17 0000 -------
That's very interesting, thanks for investigating.
So it's not the fault of the mangler.

As for the splitter, suppose we have Foo.hs:

module Foo where

foo = 0


and then...

mkdir Foo_split
ghc -v -c Foo.hs -split-objs

On my amd64 it tells us:
*** Splitter
/usr/lib64/ghc-6.4.2/ghc-split /tmp/ghc3498.split_s /tmp/ghc3498.split
tmp/ghc3498.split
*** Assembler
gcc -march=k8 -Wa,--noexecstack -c -o Foo_split/Foo__1.o
/tmp/ghc3498.split__1.s
*** Assembler
gcc -march=k8 -Wa,--noexecstack -c -o Foo_split/Foo__2.o
/tmp/ghc3498.split__2.s


/usr/lib/ghc-6.4.2/ghc-split is another perl script. It's cutting up the .s
file into lots of little ones. Then ghc assembles each one.

The ghc build does this when building the standard libraries. All the resulting
<modue>_split/*.o files get put into the .a file, like libHSbase.a (which has
>1000 tiny .o files in).

This is not strictly necessary but means that later when linking we end up with
much smaller binaries as only the necessary bits get linked in. (Yes, GNU ld
should do that automatically but while it does it well for C code it doesn't
seem to do so well for the Haskell code.)

So if we don't figure it out we can just turn off building the standard libs
using -split-objs. But if you do want to take a look, then comparing sparc vs
am64 again for the splitter, like you did for the mangler seems like a good
approach. Using -keep-tmp-files and -v you'll be able to see how to run the
splitter manually and see what's going on.

------- Comment #6 From Ferris McCormick 2006-08-22 13:53:48 0000 -------
I'll look at it tomorrow.  Thanks.

------- Comment #7 From Ferris McCormick 2006-08-23 07:30:53 0000 -------
It's in the demangler.  Demangler does not do the sasme things when splitting
and when not splitting; it needs a patch similar to the one for sparc-solaris
(and to make it look more like the amd64 linux branch).  I am checking a patch
to ghs-asm (well, actually to ghs-asm.lprl) at the moment.

------- Comment #8 From Ferris McCormick 2006-08-23 07:33:53 0000 -------
(In reply to comment #7)
> It's in the demangler.  Demangler does not do the sasme things when splitting
> and when not splitting; it needs a patch similar to the one for sparc-solaris
> (and to make it look more like the amd64 linux branch).  I am checking a patch
> to ghs-asm (well, actually to ghs-asm.lprl) at the moment.
> 
read "[mM]angler" for "[dD]emangler" of course.

------- Comment #9 From Ferris McCormick 2006-08-23 12:06:09 0000 -------
Created an attachment (id=94957) [edit]
Proposed fix for ghc-asm mangler problem on sparc-linux

Purpose of this patch is to keep mangler from removing .local attribute from
the _module_registered symbol.  Without it, ghci fails as indicated above.

With this patch, the /usr/lib/ghc-6.4.2 directory entries appear to mirror each
other on sparc and on amd64; further, the compiler builds and installs fine
with it.  It looks as if all tests run successfully, but a complete test run on
my (SB-smp) test system takes over seven hours, so I did not run all of it
before forcing an install to verify ghci fix in a live environment.

Note further:  ghc-bin has the same problem and needs to be rebuilt (its
libraries have multiple instances of external _module_registered, and so its
version of ghci will fail.)  There is an obvious patch for its version of
ghc-asm, however.

I believe this patch is correct, but it requires some independent verification. 

------- Comment #10 From Ferris McCormick 2006-09-05 07:01:50 0000 -------
Duncan,
  What's the status of this patch?  As best as I can tell, it seems to be
correct.
Thanks,
Ferris

------- Comment #11 From Duncan Coutts (RETIRED) 2006-09-05 15:05:20 0000 -------
(In reply to comment #10)
> Duncan,
>   What's the status of this patch?  As best as I can tell, it seems to be
> correct.

Hi Ferris,
I'm back from the wilderness. Im testing your patches now. Thanks for all the
work.

Duncan

------- Comment #12 From Duncan Coutts (RETIRED) 2006-09-06 08:14:07 0000 -------
Cool! Great work Ferris!

The testsuite is now down from 406 to 17 failures!

OVERALL SUMMARY for test run started at Wed Sep  6 07:12:46 BST 2006
    1365 total tests, which gave rise to     4157 test cases, of which
       0 caused framework failures
     580 were skipped
     3509 expected passes
      51 expected failures
       0 unexpected passes
      17 unexpected failures 
Unexpected failures:
   arith011(prof)
   barton-mangler-bug(normal,opt,prof,ghci,threaded)
   ffi014(threaded)
   galois_raytrace(prof)
   ghciprog004(normal)
   ioref001(normal,prof,threaded)
   joao-circular(normal,opt,prof,threaded)
   seward-space-leak(ghci)

At least the barton-mangler-bug is a harmless minor difference in floating
point precision between x86 and sparc. Others we run out of heap space.

So looking pretty good! I'll add this patch to ghc-6.4.2.ebuild. I'll take care
of sending the patch upstream too.

------- Comment #13 From Duncan Coutts (RETIRED) 2006-10-03 16:42:29 0000 -------
Comitted.

First Last Prev Next    No search results available      Search page      Enter new bug