Bug 144752 - dev-lang/ghc-6.4.2 --- mangler (ghc-asm) on sparc marks _modules_registered as entry too strongly
|
Bug#:
144752
|
Product: Gentoo Linux
|
Version: 2006.0
|
Platform: Sparc
|
|
OS/Version: Linux
|
Status: RESOLVED
|
Severity: normal
|
Priority: P3
|
|
Resolution: FIXED
|
Assigned To: dcoutts@gentoo.org
|
Reported By: fmccor@gentoo.org
|
|
Component: Applications
|
|
|
URL:
|
|
Summary: dev-lang/ghc-6.4.2 --- mangler (ghc-asm) on sparc marks _modules_registered as entry too strongly
|
|
Keywords:
|
|
Status Whiteboard:
|
|
Opened: 2006-08-22 07:43 0000
|
(Assigned to Duncan Coutts at his request.)
On sparc, the ghc_asm mangler for ghc-6.4.2 marks the symbol _module_registered
as:
00000004 C _module_registered
instead of as
0000000000000000 b _module_registered
(as taken from amd64).
This results in failures from ghci, thus:
==========================================
GHCi runtime linker: fatal error: I found a duplicate definition for symbol
_module_registered
whilst processing object file
/usr/lib/ghc-6.4.2/HShaskell98.o
===========================================
Apparently, this was a known problem for ghc on sparc/solaris and fixed there,
but it is also present on linux. See
https://sourceforge.net/tracker/?func=detail&atid=108032&aid=1170933&group_id=8032
Problem is specific to sparc, but otherwise should be independent from the
environment.
I note a cross-reference to https://bugs.gentoo.org/show_bug.cgi?id=140369 for
bookkeeping purposes; feel free to ignore it.
*** Bug 144753 has been marked as a duplicate of this bug. ***
It should be something very similar. Hard part is figuring out how to look at
the assembly code, but this is what has to happen:
Consider this tiny C program:
==============================
int _module_registered_;
static int _module_registered ;
main() {
_module_registered = 1;
_module_registered_ = 42;
}
=============================
The assembly code from this is:
==============================
.file "t.c"
.section ".text"
.align 4
.align 32
.global main
.type main, #function
.proc 04
main:
!#PROLOGUE# 0
!#PROLOGUE# 1
sethi %hi(_module_registered), %g1
mov 1, %g2
st %g2, [%g1+%lo(_module_registered)]
sethi %hi(_module_registered_), %g1
mov 42, %g3
retl
st %g3, [%g1+%lo(_module_registered_)]
.size main, .-main
.common _module_registered_,4,4
.local _module_registered
.common _module_registered_,4,4
.section ".note.GNU-stack"
.ident "GCC: (GNU) 3.4.6 (Gentoo 3.4.6-r2, ssp-3.4.6-1.0, pie-8.7.9)"
=================================================
And the corresponding object file shows:
=================================================
00000000 T main
00000000 b _module_registered
00000004 C _module_registered_
=================================================
So, what we need is a .local inserted before the .common for
_module_registered.
Now, experimenting on amd64 (x86_64), notice that the corresponding assembly
code is:
.comm _module_registered_,4,4
.local _module_registered
.comm _module_registered,4,4
so, it is identical to sparc except for .comm instead of .common but the
libraries for amd64 are correct.
Finally, experimentation shows that if you just run the mangler on amd64 output
from gcc, it doesn't convert anything, either.
That is not everything, however. If you take some .hs file, say, Expr.hs from
the hugs98 demo files, and just use ghc on it, you see that:
00000000 b _module_registered
both before and after mangling.
And, indeed, the assembly code put out by ghc is as it should be.
At this point, I am confused, and am rebuilding ghc to keep the build directory
around. Also, to see if a second build has any effect.
OK, here is where the bad external gets introduced.
ghc itself normally seems to set things up the way we want (as above).
However, when it builds, say, ghc-6.4.2/libraries/haskell98/libHShaskell98.a,
it has split evreything up and converted the local _module_registered into
00000004 C _module_registered
and nothing ever converts this back to local. I have no idea why or how this
happens, nor what should convert it back, so at this point I await more
information.
I suppose it is related to -split-objs, but don't know where to go from there.
That's very interesting, thanks for investigating.
So it's not the fault of the mangler.
As for the splitter, suppose we have Foo.hs:
module Foo where
foo = 0
and then...
mkdir Foo_split
ghc -v -c Foo.hs -split-objs
On my amd64 it tells us:
*** Splitter
/usr/lib64/ghc-6.4.2/ghc-split /tmp/ghc3498.split_s /tmp/ghc3498.split
tmp/ghc3498.split
*** Assembler
gcc -march=k8 -Wa,--noexecstack -c -o Foo_split/Foo__1.o
/tmp/ghc3498.split__1.s
*** Assembler
gcc -march=k8 -Wa,--noexecstack -c -o Foo_split/Foo__2.o
/tmp/ghc3498.split__2.s
/usr/lib/ghc-6.4.2/ghc-split is another perl script. It's cutting up the .s
file into lots of little ones. Then ghc assembles each one.
The ghc build does this when building the standard libraries. All the resulting
<modue>_split/*.o files get put into the .a file, like libHSbase.a (which has
>1000 tiny .o files in).
This is not strictly necessary but means that later when linking we end up with
much smaller binaries as only the necessary bits get linked in. (Yes, GNU ld
should do that automatically but while it does it well for C code it doesn't
seem to do so well for the Haskell code.)
So if we don't figure it out we can just turn off building the standard libs
using -split-objs. But if you do want to take a look, then comparing sparc vs
am64 again for the splitter, like you did for the mangler seems like a good
approach. Using -keep-tmp-files and -v you'll be able to see how to run the
splitter manually and see what's going on.
I'll look at it tomorrow. Thanks.
It's in the demangler. Demangler does not do the sasme things when splitting
and when not splitting; it needs a patch similar to the one for sparc-solaris
(and to make it look more like the amd64 linux branch). I am checking a patch
to ghs-asm (well, actually to ghs-asm.lprl) at the moment.
(In reply to comment #7)
> It's in the demangler. Demangler does not do the sasme things when splitting
> and when not splitting; it needs a patch similar to the one for sparc-solaris
> (and to make it look more like the amd64 linux branch). I am checking a patch
> to ghs-asm (well, actually to ghs-asm.lprl) at the moment.
>
read "[mM]angler" for "[dD]emangler" of course.
Created an attachment (id=94957) [details]
Proposed fix for ghc-asm mangler problem on sparc-linux
Purpose of this patch is to keep mangler from removing .local attribute from
the _module_registered symbol. Without it, ghci fails as indicated above.
With this patch, the /usr/lib/ghc-6.4.2 directory entries appear to mirror each
other on sparc and on amd64; further, the compiler builds and installs fine
with it. It looks as if all tests run successfully, but a complete test run on
my (SB-smp) test system takes over seven hours, so I did not run all of it
before forcing an install to verify ghci fix in a live environment.
Note further: ghc-bin has the same problem and needs to be rebuilt (its
libraries have multiple instances of external _module_registered, and so its
version of ghci will fail.) There is an obvious patch for its version of
ghc-asm, however.
I believe this patch is correct, but it requires some independent verification.
Duncan,
What's the status of this patch? As best as I can tell, it seems to be
correct.
Thanks,
Ferris
(In reply to comment #10)
> Duncan,
> What's the status of this patch? As best as I can tell, it seems to be
> correct.
Hi Ferris,
I'm back from the wilderness. Im testing your patches now. Thanks for all the
work.
Duncan
Cool! Great work Ferris!
The testsuite is now down from 406 to 17 failures!
OVERALL SUMMARY for test run started at Wed Sep 6 07:12:46 BST 2006
1365 total tests, which gave rise to 4157 test cases, of which
0 caused framework failures
580 were skipped
3509 expected passes
51 expected failures
0 unexpected passes
17 unexpected failures
Unexpected failures:
arith011(prof)
barton-mangler-bug(normal,opt,prof,ghci,threaded)
ffi014(threaded)
galois_raytrace(prof)
ghciprog004(normal)
ioref001(normal,prof,threaded)
joao-circular(normal,opt,prof,threaded)
seward-space-leak(ghci)
At least the barton-mangler-bug is a harmless minor difference in floating
point precision between x86 and sparc. Others we run out of heap space.
So looking pretty good! I'll add this patch to ghc-6.4.2.ebuild. I'll take care
of sending the patch upstream too.