Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 14212 - emerge of gcc-3.2.1-r7 won't go to completion
Summary: emerge of gcc-3.2.1-r7 won't go to completion
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] GCC Porting (show other bugs)
Hardware: x86 Linux
: High normal (vote)
Assignee: Martin Schlemmer (RETIRED)
URL:
Whiteboard:
Keywords:
Depends on: 14647
Blocks:
  Show dependency tree
 
Reported: 2003-01-19 20:13 UTC by Guy
Modified: 2003-06-06 05:30 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
gcc.msgs.text.gz (gcc.msgs.text.gz,22.98 KB, application/gzip)
2003-01-23 20:10 UTC, Guy
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Guy 2003-01-19 20:13:04 UTC
I can't get the emerge of gcc-3.2.1-r7 to go to completion on several machines with a variety of options.
Machine 1 is an antique I'm installing gentoo on. K6 PR200 64 megs
Machine 2 has an existing installation of 1.4_rc1 gentoo I'm updating. Celeron 466 256 megs
I've tried a number of different CFLAGS on the first machine. It seems to fail in a number of different places (depending on options) in libjava.
Machine 2 fails in RMIClass Java.

I'll attach emerge info's for both machines as well as whatever logs I can find.

The first machine has stage3-i586 tarball extracted and is at the point of 'emerge rsync', 'emerge -u world'.
Comment 1 Guy 2003-01-19 20:38:47 UTC
Aaaack! Can't add an  attachment through lynx! From the screen:
Sandbox error : urity/CodeSource.java environmental variable should be defined.
ACCESS DENIED  open_rd:  /var/tmp/portage/gcc-3.2.1-r7/work/gcc-3.2.1/libjava/java/security/cert/Certificat.java
jc1: Permission denied: can't reopen /var/tmp/portage ... (above file)

continues with regular error messages.
emerge info:
portage 2.0.46-r9 (default-x86-1.4 gcc-3.2.1 glibc-2.3.1-r2)
CHOST="i586-pc-linux-gnu", CFLAGS="-march=i586 -Os -pipe" (note) I've tried various combinations of K6, O2, O3. They've all failed. :-( Unfortunately, I did not note exactly where each combination failed.
Comment 2 Martin Schlemmer (RETIRED) gentoo-dev 2003-01-19 22:41:03 UTC
That should not happen.  Try:

 # FEATURES=-sandbox emerge gcc


Robert, if you want to have a look.  This is not the only one, seems like
some guys get segfaults.  Also with -r6 and -r1/0 I think.
Comment 3 Guy 2003-01-20 07:05:55 UTC
Martin, you're correct. Also had this on -r6. Also got segfaults on other
machine on -r6.

Sorry I didn't post where it broke on other machine last night. Was tired and
needed sleep. Will do so after I get home from work.

Will try your suggestion tonight on these two machines after I get home (after
posting the other stuff from machine 2). 

Thanx.
Comment 4 J Robert Ray 2003-01-20 15:41:26 UTC
Sandbox error : urity/CodeSource.java environmental variable should be defined.
ACCESS DENIED  open_rd: 
/var/tmp/portage/gcc-3.2.1-r7/work/gcc-3.2.1/libjava/java/security/cert/Certificat.java

This message is unsettling.


static void init_env_entries(char*** prefixes_array, int* prefixes_num, char*
env, int warn)
{
  int old_errno = errno;
  char* prefixes_env = getenv(env);

  if (NULL == prefixes_env) {
    fprintf(stderr,
            "Sandbox error : the %s environmental variable should be defined.\n",
            env);
 

'env' is passed as an arg to this function, this function is only called in one
place, called for times with the env arg set to one of "SANDBOX_DENY",
"SANDBOX_READ", "SANDBOX_WRITE", or "SANDBOX_PREDICT".

For 'env' to end up with the value of "urity/CodeSource.java" means there is
some serious memory trashing going on somewhere.
Comment 5 Guy 2003-01-20 17:47:25 UTC
Am testing Martin's suggestion now (finally! Nothing like all day meetings to
trash one's good intentions)

FWIW, Do not discount the possibility of memory going bad on machine 1. After
this attempt to emerge gcc stops, I will try to emerge memtest86. On machine 2,
this machine has been running fine with no problems for some time. I will
arrange to do memtest86 on it as well.

Results as soon as I get them.
Comment 6 Guy 2003-01-20 18:08:39 UTC
machine 2:  
  
from the screen  
  
var/tmp/portage/gcc-3.2.1-r7/work/gcc-3.2.1/libjava/java/rmi/server/RMIClassLoader.java:95:  
Class 'MalforMedURLException' not found in 'throws'.  
    throws MalforMedURLException, ClassNotFoundException  
  
2 errors  
  
...  
  
Function src_compile, line 293, exit code 2  
  
...  
  
# emerge info  
Portage 2.0.46-r9 (default-x86-1.4, gcc-3.2.1, glibc-2.3.1-r3) 
================================================================= 
System uname: 2.4.20 i686 Celeron (Mendocino) 
GENTOO_MIRRORS="http://www.ibiblio.org/pub/Linux/distributions/gentoo" 
CONFIG_PROTECT="/etc /var/qmail/control /usr/kde/2/share/config 
/usr/X11R6/lib/X11/xkb /usr/kde/3.1/share/config /usr/kde/3/share/config 
/usr/share/config" 
CONFIG_PROTECT_MASK="/etc/gconf /etc/env.d" 
PORTDIR="/usr/portage" 
DISTDIR="/usr/portage/distfiles" 
PKGDIR="/usr/portage/packages" 
PORTAGE_TMPDIR="/var/tmp" 
PORTDIR_OVERLAY="" 
USE="x86 oss 3dnow apm avi crypt cups encode gif gpm jpeg libg++ libwww mikmod 
mmx mpeg ncurses nls pdflib png qtmt quicktime spell truetype xml2 xmms xv zlib 
alsa gdbm berkdb slang readline arts bonobo svga java guile X sdl tcpd pam ssl 
perl python esd imlib oggvorbis gnome gtk qt kde motif opengl mozilla" 
COMPILER="gcc3" 
CHOST="i686-pc-linux-gnu" 
CFLAGS="-march=pentium2 -O3 -pipe" 
CXXFLAGS="-march=pentium2 -O3 -pipe" 
ACCEPT_KEYWORDS="x86 ~x86" 
MAKEOPTS="-j1" 
AUTOCLEAN="yes" 
SYNC="rsync://rsync.gentoo.org/gentoo-portage" 
FEATURES="sandbox ccache" 
 
---------------------- 
 
I'm not sure what other information I can send you which could be useful. Let 
me know and I will attach it. I'll save the work directory to another name 
before I re-run the gcc emerge on this machine with the sandbox disabled. 
  
Comment 7 J Robert Ray 2003-01-20 18:16:46 UTC
var/tmp/portage/gcc-3.2.1-r7/work/gcc-3.2.1/libjava/java/rmi/server/RMIClassLoader.java:95:
 
Class 'MalforMedURLException' not found in 'throws'.  
    throws MalforMedURLException, ClassNotFoundException  

Is this a cut & paste?  It should be
'MalformedURLException' and not
'MalforMedURLException'.

The file and line given has a lowercase 'm' here.  Assuming yours really does
have an uppercase 'M', the error is legitimate, and the question is ... how did
that 'M' get there?
Comment 8 Guy 2003-01-20 19:03:38 UTC
First - to answer your question J., it is not a cut and paste (I had been running as console) so it's a rather tedious, but exact copy, created by repeatedly switching bewteen an Xwindows session and the original console login. I took extra pains to ensure fidelity of copy. And yes, I thought the 'M' should have been 'm' myself.

I'm re-running Martin's suggestion on machine one as I forgot to turn my swap partition on. :-( [sigh]
Comment 9 Guy 2003-01-20 20:46:17 UTC
Machine 2 completed emerge of gcc-3.2.1-r7 with FEATURES="-sandbox" without a hitch.
Comment 10 Guy 2003-01-20 21:04:35 UTC
Machine 1 is still running the emerge. It's at the point of compiling from
directory (?) libstdc++-v3 program locale.cc. I don't know for sure if this is
further than where is borked out originally. I'll check it in the AM as it's
bedteim for me now.

In the meantime, I have another machine (let's call it machine 3) which has
successfully emerged gcc-3.2.1-r6 and is well on it's way to finishing the
updated emerge of gcc-3.2.1-r7.

Machine 1: K6 PR200, 64 meg ram, 384 meg swap. BORKED (-sandbox being tested)
Machine 2: Celeron 466, 256 meg ram, 128 meg swap. BORKED (-sandbox worked)
Machine 3: Pentium Classic 166, 96 meg ram, 384 meg swap. no problem (-r6 to -r7)
Machine 4: K6 300, 48 meg ram, 512 meg swap. no problem (-r6 to r7)

I wouldn't draw any conclusions with this limited dataset. But it is suggestive
of an issue with some CPUs and sandbox.

This may also be an important point: Machine 1 is in the middle of an install
and is therefore running the 1.4_rc2 live cd kernel. 

If you want me to collect more info, I'll be happy to do so. Just let me know what.

'Night!
Comment 11 Martin Schlemmer (RETIRED) gentoo-dev 2003-01-21 00:07:36 UTC
Here is bug where ld segfaults when linking libgcj:

  http://bugs.gentoo.org/show_bug.cgi?id=14142


Only bad thing, is I cannot recreate it, so no way to try and debug anything.
Comment 12 Guy 2003-01-21 08:16:33 UTC
OK: Machine 1:

FEATURES="-sandbox"
CFLAGS="-march=k6 -Os -pipe"

Borks in libjava

FEATURES="-sandbox"
CFLAGS="-mcpu=i586 -O2 -pipe"

Borks in libjava (different location)

Note: NO APPARENTLY FUNNY MEMORY ERRORS! :-)

1) To my non-programmer eyes, under some circumstances (yet to be fully defined)
there appears to be a problem with memory in sandbox. (IE Machine 2 completed
successfully without sandbox and Machine 1's messages are much more
'reasonable'. Unwanted, but reasonable.)

2) Machine 1 has additional problems over and above the potential memory
problem(s) with sandbox.

I've got some experimental ideas I want to try out on machine 1. Unfortunately,
this will take some time. One of them involves possibly using a different CPU. I
have 2 CYRIX PR200s, an IBM PR200 and a couple of Pentium Classics I can pop
into this motherboard. Obviously, I want to see if this specific CPU is a
possibility. The other involves running the Gentoo install to completion enough
to reboot with a kernel specifically built by and for this CPU. This machine is
the only machine I've had so far of which the kernel on the gentoo iso does NOT
identify and load the correct nic driver (8139too). It loads the aironet modules
instead. I realize this is a stretch, but in the spirit of leaving no stone
unturned ...

I'd suggest that perhaps you want to concentrate on 1) and let me worry about 2)
 for now until I have more information. At your convienence, I have a low use
machine I can test any sandbox (or whatever) changes. Machine 2 is used only
once or twice a week (it's a guest machine). Given that it borks with sandbox
repeatedly and doesn't bork without sandbox, this is a good machine for such
testing. If you can think of anything else to try with Machine 1, I'm certainly
open to suggestions.

I'll post the messages from the gcc compiles on Machine 1 after I get back home
later today.

Finally, I did make a bootable CD with memtest86 and it reports that my ram is fine.

Um - My current schedule is:

to post Machine 1's error messages.

to finish configuring the desktop for Machine 3 and put Machine 3 away for now.

to put Machine 4 away for now.

to bring Machine 1 to the front of my workbench so I can start tinkering as
noted above.

I'm beginning to wonder if I should just use their actual names. :-) Anyway. I
hope all this helps!
Comment 13 Martin Schlemmer (RETIRED) gentoo-dev 2003-01-21 08:43:55 UTC
Hehe, ok.  Ill have a look at the sandbox stuff again and look if I can see
a possible memory leak/whatever.  Also maybe try that k6 with diff -march ?
Like -march=i586, or such ?
Comment 14 Martin Schlemmer (RETIRED) gentoo-dev 2003-01-22 00:00:39 UTC
Guy, have a look at bug #14142 again .. his problems was due to no swap ...
Comment 15 Guy 2003-01-22 06:51:06 UTC
Martin, that was the first thing I thought of too when I saw his comment. :-) I
didn't get a chance to do anything I wanted to last night because I didn't get
in till after my bed time. However, I think this is a much better lead to follow
than what I was going to do.

It occurred to me as I thought about it more, that there are a lot of things
that a total available memory shortage would exactly account for. This includes
things like the ebuild aborting in different places on the same machine, Why two
machines with the same amount of total memory (ram & swap) would behave
differently (1 fail the other succeed) etc.

I suspect that my figure for total memory is on the borderline of the minimum
requirement to compile gcc successfully. What I hope to do tonight is:

1) Confirm the ram and swap currently available for each machine.

2) Start the Gentoo install of machine 1 over with a larger swap.

3) Pop in another hard drive in machine 2 and create a second swap and retest
with sandbox on.

It also occured to me that a machine with the 1.4 install iso would require more
total memory than was previously required with the 1.2 and earlier installs
because the iso image itself probably pulls out some of the swap for it's own uses.

If this is the case (and I feel really confortable that we're on the right track
at last), then I can give you a good working figure for the minimum total memory
 (ram & swap) required for completion of gcc. I don't know if this is somthing
you'll be able to test for in the ebuild, but at least you'll be able to ask
people what total memory they have when they report problems. :-)

One of the things I was wondering is if the total number of objects compiled in
gcc has been going up with each -r version. I ask only because I had been
watching the successful machine 2 gcc emerge (w/out sandbox) and I didn't recall
seeing a lot of the stuff there, especially towards the end. Is this the case?

The other thing I was wondering was if the gcc ebuild was the 'largest' ebuild
in terms of memory requirements. It's always been my impression that this is the
case with perhaps open office being second and mozilla being third. Do you have
any feel for this yourself? Despite the time requirements, I'll build open
office (as opposed to open office bin) on one of these machines if you feel open
office actually requires even more total memory. Ultimately, I'd like to give
you a fairly hard minimum total memory requirement that you can incorporate
where needed.

Well, here's to hoping we're really on the right track. ;-)
Comment 16 Guy 2003-01-23 20:10:07 UTC
Created attachment 7595 [details]
gcc.msgs.text.gz

Martin, I added 128meg to the swap partition in machine 1. Total size (ram &
swap) is 640 megs. Now, I get a segmentation fault. -bleh-

I started ssh and pulled the file resulting from 'emerge gcc &> gcc.msgs.text'
so that I could upload the attachment.

I've left the machine up with ssh running. If it would help you to access the
machine directly, I can email you the password (living dangerously here!) and
current internet ip address. And if it would make you more comfortable, I can
set up iptables on each of the other machines to ignore that machine's ip
address.

Let me know what you want to do or what other information you want from that
machine. Frankly I'm so stumped I can't even speculate anymore. ;-)
Comment 17 Martin Schlemmer (RETIRED) gentoo-dev 2003-01-24 13:59:17 UTC
Machine 1 still bork with sandbox on or off ?   Tonight is a bit difficult (same
as yesterday, but should be ok tomorrow for checking it out live.  If its still
going to be up for the next 36 hours, mail me the ip + u/p if you do not mind.
Comment 18 Guy 2003-01-24 15:23:46 UTC
This was with sandbox. I'll kick off one without in a little while and save the
msgs to another file (in /root after you chroot to it)

I've got other machines (486's) to start playing with. It's time I learned how
to use distcc anyway. 
Comment 19 Guy 2003-01-28 21:11:14 UTC
Updated info:

I've verified the problem with machine 2 that it had insufficient total memory
(RAM & SWAP) to emerge gcc to completion. I did this by putting in a second disk
drive and creating a new (larger) swap partition. With sandox, the emerge went
through to completion with no problem.

I've identified the problem with machine 1 and I'm currently running the acid
test to finally verify the problem. The problem appears to lie in the K6 PR200
processor. There is a known bug where memory becomes unreliable if there is more
than 32 megs of ram available to this processor. These are the K6 CPUs of
stepping 'B' (and earlier?). You can view a write up here:
http://membres.lycos.fr/poulot/k6bug.html.

From the web page:

=================================================================================

The AMD-K6 processor has a bug that prevents reliable operation when more than
32 MB of RAM is used.

The most common symptoms are segmentation violations (see The SIG11 FAQ) while
compiling the Linux kernel.

It can be reproduced, up to now, only when doing heavy compilations, probably
because only compilations stress the system enough. It is not a gcc problem, as
it is sometimes the program that launches gcc (it can be make, or sh) that dies.

This bug has been seen by many people all around the world. The general
consensus among them is that the bug only depends on the amount of memory used :

    * With 32 MB of RAM, or less, no problem.
    * With more than 32 MB, sporadic compile failures have been observed. 

According to AMD :

    * this bug is documented in section 2.6.2 of the AMD-K6 MMX Enhanced
Processor Revision Guide
    * it has been corrected in later chips produced in the B stepping.
    * current machines are shipped with the corrected chips. 

===============================================================================

Read the web page for the rest of the write up and also for supplemental links
including to the AMD-K6 MMX Enhanced Processor Revision Guide mentioned above.

For now, I'm changing the severity to 'normal' as it looks very much like
identification of these two problems constitute the 'fix'. I'll have
conformation in the AM (EST). Will post results here.

Note 1: I added bug 14647 which is a documentation change suggestion regarding
reccomended swap space in the installation guide.

Note 2: Booting correctly identifies this CPU and the appropriate messages are
displayed in 'dmesg'. However, the web link report is out-of-date.
Comment 20 J Robert Ray 2003-01-28 22:57:09 UTC
Very interesting information, thanks for hunting that down!
Comment 21 Guy 2003-01-29 04:37:29 UTC
You're welcome. :-)

And gcc did complete this AM (as expected) with 32 megs of RAM and 608 megs of
swap. :D

So there are several results that I get from this leeetle episode:

1) There is nothing wrong with sandbox under even extreme circumstances. (Yeah!)

2) CFLAGS="-march=k6 -Os -pipe" works fine for emerging gcc

3) To be on the safe side, installation of Gentoo should be done with a
reccomended total memory (RAM + SWAP) of 640 megs.

4) K6 PR200 CPUs with more than 32 megs of ram should be specifically asked
about when smoeone complains of 'segfaults' especially during the emerge of gcc.
-heh-

5) You can install Gentoo on systems with as little as 32 megs of ram even
without going through distcc - provided you have lots of time and swap. My
previous low was 48megs.

Martin, I'll leave this for you to close since, as far as I'm concerned, the
issues are all resolved. As I mentioned in my email though, adding a CPU and
memory check to the checking phase of the ebuild might be a good idea.  If you
do so, I can probably be talked into increasing the memory in machine 1
specifically to test such a functional check. (no need, I think, to test the
entire ebuild!)

BTW - the resulting text file from 'emerge gcc &> gcc.msgs.32m.text was 5megs+.
Was I ever glad to see the bottom of that!
Comment 22 Guy 2003-01-29 04:53:12 UTC
Martin, I forgot! (me bad!!)

Thanx for both your time and patience. :D
Comment 23 Martin Schlemmer (RETIRED) gentoo-dev 2003-02-02 15:11:46 UTC
Hi guys, have a look at comment #21:

------------------------------------------------------------------------
3) To be on the safe side, installation of Gentoo should be done with a
reccomended total memory (RAM + SWAP) of 640 megs.
------------------------------------------------------------------------

Guy, great work thanks!
Comment 24 Guy 2003-02-04 20:19:28 UTC
As far as I can tell, this is not a problem for me any more. ;-)
Comment 25 Martin Schlemmer (RETIRED) gentoo-dev 2003-02-05 18:07:17 UTC
Yep, but we need docs updated to recommend a 512/640 minumum ram+swap.
Comment 26 Guy 2003-02-06 07:23:01 UTC
I realise that - in fact, I added an enhancement request for adding that to the Installation Guide. (bug 14647).  ;-)

I wasn't suggesting this be closed. Rather, I've been going through all the open bugs I've authored or joined and indicating where I stood on each one. This way, each respective developer knows whether I still have an open issue regarding same. There were 4 or 5 total where my issues are resolved. 

-hehe- I was trying to do you guys a favor and let you know whether to expect anything else from me.
Comment 27 Martin Schlemmer (RETIRED) gentoo-dev 2003-02-09 17:40:26 UTC
Ah, did not know you added a new bug =)
Comment 28 Clemens Schwaighofer 2003-06-06 05:30:35 UTC
very interesting thread, but I doesn't help me to solve the almost same problem with my k6 here (according to cpuinfo stepping 12), 530Mhz and 180MB RAM and 500MB SWAP, so the ram+swap is bigger than 600MB but there is still the issue with k6 & more than 32MB ram. any way to ship around ? I dunno if this stepping 12 processor is one of the faulty ones and even if, I don't have any 32MB memory modules anyway.

I got a working system one time, but then after the reboot he couldn't mount the XFS partition. so I am trying it again and I get the same glibc [no problem with gcc compiling thought] again and again.

make[2]: *** 
[/var/tmp/portage/glibc-2.3.1-r2/work/glibc-2.3.1/buildhere/sunrpc/xbootparam_prot.stmp]

Illegal instruction