On sparc32 (SS20 & friends) but not on sparc64, with either gcc-3.2.3-r3 or gcc-3.3.3, build of coreutils-5.2.0 (and reportedly perl) fails with "lost compiler syndrome" as described in the forum thread referenced in "Actual Results" below. There is no failure on sparc64, nor so far as I can find any other architecture, so this looks like a gcc problem unique to sparc32. It is easy to bandage: edit pr.c and put a space after it's final character (or another new line or anything else it seems) so that the last thing gcc sees is "} " instead of "}" I am trying to isolate this in something a little smaller than a 3162 line program, but since someone besides me has seen the identical failure, I am reporting it as a bug, too, in case there are lots of sparc32 people wondering what's wrong with their hardware. To save anyone who cares some time: You don't need to compile the program, and if you use "gcc -E -v ..." the error messages are quite a bit more useful: Here's a good preprocessor run (with the terminating space) ===================== Compiling with flags <-v -E> gcc -DHAVE_CONFIG_H -I. -I. -I.. -I.. -I. -I../lib -I../lib -c -o pr.o $* pr.c Reading specs from /usr/lib/gcc-lib/sparc-unknown-linux-gnu/3.3.3/specs Configured with: /var/tmp/portage/gcc-3.3.3/work/gcc-3.3.3/configure --prefix=/usr --bindir=/usr/sparc-unknown-linux-gnu/gcc-bin/3.3 --includedir=/usr/lib/gcc-lib/sparc-unknown-linux-gnu/3.3.3/include --datadir=/usr/share/gcc-data/sparc-unknown-linux-gnu/3.3 --mandir=/usr/share/gcc-data/sparc-unknown-linux-gnu/3.3/man --infodir=/usr/share/gcc-data/sparc-unknown-linux-gnu/3.3/info --enable-shared --host=sparc-unknown-linux-gnu --target=sparc-unknown-linux-gnu --with-system-zlib --enable-languages=c,c++,f77,objc --enable-threads=posix --enable-long-long --disable-checking --enable-cstdio=stdio --enable-clocale=generic --enable-__cxa_atexit --enable-version-specific-runtime-libs --with-gxx-include-dir=/usr/lib/gcc-lib/sparc-unknown-linux-gnu/3.3.3/include/g++-v3 --with-local-prefix=/usr/local --enable-shared --enable-nls --without-included-gettext --disable-multilib Thread model: posix gcc version 3.3.3 20040217 (Gentoo Linux 3.3.3, propolice-3.3-7) /usr/lib/gcc-lib/sparc-unknown-linux-gnu/3.3.3/cc1 -E -quiet -v -I. -I. -I.. -I.. -I. -I../lib -I../lib -D__GNUC__=3 -D__GNUC_MINOR__=3 -D__GNUC_PATCHLEVEL__=3 -D__ELF__ -Dunix -D__sparc__ -D__gnu_linux__ -Dlinux -D__ELF__ -D__unix__ -D__sparc__ -D__gnu_linux__ -D__linux__ -D__unix -D__linux -Asystem=unix -Asystem=posix -D__GCC_NEW_VARARGS__ -Acpu=sparc -Amachine=sparc -DHAVE_CONFIG_H pr.c -o pr.o ignoring nonexistent directory "/usr/local/include" ignoring nonexistent directory "/usr/sparc-unknown-linux-gnu/include" ignoring duplicate directory "." ignoring duplicate directory ".." ignoring duplicate directory "." ignoring duplicate directory "../lib" #include "..." search starts here: #include <...> search starts here: . .. ../lib /usr/lib/gcc-lib/sparc-unknown-linux-gnu/3.3.3/include /usr/include End of search list. =========== And here's a bad one: =============== Compiling with flags <-v -E> gcc -DHAVE_CONFIG_H -I. -I. -I.. -I.. -I. -I../lib -I../lib -c -o pr.o $* pr.c Reading specs from /usr/lib/gcc-lib/sparc-unknown-linux-gnu/3.3.3/specs Configured with: /var/tmp/portage/gcc-3.3.3/work/gcc-3.3.3/configure --prefix=/usr --bindir=/usr/sparc-unknown-linux-gnu/gcc-bin/3.3 --includedir=/usr/lib/gcc-lib/sparc-unknown-linux-gnu/3.3.3/include --datadir=/usr/share/gcc-data/sparc-unknown-linux-gnu/3.3 --mandir=/usr/share/gcc-data/sparc-unknown-linux-gnu/3.3/man --infodir=/usr/share/gcc-data/sparc-unknown-linux-gnu/3.3/info --enable-shared --host=sparc-unknown-linux-gnu --target=sparc-unknown-linux-gnu --with-system-zlib --enable-languages=c,c++,f77,objc --enable-threads=posix --enable-long-long --disable-checking --enable-cstdio=stdio --enable-clocale=generic --enable-__cxa_atexit --enable-version-specific-runtime-libs --with-gxx-include-dir=/usr/lib/gcc-lib/sparc-unknown-linux-gnu/3.3.3/include/g++-v3 --with-local-prefix=/usr/local --enable-shared --enable-nls --without-included-gettext --disable-multilib Thread model: posix gcc version 3.3.3 20040217 (Gentoo Linux 3.3.3, propolice-3.3-7) /usr/lib/gcc-lib/sparc-unknown-linux-gnu/3.3.3/cc1 -E -quiet -v -I. -I. -I.. -I.. -I. -I../lib -I../lib -D__GNUC__=3 -D__GNUC_MINOR__=3 -D__GNUC_PATCHLEVEL__=3 -D__ELF__ -Dunix -D__sparc__ -D__gnu_linux__ -Dlinux -D__ELF__ -D__unix__ -D__sparc__ -D__gnu_linux__ -D__linux__ -D__unix -D__linux -Asystem=unix -Asystem=posix -D__GCC_NEW_VARARGS__ -Acpu=sparc -Amachine=sparc -DHAVE_CONFIG_H pr.c -o pr.o ignoring nonexistent directory "/usr/local/include" ignoring nonexistent directory "/usr/sparc-unknown-linux-gnu/include" ignoring duplicate directory "." ignoring duplicate directory ".." ignoring duplicate directory "." ignoring duplicate directory "../lib" #include "..." search starts here: #include <...> search starts here: . .. ../lib /usr/lib/gcc-lib/sparc-unknown-linux-gnu/3.3.3/include /usr/include End of search list. pr.c:4681:1: warning: null character(s) ignored pr.c:5627:1: warning: null character(s) ignored cc1: internal compiler error: Segmentation fault Please submit a full bug report, with preprocessed source if appropriate. See <URL:http://bugs.gentoo.org/> for instructions. ============= You will notice that pr.c does not have lines 4681, 5627. ================== ================== NOW, Here's the critical clue: If I grab pr.c from a U60, and do a diff, I get dragonfly src # diff -u pr.c pr.c-64 dragonfly src # dragonfly src # mv pr.c pr.c.fail dragonfly src # cp pr.c-64 pr.c .... Compiling with flags <-E> gcc -DHAVE_CONFIG_H -I. -I. -I.. -I.. -I. -I../lib -I../lib -c -o pr.o $* pr.c SO, the -64 source version compiles. So, on at least two systems, patching pr.c is giving a file which gcc can't read correctly. But ftp-ing the same file from another system gives an identical file (says diff) which compiles fine. In my case, the file system is reiserfs (left over from when I thought that was a good idea). So, I don't know if this is gcc problem or if gcc is the innocent victim of something different (Seg fault is a lousy error recover method in either event). I'm as lost as gcc at this point. So I'm reporting it as a bug and waiting for some inspiration. Reproducible: Always Steps to Reproduce: 1.emerge -uv coreutils 2. 3. Actual Results: http://forums.gentoo.org/viewtopic.php?p=1007418#1007418 (But "can't reproduce here" won't surprise me.) Expected Results: Clean compile. dragonfly src # emerge info Portage 2.0.50-r1 (default-sparc-1.4, gcc-3.3.3, glibc-2.3.2-r9, 2.4.21-sparc-r1) ================================================================= System uname: 2.4.21-sparc-r1 sparc sun4m Gentoo Base System version 1.4.3.13 ccache version 2.3 [disabled] Autoconf: sys-devel/autoconf-2.58-r1 Automake: sys-devel/automake-1.8.3 ACCEPT_KEYWORDS="sparc" AUTOCLEAN="yes" CFLAGS="-mcpu=v8 -mtune=v8 -O2 -pipe" CHOST="sparc-unknown-linux-gnu" COMPILER="gcc3" CONFIG_PROTECT="/etc /usr/X11R6/lib/X11/xkb /usr/kde/2/share/config /usr/kde/3/share/config /usr/share/config /usr/share/texmf/dvipdfm/config/ /usr/share/texmf/dvips/config/ /usr/share/texmf/tex/generic/config/ /usr/share/texmf/tex/platex/config/ /usr/share/texmf/xdvi/ /var/qmail/control" CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d" CXXFLAGS="-mcpu=v8 -mtune=v8 -O2 -pipe -Wno-deprecated -fpermissive" DISTDIR="/usr/portage/distfiles" FEATURES="" GENTOO_MIRRORS="http://gentoo.mirrors.pair.com/ ftp://gentoo.mirrors.pair.com/ ftp://mirrors.tds.net/gentoo http://mirror.clarkson.edu/pub/distributions/gentoo/ http://mirrors.tds.net/gentoo" MAKEOPTS="-j2" PKGDIR="/usr/portage/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="" SYNC="rsync://rsync.namerica.gentoo.org/gentoo-portage" USE="X Xaw3d berkdb crypt cups fbcon foomaticdb gdbm gif gtk gtk2 imlib java jpeg libwww mad mikmod motif mozilla mpeg mysql ncurses nls opengl pam perl png python qt readline ruby ruby18 slang sparc spell ssl tcltk tcpd tetex tiff truetype xv zlib" =================================== Compiler was built with "-mcpu=v8" but I don't know if that is relevant (and unfortunately it takes about a day to find out on this system.) Everything but coreutils is current (and the kernel: this kernel works for SMP, so, being superstitious, I don't touch it.). As mentioned above, same source built on another system and ftp-ed in tests as identical (with diff & with wc) but compiles cleanly.
OK, with perl, it's the source file perl-5.8.2/ext/Encode/KR/ks_03_t.c, same failure, same bandage. Except that here, adding a space then deleting it cures the problem. If you regenerate it (the file) using enc2xs, second time around it's OK. Can this be anything but some sort of file system problem? Well, we'll know more "soon" because I'm putting /var/tmp/portage onto an ext2 system and trying a fresh coreutils build. ("Soon" on this SS20 means "not right away.") And, cloning /var/tmp/portage cures the pr.c problem. Next is to start it fresh. More later.
Same failure with a different file system for /var/tmp/portage. (ext3 and on a different disk). The file pr.c as generated by patch badly confuses the compiler. But if I mv pr.c pr.c.bad cp pr.c.bad pr.c make pr.o everything is fine, and again mv pr.c.bad pr.c make pr.o gcc goes wild. As before, diff says the files are all identical. I can't even guess who all is at fault here. For sure, gcc because Seg Fault is a lousy error message, but patch (or, in perl's case, enc2xs) is generating something gcc can't handle, and I hypothesize that gcc's handling of include files has to be related: diff doesn't have any problem, and any sort of rewrite on the file cures whatever is wrong. Fails with both reiserfs & ext3 file systems. At this point, I can provide information, but otherwise I am out of guesses.
Second hypothesis: This is related to the curiosity I mentioned in comment#6 to bug 43690, even though that was on a U2 instead of SS20. (Described as difficulties with bits/stdio.h under unknown circumstances, hidden with a local bits/stdio.h) Why? That comment describes an instance of the compiler's seemingly getting lost when processing a program generated by another program. In the cases here, that "other program" is either patch or a perl script. In 43690, it is a perl script. In all cases, the compiler goes bananas on a program file that looks fine, no matter how hard you look at it. This might be a case of similar symptoms with multiple causes, but it's interesting to speculate otherwise. I note further that in all cases (except perhaps the thread starter), we are on a journaled file system (reiserfs or ext3). I wonder if this is somehow related to the various "Can't build glibc" reports floating around???
Re, hypothesis 2: If the problems are related, it's a negative relationship: with gcc-3.3.3, the bug 43690 problem is gone. Everything there is working as expected without any bandaids.
This is bug 41820 -- I missed it in my search,
After this morning's update to portage-2.0.50-r3, my problem with coreutils went away. Going to try perl next, but can you see if this fixes things for you? Thinking now it might be related to portage's libsandbox
Yes, coreutils just built and installed. I'm in the process of starting a compile for perl, but probably won't know before tomorrow.
perl still fails at the same spot (ks_03_t.c).
It looks like this is a sandbox issue. I can replicate the emerge problem with the default FEATURES in make.conf (which includes sandbox). Emerging perl via the FEATURES="-sandbox" emerge -v perl seems to fix this problem. Portage peeps, any ideas? I can provide access to a box that can replicate this i f necessary (though it's really slow so be patient).
I'm running on a Sparc 5 170 and a 110 and have the same problem. The problem is that, for some reason, these files do end properly on the fs in a way that gcc can recognise. Something funky must be happening on the filesystem to allow reading past the end of a file. It could be a kernel/fs issue, but I'm not sure yet. Strange that it would be on the same files on different machines if it was a fs issue. The workaround is, when the compiler starts spinning on non-existent lines (basically reading garbage after the file), abort the build and find the file. I opened the file in vi, went to the end, pressed 'enter' and saved. Vi then rewrote the file properly and it compiled fine. The next time I get this, I'm going to see is a couple of mv's fixes it.
after patching, there is no \n character on the last line of pr.c Does 'diff' ignore trailing whitespace? anyway, echo -ne "\n\n" >> pr.c fixes the problem too. You may only need one \n character, the second is for good measure and clarity.
Portage peeps, can you take a look? I can make sparc32 shells available if need be.
Could this possibly be a gcc bug? I have a sparc32 cross-compiler with distcc setup on my Athlon box to help out my Sparc Gentoo build. I get this same error. Gcc appears to be segfaulting on my Athlon box, not the Sparc. pr.c:4681:1: warning: null character(s) ignored pr.c:5627:1: warning: null character(s) ignored cc1: internal compiler error: Segmentation fault Please submit a full bug report, with preprocessed source if appropriate. See <URL:http://bugs.gentoo.org/> for instructions. distcc[24574] ERROR: compile pr.c on 192.168.0.3:50000/2 failed make[3]: *** [pr.o] Error 1 make[3]: *** Waiting for unfinished jobs.... make[3]: Leaving directory `/var/tmp/portage/coreutils-5.2.1-r1/work/coreutils-5.2.1/src'
as a hack, I'm echoing two blank lines (at ciaran's suggestion) to the end of pr.c in the coreutils ebuilds for 5.2.0-r2 and 5.2.1-r1 That doesn't fix this bug just goes around it, but it should result in less people experiencing this weirdness.
Also xorg-x11-6.7.99.902, in xc/programs/Xserver/hw/xfree86/drivers/ati/r128_driver.c -- the file is 3700 lines long, and compiler messages start out like this: r128_driver.c:3702: error: stray '\1' in program and after 14171 of these, the compiler finally bails out. Note: this is with gcc-3.3.4-r1
Here's an interesting update. In the last few weeks, both e2fsprogs and perl needed upgrading, so: sys-fs/e2fsprogs-1.35-r1 dev-lang/perl-5.8.5-r2 and both of them ran into this bug on my SS20-SMP, 2.4.27-sparc This system had everything distributed across 2 external scsi disks. After moving some systems around, it turned out that one of these disks did not really want to come back on line, so after it finally did, I moved everything from it onto the other disk and removed it from the system. Now, this is an SS20 with everything (except for /boot) on one external disk. Now, both e2fsprogs & perl build fine (making sure to start with a clean TMPDIR for them, so that all files have to be recreated.) (Failing disk was /home in a reiserfs file system, and as I recall, not used during the builds. Good disk is everything else in one /ext3 partition.) Maybe this will give someone an idea; for me, it's just an interesting observation.
I am tying these together, based on Comment 16 on this bug and on the observation that Bug 74739 is Comment 8 to this bug, but on a uniprocessor system.
I just used xorg-x11-6.8.0-r1 as a cross-check for comments 16, 17. The ebuild has a patch to avoid this bug; as a test, I changed the ebuild to do this: einfo "(DON'T) Avoid bug #46593 for sparc32-SMP with kernel 2.4.xx" #echo "/* Add a line to avoid bug #56593 on sparc32 */" >> \ # programs/Xserver/hw/xfree86/drivers/ati/r128_driver.c (And just looking at the file shows the comment line is not added.) Now, on the same sparc32-SMP system which necessitated this patch, r128_driver.c compiles fine. Difference is as noted in Comment 16: Originally, the system was like this and failed: /dev/sda4 == / /dev/sdc4 == /homes mount --rbind /homes/home1 /home1 mount --rbind /home1/tmp/portage /var/tmp/portage ============ Now, system is just /dev/sda4 == /, and /dev/sdc is physically removed.
Bouncing this... it's *extremely* unlikely this is portage.
A forum user (starbuck) reports that changing to profile 2006.0/2.4 cures this problem. Details and discussion at http://forums.gentoo.org/viewtopic-t-448951.html
Seems to be solved by the latest toolchain (2006.0 stages). Reopen otherwise.