Summary: | On sparc32, emerge coreutils fails on compile of pr.c (gcc-3.2.3-r3; gcc-3.3.3) with flood of garbage messages. | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | Ferris McCormick (RETIRED) <fmccor> |
Component: | [OLD] Core system | Assignee: | Sparc Porters <sparc> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | agaffney, eradicator, mcummings, multix, sparc |
Priority: | High | ||
Version: | unspecified | ||
Hardware: | Sparc | ||
OS: | Linux | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- | |
Bug Depends on: | 74739 | ||
Bug Blocks: |
Description
Ferris McCormick (RETIRED)
![]() OK, with perl, it's the source file perl-5.8.2/ext/Encode/KR/ks_03_t.c, same failure, same bandage. Except that here, adding a space then deleting it cures the problem. If you regenerate it (the file) using enc2xs, second time around it's OK. Can this be anything but some sort of file system problem? Well, we'll know more "soon" because I'm putting /var/tmp/portage onto an ext2 system and trying a fresh coreutils build. ("Soon" on this SS20 means "not right away.") And, cloning /var/tmp/portage cures the pr.c problem. Next is to start it fresh. More later. Same failure with a different file system for /var/tmp/portage. (ext3 and on a different disk). The file pr.c as generated by patch badly confuses the compiler. But if I mv pr.c pr.c.bad cp pr.c.bad pr.c make pr.o everything is fine, and again mv pr.c.bad pr.c make pr.o gcc goes wild. As before, diff says the files are all identical. I can't even guess who all is at fault here. For sure, gcc because Seg Fault is a lousy error message, but patch (or, in perl's case, enc2xs) is generating something gcc can't handle, and I hypothesize that gcc's handling of include files has to be related: diff doesn't have any problem, and any sort of rewrite on the file cures whatever is wrong. Fails with both reiserfs & ext3 file systems. At this point, I can provide information, but otherwise I am out of guesses. Second hypothesis: This is related to the curiosity I mentioned in comment#6 to bug 43690, even though that was on a U2 instead of SS20. (Described as difficulties with bits/stdio.h under unknown circumstances, hidden with a local bits/stdio.h) Why? That comment describes an instance of the compiler's seemingly getting lost when processing a program generated by another program. In the cases here, that "other program" is either patch or a perl script. In 43690, it is a perl script. In all cases, the compiler goes bananas on a program file that looks fine, no matter how hard you look at it. This might be a case of similar symptoms with multiple causes, but it's interesting to speculate otherwise. I note further that in all cases (except perhaps the thread starter), we are on a journaled file system (reiserfs or ext3). I wonder if this is somehow related to the various "Can't build glibc" reports floating around??? Re, hypothesis 2: If the problems are related, it's a negative relationship: with gcc-3.3.3, the bug 43690 problem is gone. Everything there is working as expected without any bandaids. This is bug 41820 -- I missed it in my search, After this morning's update to portage-2.0.50-r3, my problem with coreutils went away. Going to try perl next, but can you see if this fixes things for you? Thinking now it might be related to portage's libsandbox Yes, coreutils just built and installed. I'm in the process of starting a compile for perl, but probably won't know before tomorrow. perl still fails at the same spot (ks_03_t.c). It looks like this is a sandbox issue. I can replicate the emerge problem with the default FEATURES in make.conf (which includes sandbox). Emerging perl via the FEATURES="-sandbox" emerge -v perl seems to fix this problem. Portage peeps, any ideas? I can provide access to a box that can replicate this i f necessary (though it's really slow so be patient). I'm running on a Sparc 5 170 and a 110 and have the same problem. The problem is that, for some reason, these files do end properly on the fs in a way that gcc can recognise. Something funky must be happening on the filesystem to allow reading past the end of a file. It could be a kernel/fs issue, but I'm not sure yet. Strange that it would be on the same files on different machines if it was a fs issue. The workaround is, when the compiler starts spinning on non-existent lines (basically reading garbage after the file), abort the build and find the file. I opened the file in vi, went to the end, pressed 'enter' and saved. Vi then rewrote the file properly and it compiled fine. The next time I get this, I'm going to see is a couple of mv's fixes it. after patching, there is no \n character on the last line of pr.c Does 'diff' ignore trailing whitespace? anyway, echo -ne "\n\n" >> pr.c fixes the problem too. You may only need one \n character, the second is for good measure and clarity. Portage peeps, can you take a look? I can make sparc32 shells available if need be. Could this possibly be a gcc bug? I have a sparc32 cross-compiler with distcc setup on my Athlon box to help out my Sparc Gentoo build. I get this same error. Gcc appears to be segfaulting on my Athlon box, not the Sparc. pr.c:4681:1: warning: null character(s) ignored pr.c:5627:1: warning: null character(s) ignored cc1: internal compiler error: Segmentation fault Please submit a full bug report, with preprocessed source if appropriate. See <URL:http://bugs.gentoo.org/> for instructions. distcc[24574] ERROR: compile pr.c on 192.168.0.3:50000/2 failed make[3]: *** [pr.o] Error 1 make[3]: *** Waiting for unfinished jobs.... make[3]: Leaving directory `/var/tmp/portage/coreutils-5.2.1-r1/work/coreutils-5.2.1/src' as a hack, I'm echoing two blank lines (at ciaran's suggestion) to the end of pr.c in the coreutils ebuilds for 5.2.0-r2 and 5.2.1-r1 That doesn't fix this bug just goes around it, but it should result in less people experiencing this weirdness. Also xorg-x11-6.7.99.902, in xc/programs/Xserver/hw/xfree86/drivers/ati/r128_driver.c -- the file is 3700 lines long, and compiler messages start out like this: r128_driver.c:3702: error: stray '\1' in program and after 14171 of these, the compiler finally bails out. Note: this is with gcc-3.3.4-r1 Here's an interesting update. In the last few weeks, both e2fsprogs and perl needed upgrading, so: sys-fs/e2fsprogs-1.35-r1 dev-lang/perl-5.8.5-r2 and both of them ran into this bug on my SS20-SMP, 2.4.27-sparc This system had everything distributed across 2 external scsi disks. After moving some systems around, it turned out that one of these disks did not really want to come back on line, so after it finally did, I moved everything from it onto the other disk and removed it from the system. Now, this is an SS20 with everything (except for /boot) on one external disk. Now, both e2fsprogs & perl build fine (making sure to start with a clean TMPDIR for them, so that all files have to be recreated.) (Failing disk was /home in a reiserfs file system, and as I recall, not used during the builds. Good disk is everything else in one /ext3 partition.) Maybe this will give someone an idea; for me, it's just an interesting observation. I am tying these together, based on Comment 16 on this bug and on the observation that Bug 74739 is Comment 8 to this bug, but on a uniprocessor system. I just used xorg-x11-6.8.0-r1 as a cross-check for comments 16, 17. The ebuild has a patch to avoid this bug; as a test, I changed the ebuild to do this: einfo "(DON'T) Avoid bug #46593 for sparc32-SMP with kernel 2.4.xx" #echo "/* Add a line to avoid bug #56593 on sparc32 */" >> \ # programs/Xserver/hw/xfree86/drivers/ati/r128_driver.c (And just looking at the file shows the comment line is not added.) Now, on the same sparc32-SMP system which necessitated this patch, r128_driver.c compiles fine. Difference is as noted in Comment 16: Originally, the system was like this and failed: /dev/sda4 == / /dev/sdc4 == /homes mount --rbind /homes/home1 /home1 mount --rbind /home1/tmp/portage /var/tmp/portage ============ Now, system is just /dev/sda4 == /, and /dev/sdc is physically removed. Bouncing this... it's *extremely* unlikely this is portage. A forum user (starbuck) reports that changing to profile 2006.0/2.4 cures this problem. Details and discussion at http://forums.gentoo.org/viewtopic-t-448951.html Seems to be solved by the latest toolchain (2006.0 stages). Reopen otherwise. |