After careful study of an xmms bug, I determined that it was a gcc bug involving a combination of -m{arch,cpu}=k6*, -O[2+] -funroll-loops, and confirmed it with a developer, whose comments are below. You might want to fix this before 1.4-final. Note that the in the gcc bug audit-trail mentions the powerpc architecture, so this is apparently not just a k6 bug. A gcc developer says: Ok. This testcase is correctly compiled by the current 3.2 branch so your problems will very likely vanish by upgrading to the upcoming 3.2.2 official release. Here's the story: gcc 3.2.1 shipped with a wrong-code generation bug, regression from gcc 3.2, for the following loop construct compiled with -mcpu=k6 -O2 -funroll-loops for (i=0; i<n; i++) array[i] = 0; when (n%4 == 0). It was reported as PR optimization/8599. The bug had already been fixed on the mainline (future 3.3 release) so I backported the fix to the 3.2 branch. Now it turned out that the fix for the mainline was not valid for the branch (still unclear why...) so, while fixing the problem exposed above, it broke the very same loop construct for (i=0; i<n; i++) array[i] = 0; this time when (n%4 != 0). So I reverted the backport patch and, after some analysis, came up with what I think is the correct fix for the 3.2 branch, which is in since then. Now, according to the version string of your snapshot, it was checked out from the repository during the period when the _wrong_ fix was on the branch (see the dates in the audit trail of PR optimization/8599). Hence very likely the problems you ran into.
this would explain all the k6 bugs we have
This query for k6 bugs shows 12: http://bugs.gentoo.org/buglist.cgi?short_desc_type=allwordssubstr&short_desc=k6&long_desc_type=allwordssubstr&long_desc=k6&bug_file_loc_type=allwordssubstr&bug_file_loc=&bug_status=UNCONFIRMED&bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&bug_status=RESOLVED&bug_status=VERIFIED&bug_status=CLOSED&emailtype1=exact&email1=&emailtype2=exact&email2=&bugidtype=include&bug_id=&votes=&changedin=&chfieldfrom=&chfieldto=Now&chfieldvalue=&cmdtype=doit&newqueryname=&order=Reuse+same+sort+as+last+time&field0-0-0=noop&type0-0-0=noop&value0-0-0= 11 of which are marked 'resolved' (including my xmms bug). I will note again that the powerpc architecture is also mentioned in the gcc bug report, so this could be more than a k6 bug. I looked thru the patches, and they involve gcc/gcc/do-loop.c. Nothing obviously k6-specific about it. Question: is there a way, using flag-o-matic or something else, to override configure.in and delete all '-funroll' flags? My p5ab board doesn't have much cache, so I suspect any advantage of loop unrolling is lost in wait-states caused by additional instruction-cache misses. I'd like to globally eliminate those flags on my system, regardless of what packages specify, and in general be able to have absolute control over gcc flags. I did it with 'sed -e' in my xmms ebuild for bug-hunting. Maybe USE gcc-never and gcc-always flags would be nice, if doable.
Created attachment 7731 [details] test code This is the test-code I was sent for this bug. Note the gcc options below: /* PR optimization/8599 */ /* { dg-do run } */ /* { dg-options "-O2 -funroll-loops" } */ /* { dg-options "-mcpu=k6 -O2 -funroll-loops" { target i?86-*-* } } */
Completely out of the blue and quite possibly having nothing to do with anything but: Since you are all generally discussing K6 problems, are you aware of this? http://membres.lycos.fr/poulot/k6bug.html I ran into this with bug 14212.
Yeah, I've known about that bug for years, because I got (from an ebay auction, caveat emptor) a k6-233 which had this bug. I think this only applies to a run of k6s made in 1997. I found that link in one of the k6 bug reports here, and checked it out to refresh my memory. The symptoms of that bug, at least as I experienced it, was X and big c compiles getting sig11. The bug we are talking about here is something else. Notice the concentration of reports around multimedia things. I suspect they are all specifying -funroll-loops in a (probably misguided) attempt to speed up things like codecs. An unrolled loop is going to produce more instruction fetches and thus more cache misses, and is thus probably a Bad Thing, at least for my box which has a 550 mhz chip, a 100mhz bus and only a 512K L2 cache. Notice that the gcc info says about these options: `-funroll-loops' Unroll loops whose number of iterations can be determined at compile time or upon entry to the loop. `-funroll-loops' implies both `-fstrength-reduce' and `-frerun-cse-after-loop'. This option makes code larger, and may or may not make it run faster. `-funroll-all-loops' Unroll all loops, even if their number of iterations is uncertain when the loop is entered. This usually makes programs run more slowly. `-funroll-all-loops' implies the same options as `-funroll-loops', not exactly a ringing endorsement. xmms does -funroll-all-loops; either the xmms people did testing and found it made their codecs faster, or they just didn't read the above. I think I'm going to roll my own gcc from gcc cvs and see if I can't kill those options completely.
I built my own gcc, with the cvs for the 3.2 branch: src/0$gcc -v Reading specs from /usr/local/gcc/usr/bin/../lib/gcc-lib/i586-pc-linux-gnu/3.2.2/specs Configured with: ./configure --prefix=/usr : (reconfigured) ./configure : (reconfigured) ./configure --prefix=/usr Thread model: posix gcc version 3.2.2 20030130 (prerelease) and it passes the test: src/0$gcc -march=k6-2 -O3 -funroll-loops unroll-1.c -o unroll src/0$./unroll NO ERROR Also, I have confirmation that this is not a k6-specific bug: > Yes, the second regression was first found on PowerPC at -O2 -funroll-loops. And that gcc-3.2.2 is coming real soon: > We are in final testing phase so I'd say within a week, provided that there > is no last minute showstopper.
:-) It was a shot in the dark. I did check to see what version of K6 you all were talking about but didn't see anything relavant which is why I brought it up. I'm glad you found the problem! BTW, FWIW - I'd have to say from my experiences with the K6 'B' stepping bug, that it manifests itself in more than just segfault 11. Most of my gcc ebuilds were exiting out without segfaults. ;-) -hehe- Now I can look forward to breaking gcc 3.2.2
I've done emerge -C gcc and installed my own gcc-3.2.2-pre<something>. It builds xmms correctly. I ran the gcc tests and discussed the gcc results with the gcc developer, and he says it looks fine. > I'm glad you found the problem! Thanks. Me too. Hopefully a lot of stuff I've figured was just glitches in new software will start going away as I emerge stuff with the new gcc. > BTW, FWIW - I'd have to say from my experiences with the K6 'B' stepping bug, > that it manifests itself in more than just segfault 11. Most of my gcc ebuilds > were exiting out without segfaults. ;-) It isn't a reliable bug. It acted (for me) exactly like an intermittent memory problem or overheating chip.
Created attachment 7787 [details] ebuild using the 2003-01-27 snapshot This ebuild builds a vanilla version of the 2003-01-27 snapshot, applying no patches, and does not remove k6 -march or -mcpu flags from CFLAGS. It's making stage1 as I speak, and so in quite a few hours I might know if the final result is OK, but I expect it will be.
Please test gcc-3.2.2_pre20030131.
emerged OK. I emerged a small app (aumix) with it and it was fine. It passes the unroll test: testsrc/0$gcc -mcpu=k6 -O2 -funroll-loops unroll-1.c -o unroll src/0$./unroll No Error.
Ok, thanks for the effort and feedback Jim!
Thanks for taking care of this, Martin. I've hand-merged a patch Eric sent me to i386.md that should take care of the pesky k6-related assembly errors. It has built, and I am doing a make -k check now. I made a new patch file and sent it back to Eric with my notes. I'm optimistic that it can make it into 3.2.2. I'll let you know when/if it hits cvs, and I can send you the patch if you want.