Summary: | sci-libs/blas-atlas-3.9.21 fails to compile with: #error "SSE3 instruction set not enabled" | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | David Kredba <kredba> |
Component: | Current packages | Assignee: | Markus Dittrich (RETIRED) <markusle> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | cmue81, ddemidov, mehrunes_dagon, sci |
Priority: | High | ||
Version: | 2008.0 | ||
Hardware: | AMD64 | ||
OS: | Linux | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- | |
Attachments: | Build log gziped |
Description
David Kredba
2010-01-30 07:55:52 UTC
Created attachment 217901 [details]
Build log gziped
Thanks for the report, I'll have a look. Markus Could you please try recompiling with "-march=core2" included in your CFLAGS to make sure gcc enables SSE3. Thanks, Markus It works perfectly with -march=core2. Thank you. Great! Unfortunately, I am not quite sure what the best way to proceed from here is. It seems that without an explicit march=core2 gcc is pessimistic and disables SSE3 instructions completely. Hence, it would definitely be good to add this to your CFLAGS. I am not sure if there is much I can do about this in terms of checking within the ebuild. Hence, I'd tend to mark this as fixed as is - are there any objections? Thanks, Markus No, there are not. Maybe a little note in ebuild like "if build fails complaining about SSE3 try to ...."? To save your time and Bugzilla space. It seems for me that in time build logic of the original package changed. I think that working with something like "if there is ssse3 in the /proc/cpuinfo and model name is one of ....... then add -march=xxxxx to the CFLAGS in ebuild will cause more trouble than profit. Thank you. Thanks for the suggestions, David. I'll have to think about what might be most beneficial to users and might indeed add a comment to the ebuild. Thanks again, Markus *** Bug 314027 has been marked as a duplicate of this bug. *** (In reply to comment #5) > Great! Unfortunately, I am not quite sure what the best > way to proceed from here is. It seems that without an explicit > march=core2 gcc is pessimistic and disables SSE3 instructions completely. is this true for any version of gcc? > Hence, it would definitely be good to add this to your CFLAGS. and what if i have amd64? IMHO i should not set compiler flag to core2 on amd platform. So how can i compile sci-libs/blas-atlas-3.9.21? You need to tell portage to use march=athlon64-sse3, or better march=native - this happens if you upgrade your AMD CPU from a non-sse3 to a sse3-chip Use diff -u <(export TESTFLAGS="-march=athlon64-sse3"; export OUTPUT=athlon64-sse3; touch $OUTPUT.cc; gcc $TESTFLAGS -fverbose-asm $OUTPUT.cc -S; cat $OUTPUT.s; unset OUTPUT TESTFLAGS; rm -f $OUTPUT.cc) <(export TESTFLAGS="-march=native"; export OUTPUT=native; touch $OUTPUT.cc; gcc $TESTFLAGS -fverbose-asm $OUTPUT.cc -S; cat $OUTPUT.s; unset OUTPUT TESTFLAGS; rm -f $OUTPUT.cc) to see the different flags, GCC activates depending on the march flag you set.. SAMPLE OUTPUT from my machine: --- /dev/fd/63 2011-02-12 16:32:18.130490005 +0100 +++ /dev/fd/62 2011-02-12 16:32:18.130490005 +0100 @@ -1,9 +1,11 @@ - .file "athlon64-sse3.cc" + .file "native.cc" # GNU C++ (Gentoo 4.4.5 p1.1, pie-0.4.5) version 4.4.5 (x86_64-pc-linux-gnu) # compiled by GNU C version 4.4.5, GMP version 5.0.1, MPFR version 3.0.0-p3. # GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 -# angegebene Optionen: -D_GNU_SOURCE athlon64-sse3.cc -D_FORTIFY_SOURCE=2 -# -march=athlon64-sse3 -fverbose-asm +# angegebene Optionen: -D_GNU_SOURCE native.cc -D_FORTIFY_SOURCE=2 +# -march=amdfam10 -mcx16 -msahf -mpopcnt --param l1-cache-size=64 --param +# l1-cache-line-size=64 --param l2-cache-size=512 -mtune=amdfam10 +# -fverbose-asm # angeschaltete Optionen: -falign-loops -fargument-alias # -fasynchronous-unwind-tables -fauto-inc-dec -fbranch-count-reg -fcommon # -fdwarf2-cfi-asm -fearly-inlining -feliminate-unused-debug-types @@ -18,9 +20,10 @@ # -ftree-reassoc -ftree-scev-cprop -ftree-switch-conversion # -ftree-vect-loop-version -funit-at-a-time -funwind-tables # -fvect-cost-model -fverbose-asm -fzero-initialized-in-bss -# -m128bit-long-double -m3dnow -m64 -m80387 -maccumulate-outgoing-args -# -malign-stringops -mfancy-math-387 -mfp-ret-in-387 -mfused-madd -mglibc -# -mieee-fp -mmmx -mno-sse4 -mpush-args -mred-zone -msse -msse2 -msse3 +# -m128bit-long-double -m3dnow -m64 -m80387 -mabm +# -maccumulate-outgoing-args -malign-stringops -mcx16 -mfancy-math-387 +# -mfp-ret-in-387 -mfused-madd -mglibc -mieee-fp -mmmx -mno-sse4 -mpopcnt +# -mpush-args -mred-zone -msahf -msse -msse2 -msse3 -msse4a # -mtls-direct-seg-refs # Compiler executable checksum: 430874aaeace404ef8c2381e200db136 |