Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 558466 (PR67310) - sys-devel/gcc fails to compile using "-march=native" on VIA nano CPU
Summary: sys-devel/gcc fails to compile using "-march=native" on VIA nano CPU
Status: RESOLVED FIXED
Alias: PR67310
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Development (show other bugs)
Hardware: AMD64 Linux
: Normal enhancement (vote)
Assignee: Gentoo Toolchain Maintainers
URL: https://gcc.gnu.org/PR67310
Whiteboard:
Keywords: PATCH
Depends on:
Blocks:
 
Reported: 2015-08-23 12:29 UTC by Jocelyn Mayer
Modified: 2017-02-15 08:06 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments
patch to properly detect VIA nano CPU for gcc 4.8.x versions (VIA_nano_4.8.diff,2.16 KB, patch)
2015-08-23 12:32 UTC, Jocelyn Mayer
Details | Diff
patch to properly detect VIA nano CPU for gcc 4.9+ versions (VIA_nano_4.9.diff,1.17 KB, patch)
2015-08-23 12:32 UTC, Jocelyn Mayer
Details | Diff
Overlay for sys-devel/gcc with VIA nano CPU patches included for gcc 4.8.4, 4.8.5, 4.9.3 and 5.2 (sys-devel_gcc_overlay.tar.bz2,19.86 KB, application/x-bzip)
2015-08-23 12:34 UTC, Jocelyn Mayer
Details
Patch commited to gcc mainstream to fix VIA Nano issue with -march=native (gcc_nano.patch,2.46 KB, patch)
2016-06-02 08:28 UTC, Jocelyn Mayer
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jocelyn Mayer 2015-08-23 12:29:37 UTC
when using "-march=native" option with gcc 4.8.x & 4.9.x, VIA nano CPU gets detected as "core2" instead of "x86-64" CPUs, then fails to compile.
Using "-march=x86-64" succeed but does not uses the full instruction set of the CPU.
Thus one has to compile using:
CFLAGS="-march=x86-64 -mcx16 -msahf -mfxsr --param l1-cache-size=64 --param l1-cache-line-size=64 --param l2-cache-size=1024 -mmmx -msse -msse2 -msse3 -mssse3" in order to get optimized compilated code, which is quite painful.

Reproducible: Always

Steps to Reproduce:
1. > cat /proc/cpuinfo
processor       : 0
vendor_id       : CentaurHauls
cpu family      : 6
model           : 15
model name      : VIA Nano processor U2250 (1.6GHz Capable)
stepping        : 3
cpu MHz         : 800.000
cache size      : 1024 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 10
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat clflush acpi mmx fxsr sse sse2 ss tm pbe syscall nx lm constant_tsc rep_good nopl pni monitor vmx est tm2 ssse3 cx16 xtpr rng rng_en ace ace_en ace2 phe phe_en lahf_lm
bugs            :
bogomips        : 3191.71
clflush size    : 64
cache_alignment : 128
address sizes   : 36 bits physical, 48 bits virtual
power management:

2. > echo 'int main(){return 0;}' > test.c && gcc -march=native -O2 -pipe  test.c -o test && rm test.c test
Actual Results:  
Compilation fails with the following error message and informations:
[...]
test.c:1:0: error: CPU you selected does not support x86-64 instruction set
 int main(){return 0;}
 ^


Expected Results:  
Compilation to succeed.


running gcc with "-v -Q" options gives more informations about the problem:
gcc selects "-march=core2" and "-mtune=i386":

GNU C (GCC) version 4.8.4 (x86_64-unknown-linux-gnu)
        compiled by GNU C version 4.8.4, GMP version 5.1.3, MPFR version 3.1.2-p10, MPC version 1.0.2
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
options passed:  -v
 -iprefix /usr/local/src/gcc-4.8.4/host-x86_64-unknown-linux-gnu/gcc/../lib/gcc/x86_64-unknown-linux-gnu/4.8.4/
 test.c -march=core2 -mcx16 -msahf -mno-movbe -mno-aes -mno-pclmul
 -mno-popcnt -mno-abm -mno-lwp -mno-fma -mno-fma4 -mno-xop -mno-bmi
 -mno-bmi2 -mno-tbm -mno-avx -mno-avx2 -mno-sse4.2 -mno-sse4.1 -mno-lzcnt
 -mno-rtm -mno-hle -mno-rdrnd -mno-f16c -mno-fsgsbase -mno-rdseed
 -mno-prfchw -mno-adx -mfxsr -mno-xsave -mno-xsaveopt
 --param l1-cache-size=64 --param l1-cache-line-size=64
 --param l2-cache-size=1024 -mtune=i386 -O2 -fno-use-linker-plugin
options enabled:  -faggressive-loop-optimizations -falign-labels
 -fasynchronous-unwind-tables -fauto-inc-dec -fbranch-count-reg
 -fcaller-saves -fcombine-stack-adjustments -fcommon -fcompare-elim
 -fcprop-registers -fcrossjumping -fcse-follow-jumps -fdefer-pop
 -fdelete-null-pointer-checks -fdevirtualize -fdwarf2-cfi-asm
 -fearly-inlining -feliminate-unused-debug-types -fexpensive-optimizations
 -fforward-propagate -ffunction-cse -fgcse -fgcse-lm -fgnu-runtime
 -fgnu-unique -fguess-branch-probability -fhoist-adjacent-loads -fident
 -fif-conversion -fif-conversion2 -findirect-inlining -finline
 -finline-atomics -finline-functions-called-once -finline-small-functions
 -fipa-cp -fipa-profile -fipa-pure-const -fipa-reference -fipa-sra
 -fira-hoist-pressure -fira-share-save-slots -fira-share-spill-slots
 -fivopts -fkeep-static-consts -fleading-underscore -fmath-errno
 -fmerge-constants -fmerge-debug-strings -fmove-loop-invariants
 -fomit-frame-pointer -foptimize-register-move -foptimize-sibling-calls
 -foptimize-strlen -fpartial-inlining -fpeephole -fpeephole2
 -fprefetch-loop-arrays -free -freg-struct-return -fregmove
 -freorder-blocks -freorder-functions -frerun-cse-after-loop
 -fsched-critical-path-heuristic -fsched-dep-count-heuristic
 -fsched-group-heuristic -fsched-interblock -fsched-last-insn-heuristic
 -fsched-rank-heuristic -fsched-spec -fsched-spec-insn-heuristic
 -fsched-stalled-insns-dep -fshow-column -fshrink-wrap -fsigned-zeros
 -fsplit-ivs-in-unroller -fsplit-wide-types -fstrict-aliasing
 -fstrict-overflow -fstrict-volatile-bitfields -fsync-libcalls
 -fthread-jumps -ftoplevel-reorder -ftrapping-math -ftree-bit-ccp
 -ftree-builtin-call-dce -ftree-ccp -ftree-ch -ftree-coalesce-vars
 -ftree-copy-prop -ftree-copyrename -ftree-cselim -ftree-dce
 -ftree-dominator-opts -ftree-dse -ftree-forwprop -ftree-fre
 -ftree-loop-if-convert -ftree-loop-im -ftree-loop-ivcanon
 -ftree-loop-optimize -ftree-parallelize-loops= -ftree-phiprop -ftree-pre
 -ftree-pta -ftree-reassoc -ftree-scev-cprop -ftree-sink
 -ftree-slp-vectorize -ftree-slsr -ftree-sra -ftree-switch-conversion
 -ftree-tail-merge -ftree-ter -ftree-vect-loop-version -ftree-vrp
 -funit-at-a-time -funwind-tables -fvar-tracking -fvar-tracking-assignments
 -fzero-initialized-in-bss -m128bit-long-double -m64 -m80387
 -maccumulate-outgoing-args -malign-stringops -mcx16 -mfancy-math-387
 -mfp-ret-in-387 -mfxsr -mglibc -mieee-fp -mlong-double-80 -mmmx -mno-sse4
 -mpush-args -mred-zone -msahf -msse -msse2 -msse3 -mssse3
 -mtls-direct-seg-refs

where we expect to get "-march=x86-64":
(here's the output of a patched version)

GNU C (GCC) version 4.8.4 (x86_64-unknown-linux-gnu)
        compiled by GNU C version 4.8.4, GMP version 5.1.3, MPFR version 3.1.2-p10, MPC version 1.0.2
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
options passed:  -v
 -iprefix /usr/local/src/gcc-4.8.4/host-x86_64-unknown-linux-gnu/gcc/../lib/gcc/x86_64-unknown-linux-gnu/4.8.4/
 test.c -march=x86-64 -mcx16 -msahf -mno-movbe -mno-aes -mno-pclmul
 -mno-popcnt -mno-abm -mno-lwp -mno-fma -mno-fma4 -mno-xop -mno-bmi
 -mno-bmi2 -mno-tbm -mno-avx -mno-avx2 -msse3 -mssse3 -mno-sse4.2
 -mno-sse4.1 -mno-lzcnt -mno-rtm -mno-hle -mno-rdrnd -mno-f16c
 -mno-fsgsbase -mno-rdseed -mno-prfchw -mno-adx -mfxsr -mno-xsave
 -mno-xsaveopt --param l1-cache-size=64 --param l1-cache-line-size=64
 --param l2-cache-size=1024 -mtune=generic -O2 -fno-use-linker-plugin
options enabled:  -faggressive-loop-optimizations -falign-labels
 -fasynchronous-unwind-tables -fauto-inc-dec -fbranch-count-reg
 -fcaller-saves -fcombine-stack-adjustments -fcommon -fcompare-elim
 -fcprop-registers -fcrossjumping -fcse-follow-jumps -fdefer-pop
 -fdelete-null-pointer-checks -fdevirtualize -fdwarf2-cfi-asm
 -fearly-inlining -feliminate-unused-debug-types -fexpensive-optimizations
 -fforward-propagate -ffunction-cse -fgcse -fgcse-lm -fgnu-runtime
 -fgnu-unique -fguess-branch-probability -fhoist-adjacent-loads -fident
 -fif-conversion -fif-conversion2 -findirect-inlining -finline
 -finline-atomics -finline-functions-called-once -finline-small-functions
 -fipa-cp -fipa-profile -fipa-pure-const -fipa-reference -fipa-sra
 -fira-hoist-pressure -fira-share-save-slots -fira-share-spill-slots
 -fivopts -fkeep-static-consts -fleading-underscore -fmath-errno
 -fmerge-constants -fmerge-debug-strings -fmove-loop-invariants
 -fomit-frame-pointer -foptimize-register-move -foptimize-sibling-calls
 -foptimize-strlen -fpartial-inlining -fpeephole -fpeephole2
 -fprefetch-loop-arrays -free -freg-struct-return -fregmove
 -freorder-blocks -freorder-functions -frerun-cse-after-loop
 -fsched-critical-path-heuristic -fsched-dep-count-heuristic
 -fsched-group-heuristic -fsched-interblock -fsched-last-insn-heuristic
 -fsched-rank-heuristic -fsched-spec -fsched-spec-insn-heuristic
 -fsched-stalled-insns-dep -fschedule-insns2 -fshow-column -fshrink-wrap
 -fsigned-zeros -fsplit-ivs-in-unroller -fsplit-wide-types
 -fstrict-aliasing -fstrict-overflow -fstrict-volatile-bitfields
 -fsync-libcalls -fthread-jumps -ftoplevel-reorder -ftrapping-math
 -ftree-bit-ccp -ftree-builtin-call-dce -ftree-ccp -ftree-ch
 -ftree-coalesce-vars -ftree-copy-prop -ftree-copyrename -ftree-cselim
 -ftree-dce -ftree-dominator-opts -ftree-dse -ftree-forwprop -ftree-fre
 -ftree-loop-if-convert -ftree-loop-im -ftree-loop-ivcanon
 -ftree-loop-optimize -ftree-parallelize-loops= -ftree-phiprop -ftree-pre
 -ftree-pta -ftree-reassoc -ftree-scev-cprop -ftree-sink
 -ftree-slp-vectorize -ftree-slsr -ftree-sra -ftree-switch-conversion
 -ftree-tail-merge -ftree-ter -ftree-vect-loop-version -ftree-vrp
 -funit-at-a-time -funwind-tables -fvar-tracking -fvar-tracking-assignments
 -fzero-initialized-in-bss -m128bit-long-double -m64 -m80387
 -maccumulate-outgoing-args -malign-stringops -mcx16 -mfancy-math-387
 -mfp-ret-in-387 -mfxsr -mglibc -mieee-fp -mlong-double-80 -mmmx -mno-sse4
 -mpush-args -mred-zone -msahf -msse -msse2 -msse3 -mssse3
 -mtls-direct-seg-refs

The following patches adds VIA nano CPU detection for gcc 4.8.x and 4.9.x.
The 4.9.x patch also applies to gcc 5.2 but, for any reason, I cannot get gcc enter the CPU detection routine, as if "-march=native", "-mtune=native" & "-mcpu=native" options are discarded somewhere. However, gcc 5.2 compiles for VIA nano CPU while using "-march=native" but lacks to enable SSE3 & SSSE3 support.
You'll also find the sys-devel/gcc overlay directory I created for my tests.

As a test, I recompiled the whole distribution using patched gcc 4.8.4 with:
emerge --verbose --empty @system
emerge --verbose --empty @world  (in a newly opened shell)
It compiled and runs fine.

gcc versions 4.8.5 & 4.9.3 where only checked from their output on a test program.

Please note that those patches should fix support for all known VIA nano but won't help for VIA Eden support, which is likely to be broken too as far as I can see.

I also opened a bug on gcc bug tracker, for information:
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67310>

Please consider applying thoses patches.
Comment 1 Jocelyn Mayer 2015-08-23 12:32:14 UTC
Created attachment 409920 [details, diff]
patch to properly detect VIA nano CPU for gcc 4.8.x versions
Comment 2 Jocelyn Mayer 2015-08-23 12:32:39 UTC
Created attachment 409922 [details, diff]
patch to properly detect VIA nano CPU for gcc 4.9+ versions
Comment 3 Jocelyn Mayer 2015-08-23 12:34:45 UTC
Created attachment 409928 [details]
Overlay for sys-devel/gcc with VIA nano CPU patches included for gcc 4.8.4, 4.8.5, 4.9.3 and 5.2
Comment 4 Jocelyn Mayer 2016-06-02 08:26:26 UTC
The following patch has been commited by the gcc maintainers team and allows -march=native option to be usable on any x86_64 VIA CPU and applies to gcc >= 4.9. 
Refer to:
https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00044.html
Please consider adding this patch to gentoo ones, until it's commited in all gcc branches.
Comment 5 Jocelyn Mayer 2016-06-02 08:28:27 UTC
Created attachment 436152 [details, diff]
Patch commited to gcc mainstream to fix VIA Nano issue with -march=native
Comment 6 SpanKY gentoo-dev 2016-06-13 18:50:20 UTC
i've added the backport to gcc-5.4.0:
https://sources.gentoo.org/gentoo/src/patchsets/gcc/5.4.0/gentoo/71_all_gcc-5-march-native-pr67310.patch

i'll close the bug once it's queued for 4.9 too
Comment 7 SpanKY gentoo-dev 2017-02-15 08:06:56 UTC
fix is in the 4.9.4 release