Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 384129 - sys-devel/gcc-4.5.3-r1 uses -mtune=generic instead of -mtune=core2 when using -march=native
Summary: sys-devel/gcc-4.5.3-r1 uses -mtune=generic instead of -mtune=core2 when using...
Status: RESOLVED INVALID
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Gentoo Toolchain Maintainers
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-09-22 19:16 UTC by Pacho Ramos
Modified: 2013-01-15 00:53 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
cpuid.c (cpuid.c,882 bytes, text/plain)
2011-09-22 23:05 UTC, SpanKY
Details
x86-64 not detected properly, config.log (config.log,12.76 KB, text/plain)
2011-09-23 00:15 UTC, Matthew Thode ( prometheanfire )
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Pacho Ramos gentoo-dev 2011-09-22 19:16:05 UTC
While old stable gcc-4.4 used this options:

/usr/libexec/gcc/x86_64-pc-linux-gnu/4.4.5/cc1 -quiet - -D_FORTIFY_SOURCE=2 -march=core2 -mcx16 -msahf -maes -mpclmul -mpopcnt -mavx --param l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=256 -mtune=core2 -quiet -dumpbase - -auxbase-strip /dev/null -o /tmp/ccqMDf3U.s


gcc-4.5 uses:

/usr/libexec/gcc/x86_64-pc-linux-gnu/4.5.3/cc1 -quiet - -D_FORTIFY_SOURCE=2 -march=core2 -mcx16 -msahf -maes -mpclmul -mpopcnt -mavx --param l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=3072 -mtune=generic -quiet -dumpbase - -auxbase-strip /dev/null -o /tmp/ccCfq05p.s

This is a none sense as explained in gcc man page:

           generic
               Produce code optimized for the most common IA32/AMD64/EM64T processors.  If you know the CPU on which your code will run, then you
               should use the corresponding -mtune option instead of -mtune=generic.  But, if you do not know exactly what CPU users of your
               application will have, then you should use this option.


My emerge --info:

Portage 2.1.10.19 (default/linux/amd64/10.0/desktop/gnome, gcc-4.5.3, glibc-2.12.2-r0, 2.6.38-tuxonice-r2 x86_64)
=================================================================
System uname: Linux-2.6.38-tuxonice-r2-x86_64-Intel-R-_Core-TM-_i5-2410M_CPU_@_2.30GHz-with-gentoo-2.0.3
Timestamp of tree: Thu, 22 Sep 2011 15:30:01 +0000
ccache version 2.4 [enabled]
app-shells/bash:          4.1_p9
dev-java/java-config:     2.1.11-r3
dev-lang/python:          2.7.1-r1, 3.1.3-r1
dev-util/ccache:          2.4-r9
dev-util/cmake:           2.8.4-r1
dev-util/pkgconfig:       0.26
sys-apps/baselayout:      2.0.3
sys-apps/openrc:          0.8.3-r1
sys-apps/sandbox:         2.4
sys-devel/autoconf:       2.13, 2.68
sys-devel/automake:       1.9.6-r3, 1.10.3, 1.11.1
sys-devel/binutils:       2.21.1-r1
sys-devel/gcc:            4.4.5, 4.5.3-r1
sys-devel/gcc-config:     1.4.1-r1
sys-devel/libtool:        2.4-r1
sys-devel/make:           3.82-r1
sys-kernel/linux-headers: 2.6.36.1 (virtual/os-headers)
sys-libs/glibc:           2.12.2
Repositories: gentoo sunrise x11 rainyday x-portage
ACCEPT_KEYWORDS="amd64"
ACCEPT_LICENSE="*"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-O2 -pipe -march=native"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/gnupg/qualified.txt"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/env.d/java/ /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c"
CXXFLAGS="-O2 -pipe -march=native"
DISTDIR="/usr/distfiles"
FEATURES="assume-digests binpkg-logs ccache distlocks ebuild-locks fixlafiles fixpackages multilib-strict news parallel-fetch protect-owned sandbox sfperms sign split-log strict test test-fail-continue unknown-features-warn unmerge-logs unmerge-orphans userfetch"
FFLAGS=""
GENTOO_MIRRORS="http://mirror.ovh.net/gentoo-distfiles/ http://ftp.heanet.ie/pub/gentoo/"
LANG="es_ES.UTF-8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed -Wl,--hash-style=gnu"
LINGUAS="es es_ES en_US"
MAKEOPTS="-j5"
PKGDIR="/usr/local/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/portage/local/layman/sunrise /usr/portage/local/layman/x11 /usr/portage/local/layman/rainyday /usr/local/portage"
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="X a52 aac acl acpi alsa amd64 applet bash-completion berkdb bluetooth branding bzip2 cairo cdda cddb cdr cli consolekit cracklib crypt cups cvs cxx dbus djvu dri dts dvd dvdr dvi eds emboss enchant encode evo exif fam fat ffmpeg firefox flac fortran fuse gdbm gdu gif gnome gnome-keyring gpm gstreamer gtk gtk3 gtkstyle iconv jpeg kpathsea latex lcms ldap libnotify lyx mad mms mmx mmxext mng modules mono mp3 mp4 mpeg mudflap multilib musicbrainz nautilus ncurses network-cron networkmanager nls nptl nptlonly ntfs ntp nvidia ogg opengl openmp optimized-qmake pam pango pch pcre pdf perl png policykit ppds pppd python qt3support readline reiserfs sdl session smp sna spell sse sse2 sse3 ssl ssse3 startup-notification svg sysfs t1lib tcpd test theora threads tiff truetype udev unicode usb vaapi vcd vdpau vorbis wifi x264 xcb xml xorg xulrunner xv xvid youtube zlib" ALSA_CARDS="hda-intel" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="kexi words flow plan stage tables krita karbon braindump" CAMERAS="ptp2" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip tripmate tnt ubx" INPUT_DEVICES="evdev synaptics" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LINGUAS="es es_ES en_US" PHP_TARGETS="php5-3" RUBY_TARGETS="ruby18" USERLAND="GNU" VIDEO_CARDS="fbdev nvidia vesa intel i915 i965" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS


Reproducible: Always




cat /proc/cpuinfo:

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 42
model name	: Intel(R) Core(TM) i5-2410M CPU @ 2.30GHz
stepping	: 7
cpu MHz		: 800.000
cache size	: 3072 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 2
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 x2apic popcnt aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dts tpr_shadow vnmi flexpriority ept vpid
bogomips	: 4589.60
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 42
model name	: Intel(R) Core(TM) i5-2410M CPU @ 2.30GHz
stepping	: 7
cpu MHz		: 800.000
cache size	: 3072 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 2
apicid		: 1
initial apicid	: 1
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 x2apic popcnt aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dts tpr_shadow vnmi flexpriority ept vpid
bogomips	: 4588.98
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

processor	: 2
vendor_id	: GenuineIntel
cpu family	: 6
model		: 42
model name	: Intel(R) Core(TM) i5-2410M CPU @ 2.30GHz
stepping	: 7
cpu MHz		: 2301.000
cache size	: 3072 KB
physical id	: 0
siblings	: 4
core id		: 1
cpu cores	: 2
apicid		: 2
initial apicid	: 2
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 x2apic popcnt aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dts tpr_shadow vnmi flexpriority ept vpid
bogomips	: 4588.99
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

processor	: 3
vendor_id	: GenuineIntel
cpu family	: 6
model		: 42
model name	: Intel(R) Core(TM) i5-2410M CPU @ 2.30GHz
stepping	: 7
cpu MHz		: 800.000
cache size	: 3072 KB
physical id	: 0
siblings	: 4
core id		: 1
cpu cores	: 2
apicid		: 3
initial apicid	: 3
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 x2apic popcnt aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dts tpr_shadow vnmi flexpriority ept vpid
bogomips	: 4588.98
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:
Comment 1 Agostino Sarubbo gentoo-dev 2011-09-22 21:35:41 UTC
Imho is because gcc-4.5 works better with -mtune=generic instead of specific
Comment 2 Pacho Ramos gentoo-dev 2011-09-22 21:59:05 UTC
It's only using generic for the laptop with i5 processor
Comment 3 SpanKY gentoo-dev 2011-09-22 23:05:39 UTC
Created attachment 287455 [details]
cpuid.c

post the output from this:
gcc cpuid.c && ./a.out
Comment 4 Matthew Thode ( prometheanfire ) archtester Gentoo Infrastructure gentoo-dev Security 2011-09-22 23:18:13 UTC
This is causing qemu-kvm-0.15 to make gcc thing that it is a simple pentium-m vs native (core2 usually (this is a first gen i7)).  Notice the colect_gcc_options

Configured with: /var/tmp/portage/sys-devel/gcc-4.5.3-r1/work/gcc-4.5.3/configure --prefix=/usr --bindir=/usr/x86_64-pc-linux-gnu/gcc-bin/4.5.3 --includedir=/usr/lib/gcc/x86_64-pc-linux-gnu/4.5.3/include --datadir=/usr/share/gcc-data/x86_64-pc-linux-gnu/4.5.3 --mandir=/usr/share/gcc-data/x86_64-pc-linux-gnu/4.5.3/man --infodir=/usr/share/gcc-data/x86_64-pc-linux-gnu/4.5.3/info --with-gxx-include-dir=/usr/lib/gcc/x86_64-pc-linux-gnu/4.5.3/include/g++-v4 --host=x86_64-pc-linux-gnu --build=x86_64-pc-linux-gnu --disable-altivec --disable-fixed-point --without-ppl --without-cloog --disable-lto --enable-nls --without-included-gettext --with-system-zlib --disable-werror --enable-secureplt --disable-multilib --enable-libmudflap --disable-libssp --enable-esp --enable-libgomp --with-python-dir=/share/gcc-data/x86_64-pc-linux-gnu/4.5.3/python --enable-checking=release --disable-libgcj --enable-languages=c,c++ --enable-shared --enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu --with-bugurl=http://bugs.gentoo.org/ --with-pkgversion='Gentoo Hardened 4.5.3-r1 p1.0, pie-0.4.5'
Thread model: posix
gcc version 4.5.3 (Gentoo Hardened 4.5.3-r1 p1.0, pie-0.4.5) 
COLLECT_GCC_OPTIONS='-B/var/tmp/portage/sys-devel/gcc-4.5.3-r1/work/build/gcc/' '-c' '-g' '-O2' '-pipe'  '-v' '-Q' '-fPIE' '-pie'
 cc1 -v -iprefix /root/../../../lib/gcc/x86_64-pc-linux-gnu/4.5.3/ conftest.c -D_FORTIFY_SOURCE=2 -march=pentium-m -mcx16 -msahf -mpopcnt --param l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=4096 -mtune=generic -fno-strict-overflow -dumpbase conftest.c -auxbase conftest -g -O2 -version -fPIE -fstack-protector-all -o - |
 as -V -Qy --64 -o conftest.o -
Comment 5 Ryan Hill (RETIRED) gentoo-dev 2011-09-22 23:49:59 UTC
GCC versions before 4.6 don't have a proper cost model for Core and later processors.  Tuning for generic produces better code in 4.5 so it's the default.
Comment 6 Matthew Thode ( prometheanfire ) archtester Gentoo Infrastructure gentoo-dev Security 2011-09-23 00:14:37 UTC
(In reply to comment #5)
> GCC versions before 4.6 don't have a proper cost model for Core and later
> processors.  Tuning for generic produces better code in 4.5 so it's the
> default.

This is causing my x86_64 vm to be detected as only being 32bit.


I attached this file, look at line 99
/var/tmp/portage/sys-devel/gcc-4.5.3-r1/work/build/x86_64-pc-linux-gnu/libgcc/config.log
Comment 7 Matthew Thode ( prometheanfire ) archtester Gentoo Infrastructure gentoo-dev Security 2011-09-23 00:15:51 UTC
Created attachment 287457 [details]
x86-64 not detected properly, config.log

/var/tmp/portage/sys-devel/gcc-4.5.3-r1/work/build/x86_64-pc-linux-gnu/libgcc/config.log
Comment 8 Matthew Thode ( prometheanfire ) archtester Gentoo Infrastructure gentoo-dev Security 2011-09-23 00:30:43 UTC
I have added bug 384149 since this is a march detection issue, not mtune.
Comment 9 Pacho Ramos gentoo-dev 2011-09-23 10:33:32 UTC
(In reply to comment #3)
> Created attachment 287455 [details]
> cpuid.c
> 
> post the output from this:
> gcc cpuid.c && ./a.out

This one:

$ gcc cpuid.c && ./a.out
0x1: 0x206a7 0x1100800 0x1fbae3bf 0xbfebfbff
0x80000000: 0x80000008 0 0 0
0x80000001: 0 0 0x1 0x28100800
0x80000002: 0x20202020 0x49202020 0x6c65746e 0x20295228

(In reply to comment #5)
> GCC versions before 4.6 don't have a proper cost model for Core and later
> processors.  Tuning for generic produces better code in 4.5 so it's the
> default.

Are you sure then "core2" with gcc-4.4 was producing slower code than "generic" on gcc-4.5? Please note that I am getting this problem only on a i5 process, other core2 models looks to still be detected ok by gcc-4.5
Comment 10 Pacho Ramos gentoo-dev 2011-09-23 12:06:54 UTC
Looks like this have changed in gcc-4.6:

\_ /usr/libexec/gcc/x86_64-pc-linux-gnu/4.6.1/cc1 -quiet - -D_FORTIFY_SOURCE=2 -march=corei7-avx -mcx16 -msahf -mno-movbe -maes -mpclmul -mpopcnt -mno-abm -mno-lwp -mno-fma -mno-fma4 -mno-xop -mno-bmi -mno-tbm -mavx -msse4.2 -msse4.1 --param l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=3072 -mtune=corei7-avx -quiet -dumpbase - -auxbase-strip /dev/null -o /tmp/ccgcWHIx.s

(well, mine is i5 instead of i7, but I guess corei7-avx is also valid for it).

I guess I will need to help on fixing gcc-4.6 tracker bugs to try to get it stabilized soon ;)
Comment 11 Marcin Mirosław 2013-01-14 10:31:50 UTC
Pacho, with gcc-4.6.3 I have the same problem, CPU E3-1230 V2 @ 3.30GHz. Gcc detect it as "corei7-avx".
Comment 12 Pacho Ramos gentoo-dev 2013-01-14 19:51:13 UTC
(In reply to comment #11)
> Pacho, with gcc-4.6.3 I have the same problem, CPU E3-1230 V2 @ 3.30GHz. Gcc
> detect it as "corei7-avx".

I would open a different bug as it refers to a different CPU, and also report to upstream to verify it's not expected
Comment 13 Ryan Hill (RETIRED) gentoo-dev 2013-01-15 00:53:03 UTC
Not a bug.