Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 332733 - sys-devel/gcc-4.4.3-r2: probably wrong optimization options chosen by "-march=native"
Summary: sys-devel/gcc-4.4.3-r2: probably wrong optimization options chosen by "-march...
Status: RESOLVED WORKSFORME
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: All Linux
: High normal (vote)
Assignee: Gentoo Toolchain Maintainers
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-08-14 12:07 UTC by Pacho Ramos
Modified: 2010-08-16 10:51 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Pacho Ramos gentoo-dev 2010-08-14 12:07:19 UTC
I checked what options are being chosen on one of my laptops following the following instructions:
http://en.chys.info/2010/04/what-exactly-marchnative-means/

But,  when reviewing used options I got:

$ ps af | grep cc1
18118 pts/1    S+     0:00  \_ grep --colour=auto cc1
18116 pts/0    S+     0:00      \_ /usr/libexec/gcc/i686-pc-linux-gnu/4.4.3/cc1 -quiet - -D_FORTIFY_SOURCE=2 -march=prescott --param l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=2048 -mtune=generic -quiet -dumpbase - -auxbase-strip /dev/null -o /tmp/ccLS5xw5.s
13580 tty3     S+     0:00          \_ /usr/libexec/gcc/i686-pc-linux-gnu/4.4.3/cc1 -quiet - -D_FORTIFY_SOURCE=2 -march=prescott --param l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=2048 -mtune=generic -quiet -dumpbase - -auxbase-strip /dev/null -o /tmp/ccSnTxP2.s

My /proc/cpuinfo is the following:

$ cat /proc/cpuinfo 
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 14
model name	: Genuine Intel(R) CPU           T2300  @ 1.66GHz
stepping	: 8
cpu MHz		: 996.000
cache size	: 2048 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 2
apicid		: 0
initial apicid	: 0
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx constant_tsc arch_perfmon bts aperfmperf pni monitor vmx est tm2 xtpr pdcm
bogomips	: 3324.55
clflush size	: 64
cache_alignment	: 64
address sizes	: 32 bits physical, 32 bits virtual
power management:

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 14
model name	: Genuine Intel(R) CPU           T2300  @ 1.66GHz
stepping	: 8
cpu MHz		: 996.000
cache size	: 2048 KB
physical id	: 0
siblings	: 2
core id		: 1
cpu cores	: 2
apicid		: 1
initial apicid	: 1
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx constant_tsc arch_perfmon bts aperfmperf pni monitor vmx est tm2 xtpr pdcm
bogomips	: 3324.56
clflush size	: 64
cache_alignment	: 64
address sizes	: 32 bits physical, 32 bits virtual
power management:

And, then, I see two problems:
1. -mtune=generic is being passed instead of, for example, -mtune="specific option"

As I can read in "man gcc", looks like code is really being compiled for a generic set of CPUs instead of specific one:

           generic
               Produce code optimized for the most common IA32/AMD64/EM64T processors.  If you know the CPU on which your code will run, then
               you should use the corresponding -mtune option instead of -mtune=generic.  But, if you do not know exactly what CPU users of
               your application will have, then you should use this option.

               As new processors are deployed in the marketplace, the behavior of this option will change.  Therefore, if you upgrade to a
               newer version of GCC, the code generated option will change to reflect the processors that were most common when that version
               of GCC was released.

               There is no -march=generic option because -march indicates the instruction set the compiler can use, and there is no generic
               instruction set applicable to all processors.  In contrast, -mtune indicates the processor (or, in this case, collection of
               processors) for which the code is optimized.

2. -march=prescott 

I am unsure about my processor is really a prescott one, even supporting sse3 it's listed as a Pentium-M based processor in the following links:
http://en.wikipedia.org/wiki/List_of_Intel_microprocessors#Intel_Core
http://en.wikipedia.org/wiki/Yonah_(microprocessor)
http://en.wikipedia.org/wiki/List_of_Intel_Core_microprocessors#Core_Duo

Then, I would pass "-march=pentium-m -msse3" instead.

Thanks a lot for your help

Reproducible: Always




$ emerge --info
Portage 2.1.8.3 (default/linux/amd64/10.0/desktop/gnome, gcc-4.4.3, glibc-2.11.2-r0, 2.6.34-tuxonice-r1 x86_64)
=================================================================
System uname: Linux-2.6.34-tuxonice-r1-x86_64-Intel-R-_Core-TM-2_Duo_CPU_T9300_@_2.50GHz-with-gentoo-1.12.13
Timestamp of tree: Sun, 01 Aug 2010 08:35:01 +0000
ccache version 2.4 [enabled]
app-shells/bash:     4.0_p37
dev-java/java-config: 2.1.11
dev-lang/python:     2.6.5-r2
dev-util/ccache:     2.4-r7
dev-util/cmake:      2.8.1-r2
sys-apps/baselayout: 1.12.13
sys-apps/sandbox:    1.6-r2
sys-devel/autoconf:  2.13, 2.65
sys-devel/automake:  1.9.6-r3, 1.10.3, 1.11.1
sys-devel/binutils:  2.20.1-r1
sys-devel/gcc:       4.4.3-r2
sys-devel/gcc-config: 1.4.1
sys-devel/libtool:   2.2.6b
virtual/os-headers:  2.6.30-r1
ACCEPT_KEYWORDS="amd64"
ACCEPT_LICENSE="*"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-O2 -pipe -march=native"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/X11/xkb /var/lib/hsqldb"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/java/ /etc/fonts/fonts.conf /etc/gconf /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c"
CXXFLAGS="-O2 -pipe -march=native"
DISTDIR="/usr/distfiles"
FEATURES="assume-digests autoaddcvs ccache cvs distlocks fixpackages multilib-strict news parallel-fetch protect-owned sandbox sfperms sign split-log strict test test-fail-continue unmerge-logs unmerge-orphans userfetch"
GENTOO_MIRRORS="ftp://ftp.free.fr/mirrors/ftp.gentoo.org"
LANG="es_ES.UTF-8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
LINGUAS="es es_ES en_US"
MAKEOPTS="-j3"
PKGDIR="/usr/local/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/portage/local/layman/sunrise /usr/local/portage"
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="X aac acl acpi alsa amd64 avahi bash-completion berkdb bluetooth branding bzip2 cairo cleartype cli consolekit cracklib crypt cups cxx daap dbus dell djvu dri dts dvi eds emboss encode evo exif fam fat ffmpeg firefox flac fortran fuse gdbm gdu gif git gnome gnome-keyring gpm gstreamer gtk hal hddtemp iconv ieee1394 ipod java jpeg laptop latex lcdfilter lcms libnotify lm_sensors lyx lzma mad mikmod mmx mmxext mng modules mono mp3 mp4 mpeg mudflap multilib musicbrainz nautilus ncurses network network-cron networkmanager nls nptl nptlonly ntfs nvidia ogg opengl openmp pam pango pch pcre pdf perl pidgin png policykit ppds pppd python qt3support readline reflection reiserfs sdl session spell spl sse sse2 sse3 sse4 ssl ssse3 startup-notification subversion svg sysfs t1lib tcpd threads tiff truetype unicode usb v4l2 vdpau vorbis webkit wifi x264 xattr xcb xml xmp xorg xpm xulrunner xv xvid zeroconf zlib" ALSA_CARDS="hda-intel" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" ELIBC="glibc" INPUT_DEVICES="evdev synaptics" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LINGUAS="es es_ES en_US" RUBY_TARGETS="ruby18" USERLAND="GNU" VIDEO_CARDS="nvidia nv" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account" 
Unset:  CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, FFLAGS, INSTALL_MASK, LC_ALL, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
Comment 1 Pacho Ramos gentoo-dev 2010-08-14 12:12:38 UTC
Proper emerge --info from affected computer (sorry):

$ emerge --info
Portage 2.1.8.3 (default/linux/x86/10.0/desktop/gnome, gcc-4.4.3, glibc-2.11.2-r0, 2.6.34-tuxonice-r1 i686)
=================================================================
System uname: Linux-2.6.34-tuxonice-r1-i686-Genuine_Intel-R-_CPU_T2300_@_1.66GHz-with-gentoo-1.12.13
Timestamp of tree: Fri, 30 Jul 2010 12:00:20 +0000
ccache version 2.4 [enabled]
app-shells/bash:     4.0_p37
dev-java/java-config: 2.1.11
dev-lang/python:     2.6.5-r2
dev-util/ccache:     2.4-r7
dev-util/cmake:      2.6.4-r3
sys-apps/baselayout: 1.12.13
sys-apps/sandbox:    1.6-r2
sys-devel/autoconf:  2.13, 2.65
sys-devel/automake:  1.9.6-r3, 1.10.3, 1.11.1
sys-devel/binutils:  2.20.1-r1
sys-devel/gcc:       4.4.3-r2
sys-devel/gcc-config: 1.4.1
sys-devel/libtool:   2.2.6b
virtual/os-headers:  2.6.30-r1
ACCEPT_KEYWORDS="x86"
ACCEPT_LICENSE="*"
CBUILD="i686-pc-linux-gnu"
CFLAGS="-O2 -march=native -pipe -fomit-frame-pointer"
CHOST="i686-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/X11/xkb /var/lib/hsqldb"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/java/ /etc/fonts/fonts.conf /etc/gconf /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c"
CXXFLAGS="-O2 -march=native -pipe -fomit-frame-pointer"
DISTDIR="/usr/distfiles"
FEATURES="assume-digests ccache distlocks fixpackages news parallel-fetch protect-owned sandbox sfperms strict unmerge-logs unmerge-orphans userfetch"
GENTOO_MIRRORS="ftp://ftp.free.fr/mirrors/ftp.gentoo.org"
LANG="es_ES.UTF-8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
LINGUAS="es es_ES en_US"
MAKEOPTS="-j3"
PKGDIR="/usr/local/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/portage/local/layman/sunrise /usr/local/portage"
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="X a52 aac acl acpi alsa avahi bash-completion berkdb branding bzip2 cairo cdda cddb cdinstall cdr cleartype cli consolekit cracklib crypt cups cxx daap dbus djvu dri dts dvd dvdr dvi eds emboss encode evo exif fam fat ffmpeg firefox flac fortran fuse gdbm gdu gif gnome gnome-keyring gpm gstreamer gtk hal iconv java jpeg laptop latex lcms libnotify lm_sensors lzma mad mikmod mmx mmxext mng modules mono mp3 mp4 mpeg mudflap musicbrainz nautilus ncurses network network-cron networkmanager nls nptl nptlonly ntfs nvidia ogg opengl openmp pam pango pch pcre pdf perl png policykit ppds pppd python qt3support readline reflection reiserfs sdl session spell spl sse sse2 sse3 ssl startup-notification svg sysfs t1lib tcpd threads tiff truetype udev unicode usb v4l2 vcd vorbis wifi x264 x86 xattr xcb xml xmp xorg xpm xulrunner xv xvid zeroconf zlib" ALSA_CARDS="hda-intel" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" ELIBC="glibc" INPUT_DEVICES="evdev synaptics" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LINGUAS="es es_ES en_US" RUBY_TARGETS="ruby18" USERLAND="GNU" VIDEO_CARDS="nvidia nv vesa" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account" 
Unset:  CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, FFLAGS, INSTALL_MASK, LC_ALL, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
Comment 2 SpanKY gentoo-dev 2010-08-14 17:50:52 UTC
-mtune=generic can tune however gcc feels is best.  just because it picks a few common procs doesnt mean that it wont work on others.  it just means it wont work as well as it could if you selected a -mtune=<cpu> that matched the cpu it actually runs on.

your cpuinfo output says you *dont* have sse3 support.  are you sure that output is from the right computer ?
Comment 3 Ryan Hill (RETIRED) gentoo-dev 2010-08-14 19:57:33 UTC
"pni" (Prescott New Instructions) is sse3.

Everything looks fine to me.  -mtune=generic is standard for -march=native.  -march=prescott is the correct setting for that CPU on a 32bit system (if it was x86_64 it would be core2 or nocona depending on the GCC version).  My old laptop had the same processor and a guy at Intel told me to use prescott instead of pentium-m.

If you're interested in how it decides what options to use check out gcc/config/i386/driver-i386.c.
Comment 4 Pacho Ramos gentoo-dev 2010-08-15 12:03:39 UTC
(In reply to comment #2)
> -mtune=generic can tune however gcc feels is best.  just because it picks a few
> common procs doesnt mean that it wont work on others.  it just means it wont
> work as well as it could if you selected a -mtune=<cpu> that matched the cpu it
> actually runs on.

Then, why gcc doesn't select "-mtune=prescott" instead of generic if it would be better? Please note that in my other laptop (the one from I got the wrong emerge --info), gcc properly passes -mtune=core2. gcc manpage says that "native" should choose the best option for host processor, and seems that "generic" is not the case :-/

(In reply to comment #3)
> "pni" (Prescott New Instructions) is sse3.
> 
> Everything looks fine to me.  -mtune=generic is standard for -march=native. 

It is not the standard on all systems, I can only test it on two laptops now, and on my Dell it doesn't pass "generic" (but maybe because it's a 64 bits system :-/)

> -march=prescott is the correct setting for that CPU on a 32bit system (if it
> was x86_64 it would be core2 or nocona depending on the GCC version).

If I don't misremember, T2300 doesn't support x86_64 at all... are you sure you had exactly the same processor? 

>   My old
> laptop had the same processor and a guy at Intel told me to use prescott
> instead of pentium-m.
> 
> If you're interested in how it decides what options to use check out
> gcc/config/i386/driver-i386.c.
> 

Will take a look to it, thanks :-)
Comment 5 Pacho Ramos gentoo-dev 2010-08-15 12:26:23 UTC
> > If you're interested in how it decides what options to use check out
> > gcc/config/i386/driver-i386.c.
> > 
> 
> Will take a look to it, thanks :-)
> 

Looks like this is like a "special case", but maybe I have misunderstood how does it work:

    case PROCESSOR_PENTIUMPRO:
      if (has_longmode)
        /* It is Core 2 Duo.  */
        cpu = "core2";
      else if (arch)
        {
          if (has_sse3)
            /* It is Core Duo.  */
            cpu = "prescott";
          else if (has_sse2)
            /* It is Pentium M.  */
            cpu = "pentium-m";
          else if (has_sse)
            /* It is Pentium III.  */
            cpu = "pentium3";
          else if (has_mmx)
            /* It is Pentium II.  */
            cpu = "pentium2";
          else
            /* Default to Pentium Pro.  */
            cpu = "pentiumpro";
        }
      else
        /* For -mtune, we default to -mtune=generic.  */
        cpu = "generic";
      break;

Then, seems that it chooses "prescott" because processor has "sse3" and "-mtune=generic" because it's forced for some reason :-|

Do you know where could I ask for information about why choosing these options? I have looked at NEWS and Changelog files without success :-(

Thanks
Comment 6 Ryan Hill (RETIRED) gentoo-dev 2010-08-15 19:51:41 UTC
(In reply to comment #4)

> Then, why gcc doesn't select "-mtune=prescott" instead of generic if it would
> be better? Please note that in my other laptop (the one from I got the wrong
> emerge --info), gcc properly passes -mtune=core2. gcc manpage says that
> "native" should choose the best option for host processor, and seems that
> "generic" is not the case :-/

No clue.  You'd have to ask upstream.  But my point is it's working as they intended, so no bug.

> > -march=prescott is the correct setting for that CPU on a 32bit system (if it
> > was x86_64 it would be core2 or nocona depending on the GCC version).
> 
> If I don't misremember, T2300 doesn't support x86_64 at all... are you sure you
> had exactly the same processor?

My mistake, it is only 32bit.
Comment 7 Pacho Ramos gentoo-dev 2010-08-16 10:51:35 UTC
OK, thanks :-)