Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 460196 - Revise GCC Optimization guide's -O3 recommendation
Summary: Revise GCC Optimization guide's -O3 recommendation
Status: RESOLVED OBSOLETE
Alias: None
Product: [OLD] Docs on www.gentoo.org
Classification: Unclassified
Component: Other documents (show other bugs)
Hardware: All All
: Normal normal (vote)
Assignee: Docs Team
URL: http://www.gentoo.org/doc/en/gcc-opti...
Whiteboard:
Keywords: Goal
Depends on:
Blocks:
 
Reported: 2013-03-04 00:26 UTC by Richard Yao (RETIRED)
Modified: 2013-07-27 20:35 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Richard Yao (RETIRED) gentoo-dev 2013-03-04 00:26:55 UTC
The advice against using -O3 is currect given by the GCC Optimization documentation is correct. However, the statement about "compilation failure or unexpected program behavior" is not strictly correct. In specific, package maintainers will replace -O3 with -O2 in CFLAGS whenever -O3 causes runtime issues.

I suggest changing the documentation to state:

-O3: This is the highest level of optimization possible. It improves upon -O2 by enabling optimization passes that dramatically increase both compilation time and binary size for what is often marginal benefit. The significant increase in binary size increases pressure on both the kernel disk cache and CPU hardware caches. This reduces performance in comparison to -O2 in most real-world workloads. The few software packages where -O3 has greater than marginal benefit often use handwritten assembly instead of C, which makes compiler optimization irrelevant. For these reasons, we recommend against building packages with -O3.
Comment 1 Richard Yao (RETIRED) gentoo-dev 2013-03-04 00:30:19 UTC
I missed a typo when revising the initial comment. Please do a mental `s/is currect //` when reading it.
Comment 2 Richard Yao (RETIRED) gentoo-dev 2013-03-04 00:45:19 UTC
It just occurred to me that we might want to correct the guide to add -Ofast, which is the new highest level of optimization possible. I should probably ammend my original proposal to be:


-O3: This adds some additional optimization passes to those used by -O2. The additional optimization passes dramatically increase both compilation time and binary size for what is often marginal benefit. The significant increase in binary size increases pressure on both the kernel disk cache and CPU hardware caches. This reduces performance in comparison to -O2 in most real-world workloads. The few software packages where -O3 has greater than marginal benefit often use handwritten assembly instead of C, which makes compiler optimization irrelevant. For these reasons, we recommend against building packages with -O3.
-Ofast: This is the highest level of optimization possible, and is equivalent to specifying -O3 -ffast-math. Using -ffast-math will modify floating point calculations whenever a faster calculation is known that generates a similar result. This is equivalent to saying that it is okay for 2 + 2 = 5 because 5 is close to 4. This will cause runtime failures and other odd behavior in nearly all software that does floating point arithmetic. Therefore -Ofast is not supported.
Comment 3 Mark Wright gentoo-dev 2013-03-04 01:07:46 UTC
This is not actually recommended, and -flto is unsupported on Gentoo.
The gcc devs discussed building gcc with -O3 -flto and
profiled feedback here:

http://marc.info/?l=gcc&m=134424163230104&w=4

The gcc build documenation describes using -flto and profiled feedback:

http://gcc.gnu.org/install/build.html

  bootstrap-lto'
  Enables Link-Time Optimization for host tools during
  bootstrapping. `BUILD_CONFIG=bootstrap-lto' is equivalent to adding
  -flto to `BOOT_CFLAGS'. 

  ...

  To bootstrap the compiler with profile feedback, use make profiledbootstrap.

So translating that to Gentoo:

(1) Contents of /etc/portage/env/O3-flto-cflags

CFLAGS="-O3 -march=native -flto=8 -frandom-seed=1 -pipe"
CXXFLAGS="-O3 -march=native -flto=8 -frandom-seed=1 -pipe"
BOOT_CFLAGS="-O3 -march=native -flto=8 -frandom-seed=1 -pipe"

(2) Add line to /etc/portage/package.env

sys-devel/gcc:4.7 O3-flto-cflags

(3) Add line to /etc/portage/make.conf

GCC_MAKE_TARGET="profiledbootstrap"

(4) Try building it, probably best to enable the testsuite
(will take a long time to build, testsuite takes longer but probably
better to check the resulting compiler):

FEATURES=test nice emerge -v sys-devel/gcc:4.7
Comment 4 Richard Yao (RETIRED) gentoo-dev 2013-03-04 01:13:29 UTC
I should probably clarify that this bug involves our recommendations for make.conf. Individual packages that might benefit from -O3 have no effect on such recommendations.

With that said, it might be worthwhile for toolchain to modify the GCC ebuilds to support USE=custom-optimization and specify -O3 when USE=custom-optimization is not in effect. That should be a separate bug.
Comment 5 Ryan Hill (RETIRED) gentoo-dev 2013-03-04 01:36:15 UTC
That should not be a bug at all, because it would set a record as the fastest closed bug in the history of bugzilla.

I think your suggested description for -O3 is an improvement, but I don't know if I agree with it completely.  I'll come up with something.  I'd rather we didn't document -Ofast.
Comment 6 Richard Yao (RETIRED) gentoo-dev 2013-03-04 02:07:59 UTC
(In reply to comment #5)
> That should not be a bug at all, because it would set a record as the
> fastest closed bug in the history of bugzilla.
> 
> I think your suggested description for -O3 is an improvement, but I don't
> know if I agree with it completely.  I'll come up with something.  I'd
> rather we didn't document -Ofast.

Someone raised the issue in IRC, which lead me to think about revising the language we use. I will be happy to see any improvement to this.
Comment 7 nm (RETIRED) gentoo-dev 2013-03-04 12:18:53 UTC
not every package does this. also, on the off-chance folks are compiling their own stuff or items from overlays, there's no guarantee that flags are being filtered.

stuff like what's described in the guide DOES happen if users get away with -O3. tons of reports on the forums & bugzie from users who've set -O3 globally. the guide's text is simpler and more straight to the point. i don't want to have to go into extended discussions of programming languages and caches in this doc...a more general command of "this is bad; don't do it" suffices. :)
Comment 8 Ryan Hill (RETIRED) gentoo-dev 2013-03-05 00:54:54 UTC
Josh, I agree, but there's still some tweaking I'd like to do.
Comment 9 nm (RETIRED) gentoo-dev 2013-03-05 09:13:57 UTC
(In reply to comment #8)
> Josh, I agree, but there's still some tweaking I'd like to do.

aight. gimme text, and i'll drop it in.
Comment 10 Sven Vermeulen (RETIRED) gentoo-dev 2013-04-06 11:45:53 UTC
I'm not opposed to documenting -Ofast, as long as we make it clear what it means (and/or what "support" from Gentoo you might get if you use it).
Comment 11 Sven Vermeulen (RETIRED) gentoo-dev 2013-07-27 20:35:26 UTC
This document has been moved to the Gentoo wiki and can be found at https://wiki.gentoo.org/wiki/GCC_optimization. We welcome any contributions on this guide and recommend you create an account (if you do not have one already) and make the adjustments to the article as needed. In case of doubt, use the Talk page to discuss potential changes before applying them.