The advice against using -O3 is currect given by the GCC Optimization documentation is correct. However, the statement about "compilation failure or unexpected program behavior" is not strictly correct. In specific, package maintainers will replace -O3 with -O2 in CFLAGS whenever -O3 causes runtime issues. I suggest changing the documentation to state: -O3: This is the highest level of optimization possible. It improves upon -O2 by enabling optimization passes that dramatically increase both compilation time and binary size for what is often marginal benefit. The significant increase in binary size increases pressure on both the kernel disk cache and CPU hardware caches. This reduces performance in comparison to -O2 in most real-world workloads. The few software packages where -O3 has greater than marginal benefit often use handwritten assembly instead of C, which makes compiler optimization irrelevant. For these reasons, we recommend against building packages with -O3.
I missed a typo when revising the initial comment. Please do a mental `s/is currect //` when reading it.
It just occurred to me that we might want to correct the guide to add -Ofast, which is the new highest level of optimization possible. I should probably ammend my original proposal to be: -O3: This adds some additional optimization passes to those used by -O2. The additional optimization passes dramatically increase both compilation time and binary size for what is often marginal benefit. The significant increase in binary size increases pressure on both the kernel disk cache and CPU hardware caches. This reduces performance in comparison to -O2 in most real-world workloads. The few software packages where -O3 has greater than marginal benefit often use handwritten assembly instead of C, which makes compiler optimization irrelevant. For these reasons, we recommend against building packages with -O3. -Ofast: This is the highest level of optimization possible, and is equivalent to specifying -O3 -ffast-math. Using -ffast-math will modify floating point calculations whenever a faster calculation is known that generates a similar result. This is equivalent to saying that it is okay for 2 + 2 = 5 because 5 is close to 4. This will cause runtime failures and other odd behavior in nearly all software that does floating point arithmetic. Therefore -Ofast is not supported.
This is not actually recommended, and -flto is unsupported on Gentoo. The gcc devs discussed building gcc with -O3 -flto and profiled feedback here: http://marc.info/?l=gcc&m=134424163230104&w=4 The gcc build documenation describes using -flto and profiled feedback: http://gcc.gnu.org/install/build.html bootstrap-lto' Enables Link-Time Optimization for host tools during bootstrapping. `BUILD_CONFIG=bootstrap-lto' is equivalent to adding -flto to `BOOT_CFLAGS'. ... To bootstrap the compiler with profile feedback, use make profiledbootstrap. So translating that to Gentoo: (1) Contents of /etc/portage/env/O3-flto-cflags CFLAGS="-O3 -march=native -flto=8 -frandom-seed=1 -pipe" CXXFLAGS="-O3 -march=native -flto=8 -frandom-seed=1 -pipe" BOOT_CFLAGS="-O3 -march=native -flto=8 -frandom-seed=1 -pipe" (2) Add line to /etc/portage/package.env sys-devel/gcc:4.7 O3-flto-cflags (3) Add line to /etc/portage/make.conf GCC_MAKE_TARGET="profiledbootstrap" (4) Try building it, probably best to enable the testsuite (will take a long time to build, testsuite takes longer but probably better to check the resulting compiler): FEATURES=test nice emerge -v sys-devel/gcc:4.7
I should probably clarify that this bug involves our recommendations for make.conf. Individual packages that might benefit from -O3 have no effect on such recommendations. With that said, it might be worthwhile for toolchain to modify the GCC ebuilds to support USE=custom-optimization and specify -O3 when USE=custom-optimization is not in effect. That should be a separate bug.
That should not be a bug at all, because it would set a record as the fastest closed bug in the history of bugzilla. I think your suggested description for -O3 is an improvement, but I don't know if I agree with it completely. I'll come up with something. I'd rather we didn't document -Ofast.
(In reply to comment #5) > That should not be a bug at all, because it would set a record as the > fastest closed bug in the history of bugzilla. > > I think your suggested description for -O3 is an improvement, but I don't > know if I agree with it completely. I'll come up with something. I'd > rather we didn't document -Ofast. Someone raised the issue in IRC, which lead me to think about revising the language we use. I will be happy to see any improvement to this.
not every package does this. also, on the off-chance folks are compiling their own stuff or items from overlays, there's no guarantee that flags are being filtered. stuff like what's described in the guide DOES happen if users get away with -O3. tons of reports on the forums & bugzie from users who've set -O3 globally. the guide's text is simpler and more straight to the point. i don't want to have to go into extended discussions of programming languages and caches in this doc...a more general command of "this is bad; don't do it" suffices. :)
Josh, I agree, but there's still some tweaking I'd like to do.
(In reply to comment #8) > Josh, I agree, but there's still some tweaking I'd like to do. aight. gimme text, and i'll drop it in.
I'm not opposed to documenting -Ofast, as long as we make it clear what it means (and/or what "support" from Gentoo you might get if you use it).
This document has been moved to the Gentoo wiki and can be found at https://wiki.gentoo.org/wiki/GCC_optimization. We welcome any contributions on this guide and recommend you create an account (if you do not have one already) and make the adjustments to the article as needed. In case of doubt, use the Talk page to discuss potential changes before applying them.