CPython now supports tow new compile-time flags: --enable-optimizations --with-lto Where it will automatically try to do PGO (Profile-Guided-Optimization) optimizations or set itself up to use LTO (Link-Time-Optimization). https://bugs.python.org/issue25702 https://bugs.python.org/issue26359 These only appear on CPython >=3.6.0, >=3.5.3, >=2.7.13 Could the ebuilds please be updated to support these optimizations as optional use flags? It seems they suggest that production builds should be built with --enable-optimizations, I still think it should be hidden behind a use-flag until widespread testing is done. Note: --with-lto and --enable-optimizations seem to be broken in GCC < 5.4 as per this bug report: https://bugs.python.org/issue29641 Note: I tried to just add --enable-optimizations to the ebuild, but ran into sandbox issues, which I don't have the skill to resolve. Thanks in advance
I think I've already seen a bug about this bug can't find it now. As for LTO, it's against Gentoo policy to have ebuilds enforce that, unconditionally or via USE flag. People who want that can put '-flto' or equivalent in C*FLAGS. As for PGO, I don't know the details but I suppose a similar rule may apply.
> As for LTO, it's against Gentoo policy to have ebuilds enforce that, unconditionally or via USE flag. Citation needed. Where is the policy that says we can't add a USE flag to pass --enable-lto to configure? > As for PGO, I don't know the details but I suppose a similar rule may apply. In this case, PGO involves building python twice, with a profiling step between the two builds. The profiling step basically involves running the test suite with a "--pgo" flag. This is complex enough to warrant a USE flag I think.
(In reply to Mike Gilbert from comment #2) > > As for LTO, it's against Gentoo policy to have ebuilds enforce that, unconditionally or via USE flag. > > Citation needed. Where is the policy that says we can't add a USE flag to > pass --enable-lto to configure? Since when do we use USE flags to alter CFLAGS? Just because upstreams have --enable-O1, --enable-sse etc. just to alter CFLAGS doesn't mean we ought to use that. It's incomplete but here's the initial list: https://wiki.gentoo.org/wiki/Project:Quality_Assurance/User_vs_upstream_FLAGS > > > As for PGO, I don't know the details but I suppose a similar rule may apply. > > In this case, PGO involves building python twice, with a profiling step > between the two builds. The profiling step basically involves running the > test suite with a "--pgo" flag. > > This is complex enough to warrant a USE flag I think. Yes, sounds like this. I wonder if this gives any real gain though ;-).
(In reply to Michał Górny from comment #3) > Since when do we use USE flags to alter CFLAGS? Just because upstreams have > --enable-O1, --enable-sse etc. just to alter CFLAGS doesn't mean we ought to > use that. I don't disagree; I just don't like seeing "policy" thrown around without documentation to reference. Thanks for the wiki link.
Created attachment 470080 [details] sandbox violatoin I merely added the --enable-optimizations flag to the ebuild, and this is the sandbox violation I got.
According to Activestate and Antonie Pitrou (whom is on the core Python dev team) one should get ~10% speedup. https://www.activestate.com/blog/2014/06/python-performance-boost-using-profile-guided-optimization https://github.com/ContinuumIO/anaconda-issues/issues/423 I wanted to test it out myself, but ran into sandbox issues, so I can't confirm. Re LTO, if all the flag does is add -flto, then I agree it should rather be left as a cflag option one can manually set.
Pull request with patch to address this created: https://github.com/gentoo/gentoo/pull/5768
Created attachment 501540 [details] Profile generation part. Can anyone confirm there is no "sandbox issues" with recent python versions? I modified python-2.7.14 and python-3.6.3 ebuilds like `use pgo && emake profile-opt || emake` and `use pgo && emake profile-opt CPPFLAGS= CFLAGS= LDFLAGS= || emake CPPFLAGS= CFLAGS= LDFLAGS=` because I have LTO flags set in my *FLAGS. And both builds finished successfully. Previously I had issues with python-2.7.13 and python-3.6.1 PGO builds. Some tests failed due to restricted access to network/sound/... but it's not fatal for build process (but TODO, more tests - more profile data).
I tried with python3.6.3, and got exactly the same sandbox violation. I used the standard ebuild and added `--enable-optimizations` in myeconfargs. Am I doing it wrong? How am I supposed to do it?
(In reply to Nickolas Grigoriadis from comment #9) > I tried with python3.6.3, and got exactly the same sandbox violation. I can't reproduce this even with --enable-optimizations variant (BTW this option don't enable LTO optimizations now in 3.6.3). Do you have PYTHONPATH or any PYTHON* env variables set?
Created attachment 501696 [details] Build log I checked `env` and there is nothing with the word PYTHON in there at all. So assume not. I always do my own python apps in a virtualenv. I just built it without the change, and it worked. After successfully I did this exactly: cd /usr/portage/dev-lang/python vi python-3.6.3.ebuild (added --enable-optimizations to the bottom of myeconfargs) ebuild python-3.6.3.ebuild manifest emerge -1 =dev-lang/python-3.6.3 It built quickly, then ran a lot of tests slowly, then rebuilt. And failed. I attached the build log. Here is my emerge --info Portage 2.3.8 (python 2.7.12-final-0, default/linux/amd64/13.0/desktop/plasma/systemd, gcc-5.4.0, glibc-2.25-r8, 4.12.12-gentoo x86_64) ================================================================= System uname: Linux-4.12.12-gentoo-x86_64-Intel-R-_Core-TM-_i7-3720QM_CPU_@_2.60GHz-with-gentoo-2.4.1 KiB Mem: 16369884 total, 4881232 free KiB Swap: 16895988 total, 13587424 free Timestamp of repository gentoo: Thu, 02 Nov 2017 05:00:01 +0000 Head commit of repository gentoo: ae21ca21612e06936986de5ebb6a5663e16b2a55 sh bash 4.3_p48-r1 ld GNU ld (Gentoo 2.28.1 p1.0) 2.28.1 app-shells/bash: 4.3_p48-r1::gentoo dev-java/java-config: 2.2.0-r3::gentoo dev-lang/perl: 5.24.3::gentoo dev-lang/python: 2.7.12::gentoo, 3.5.4::gentoo, 3.6.3::gentoo dev-util/cmake: 3.8.2::gentoo dev-util/pkgconfig: 0.29.2::gentoo sys-apps/baselayout: 2.4.1-r2::gentoo sys-apps/openrc: 0.32.1::gentoo sys-apps/sandbox: 2.10-r4::gentoo sys-devel/autoconf: 2.13::gentoo, 2.69::gentoo sys-devel/automake: 1.13.4::gentoo, 1.15-r2::gentoo sys-devel/binutils: 2.28.1::gentoo sys-devel/gcc: 5.4.0-r3::gentoo sys-devel/gcc-config: 1.8-r1::gentoo sys-devel/libtool: 2.4.6-r3::gentoo sys-devel/make: 4.2.1::gentoo sys-kernel/linux-headers: 4.4::gentoo (virtual/os-headers) sys-libs/glibc: 2.25-r8::gentoo Repositories: gentoo location: /usr/portage sync-type: rsync sync-uri: rsync://rsync.gentoo.org/gentoo-portage priority: -1000 sync-rsync-extra-opts: --exclude '*/ChangeLog*' grigi location: /var/lib/layman/grigi masters: gentoo priority: 0 graaff location: /var/lib/layman/graaff masters: gentoo priority: 50 jorgicio location: /var/lib/layman/jorgicio masters: gentoo priority: 50 kde location: /var/lib/layman/kde masters: gentoo priority: 50 lmiphay location: /var/lib/layman/lmiphay masters: gentoo priority: 50 steam-overlay location: /var/lib/layman/steam-overlay masters: gentoo priority: 50 ACCEPT_KEYWORDS="amd64" ACCEPT_LICENSE="* -@EULA googleearth AdobeFlash-10.3 Oracle-BCLA-JavaSE skype- PUEL AdobeFlash-11.x AdobeAIRSDK ArxFatalis-EULA-GOG Google-TOS" CBUILD="x86_64-pc-linux-gnu" CFLAGS="-O2 -pipe -march=ivybridge" CHOST="x86_64-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/lib64/libreoffice/program/sofficerc /usr/share/config /usr/share/gnupg/qualified.txt" CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo" CXXFLAGS="-O2 -pipe -march=ivybridge" DISTDIR="/usr/portage/distfiles" FCFLAGS="-O2 -pipe" FEATURES="assume-digests binpkg-logs config-protect-if-modified distlocks ebuild-locks fixlafiles merge-sync multilib-strict news parallel-fetch preserve-libs protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr" FFLAGS="-O2 -pipe" GENTOO_MIRRORS="http://distfiles.gentoo.org" LANG="en_ZA.utf8" LDFLAGS="-Wl,-O1 -Wl,--as-needed" MAKEOPTS="-j4" PKGDIR="/usr/portage/packages" PORTAGE_CONFIGROOT="/" PORTAGE_RSYNC_EXTRA_OPTS="--exclude '*/ChangeLog*'" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git" PORTAGE_TMPDIR="/dev/shm" USE="X a52 aac acl acpi activities aes alsa amd64 avx bash-completion berkdb branding bzip2 cairo cdda cdr cli cracklib crypt cups cxx dbus declarative dell dri dts dvd dvdr emboss encode exif fam firefox flac fortran gdbm gif glamor gpm gtk iconv ipv6 jpeg kde kipi laptop lcms ldap libnotify lm_sensors mad mmx mmxext mng modules mp3 mp4 mpeg multilib ncurses networkmanager nls nptl ogg opencl opengl openmax openmp pam pango pcre pdf phonon plasma png policykit popcnt ppds pulseaudio qml qt3support qt5 readline sdl seccomp session spell sse sse2 sse3 sse4_1 sse4_2 ssl ssse3 startup-notification svg systemd tcpd tiff truetype udev udisks unicode upower usb v4l vaapi vdpau vorbis vulkan wayland widgets wxwidgets x264 xattr xcb xcomposite xinerama xml xscreensaver xv xvid zlib" ABI_X86="32 64" ALSA_CARDS="hda-intel" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump author" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="aes avx mmx mmxext popcnt sse sse2 sse3 sse4_1 sse4_2 ssse3" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" INPUT_DEVICES="evdev synaptics" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" LINGUAS="en_GB en_US en" LLVM_TARGETS="AMDGPU BPF X86" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php5-6" POSTGRES_TARGETS="postgres9_5" PYTHON_SINGLE_TARGET="python3_5" PYTHON_TARGETS="python2_7 python3_5" RUBY_TARGETS="ruby22" USERLAND="GNU" VIDEO_CARDS="radeon radeonsi amdgpu" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account" Unset: CC, CPPFLAGS, CTARGET, CXX, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS
Performance mesured by `python -m performance run --fast --output ../outfile` $ python -V Python 3.6.4 Benchmark Base Changed delta pickle_dict 3,9E-05 2,9E-05 14,86% unpickle_list 4,3E-06 3,3E-06 13,01% pickle_list 5,3E-06 4,2E-06 11,84% pickle 1,2E-05 9,3E-06 11,69% richards 1,1E-01 9,2E-02 9,00% deltablue 1,1E-02 8,9E-03 8,52% go 3,4E-01 2,9E-01 8,35% mako 2,6E-02 2,2E-02 8,22% unpickle 1,5E-05 1,3E-05 8,07% telco 8,5E-03 7,2E-03 7,78% genshi_text 4,0E-02 3,5E-02 7,15% genshi_xml 8,4E-02 7,3E-02 7,07% logging_silent 4,3E-07 3,8E-07 7,00% scimark_sor 2,8E-01 2,4E-01 6,83% django_template 2,0E-01 1,7E-01 6,65% logging_simple 1,4E-05 1,2E-05 6,64% html5lib 1,2E-01 1,1E-01 6,63% unpickle_pure_python 4,8E-04 4,2E-04 6,38% raytrace 7,6E-01 6,7E-01 6,23% pathlib 2,4E-02 2,1E-02 6,20% chaos 1,6E-01 1,4E-01 6,09% scimark_monte_carlo 1,5E-01 1,3E-01 5,97% logging_format 1,6E-05 1,4E-05 5,89% pickle_pure_python 6,5E-04 5,8E-04 5,82% sqlite_synth 3,9E-06 3,5E-06 5,80% nqueens 1,2E-01 1,1E-01 5,11% fannkuch 5,5E-01 5,0E-01 4,97% xml_etree_generate 1,3E-01 1,2E-01 4,97% meteor_contest 1,1E-01 1,0E-01 4,84% sympy_sum 1,3E-01 1,1E-01 4,81% json_loads 3,0E-05 2,7E-05 4,73% 2to3 4,0E-01 3,6E-01 4,68% chameleon 1,2E-02 1,1E-02 4,58% regex_compile 2,3E-01 2,1E-01 4,49% hexiom 1,4E-02 1,2E-02 4,47% scimark_lu 2,9E-01 2,6E-01 4,47% xml_etree_process 1,0E-01 9,6E-02 4,20% dulwich_log 8,8E-02 8,1E-02 4,07% sympy_str 2,6E-01 2,4E-01 4,02% json_dumps 1,4E-02 1,3E-02 3,92% sympy_expand 5,9E-01 5,5E-01 3,62% crypto_pyaes 1,3E-01 1,2E-01 3,26% pidigits 1,7E-01 1,6E-01 3,09% scimark_sparse_mat_mult 4,9E-03 4,6E-03 2,73% float 1,5E-01 1,4E-01 2,60% sympy_integrate 2,7E-02 2,5E-02 2,56% unpack_sequence 4,7E-08 4,5E-08 2,30% regex_dna 1,6E-01 1,5E-01 2,27% regex_v8 2,4E-02 2,3E-02 2,17% spectral_norm 1,6E-01 1,5E-01 2,09% python_startup 1,0E-02 9,7E-03 1,96% python_startup_no_site 6,5E-03 6,3E-03 1,96% nbody 1,7E-01 1,7E-01 1,65% tornado_http 2,2E-01 2,2E-01 1,39% sqlalchemy_declarative 1,8E-01 1,8E-01 1,38% sqlalchemy_imperative 3,6E-02 3,5E-02 1,29% xml_etree_parse 3,2E-01 3,1E-01 1,14% regex_effbot 2,7E-03 2,6E-03 1,14% xml_etree_iterparse 1,8E-01 1,7E-01 1,07% scimark_fft 4,0E-01 4,0E-01 0,36%
I have been using the ebuilds from here: https://github.com/InBetweenNames/gentooLTO/tree/master/dev-lang/python successfully for a while now. The speedup is noticeable. Running the tests on my one project drops the run-time from 7.4s to 6.7s. The changes are quite small, and all hidden behind a pgo use-flag.
Created attachment 528032 [details, diff] PGO patch for dev-lang/python-3.6.5
Created attachment 528034 [details, diff] PGO patch for dev-lang/python-3.5.5
Created attachment 528036 [details, diff] PGO patch for dev-lang/python-3.4.8
Created attachment 528038 [details, diff] PGO patch for dev-lang/python-2.7.14-r1
I'm providing an up-to-date python:2.7 ebuild with optional PGO and LTO support in my overlay: https://github.com/stefantalpalaru/gentoo-overlay Some older benchmarks: https://old.reddit.com/r/Gentoo/comments/8n38ak/devlangpython2715r104_enable_pgo_for_extensions/
*** Bug 541966 has been marked as a duplicate of this bug. ***
Created attachment 641952 [details, diff] pgo-lto patches These are pgo and lto patches for the latest python 3.7, 3.8, and 3.9 in the tree. Substantial changes were needed w.r.t. the previous patches.
I submitted a unified patch against the tree just now for python 3.7, 3.8, and 3.9. Benchmarking on 3.7 shows substantial benefits, about 20% for the tests I care about most.
(In reply to Joel Berendzen from comment #21) > I submitted a unified patch against the tree just now for python 3.7, 3.8, > and 3.9. Benchmarking on 3.7 shows substantial benefits, about 20% for the > tests I care about most. I wonder if you should re-benchmark because your patch has a typo in the 3.7 version ("use_wth")? However, it seems correct for the other 2 ebuilds Given that this is supported upstream and highly recommended and people like RedHat made a big deal about the speedup, and that gentoo relies heavily on python .... is there a reason not to merge this?
FWIW, I've got a fair bit more motivation to do this now that my simple package.env doesn't work. Will work on a PR.
just a heads up that although source is compiled just fine with PGO and LTO, sandbox doesn't like that its trying to unlink some pyc files during testing phase, and will block the installation, so any work on this needs to take sandbox into account. running build_scripts copying and adjusting /var/tmp/portage/dev-lang/python-3.9.8/work/Python-3.9.8/Tools/scripts/pydoc3 -> build/scripts-3.9 copying and adjusting /var/tmp/portage/dev-lang/python-3.9.8/work/Python-3.9.8/Tools/scripts/idle3 -> build/scripts-3.9 copying and adjusting /var/tmp/portage/dev-lang/python-3.9.8/work/Python-3.9.8/Tools/scripts/2to3 -> build/scripts-3.9 changing mode of build/scripts-3.9/pydoc3 from 644 to 755 changing mode of build/scripts-3.9/idle3 from 644 to 755 changing mode of build/scripts-3.9/2to3 from 644 to 755 renaming build/scripts-3.9/pydoc3 to build/scripts-3.9/pydoc3.9 renaming build/scripts-3.9/idle3 to build/scripts-3.9/idle3.9 renaming build/scripts-3.9/2to3 to build/scripts-3.9/2to3-3.9 make[1]: Leaving directory '/var/tmp/portage/dev-lang/python-3.9.8/work/Python-3.9.8' >>> Source compiled. * ----------------------- SANDBOX ACCESS VIOLATION SUMMARY ----------------------- * LOG FILE: "/var/tmp/portage/dev-lang/python-3.9.8/temp/sandbox.log" * VERSION 1.0 FORMAT: F - Function called FORMAT: S - Access Status FORMAT: P - Path as passed to function FORMAT: A - Absolute Path (not canonical) FORMAT: R - Canonical Path FORMAT: C - Command Line F: unlink S: deny P: /usr/lib/python3.9/site-packages/locking_import.pyc A: /usr/lib/python3.9/site-packages/locking_import.pyc R: /usr/lib/python3.9/site-packages/locking_import.pyc C: ./python -m test --pgo F: unlink S: deny P: /usr/lib/python3.9/site-packages/__pycache__/locking_import.cpython-39.pyc A: /usr/lib/python3.9/site-packages/__pycache__/locking_import.cpython-39.pyc R: /usr/lib/python3.9/site-packages/__pycache__/locking_import.cpython-39.pyc C: ./python -m test --pgo F: unlink S: deny P: /usr/lib/python3.9/site-packages/__pycache__/locking_import.cpython-39.opt-1.pyc A: /usr/lib/python3.9/site-packages/__pycache__/locking_import.cpython-39.opt-1.pyc R: /usr/lib/python3.9/site-packages/__pycache__/locking_import.cpython-39.opt-1.pyc C: ./python -m test --pgo . . . . .
The bug has been closed via the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=2c862d1f086bb8a9ce70631dfce1f2d7b5676b4c commit 2c862d1f086bb8a9ce70631dfce1f2d7b5676b4c Author: Sam James <sam@gentoo.org> AuthorDate: 2021-11-17 10:05:45 +0000 Commit: Sam James <sam@gentoo.org> CommitDate: 2021-11-17 10:46:27 +0000 dev-lang/python: add PGO Closes: https://bugs.gentoo.org/615412 Closes: https://github.com/gentoo/gentoo/pull/20077 Signed-off-by: Sam James <sam@gentoo.org> dev-lang/python/metadata.xml | 1 + dev-lang/python/python-3.11.0_alpha2.ebuild | 17 ++++++++++++++++- 2 files changed, 17 insertions(+), 1 deletion(-) Additionally, it has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=074a78fe03e77fe3337dae773c3d46d0af47eb95 commit 074a78fe03e77fe3337dae773c3d46d0af47eb95 Author: Sam James <sam@gentoo.org> AuthorDate: 2021-11-17 10:49:17 +0000 Commit: Sam James <sam@gentoo.org> CommitDate: 2021-11-17 10:49:17 +0000 dev-lang/python: backport PGO, LTO to 3.9.x and 3.10.x Bug: https://bugs.gentoo.org/615412 Bug: https://bugs.gentoo.org/700012 Signed-off-by: Sam James <sam@gentoo.org> dev-lang/python/python-3.10.0_p1.ebuild | 18 +++++++++++++++++- dev-lang/python/python-3.9.9.ebuild | 18 +++++++++++++++++- 2 files changed, 34 insertions(+), 2 deletions(-) https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=b85750dd2326759e7f102d81e5943bd4e5557a8a commit b85750dd2326759e7f102d81e5943bd4e5557a8a Author: Sam James <sam@gentoo.org> AuthorDate: 2021-11-17 10:06:12 +0000 Commit: Sam James <sam@gentoo.org> CommitDate: 2021-11-17 10:46:28 +0000 dev-lang/python: add LTO Bug: https://bugs.gentoo.org/615412 Bug: https://bugs.gentoo.org/700012 Signed-off-by: Sam James <sam@gentoo.org> Closes: https://github.com/gentoo/gentoo/pull/22853 Signed-off-by: Sam James <sam@gentoo.org> dev-lang/python/metadata.xml | 1 + dev-lang/python/python-3.11.0_alpha2.ebuild | 3 ++- 2 files changed, 3 insertions(+), 1 deletion(-)