Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 615412 - dev-lang/python: add pgo support
Summary: dev-lang/python: add pgo support
Status: CONFIRMED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal with 5 votes (vote)
Assignee: Python Gentoo Team
URL:
Whiteboard:
Keywords:
: 541966 (view as bug list)
Depends on:
Blocks:
 
Reported: 2017-04-13 05:36 UTC by Nickolas Grigoriadis
Modified: 2020-10-29 17:40 UTC (History)
13 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
sandbox violatoin (file_615412.txt,683 bytes, text/plain)
2017-04-15 05:30 UTC, Nickolas Grigoriadis
Details
Profile generation part. (python3.6-tests.log,21.32 KB, text/x-log)
2017-11-01 08:27 UTC, Bug Bugs
Details
Build log (build-python3.6.3.log,346.99 KB, text/x-log)
2017-11-02 09:58 UTC, Nickolas Grigoriadis
Details
PGO patch for dev-lang/python-3.6.5 (3.6.5-python-pgo.patch,1.29 KB, patch)
2018-04-19 08:52 UTC, Nickolas Grigoriadis
Details | Diff
PGO patch for dev-lang/python-3.5.5 (3.5.5-python-pgo.patch,1.29 KB, patch)
2018-04-19 08:53 UTC, Nickolas Grigoriadis
Details | Diff
PGO patch for dev-lang/python-3.4.8 (3.4.8-python-pgo.patch,1.37 KB, patch)
2018-04-19 08:53 UTC, Nickolas Grigoriadis
Details | Diff
PGO patch for dev-lang/python-2.7.14-r1 (2.7.14-r1-python-pgo.patch,1.19 KB, patch)
2018-04-19 08:53 UTC, Nickolas Grigoriadis
Details | Diff
pgo-lto patches (pgo+lto-3.7-3.8-3.9.patch,7.75 KB, patch)
2020-05-26 18:37 UTC, Joel Berendzen
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Nickolas Grigoriadis 2017-04-13 05:36:26 UTC
CPython now supports tow new compile-time flags:
--enable-optimizations
--with-lto

Where it will automatically try to do PGO (Profile-Guided-Optimization) optimizations or set itself up to use LTO (Link-Time-Optimization).
https://bugs.python.org/issue25702
https://bugs.python.org/issue26359

These only appear on CPython >=3.6.0, >=3.5.3, >=2.7.13

Could the ebuilds please be updated to support these optimizations as optional use flags?

It seems they suggest that production builds should be built with --enable-optimizations, I still think it should be hidden behind a use-flag until widespread testing is done.

Note: --with-lto and --enable-optimizations seem to be broken in GCC < 5.4 as per this bug report:
https://bugs.python.org/issue29641

Note: I tried to just add --enable-optimizations to the ebuild, but ran into sandbox issues, which I don't have the skill to resolve.

Thanks in advance
Comment 1 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2017-04-13 17:02:40 UTC
I think I've already seen a bug about this bug can't find it now.

As for LTO, it's against Gentoo policy to have ebuilds enforce that, unconditionally or via USE flag. People who want that can put '-flto' or equivalent in C*FLAGS.

As for PGO, I don't know the details but I suppose a similar rule may apply.
Comment 2 Mike Gilbert gentoo-dev 2017-04-13 17:15:05 UTC
> As for LTO, it's against Gentoo policy to have ebuilds enforce that, unconditionally or via USE flag.

Citation needed. Where is the policy that says we can't add a USE flag to pass --enable-lto to configure?

> As for PGO, I don't know the details but I suppose a similar rule may apply.

In this case, PGO involves building python twice, with a profiling step between the two builds. The profiling step basically involves running the test suite with a "--pgo" flag.

This is complex enough to warrant a USE flag I think.
Comment 3 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2017-04-13 17:43:11 UTC
(In reply to Mike Gilbert from comment #2)
> > As for LTO, it's against Gentoo policy to have ebuilds enforce that, unconditionally or via USE flag.
> 
> Citation needed. Where is the policy that says we can't add a USE flag to
> pass --enable-lto to configure?

Since when do we use USE flags to alter CFLAGS? Just because upstreams have --enable-O1, --enable-sse etc. just to alter CFLAGS doesn't mean we ought to use that.

It's incomplete but here's the initial list: https://wiki.gentoo.org/wiki/Project:Quality_Assurance/User_vs_upstream_FLAGS

> 
> > As for PGO, I don't know the details but I suppose a similar rule may apply.
> 
> In this case, PGO involves building python twice, with a profiling step
> between the two builds. The profiling step basically involves running the
> test suite with a "--pgo" flag.
> 
> This is complex enough to warrant a USE flag I think.

Yes, sounds like this. I wonder if this gives any real gain though ;-).
Comment 4 Mike Gilbert gentoo-dev 2017-04-13 17:59:19 UTC
(In reply to Michał Górny from comment #3)
> Since when do we use USE flags to alter CFLAGS? Just because upstreams have
> --enable-O1, --enable-sse etc. just to alter CFLAGS doesn't mean we ought to
> use that.

I don't disagree; I just don't like seeing "policy" thrown around without documentation to reference. Thanks for the wiki link.
Comment 5 Nickolas Grigoriadis 2017-04-15 05:30:13 UTC
Created attachment 470080 [details]
sandbox violatoin

I merely added the --enable-optimizations flag to the ebuild, and this is the sandbox violation I got.
Comment 6 Nickolas Grigoriadis 2017-04-15 05:30:28 UTC
According to Activestate and Antonie Pitrou (whom is on the core Python dev team) one should get ~10% speedup.
https://www.activestate.com/blog/2014/06/python-performance-boost-using-profile-guided-optimization
https://github.com/ContinuumIO/anaconda-issues/issues/423

I wanted to test it out myself, but ran into sandbox issues, so I can't confirm.

Re LTO, if all the flag does is add -flto, then I agree it should rather be left as a cflag option one can manually set.
Comment 7 Shane Peelar 2017-09-22 19:42:50 UTC
Pull request with patch to address this created: https://github.com/gentoo/gentoo/pull/5768
Comment 8 Bug Bugs 2017-11-01 08:27:43 UTC
Created attachment 501540 [details]
Profile generation part.

Can anyone confirm there is no "sandbox issues" with recent python versions?

I modified python-2.7.14 and python-3.6.3 ebuilds like `use pgo && emake profile-opt || emake` and `use pgo && emake profile-opt CPPFLAGS= CFLAGS= LDFLAGS= || emake CPPFLAGS= CFLAGS= LDFLAGS=` because I have LTO flags set in my *FLAGS. And both builds finished successfully. Previously I had issues with python-2.7.13 and python-3.6.1 PGO builds.

Some tests failed due to restricted access to network/sound/... but it's not fatal for build process (but TODO, more tests - more profile data).
Comment 9 Nickolas Grigoriadis 2017-11-01 13:01:46 UTC
I tried with python3.6.3, and got exactly the same sandbox violation.

I used the standard ebuild and added `--enable-optimizations` in myeconfargs.

Am I doing it wrong? How am I supposed to do it?
Comment 10 Bug Bugs 2017-11-01 16:15:28 UTC
(In reply to Nickolas Grigoriadis from comment #9)
> I tried with python3.6.3, and got exactly the same sandbox violation.

I can't reproduce this even with --enable-optimizations variant (BTW this option don't enable LTO optimizations now in 3.6.3).

Do you have PYTHONPATH or any PYTHON* env variables set?
Comment 11 Nickolas Grigoriadis 2017-11-02 09:58:55 UTC
Created attachment 501696 [details]
Build log

I checked `env` and there is nothing with the word PYTHON in there at all. So assume not.
I always do my own python apps in a virtualenv.

I just built it without the change, and it worked.

After successfully I did this exactly:
cd /usr/portage/dev-lang/python
vi python-3.6.3.ebuild (added --enable-optimizations to the bottom of myeconfargs)
ebuild python-3.6.3.ebuild manifest
emerge -1 =dev-lang/python-3.6.3

It built quickly, then ran a lot of tests slowly, then rebuilt.
And failed.

I attached the build log.

Here is my emerge --info
Portage 2.3.8 (python 2.7.12-final-0, default/linux/amd64/13.0/desktop/plasma/systemd, gcc-5.4.0, glibc-2.25-r8, 4.12.12-gentoo x86_64)
=================================================================
System uname: Linux-4.12.12-gentoo-x86_64-Intel-R-_Core-TM-_i7-3720QM_CPU_@_2.60GHz-with-gentoo-2.4.1
KiB Mem:    16369884 total,   4881232 free
KiB Swap:   16895988 total,  13587424 free
Timestamp of repository gentoo: Thu, 02 Nov 2017 05:00:01 +0000
Head commit of repository gentoo: ae21ca21612e06936986de5ebb6a5663e16b2a55
sh bash 4.3_p48-r1
ld GNU ld (Gentoo 2.28.1 p1.0) 2.28.1
app-shells/bash:          4.3_p48-r1::gentoo
dev-java/java-config:     2.2.0-r3::gentoo
dev-lang/perl:            5.24.3::gentoo
dev-lang/python:          2.7.12::gentoo, 3.5.4::gentoo, 3.6.3::gentoo
dev-util/cmake:           3.8.2::gentoo
dev-util/pkgconfig:       0.29.2::gentoo
sys-apps/baselayout:      2.4.1-r2::gentoo
sys-apps/openrc:          0.32.1::gentoo
sys-apps/sandbox:         2.10-r4::gentoo
sys-devel/autoconf:       2.13::gentoo, 2.69::gentoo
sys-devel/automake:       1.13.4::gentoo, 1.15-r2::gentoo
sys-devel/binutils:       2.28.1::gentoo
sys-devel/gcc:            5.4.0-r3::gentoo
sys-devel/gcc-config:     1.8-r1::gentoo
sys-devel/libtool:        2.4.6-r3::gentoo
sys-devel/make:           4.2.1::gentoo
sys-kernel/linux-headers: 4.4::gentoo (virtual/os-headers)
sys-libs/glibc:           2.25-r8::gentoo
Repositories:

gentoo
    location: /usr/portage
    sync-type: rsync
    sync-uri: rsync://rsync.gentoo.org/gentoo-portage
    priority: -1000
    sync-rsync-extra-opts: --exclude '*/ChangeLog*'

grigi
    location: /var/lib/layman/grigi
    masters: gentoo
    priority: 0

graaff
    location: /var/lib/layman/graaff
    masters: gentoo
    priority: 50

jorgicio
    location: /var/lib/layman/jorgicio
    masters: gentoo
    priority: 50

kde
    location: /var/lib/layman/kde
    masters: gentoo
    priority: 50

lmiphay
    location: /var/lib/layman/lmiphay
    masters: gentoo
    priority: 50

steam-overlay
    location: /var/lib/layman/steam-overlay
    masters: gentoo
    priority: 50

ACCEPT_KEYWORDS="amd64"
ACCEPT_LICENSE="* -@EULA googleearth AdobeFlash-10.3 Oracle-BCLA-JavaSE skype-4.0.0.7-copyright PUEL AdobeFlash-11.x AdobeAIRSDK ArxFatalis-EULA-GOG Google-TOS"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-O2 -pipe -march=ivybridge"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/lib64/libreoffice/program/sofficerc /usr/share/config /usr/share/gnupg/qualified.txt"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo"
CXXFLAGS="-O2 -pipe -march=ivybridge"
DISTDIR="/usr/portage/distfiles"
FCFLAGS="-O2 -pipe"
FEATURES="assume-digests binpkg-logs config-protect-if-modified distlocks ebuild-locks fixlafiles merge-sync multilib-strict news parallel-fetch preserve-libs protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
FFLAGS="-O2 -pipe"
GENTOO_MIRRORS="http://distfiles.gentoo.org"
LANG="en_ZA.utf8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
MAKEOPTS="-j4"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_EXTRA_OPTS="--exclude '*/ChangeLog*'"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git"
PORTAGE_TMPDIR="/dev/shm"
USE="X a52 aac acl acpi activities aes alsa amd64 avx bash-completion berkdb branding bzip2 cairo cdda cdr cli cracklib crypt cups cxx dbus declarative dell dri dts dvd dvdr emboss encode exif fam firefox flac fortran gdbm gif glamor gpm gtk iconv ipv6 jpeg kde kipi laptop lcms ldap libnotify lm_sensors mad mmx mmxext mng modules mp3 mp4 mpeg multilib ncurses networkmanager nls nptl ogg opencl opengl openmax openmp pam pango pcre pdf phonon plasma png policykit popcnt ppds pulseaudio qml qt3support qt5 readline sdl seccomp session spell sse sse2 sse3 sse4_1 sse4_2 ssl ssse3 startup-notification svg systemd tcpd tiff truetype udev udisks unicode upower usb v4l vaapi vdpau vorbis vulkan wayland widgets wxwidgets x264 xattr xcb xcomposite xinerama xml xscreensaver xv xvid zlib" ABI_X86="32 64" ALSA_CARDS="hda-intel" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump author" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="aes avx mmx mmxext popcnt sse sse2 sse3 sse4_1 sse4_2 ssse3" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" INPUT_DEVICES="evdev synaptics" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" LINGUAS="en_GB en_US en" LLVM_TARGETS="AMDGPU BPF X86" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php5-6" POSTGRES_TARGETS="postgres9_5" PYTHON_SINGLE_TARGET="python3_5" PYTHON_TARGETS="python2_7 python3_5" RUBY_TARGETS="ruby22" USERLAND="GNU" VIDEO_CARDS="radeon radeonsi amdgpu" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CC, CPPFLAGS, CTARGET, CXX, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS
Comment 12 Francesco Riosa 2018-03-30 18:40:49 UTC
Performance mesured by `python -m performance run --fast --output ../outfile`

$ python -V
Python 3.6.4

Benchmark	Base	Changed	delta

pickle_dict	3,9E-05	2,9E-05	14,86%
unpickle_list	4,3E-06	3,3E-06	13,01%
pickle_list	5,3E-06	4,2E-06	11,84%
pickle	1,2E-05	9,3E-06	11,69%
richards	1,1E-01	9,2E-02	9,00%
deltablue	1,1E-02	8,9E-03	8,52%
go	3,4E-01	2,9E-01	8,35%
mako	2,6E-02	2,2E-02	8,22%
unpickle	1,5E-05	1,3E-05	8,07%
telco	8,5E-03	7,2E-03	7,78%
genshi_text	4,0E-02	3,5E-02	7,15%
genshi_xml	8,4E-02	7,3E-02	7,07%
logging_silent	4,3E-07	3,8E-07	7,00%
scimark_sor	2,8E-01	2,4E-01	6,83%
django_template	2,0E-01	1,7E-01	6,65%
logging_simple	1,4E-05	1,2E-05	6,64%
html5lib	1,2E-01	1,1E-01	6,63%
unpickle_pure_python	4,8E-04	4,2E-04	6,38%
raytrace	7,6E-01	6,7E-01	6,23%
pathlib	2,4E-02	2,1E-02	6,20%
chaos	1,6E-01	1,4E-01	6,09%
scimark_monte_carlo	1,5E-01	1,3E-01	5,97%
logging_format	1,6E-05	1,4E-05	5,89%
pickle_pure_python	6,5E-04	5,8E-04	5,82%
sqlite_synth	3,9E-06	3,5E-06	5,80%
nqueens	1,2E-01	1,1E-01	5,11%
fannkuch	5,5E-01	5,0E-01	4,97%
xml_etree_generate	1,3E-01	1,2E-01	4,97%
meteor_contest	1,1E-01	1,0E-01	4,84%
sympy_sum	1,3E-01	1,1E-01	4,81%
json_loads	3,0E-05	2,7E-05	4,73%
2to3	4,0E-01	3,6E-01	4,68%
chameleon	1,2E-02	1,1E-02	4,58%
regex_compile	2,3E-01	2,1E-01	4,49%
hexiom	1,4E-02	1,2E-02	4,47%
scimark_lu	2,9E-01	2,6E-01	4,47%
xml_etree_process	1,0E-01	9,6E-02	4,20%
dulwich_log	8,8E-02	8,1E-02	4,07%
sympy_str	2,6E-01	2,4E-01	4,02%
json_dumps	1,4E-02	1,3E-02	3,92%
sympy_expand	5,9E-01	5,5E-01	3,62%
crypto_pyaes	1,3E-01	1,2E-01	3,26%
pidigits	1,7E-01	1,6E-01	3,09%
scimark_sparse_mat_mult	4,9E-03	4,6E-03	2,73%
float	1,5E-01	1,4E-01	2,60%
sympy_integrate	2,7E-02	2,5E-02	2,56%
unpack_sequence	4,7E-08	4,5E-08	2,30%
regex_dna	1,6E-01	1,5E-01	2,27%
regex_v8	2,4E-02	2,3E-02	2,17%
spectral_norm	1,6E-01	1,5E-01	2,09%
python_startup	1,0E-02	9,7E-03	1,96%
python_startup_no_site	6,5E-03	6,3E-03	1,96%
nbody	1,7E-01	1,7E-01	1,65%
tornado_http	2,2E-01	2,2E-01	1,39%
sqlalchemy_declarative	1,8E-01	1,8E-01	1,38%
sqlalchemy_imperative	3,6E-02	3,5E-02	1,29%
xml_etree_parse	3,2E-01	3,1E-01	1,14%
regex_effbot	2,7E-03	2,6E-03	1,14%
xml_etree_iterparse	1,8E-01	1,7E-01	1,07%
scimark_fft	4,0E-01	4,0E-01	0,36%
Comment 13 Nickolas Grigoriadis 2018-04-19 08:40:30 UTC
I have been using the ebuilds from here: https://github.com/InBetweenNames/gentooLTO/tree/master/dev-lang/python successfully for a while now.

The speedup is noticeable. Running the tests on my one project drops the run-time from 7.4s to 6.7s.

The changes are quite small, and all hidden behind a pgo use-flag.
Comment 14 Nickolas Grigoriadis 2018-04-19 08:52:56 UTC
Created attachment 528032 [details, diff]
PGO patch for dev-lang/python-3.6.5
Comment 15 Nickolas Grigoriadis 2018-04-19 08:53:16 UTC
Created attachment 528034 [details, diff]
PGO patch for dev-lang/python-3.5.5
Comment 16 Nickolas Grigoriadis 2018-04-19 08:53:38 UTC
Created attachment 528036 [details, diff]
PGO patch for dev-lang/python-3.4.8
Comment 17 Nickolas Grigoriadis 2018-04-19 08:53:59 UTC
Created attachment 528038 [details, diff]
PGO patch for dev-lang/python-2.7.14-r1
Comment 18 Ștefan Talpalaru 2019-03-04 01:30:23 UTC
I'm providing an up-to-date python:2.7 ebuild with optional PGO and LTO support in my overlay: https://github.com/stefantalpalaru/gentoo-overlay

Some older benchmarks: https://old.reddit.com/r/Gentoo/comments/8n38ak/devlangpython2715r104_enable_pgo_for_extensions/
Comment 19 Matt Turner gentoo-dev 2020-03-27 16:33:41 UTC
*** Bug 541966 has been marked as a duplicate of this bug. ***
Comment 20 Joel Berendzen 2020-05-26 18:37:46 UTC
Created attachment 641952 [details, diff]
pgo-lto patches

These are pgo and lto patches for the latest python 3.7, 3.8, and 3.9 in the tree.  Substantial changes were needed w.r.t. the previous patches.
Comment 21 Joel Berendzen 2020-05-26 18:40:25 UTC
I submitted a unified patch against the tree just now for python 3.7, 3.8, and 3.9.  Benchmarking on 3.7 shows substantial benefits, about 20% for the tests I care about most.