In its current shape, portage has serious performance problems (mostly cpu-wise). Is has acceptable speed on desktop/server but on embedded or low-end devices, using portage is a real pain. This bug is intended to track patches/ideas related to performance improvements.
Created attachment 196987 [details, diff] cache_heavy_regexes.patch cache_heavy_regexes.patch uses dict-based cache to reuse extremely heavy regexes from Package.regex and dbapi._iter_match_use. Additionally, _iter_match_use no longer builds one huge regex, but splits it into implicit/explicit. Implicit part (very large) is same fo all packages so it avoids heavy recompilation. `emerge -vpuND world system` goes down from 4:00 to 3:35 mins on my 400mhz armv6 box with this patch.
cache_heavy_regexes.patch was applied (in modified form) as r13801.
Created attachment 197434 [details, diff] faster_loops_in_setcpv_v1.patch faster_loops_in_setcpv_v1.patch improves performance of setcpv by using list comprehensions instead of manual loops wherever possible. It reduces `emerge -puND world system` from 3:35 to 3:20.
Created attachment 197437 [details, diff] faster_loops_in_setcpv_v2.patch faster_loops_in_setcpv_v2.patch is an alternative approach to setcpv performance improvement (see previous comment). Instead of doing len(use_expand_split) * len(iuse_implicit) loops, this patch iterates over iuse_implicit once, builds a dict of prefix/[iuse_implicit] and then iterates over entries of this dict. It reduces `emerge -puND world system` from 3:35 to 3:05.
(In reply to comment #3) > Created an attachment (id=197434) [edit] > faster_loops_in_setcpv_v1.patch Thanks, this is in svn r13815. (In reply to comment #4) > Created an attachment (id=197437) [edit] > faster_loops_in_setcpv_v2.patch + use_expand_iuses = dict() + for item in iuse_implicit: + s = item.split("_", 1) + if len(s) > 1: + prefix = s[0] + l = use_expand_iuses.get(prefix, set()) + if not l: + use_expand_iuses[prefix] = l + l.add(item) This part seems to assume that prefix does not contain underscore, which is not true for things like video_cards.
(In reply to comment #5) > (In reply to comment #4) > > Created an attachment (id=197437) [edit] > > faster_loops_in_setcpv_v2.patch > > This part seems to assume that prefix does not contain underscore, which is not > true for things like video_cards. In svn r13823 I've committed a version of this optimization which accounts for the above issue.
Created attachment 197827 [details, diff] faster_loops_in_setcpv_v4.patch faster_loops_in_setcpv_v4.patch makes setcpv even more faster
(In reply to comment #7) > Created an attachment (id=197827) [edit] > faster_loops_in_setcpv_v4.patch > > faster_loops_in_setcpv_v4.patch makes setcpv even more faster How much faster is it? It's a lot less readable. Also, I think those x[:len(y)] == y comparisons should really be x[:len(y)+1] == y + "_".
Created attachment 198253 [details, diff] use_readlines_when_reading_whole_file.patch use_readlines_when_reading_whole_file.patch This patch makes `emerge -puND world system` 5% faster by using file.readline() instead of list(file).
(In reply to comment #9) > Created an attachment (id=198253) [edit] > use_readlines_when_reading_whole_file.patch Thanks, that's in svn r13834.
This is fixed in 2.2_rc34.
Created attachment 203267 [details, diff] Removes getattr call from Task
Created attachment 203269 [details, diff] build tuple in one go in catpkgsplit instead of array + extend + tuple()
Just ideas: 1. Split load_emerge_config into separate config/vartree reads. It'll speedup things that do not actually need vartree (--help as one example). Low priority. 2. Rewrite isvalidatom & friends with regexes. It currently uses ~20% of all init time (before depgraph creation). 3. Make emerge functions bypass unicode module and function wrappers introduced recently. That's 20% more of init time.
(In reply to comment #12) > Created an attachment (id=203267) [edit] > Removes getattr call from Task (In reply to comment #13) > Created an attachment (id=203269) [edit] > build tuple in one go in catpkgsplit instead of array + extend + tuple() Thanks, those are in svn r14209 and r14211.
Created attachment 203406 [details, diff] implements isvalidatom using regex Attached patch reimplements isvalidatom using regex. It reduces time spent in isvalidatom from 10 to 2 seconds doing 'emerge -vp paludis' (43->35 'user' total). It also should improve repoman dependency checking. The only known problem is over relaxed pkg name part, which is supposed to reject pkg name that 'ends in a hyphen followed by one or more digits' '~foo/bar-1-0.5', but allows it instead. However previous implementation also allowed it, so no regression problem is introduced.
(In reply to comment #16) > Created an attachment (id=203406) [edit] > implements isvalidatom using regex > > Attached patch reimplements isvalidatom using regex. It reduces time spent in > isvalidatom from 10 to 2 seconds doing 'emerge -vp paludis' (43->35 'user' > total). It also should improve repoman dependency checking. > > The only known problem is over relaxed pkg name part, which is supposed to > reject pkg name that 'ends in a hyphen followed by one or more digits' > '~foo/bar-1-0.5', but allows it instead. However previous implementation also > allowed it, so no regression problem is introduced. > I would modify the test to actually fail on foo-1-3-5 whatever. If the code is 'broken' the test should also be broken. I believe I wrote code that you could use to mark the test as an expected failure so it can be fixed later.
(In reply to comment #16) > Created an attachment (id=203406) [edit] > implements isvalidatom using regex Thanks, that's is svn r14213.
Created attachment 203509 [details, diff] Simplifies isvalidatom regex and adds comments Attached patch greatly simplifies atom regex (winning even more performance) and turns it in verbose mode with comments. Added more corner case tests.
Created attachment 203512 [details, diff] Fixed verbose regex I somehow messed up with previous patch, this is a corrected version.
Created attachment 203515 [details, diff] even simpler regex
Created attachment 203519 [details, diff] Even more simpler regex Sorry for spam :( this is the last version.
(In reply to comment #22) > Created an attachment (id=203519) [edit] Thanks, this is in svn r14219.
Created attachment 204572 [details, diff] Turns Package.metadata into a dict Attached patch increases Package creation speed by turning Package.metadata into a dict and avoiding slow reflection.
(In reply to comment #24) > Created an attachment (id=204572) [edit] > Turns Package.metadata into a dict Thanks, this is in svn r14280.
Created attachment 204597 [details, diff] Use existing atom instances instead of doing Atom(Atom) Attached patch makes code to use existing atom instances instead of doing Atom(Atom). Also, cpv_getkey is done with regex now. This patch reduces atom creations from 70k to 18k on emerge -puND world here (having 1.3k per second, that saves 40 seconds!).
(In reply to comment #26) > Created an attachment (id=204597) [edit] > Use existing atom instances instead of doing Atom(Atom) Thanks, that's in svn r14282.
Created attachment 204673 [details, diff] Adds caching to module_wrapper Attached patch adds caching to module_wrapper. That reduces time of 'emerge -vp paludis' from 35 to 32 seconds
Created attachment 204685 [details, diff] Makes Atom inherit from str and increases Atom creation rate from 1.2k per second to 1.6k Attached patch makes Atom inherit from str and increases Atom creation rate from 1.2k per second to 1.6k. This eliminates the need in str methods in Atom.
Thanks very much for the Portage improvements, Marat! :) Keep up the good work and keep sending us improvements please :)
(In reply to comment #28) > Created an attachment (id=204673) [edit] > Adds caching to module_wrapper (In reply to comment #29) > Created an attachment (id=204685) [edit] > Makes Atom inherit from str and increases Atom creation rate from 1.2k per > second to 1.6k Thanks, those are in svn r14299 and r14300.
Created attachment 204770 [details, diff] Simplify match_to_list down to single list comprehension Attached patch simplifies match_to_list down to single list comprehension.
(In reply to comment #32) > Created an attachment (id=204770) [edit] > Simplify match_to_list down to single list comprehension Thanks, this is in svn r14325.
Created attachment 204883 [details, diff] Optimizes unicode wrappers by avoiding redundant isinstance checks Attached patch optimizes unicode wrappers by avoiding redundant isinstance checks.
(In reply to comment #34) > Created an attachment (id=204883) [edit] > Optimizes unicode wrappers by avoiding redundant isinstance checks Thanks, this is in svn r14376.
Created attachment 204996 [details, diff] Improves vartree.getpath timing from 1.2 msec to 100 usec Attached patch improves vartree.getpath timing from 1.2 msec to 100 usec. Such big difference is caused by unicode wrappers.
(In reply to comment #36) > Created an attachment (id=204996) [edit] > Improves vartree.getpath timing from 1.2 msec to 100 usec Thanks, that's in svn r14392.
Created attachment 205039 [details, diff] Improves porttree.auxget timings from 11msec down to 5.8msec Attached patch improves porttree.auxget timings from 11msec down to 5.8msec. This results in reducing 'emerge -vp paludis' from 30secs to 26.5secs
(In reply to comment #38) > Created an attachment (id=205039) [edit] > Improves porttree.auxget timings from 11msec down to 5.8msec Thanks, this is in svn r14398.
This is fixed in 2.1.7.