I was playing with 'equery hasuse' to answer a question on the gentoo-user ML and had it fill up 1GB of RAM and almost all of 1GB of swap before I killed it. Investigation shows that the find_all_* functions in helpers.py return a list of package objects. Unfortunately, when dealing with a large number of packages, the memory required to return this list is extremely large. I am attaching a patch the changes the behavior of these functions to return a list of package names instead, and I have modified equery and etcat to instantiate the package objects as needed from the list of names returned. These changes resulted in very little memory growth when dealing with the entire portage tree.
Created attachment 57449 [details, diff] patch to limit memory usage in equery
Created attachment 57450 [details, diff] patch to limit memory usage in equery blech, screwed up the patch for etcat.
Comment on attachment 57450 [details, diff] patch to limit memory usage in equery I found the underlying cause of the issue. I will submit a patch later.
Created attachment 57657 [details, diff] package.py patch Underlying cause was package.py creating a copy of portage.config object for every package object created. This patch solves the memory usage and significantly speeds up equery as well. I'll attach full benchmarks but an equery hasuse -p perl dropped from 14 minutes to 46 seconds
Created attachment 57658 [details] Patch benchmarks
Problem is that we need this copy as the setcpv() (required to reflect package.use settings) call makes the config instance package specific :-/
Hmm, I should have actually checked the patch first ... It should work, but I'm not really happy with that if statement, just thinking about a more general solution (in portage.config) right now.
*** Bug 99517 has been marked as a duplicate of this bug. ***
Created attachment 68588 [details, diff] package.py.patch
Created attachment 68874 [details, diff] gentoolkit patch Updated patch to add semaphore access to the global portage.config object
Created attachment 68900 [details, diff] Updated gentoolkit patch Changed Sempaphore to Lock, removed reset() call.
Fix is in subversion
Fix is in gentoolkit-0.2.1_pre8
Fixed in gentoolkit-0.2.1