Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 90444 - dblink.getcontents() speedup
Summary: dblink.getcontents() speedup
Status: RESOLVED WONTFIX
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Core (show other bugs)
Hardware: All Linux
: High enhancement (vote)
Assignee: Portage team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 133068
  Show dependency tree
 
Reported: 2005-04-25 15:41 UTC by TGL
Modified: 2006-05-14 20:13 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments
dblink_getcontents--remove_normpath_calls.patch (dblink_getcontents--remove_normpath_calls.patch,482 bytes, patch)
2005-04-25 15:42 UTC, TGL
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description TGL 2005-04-25 15:41:15 UTC
While profiling a script that reads CONTENTS files using 'dblink.getcontents()', i've seen that it was spending quite some time in ~400k calls to 'os.path.normpath()' (one per referenced element). I think there is nothing to normalize here (paths in CONTENTS files are already normal by construction) and thus this calls could be avoided. 

The one-line patch i will attach does this. It's effect is easy to check, for instance with equery:

Before:
% time equery belongs /bin/bash > /dev/null
real    0m13.583s
user    0m13.215s
sys     0m0.249s

After:
% time equery belongs /bin/bash > /dev/null
real    0m6.526s
user    0m6.218s
sys     0m0.246s


Reproducible: Always
Steps to Reproduce:
Comment 1 TGL 2005-04-25 15:42:48 UTC
Created attachment 57225 [details, diff]
dblink_getcontents--remove_normpath_calls.patch

(patch is against cvs HEAD)
Comment 2 Brian Harring (RETIRED) gentoo-dev 2005-04-25 16:05:13 UTC
This modification allows for /{2,} in file paths in contents, which is invalid (won't hurt anything, but is invalid).

We need to be defensive both in writing, and reading from contents imo.
Comment 3 TGL 2005-04-25 16:42:00 UTC
"normpath()" is just about removing '.', '..' and redundant '/', but it won't garantee anything else about the paths. So i don't think it adds any safeguard in the reading of CONTENTS file. The only thing that may need it is the 'root' variable, since it comes from the user, but that could be done once for all where the global is defined. 
Comment 4 Jason Stubbs (RETIRED) gentoo-dev 2005-04-28 21:05:35 UTC
What is the need to be defensive in reading?
Comment 5 Jason Stubbs (RETIRED) gentoo-dev 2005-07-28 07:24:40 UTC
Putting a hold on feature requests for portage as they are drowning out the 
bugs. Most of these features should be available in the next major version of 
portage. But for the time being, they are just drowning out the major bugs and 
delaying the next version's progress. 
 
Any bugs that contain patches and any bugs for etc-update or dispatch-conf can 
be reopened. Sorry, I'm just not good enough with bugzilla. ;) 
Comment 6 TGL 2005-07-28 07:44:42 UTC
Reopening since this one is not a feature request, but a simple improvement
proposal with its patch.
Comment 7 Alec Warner (RETIRED) archtester gentoo-dev Security 2006-05-11 18:29:41 UTC
(In reply to comment #4)
> What is the need to be defensive in reading?
> 

technically the contents could be corrupt, however I think it would be up to the called to fix any errors in the contents, so I'm for removing it..
Comment 8 Zac Medico gentoo-dev 2006-05-14 20:13:55 UTC
It's difficult to justify a sacrifice of robustness for performance.  The contents could be slightly malformed and a sacrifice in robustness here could potentially cause a detectable file collision to go undetected.  In svn r3357 the code now uses both os.path.join and os.path.normpath in favor of maximum robustness.