Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 35931 - [patch] unmerge speedup (and fix?)
Summary: [patch] unmerge speedup (and fix?)
Status: RESOLVED INVALID
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Unclassified (show other bugs)
Hardware: All Linux
: High minor (vote)
Assignee: Portage team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2003-12-16 05:00 UTC by TGL
Modified: 2011-10-30 22:18 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments
unmerge-speedup--without-normalization.patch (unmerge-speedup--without-normalization.patch,1.15 KB, patch)
2003-12-16 05:02 UTC, TGL
Details | Diff
unmerge-speedup--with-normalization.patch (unmerge-speedup--with-normalization.patch,3.94 KB, patch)
2003-12-16 05:03 UTC, TGL
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description TGL 2003-12-16 05:00:13 UTC
Doing some pkg cleanup, I realized I had something like 10 old kernels sources trees "installed". In fact, when such a tree is outdated, I simply `rm -rf` it and often forget the package. Okay, my bad, but... unmerging this 10 non existing directories took much time, because existence of each file was checked. 

So I've patched unmerge to filter out non existing path in the file list before unmerging. This pre-process is really costless once the content list is sorted (and it has to be sorted anyway), and speedup things a lot in some cases. 

Also, writing this, I've seen something incoherent in the code:

 for mykey in pkgfiles.keys()
     ...normalize mykey to remove some "/"...
     ...access to pkgfiles[mykey]...

If you execute this whereas some paths are not already normalized in the CONTENTS file, it will give you a key error. Fortunately, contents files only contains normalized paths, so I think the normalisation here is just redundant.
Anyway, I will attach my patch in two version:
 - one that makes the assumption that paths in CONTENTS are normalized (seems safe this no such key error has been reported so far).
 - one that doesn't make this assumption, but fix this potential unmerge bug by keeping a "normalized key -> real key" dictionnary.


Reproducible: Always
Steps to Reproduce:
1- rm -rf /usr/src/linux-X.Y.Z-flavor
2- emerge -C =sys-kernel/flavor-sources-X.Y.Z
Actual Results:  
<snip>
!found /usr/src/linux-X.Y.Z-flavor/path/to/a/file
!found /usr/src/linux-X.Y.Z-flavor/path/to/another/file
!found /usr/src/linux-X.Y.Z-flavor/path/to/a/third/file
<very long snip>
!found /usr/src/linux-X.Y.Z
<snip>

Expected Results:  
<snip>
!found /usr/src/linux-X.Y.Z
<snip>
Comment 1 TGL 2003-12-16 05:02:12 UTC
Created attachment 22290 [details, diff]
unmerge-speedup--without-normalization.patch

This version assumes paths are normalized in CONTENTS.
Patch is against 2.0.50_pre1
Comment 2 TGL 2003-12-16 05:03:22 UTC
Created attachment 22291 [details, diff]
unmerge-speedup--with-normalization.patch

This version doesn't assume paths being already normalized.
Patch is also against 2.0.50_pre1.
Comment 3 TGL 2003-12-18 08:33:35 UTC
> unmerging this 10 non existing directories took much time, because 
> existence of each file was checked. 

I think I have to correct this:

The speedup I've noticed using this filtering patch was mainly due to my terminal slowness (I was using gnome-terminal). Because without the patch there was still a lot of output for non existing files, the unmerge of an already deleted kernel sources tree was taking several minutes (~4), whereas it is ~20 seconds with the patch. 

But if I redirect this output to /dev/null, then the same unmerge without the patch fall down to ~40 secs. It is still slower than the patched version, but not that much. The real overcost of testing the existence of ~12000 obviously non-existing files is only ~20 secs, much less that what I was thinking at first.
Comment 4 Nicholas Jones (RETIRED) gentoo-dev 2003-12-22 23:25:40 UTC
If the difference is the text output, that won't matter in the
future.
Comment 5 TGL 2003-12-23 07:04:28 UTC
I don't really agree with your resolution. Yes, the main difference (ie. several minutes) was text output. That said, the cost of pure file existence testing also exists, it is ~20 sec on my computer for a kernel unmerge, which is 50% of the total "emerge -C" time. Removing this unecessary work is still an improvement, no? The current unmerge behavior which consists of "/a/b/c1 does not exists... oh, and neither does /a/b/c2, nor does /a/b/c3... <snip 12000 others like that>... hum, btw there was no /a/b..." is completly absurd, I don't see why fixing it would be invalid (especially considering the fix is just a ~10 lines addition).