Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 23418 - Optimization for slow "Updating Portage Cache..." on emerge sync
Summary: Optimization for slow "Updating Portage Cache..." on emerge sync
Status: RESOLVED FIXED
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Unclassified (show other bugs)
Hardware: All Linux
: Highest enhancement (vote)
Assignee: Portage team
URL:
Whiteboard:
Keywords:
: 90518 (view as bug list)
Depends on:
Blocks: 835380
  Show dependency tree
 
Reported: 2003-06-24 16:24 UTC by Ross Girshick
Modified: 2023-05-20 07:44 UTC (History)
12 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Ross Girshick 2003-06-24 16:24:54 UTC
I've been a bit troubled by how slow portage updates the dep cache. On my laptop
it takes approx. 4.5 minutes. I looked into the code and came up with a few ways
to make it faster. Below is a patch.

1) Use symlinks instead of copying the metadata cache files from
/usr/portage/metadata/cache/. This can be done by changing the shutil.copy2 line
to os.symlink in portdb.aux_get()

2) (no patch -- just an idea) Instead of copying and then regenerating
missing/old cache items, regenerate what's needed in place in
../metadata/cache/. This saves disk space (currently about 36MB) and time.

3) (Patch below) Use local rsync to sync /var/cache/edb/dep/ with
/usr/portage/metadata/cache/. I've had success with the patch below, it reduces
my updating times to < 1 minute on average.

As you can see, I have some questions in the code. This is my first couple of
days hacking portage, so I may be missing something. But this seems like an easy
and quick way to make emerge sync a bit snappier.

--Ross Girshick

--- emerge.orig 2003-06-24 15:51:10.000000000 -0400
+++ emerge      2003-06-24 17:40:26.000000000 -0400
@@ -1656,31 +1656,55 @@
                sys.exit(1)
        if os.path.exists(myportdir+"/metadata/cache"):
                print "\n>>> Updating Portage cache...  ",
-               os.umask(0002)
-               if os.path.exists(portage.dbcachedir):
-                       portage.spawn("rm -Rf "+portage.dbcachedir,free=1)
-               try:
-                       os.mkdir(portage.dbcachedir)
-                       os.chown(portage.dbcachedir, os.getuid(),
portage.portage_gid)
-                       os.chmod(portage.dbcachedir, 06775)
-                       os.umask(002)
-               except:
-                       pass
-               mynodes=portage.portdb.cp_all()
+               sys.stdout.flush()
+               # We shouldn't have to worry about this because when portage is
imported dbcachedir is created if it's missing
+               if not os.path.exists(portage.dbcachedir):
+                       print "!!! Cache Directory " + portage.dbcachedir + "
does not exist. Re-running emerge should fix this"
+                       sys.exit(1)
+               # XXX If we don't --delete, then we don't have to regenerate the
cache files...what danger does this create?
+               # maybe it's sufficient to use --delete only every N syncs??? XXX
+               #update_cache_command = "/usr/bin/rsync -rlptD --delete
--delete-after " \
+               update_cache_command = "/usr/bin/rsync -rlptD " + myportdir +
"/metadata/cache/ " + portage.dbcachedir
+               exitcode = portage.spawn(update_cache_command, free=1)
+               # print update_cache_command
+               if (exitcode > 0):
+                       ## more error info might be good
+                       print "!!! local rsync error (cache update failed): " +
exitcode + "\n"
+                       sys.exit(1)
+               mynodes = portage.portdb.cp_all()
                for x in mynodes:
-                       myxsplit=x.split("/")
-                       if not os.path.exists(portage.dbcachedir+"/"+myxsplit[0]):
-                               os.mkdir(portage.dbcachedir+"/"+myxsplit[0])
-                               os.chown(portage.dbcachedir+"/"+myxsplit[0],
os.getuid(), portage.portage_gid)
-                               os.chmod(portage.dbcachedir+"/"+myxsplit[0], 06775)
-                       mymatches=portage.portdb.xmatch("match-all",x)
+                       mymatches = portage.portdb.xmatch("match-all", x)
                        for y in mymatches:
                                update_spinner()
+                               mydbkey = portage.dbcachedir+y
+                               myebuild, in_overlay = portage.portdb.findname2(y)
+                               myebuild_mtime = os.stat(myebuild)[ST_MTIME]
                                try:
-                                      
ignored=portage.portdb.aux_get(y,[],metacachedir=myportdir+"/metadata/cache")
-                               except:
-                                       pass
-               portage.spawn("chmod -R g+rw "+portage.dbcachedir, free=1)
+                                       mydbkeystat = os.stat(mydbkey)
+                                       if mydbkeystat[ST_SIZE] == 0 or
myebuild_mtime != mydbkeystat[ST_MTIME]:
+                                               doregen = 1
+                                       else:
+                                               doregen = 0
+                               except OSError:
+                                       doregen = 1
+
+                               if doregen:
+                                       ##print "doregen " + mydbkey + "\n"
+                                       try:
+                                               os.unlink(mydbkey)
+                                       except:
+                                               pass
+                                       # regenerate the dep cache file using
doebuild interface
+                                       if portage.doebuild(myebuild, "depend",
"/"):
+                                               #depend returned non-zero exit
code...
+                                              
sys.stderr.write(str(red("\nemerge sync:")+" (0) Error in "+y+" ebuild.\n"
+                                               "               Check for syntax
error or corruption in the ebuild. (--debug)\n\n"))
+                                       try:
+                                               os.utime(mydbkey,
(myebuild_mtime, myebuild_mtime))
+                                       except (IOError, OSError):
+                                              
sys.stderr.write(str(red("\nemerge sync:")+" (1) Error in "+y+" ebuild.\n"
+                                               "               Check for syntax
error or corruption in the ebuild. (--debug)\n\n"))
+               ##portage.spawn("chmod -R g+rw "+portage.dbcachedir, free=1)
                sys.stdout.write("\b\b  ...done!\n\n")
                sys.stdout.flush()


Reproducible: Always
Steps to Reproduce:
Comment 1 Martin Holzer (RETIRED) gentoo-dev 2003-06-24 16:28:53 UTC
since 2.0.48_preX it really takes very long
Comment 2 Nicholas Jones (RETIRED) gentoo-dev 2003-06-24 18:33:50 UTC
The metadata is not always right. The reason it is done as it is is because
if you have your own packages, you must change the data. It is slow now
because it is actually being done correctly. I may be adding in a frozen
metadata setup for people that do not use an overlay, but the support for
an overlay is the present reason for the slowness.
Comment 3 Ross Girshick 2003-06-24 20:30:42 UTC
I understand that the metadata might not always be correct. When it isn't, the cache update code needs to regenerate the cache file for the ebuild (whether in the main tree or in the overlay). But the method of copying all of the files from metadata/cache/ to edb/dep/ if they are not to be updated seems sub-optimal. If that approach is taken, then something like rsync saves a lot of data transfer. The present method of deleting all of the dep cache and then shutil.copy2'ing them over creates a pretty high I/O load on my system. Since I'm new to working with the portage code, I'm sure that I'm missing something regarding the overlay. Below is a patch for the 2nd option that I mentioned in the first post. I've symlinked my /var/cache/edb/dep to /usr/portage/metadata/cache. This patch makes portage update the metadata in place without any copying. It takes about 1/4 the time as before. It also generates a file that contains a list of regenerated cache files that rsync should not destroy (--exclude-from regen_list), but presently I'm not getting rsync to honor this file. If it does that, and there are few changes, the update takes ~ 20 seconds.

If this patch is presently ignoring the overlay situation, how can it be fixed to work with the overlay feature?

Thanks
ross

--- emerge.orig 2003-06-24 15:51:10.000000000 -0400
+++ emerge      2003-06-24 23:36:18.000000000 -0400
@@ -1590,6 +1590,11 @@
                                mycommand=mycommand+" --exclude-from "+portage.settings["RSYNC_EXCLUDEFROM"]
                        else:
                                print "!!! RSYNC_EXCLUDEFROM specified, but file does not exist."
+               if os.path.exists(myportdir+"/metadata/regenlist"):
+                       mycommand=mycommand+" --exclude-from "+myportdir+"/metadata/regenlist"
+                       regen_file = open(myportdir+"/metadata/regenlist", "r")
+               else:
+                       regen_file = None
                mycommand=mycommand+" "+syncuri+"/* "+myportdir
                print ">>> starting rsync with "+syncuri+"..."
                exitcode=portage.spawn(mycommand,free=1)
@@ -1656,31 +1661,62 @@
                sys.exit(1)
        if os.path.exists(myportdir+"/metadata/cache"):
                print "\n>>> Updating Portage cache...  ",
-               os.umask(0002)
-               if os.path.exists(portage.dbcachedir):
-                       portage.spawn("rm -Rf "+portage.dbcachedir,free=1)
-               try:
-                       os.mkdir(portage.dbcachedir)
-                       os.chown(portage.dbcachedir, os.getuid(), portage.portage_gid)
-                       os.chmod(portage.dbcachedir, 06775)
-                       os.umask(002)
-               except:
-                       pass
-               mynodes=portage.portdb.cp_all()
-               for x in mynodes:
-                       myxsplit=x.split("/")
-                       if not os.path.exists(portage.dbcachedir+"/"+myxsplit[0]):
-                               os.mkdir(portage.dbcachedir+"/"+myxsplit[0])
-                               os.chown(portage.dbcachedir+"/"+myxsplit[0], os.getuid(), portage.portage_gid)
-                               os.chmod(portage.dbcachedir+"/"+myxsplit[0], 06775)
-                       mymatches=portage.portdb.xmatch("match-all",x)
-                       for y in mymatches:
+               if regen_file:
+                       regen_list = regen_file.readlines()
+                       regen_file.close()
+                       for path_to_cpv in regen_list:
+                               cpv = string.rstrip(string.replace(path_to_cpv, myportdir+"/metadata/cache/", ""))
+                               #XXX Not sure if i need to do anything about the overlay XXX
+                               ebuild, in_overlay = portage.portdb.findname2(cpv)
+                               if not os.path.exists(ebuild):
+                                       regen_list.remove(cpv)
+               else:
+                       regen_list = []
+               node_list = portage.portdb.cp_all()
+               for node in node_list:
+                       cpv_list = portage.portdb.xmatch("match-all", node)
+                       for cpv in cpv_list:
                                update_spinner()
+                               md_file = myportdir+"/metadata/cache/"+cpv
+                               ebuild, in_overlay = portage.portdb.findname2(cpv)
+                               ebuild_mtime = os.stat(ebuild)[ST_MTIME]
                                try:
-                                       ignored=portage.portdb.aux_get(y,[],metacachedir=myportdir+"/metadata/cache")
-                               except:
-                                       pass
-               portage.spawn("chmod -R g+rw "+portage.dbcachedir, free=1)
+                                       md_file_stat = os.stat(md_file)
+                                       if md_file_stat[ST_SIZE] == 0 or ebuild_mtime != md_file_stat[ST_MTIME]:
+                                               doregen = 1
+                                       else:
+                                               doregen = 0
+                               except OSError:
+                                       doregen = 1
+                               if doregen:
+                                       ##print "doregen " + cvp + "\n"
+                                       regen_list.append(myportdir+"/metadata/cache/"+cpv+"\n")
+                                       try:
+                                               os.unlink(md_file)
+                                       except:
+                                               pass
+                                       # regenerate the dep cache file using doebuild interface
+                                       # this places it in /var/cache/edb/dep, which is now a symlink to myportdir/metadata/cache
+                                       if portage.doebuild(ebuild, "depend", "/"):
+                                               #depend returned non-zero exit code...
+                                               sys.stderr.write(str(red("\nemerge sync:")+" (0) Error in "+cpv+" ebuild.\n"
+                                               "               Check for syntax error or corruption in the ebuild. (--debug)\n\n"))
+                                       try:
+                                               os.utime(md_file, (ebuild_mtime, ebuild_mtime))
+                                       except (IOError, OSError):
+                                               sys.stderr.write(str(red("\nemerge sync:")+" (1) Error in "+cpv+" ebuild.\n"
+                                               "               Check for syntax error or corruption in the ebuild. (--debug)\n\n"))
+               #print regen_list
+               if len(regen_list) >= 1:
+                       try:
+                               os.unlink(myportdir+"/metadata/regenlist")
+                       except:
+                               pass
+                       regen_file = open(myportdir+"/metadata/regenlist", "w")
+                       for regen in regen_list:
+                               regen_file.write(regen)
+                       regen_file.flush()
+                       regen_file.close()
                sys.stdout.write("\b\b  ...done!\n\n")
                sys.stdout.flush()
Comment 4 Ross Girshick 2003-06-25 07:25:49 UTC
If you're interested in pursuing my last patch, I've fixed the rsync --exclude-from issue (I should have read the man pages more carefully) and a bug with the regen_list handling.

Also, what's the proper method of posting patches? I'm new to bugzilla and find that the line wrapping makes them almost unusable.

here's what to change:

+               if regen_file:
+                       regen_list = regen_file.readlines()
+                       regen_file.close()
+                       for path_to_cpv in regen_list:
+                               cpv = string.rstrip(string.replace(path_to_cpv, "/metadata/cache/", ""))
+                               #XXX Not sure if i need to do anything about the overlay XXX
+                               ebuild, in_overlay = portage.portdb.findname2(cpv)
+                               if not os.path.exists(ebuild):
+                                       regen_list.remove(cpv)
+               else:
+                       regen_list = ['/metadata/regenlist']


+                               if doregen:
+                                       ##print "doregen " + cvp + "\n"
+                                       if regen_list.__contains__("/metadata/cache/"+cpv+"\n") == 0:
+                                               regen_list.append("/metadata/cache/"+cpv+"\n")


+               if len(regen_list) >= 1:
+                       try:
+                               ### in case we have a crash between unlinking the previous regenlist and writting the new one
+                               shutil.copy2(myportdir+"/metadata/regenlist", myportdir+"/metadata/regenlist.bak")
+                               os.unlink(myportdir+"/metadata/regenlist")
+                       except:
+                               pass
Comment 5 rob holland (RETIRED) gentoo-dev 2003-06-27 09:11:47 UTC
hi ross. they should be posted as text/plain attachments please :)
Comment 6 Ross Girshick 2003-06-28 14:21:45 UTC
Thanks. I'd like to apologize also for posting some pre-mature patches.
This is the first time I've tried contributing to an open source project,
I got a little excited and jumped the gun :).

I'm still working on alternative ways of speeding the cache building
processing. I'll let you know when I have a mature patch.
Comment 7 Timo Boettcher 2003-07-13 07:16:54 UTC
It takes a lot longer for me.

notebook root # time emerge sync
[...]
real    597m14.397s
user    37m2.430s
sys     25m36.090s
notebook root #

Thats about 10hours, which is *WAY* to long.

My Notebook is a P-75 with 14mb of ram. It seems to be swapping stuff in and out of memory the whole time, at least it's hdd-led remains lit during the process.

It would be nice if someone could bump up the priority/severity of this, since ten hours for a sync isn't acceptable for me.
Comment 8 Nicholas Jones (RETIRED) gentoo-dev 2003-07-16 01:42:45 UTC
Timo:
Ouch... I'd recommend you just not use metadata...
I kill it on my p100, normally.

Control-c it when it goes to update it.

I'm looking into alternatives for massively
IO bound systems like that. 14M ram is painful.

------------

Linking /var/cache/edb isn't an optimal solution. When I made the
changes that made it slower, I was attempting to fix a worst-case
problem for the 'average' box... Which I picked in the 500-800Mhz
range with a reasonable amount of ram... As I've noticed and been
told, it's painful on the lower end machines.

If you link edb/dep to metadata, you guarentee that you have to
stat 2 files, change the mtime on one, and then ensure that overlays
do not contain a copy of the same ebuild. By simply recreating the
cache, I eliminated the stat calls and did a (cached, on most boxes)
perfect copy to a new file. I'm looking at some db solutions for
this to cut down IO and seeking, but any solution needs to remain
flexable and work properly with multiple trees.
Comment 9 Jason Stubbs (RETIRED) gentoo-dev 2005-05-26 15:45:17 UTC
There's an improvement in >=portage-2.0.51.21 or there abouts where metadata 
is only copied if the cache is out of date. This makes the worst slightly 
worse but the best case much much better. There are still ways where it can be 
improved (and possibly cut out all together) later on, but this bug is not 
quantitive so I'll be closing it when something from the above goes stable. 
Comment 10 Jason Stubbs (RETIRED) gentoo-dev 2005-07-14 05:47:57 UTC
Fixed on or before 2.0.51.22-r1 
Comment 11 Jason Stubbs (RETIRED) gentoo-dev 2005-07-14 06:58:41 UTC
Looking through the batch of bugs, I'm not sure that some of these are 
actually fixed in stable. Others, the requirements have possibly changed after 
the initial fix was committed. 
 
If you think this bug has been closed incorrectly, please reopen or ask that 
it be reopened. 
Comment 12 Brian Harring (RETIRED) gentoo-dev 2006-01-07 16:31:02 UTC
*** Bug 90518 has been marked as a duplicate of this bug. ***
Comment 13 Brian Harring (RETIRED) gentoo-dev 2006-01-07 16:32:11 UTC
zac, reopen if you don't agree with reasoning in this bug for why disabling the metadata transfer shouldn't happen (yet).

Or... do up that write overlay cache backend I poked ya about ;)