Summary: | metadata transfer should be done after overlays update | ||
---|---|---|---|
Product: | Portage Development | Reporter: | Andrew Savchenko <bircoph> |
Component: | Unclassified | Assignee: | Portage team <dev-portage> |
Status: | RESOLVED WORKSFORME | ||
Severity: | normal | CC: | bircoph |
Priority: | Normal | ||
Version: | unspecified | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- |
Description
Andrew Savchenko
2015-02-22 11:28:19 UTC
Anyway almost all repositories (including repository "science") other than repository "gentoo" provide no metadata cache (in ${repository_location}/metadata/md5-cache if 'cache-formats = md5-dict' set). You should run `emerge --regen` to locally generate metadata cache for these repositories. This is not necessary: emerge --regen will just duplicate effort. After running emerge --metadata I have files /var/cache/edb/dep/var/lib/layman/*.sqlite for each overlay updated. sqlite3 confirms that database structure and content from these files is similar to /var/cache/edb/dep/usr/portage.sqlite. 1. Delete /var/cache/edb/dep/* 2. Run: emerge --metadata 3. Check sizes of /var/cache/edb/dep/**/*.sqlite 4. Run: emerge --regen 5. Check sizes of /var/cache/edb/dep/**/*.sqlite They will be different than at step 3. While we look at making the change needed. You can add either a /etc/portage/postsync.d hook that runs once after all repos are updated or add a /etc/portage/repo.postsync.d hook that runs once for each repo sysnc'd. For repo.postsync.d threee items are passed in to the hook script, repo name, location, sync-uri. Hmm, indeed, you're right: after deleting old news new ones were empty databases. Now I wonder why I had those files filled in the first place. Another interesting observation: portage.sqlite was 10% smaller after removal and regenaration. Probably sqlite3 $i 'reindex; vacuum' should be used once in a while... (In reply to Brian Dolbec from comment #4) > While we look at making the change needed. > > You can add either a /etc/portage/postsync.d hook that runs once after all > repos are updated or add a /etc/portage/repo.postsync.d hook that runs once > for each repo sysnc'd. For repo.postsync.d threee items are passed in to > the hook script, repo name, location, sync-uri. In such case emerge --metadata will be run twice: after Gentoo tree sync and after all overlays update. And this is time consuming, especially on several old boxes of mine. Right now I solved this by falling back to eix-update utility (with --regen hook) and by disabling autosync for layman overlays. This way I basically use pre-2.2.16 portage configuration. (In reply to Andrew Savchenko from comment #0) > 1. Updates main tree. > 2. Runs emerge --metadata. It's not the same thing as emerge --metadata. I only transfers metadata for the repo that was just synced. > 3. Updates overlays. > > This way old overlay data is being cached during emerge --metadata run, > which is wrong. Please run metadata sync only after all repositories were > updated. No, it does the right thing, because it will transfer metadata for an overlay if it has a metadata/md5-cache directory. Since the overlays don't have metadata/md5-cache directories, it skips the metadata transfer. (In reply to Brian Dolbec from comment #4) > While we look at making the change needed. As explained above, no change is needed. The action_metadata function does not do any extra work. It is able to operate on one repo at a time. > You can add either a /etc/portage/postsync.d hook that runs once after all > repos are updated or add a /etc/portage/repo.postsync.d hook that runs once > for each repo sysnc'd. For repo.postsync.d threee items are passed in to > the hook script, repo name, location, sync-uri. This would be pointless, because action_metadata will already be called if for the repo if the metadata/md5-cache directory exists. For reference, see the SyncManager._sync_callback method: https://github.com/gentoo/portage/blob/v2.2.17/pym/portage/sync/controller.py#L309 Note that it calls action_metadata only if the metadata/md5-cache directory exists. Also note that it uses the porttrees=[self.repo.location] parameter so that the function only transfers metadata for the current repository. Now I configured repo.postsync.d to run egencache for overlays based on example scripts. Looks like everything works fine now. Thank you for explanations. |