Hello, I use portage-2.2.17 with sqlite cache enable, thus I have metadata-transfer enabled in my FEATURES in order to run emerge --metadata after Gentoo tree sync. I also use overlays via layman's sync plugin, so I have repos.conf/layman.conf as follows: [science] priority = 50 location = /var/lib/layman/science layman-type = git sync-type = laymansync sync-uri = git://git.overlays.gentoo.org/proj/sci.git auto-sync = Yes (the same for other overlays) The problem is that when running emerge --sync (or emaint sync -a) portage does the following in order: 1. Updates main tree. 2. Runs emerge --metadata. 3. Updates overlays. This way old overlay data is being cached during emerge --metadata run, which is wrong. Please run metadata sync only after all repositories were updated.
Anyway almost all repositories (including repository "science") other than repository "gentoo" provide no metadata cache (in ${repository_location}/metadata/md5-cache if 'cache-formats = md5-dict' set). You should run `emerge --regen` to locally generate metadata cache for these repositories.
This is not necessary: emerge --regen will just duplicate effort. After running emerge --metadata I have files /var/cache/edb/dep/var/lib/layman/*.sqlite for each overlay updated. sqlite3 confirms that database structure and content from these files is similar to /var/cache/edb/dep/usr/portage.sqlite.
1. Delete /var/cache/edb/dep/* 2. Run: emerge --metadata 3. Check sizes of /var/cache/edb/dep/**/*.sqlite 4. Run: emerge --regen 5. Check sizes of /var/cache/edb/dep/**/*.sqlite They will be different than at step 3.
While we look at making the change needed. You can add either a /etc/portage/postsync.d hook that runs once after all repos are updated or add a /etc/portage/repo.postsync.d hook that runs once for each repo sysnc'd. For repo.postsync.d threee items are passed in to the hook script, repo name, location, sync-uri.
Hmm, indeed, you're right: after deleting old news new ones were empty databases. Now I wonder why I had those files filled in the first place. Another interesting observation: portage.sqlite was 10% smaller after removal and regenaration. Probably sqlite3 $i 'reindex; vacuum' should be used once in a while...
(In reply to Brian Dolbec from comment #4) > While we look at making the change needed. > > You can add either a /etc/portage/postsync.d hook that runs once after all > repos are updated or add a /etc/portage/repo.postsync.d hook that runs once > for each repo sysnc'd. For repo.postsync.d threee items are passed in to > the hook script, repo name, location, sync-uri. In such case emerge --metadata will be run twice: after Gentoo tree sync and after all overlays update. And this is time consuming, especially on several old boxes of mine. Right now I solved this by falling back to eix-update utility (with --regen hook) and by disabling autosync for layman overlays. This way I basically use pre-2.2.16 portage configuration.
(In reply to Andrew Savchenko from comment #0) > 1. Updates main tree. > 2. Runs emerge --metadata. It's not the same thing as emerge --metadata. I only transfers metadata for the repo that was just synced. > 3. Updates overlays. > > This way old overlay data is being cached during emerge --metadata run, > which is wrong. Please run metadata sync only after all repositories were > updated. No, it does the right thing, because it will transfer metadata for an overlay if it has a metadata/md5-cache directory. Since the overlays don't have metadata/md5-cache directories, it skips the metadata transfer. (In reply to Brian Dolbec from comment #4) > While we look at making the change needed. As explained above, no change is needed. The action_metadata function does not do any extra work. It is able to operate on one repo at a time. > You can add either a /etc/portage/postsync.d hook that runs once after all > repos are updated or add a /etc/portage/repo.postsync.d hook that runs once > for each repo sysnc'd. For repo.postsync.d threee items are passed in to > the hook script, repo name, location, sync-uri. This would be pointless, because action_metadata will already be called if for the repo if the metadata/md5-cache directory exists.
For reference, see the SyncManager._sync_callback method: https://github.com/gentoo/portage/blob/v2.2.17/pym/portage/sync/controller.py#L309 Note that it calls action_metadata only if the metadata/md5-cache directory exists. Also note that it uses the porttrees=[self.repo.location] parameter so that the function only transfers metadata for the current repository.
Now I configured repo.postsync.d to run egencache for overlays based on example scripts. Looks like everything works fine now. Thank you for explanations.