Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 568934 - sys-apps/portage: use md5 instead of mtime for /var/cache/edb/dep (flat_hash and sqlite) cache formats
Summary: sys-apps/portage: use md5 instead of mtime for /var/cache/edb/dep (flat_hash ...
Status: RESOLVED FIXED
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Core (show other bugs)
Hardware: All All
: Normal enhancement (vote)
Assignee: Portage team
URL:
Whiteboard:
Keywords: InVCS
Depends on:
Blocks: 552814 604854
  Show dependency tree
 
Reported: 2015-12-21 04:09 UTC by Zac Medico
Modified: 2017-02-10 22:25 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Zac Medico gentoo-dev 2015-12-21 04:09:24 UTC
Metadata cache entries which are generated on-demand in /var/cache/edb/dep currently use mtimes for validation. It would be better to use md5, like the cache entries written by egencache. Using md5 means that cache entries can remain valid after a `git reset --hard` command resets all of the timestamps (see bug 568890).

For interoperability with older portage and other package managers, we can include both timestamps and md5 digests in the cache entries (the timestamps can later be dropped after a reasonable transition period has elapsed).
Comment 1 Zac Medico gentoo-dev 2015-12-21 05:54:20 UTC
(In reply to Zac Medico from comment #0)
> For interoperability with older portage and other package managers, we can
> include both timestamps and md5 digests in the cache entries (the timestamps
> can later be dropped after a reasonable transition period has elapsed).

It seems that sort of interoperability will be a lot of work. It will be much less work to simply override the validate_entry method so that it can validate entries containing either md5 digests or mtimes.
Comment 2 Zac Medico gentoo-dev 2015-12-22 07:06:42 UTC
There's a patch in the following branch:

https://github.com/zmedico/portage/tree/bug_568934

I've posted it for review here:

https://archives.gentoo.org/gentoo-portage-dev/message/6f8396a0ca6a07fb6a19bb80f216f1b3
Comment 3 Zac Medico gentoo-dev 2015-12-22 17:38:37 UTC
The flat_hash patch is in the master branch:

https://gitweb.gentoo.org/proj/portage.git/commit/?id=669d11bd8af5a2bd4cca1710a09c94294ad1e4dd

I'll follow up with another patch to fix the remaining modules that use mtime.
Comment 5 Zac Medico gentoo-dev 2016-07-05 16:09:30 UTC
The forward compatibility code is included in the latest stable version of portage (2.2.28 at this time), so now it's safe to replace mtime with md5 in the cache. I'll begin work on the patches.
Comment 7 Zac Medico gentoo-dev 2016-07-13 11:35:01 UTC
This is in the master branch now:

https://gitweb.gentoo.org/proj/portage.git/commit/?id=9abbda7d054761ae6c333d3e6d420632b9658b6d
Comment 8 Coacher 2016-07-23 15:05:55 UTC
Hello.

With these changes I have the following stack trace at 'Updating Portage cache' step with FEATURES=metadata-transfer:

# emerge --sync && emerge -avuDN world && emerge --depclean -a && emerge -a1v @preserved-rebuild && revdep-rebuild -i -- -av     
>>> Syncing repository 'gentoo' into '/var/portage'...
/usr/bin/git fetch origin --depth 1
remote: Counting objects: 22, done.
remote: Compressing objects: 100% (5/5), done.
remote: Total 22 (delta 16), reused 22 (delta 16), pack-reused 0
Unpacking objects: 100% (22/22), done.
From https://github.com/gentoo-mirror/gentoo
 + 5c0162b...d67625d master     -> origin/master  (forced update)
=== Sync completed for gentoo
q: Updating ebuild cache in /var/portage ... 
q: Finished 38140 entries in 0.417753 seconds

>>> Updating Portage cache
  0% [>                                                                       ]Traceback (most recent call last):
  File "/usr/lib/python-exec/python3.4/emerge", line 50, in <module>
    retval = emerge_main()
  File "/usr/lib64/python3.4/site-packages/_emerge/main.py", line 1194, in emerge_main
    return run_action(emerge_config)
  File "/usr/lib64/python3.4/site-packages/_emerge/actions.py", line 3134, in run_action
    return action_sync(emerge_config)
  File "/usr/lib64/python3.4/site-packages/_emerge/actions.py", line 2000, in action_sync
    retvals = syncer.auto_sync(options={'return-messages': False})
  File "/usr/lib64/python3.4/site-packages/portage/emaint/modules/sync/sync.py", line 98, in auto_sync
    emaint_opts=options)
  File "/usr/lib64/python3.4/site-packages/portage/emaint/modules/sync/sync.py", line 232, in _sync
    sync_scheduler.wait()
  File "/usr/lib64/python3.4/site-packages/_emerge/AsynchronousTask.py", line 54, in wait
    self._wait()
  File "/usr/lib64/python3.4/site-packages/portage/util/_async/AsyncScheduler.py", line 81, in _wait
    self._event_loop.iteration()
  File "/usr/lib64/python3.4/site-packages/portage/util/_eventloop/EventLoop.py", line 270, in iteration
    if not x.callback(f, event, *x.args):
  File "/usr/lib64/python3.4/site-packages/_emerge/PipeReader.py", line 80, in _output_handler
    self.wait()
  File "/usr/lib64/python3.4/site-packages/_emerge/AsynchronousTask.py", line 57, in wait
    self._wait_hook()
  File "/usr/lib64/python3.4/site-packages/_emerge/AsynchronousTask.py", line 175, in _wait_hook
    self._exit_listener_stack.pop()(self)
  File "/usr/lib64/python3.4/site-packages/portage/util/_async/AsyncFunction.py", line 61, in _async_func_reader_exit
    self.wait()
  File "/usr/lib64/python3.4/site-packages/_emerge/AsynchronousTask.py", line 57, in wait
    self._wait_hook()
  File "/usr/lib64/python3.4/site-packages/_emerge/AsynchronousTask.py", line 175, in _wait_hook
    self._exit_listener_stack.pop()(self)
  File "/usr/lib64/python3.4/site-packages/portage/sync/controller.py", line 386, in _sync_task_exit
    self.sync_callback(self.sync_task)
  File "/usr/lib64/python3.4/site-packages/portage/sync/controller.py", line 356, in _sync_callback
    porttrees=[repo.location])
  File "/usr/lib64/python3.4/site-packages/portage/metadata.py", line 152, in action_metadata
    if not (dest[dest_chf_key] == src[dest_chf_key] and \
KeyError: '_md5_'
Comment 9 Zac Medico gentoo-dev 2016-07-23 22:01:49 UTC
(In reply to Coacher from comment #8)
> in action_metadata
>     if not (dest[dest_chf_key] == src[dest_chf_key] and \
> KeyError: '_md5_'

I'm looking into it now. Hopefully I'll have a patch ready soon.
Comment 10 Zac Medico gentoo-dev 2016-07-23 23:10:42 UTC
(In reply to Coacher from comment #8)
>   File "/usr/lib64/python3.4/site-packages/portage/metadata.py", line 152,
> in action_metadata
>     if not (dest[dest_chf_key] == src[dest_chf_key] and \
> KeyError: '_md5_'

Fixed now:

https://gitweb.gentoo.org/proj/portage.git/commit/?id=bb2f061345fa487061e90922707aab2ddb4b1687
Comment 11 Coacher 2016-07-24 10:50:44 UTC
I can confirm that my problem is gone. Thank you very much for such quick fix.

Please note that this problem didn't occur on another machine with FEATURES=metadata-transfer and portdbapi.auxdbmodule = portage.cache.sqlite.database setting.
Only on the machine with FEATURES=metadata-transfer and without any special portdbapi.auxdbmodule setting.
Comment 12 Zac Medico gentoo-dev 2016-07-24 11:06:42 UTC
(In reply to Coacher from comment #11)
> I can confirm that my problem is gone. Thank you very much for such quick
> fix.

Great, thanks for testing.

> Please note that this problem didn't occur on another machine with
> FEATURES=metadata-transfer and portdbapi.auxdbmodule =
> portage.cache.sqlite.database setting.
> Only on the machine with FEATURES=metadata-transfer and without any special
> portdbapi.auxdbmodule setting.

Yeah, that's because our sqlite cache module returns an empty string instead of raising KeyError.
Comment 13 Oleh 2016-09-28 05:37:48 UTC
this make metadata generation speed up by very huge margin. looks like a good improvement.
Comment 14 Zac Medico gentoo-dev 2017-02-10 18:46:19 UTC
Fixed in portage-2.3.3.