337696 – support alternative cache formats besides existing $PORTDIR/metadata/cache format

Bug 337696 - support alternative cache formats besides existing $PORTDIR/metadata/cache format

Summary: support alternative cache formats besides existing $PORTDIR/metadata/cache fo...

Status:	CONFIRMED

Alias:	None

Product:	Portage Development
Classification:	Unclassified
Component:	Enhancement/Feature Requests (show other bugs)
Hardware:	All Linux

Importance:	High enhancement
Assignee:	Portage team

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2010-09-16 19:13 UTC by Dennis Schridde
Modified:	2023-12-28 03:01 UTC (History)
CC List:	0 users

See Also:	546536
Package list:
Runtime testing required:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Dennis Schridde 2010-09-16 19:13:47 UTC

portage/dbapi/porttree.py:309 seems to require the existence of metadata/cache before trying to create a pregen_auxdb from metadbmodule.

I would like to experiment with the metadata cache residing in an external database, without a metadata/cache directory. Hence it would be nice if the metadbmodule would itself check for this directory's existance, instead of portdbapi doing it. This would allow for other modules without that requirement.

Reproducible: Always

Comment 1 Zac Medico gentoo-dev

2010-09-16 22:52:33 UTC

The metadata/cache directory an it's contents are specified by PMS, so it seems like you are extending the specification. In this case we'll probably want to add some kind of metadata to indicate that an extension is in use, perhaps using metadata/layout.conf as suggested in bug #331683, comment #5.

Also, note that the portdbapi constructor currently passes the metadata/cache path to the portdbapi.metadbmodule constructor.

Since you're experimenting, maybe you should just create the metadata/cache directory in the target repository. You could even put your cache data somewhere in there if you wanted to.

Comment 2 Dennis Schridde 2010-09-17 00:16:35 UTC

(In reply to comment #1)
> The metadata/cache directory an it's contents are specified by PMS, so it seems
> like you are extending the specification.
I do not try to fiddle with the contents of that directory. It is merely that I have am trying to build/sync a pregenerated cache totally independend of this directory. Hence, to support different caching methods independend of a certain filesystem layout, I think it would be better for the metadbmodule to decide whether it can work on the given repository or not.

> Since you're experimenting, maybe you should just create the metadata/cache
> directory in the target repository.
Yes, true. But I found it "bad" to create a dummy directory just to make portage load the metadbmodule.

Comment 3 Zac Medico gentoo-dev

2010-09-17 01:52:37 UTC

(In reply to comment #2)
> I do not try to fiddle with the contents of that directory. It is merely that I
> have am trying to build/sync a pregenerated cache totally independend of this
> directory. Hence, to support different caching methods independend of a certain
> filesystem layout, I think it would be better for the metadbmodule to decide
> whether it can work on the given repository or not.

So, you want portage to instantiate a bunch of different modules so that each module can probe to see if it's cache format is available? Wouldn't it be a lot simpler if the repository specified the available formats in metadata/layout.conf?

Comment 4 Dennis Schridde 2010-09-17 02:27:29 UTC

(In reply to comment #3)
> Wouldn't it be a lot simpler if the repository specified the available formats
> in metadata/layout.conf?
There are instances where the original repository contains no metadata (for various reason, e.g. because the maintainer does not care, or there is no infrastructure to reliably generate it, or because the repository comes directly from svn where metadata generation post-commit hooks are impractical, or ...).
So it might be desired by the user to generate his own metadata independendly from the original repository.

Comment 5 Zac Medico gentoo-dev

2010-09-17 02:32:41 UTC

(In reply to comment #4)
> There are instances where the original repository contains no metadata (for
> various reason, e.g. because the maintainer does not care, or there is no
> infrastructure to reliably generate it, or because the repository comes
> directly from svn where metadata generation post-commit hooks are impractical,
> or ...).
> So it might be desired by the user to generate his own metadata independendly
> from the original repository.

If the user is going to the trouble of generating metadata, then at the same time they can generate a metadata/layout.conf entry specifying which cache format(s) have been generated.

Comment 6 Dennis Schridde 2010-09-17 02:46:25 UTC

(In reply to comment #5)
> If the user is going to the trouble of generating metadata, then at the same
> time they can generate a metadata/layout.conf entry specifying which cache
> format(s) have been generated.
Agreed. Does this mean that auxdbmodule is selected from /etc/portage/modules, and metadbmodule comes from $REPOS/metadata/layout.conf? Is there a syntax for specifying it defined already? [1] suggests it is not, is that correct?

[1] http://dev.gentoo.org/~zmedico/portage/doc/man/portage.5.html

Comment 7 Zac Medico gentoo-dev

2010-09-17 02:54:29 UTC

Right, there's no syntax for it yet. Something like this should be fine:

  cache_formats = foo bar

The "foo" and "bar" should be format identifiers that are independent of the class that implements them. So, portage will have an internal mapping that tells it which module to load in order to read format "foo". This mapping could be specified in /etc/portage/modules with settings like these:

  portdbapi.metadbmodule.foo = portage.cache.metadata_foo.database
  portdbapi.metadbmodule.bar = portage.cache.metadata_bar.database

Comment 8 Dennis Schridde 2010-09-17 09:15:13 UTC

(In reply to comment #7)
> Right, there's no syntax for it yet. Something like this should be fine:
> 
>   cache_formats = foo bar
When designing this, do not forget that some formats might require additional options, i.e. network addresses.

This seems to fit there:
---
cache_formats = foo

foo.url = ...
---

Or this:
---
[cache]
formats = foo

[foo]
url = ...
---

Comment 9 Zac Medico gentoo-dev

2010-09-17 09:24:13 UTC

If the cache is distributed separately then you'll have to tag both the repo and the cache with UUIDs that you can compare them to make sure that they correspond to the same snapshot in time.

Comment 10 Dennis Schridde 2012-08-20 21:21:52 UTC

I have seen md5-cache recently. Does it implement this?

Comment 11 Zac Medico gentoo-dev

2012-08-20 21:49:37 UTC

(In reply to comment #10)
> I have seen md5-cache recently. Does it implement this?

Yes, there's a cache-formats setting in metadata/layout.conf. The code is in pym/portage/repository/config.py.

Comment 12 Zac Medico gentoo-dev

2013-02-15 17:28:13 UTC

The metadbmodule configuration that you mention in comment #0 was removed when we switched to using the metadata/layout.conf cache-formats in this commit:

http://git.overlays.gentoo.org/gitweb/?p=proj/portage.git;a=commit;h=d4ea29bf6a3ce35d49e0f54f9173e3a6e42da2d6

In stable portage (2.1.11.51), the md5-cache format is now the default since this commit:

http://git.overlays.gentoo.org/gitweb/?p=proj/portage.git;a=commit;h=e760c8d2a4ccc56e351ac37904c715f596b58e42