Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 261380 - reduced manifests w/ git
Summary: reduced manifests w/ git
Status: RESOLVED DUPLICATE of bug 333691
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: New packages (show other bugs)
Hardware: All Linux
: High enhancement (vote)
Assignee: Portage team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-03-05 22:47 UTC by Caleb Cushing
Modified: 2011-09-20 08:23 UTC (History)
6 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Caleb Cushing 2009-03-05 22:47:00 UTC
git doesn't require manifests for any files in the tree/overlay. it should only have manifests for distfiles.

programs that will need updating

repoman/ebuild to generate dist only manifest

cache-tools to make sure that they don't yell about missing  manifests

portage, to check the git sha1's instead of the 

mysilly pseudo code

if (using_git) then
#git diff --quiet uses sha1's internally to check so it will actually check #hashes and return 0 or 1 depending on whether it matches the blob in the #repository.
   check_git_sha1 = git diff --quiet HEAD -- $package.ebuild
   if ( check_git_sha1 == 1 )
       error file does not match git head (or some better error message)
        exit 1
   check_manifest_dist
        now go ahead and check the DIST section of the Manifest files
else

other notes include to get the actual hash.


 # hash of 'Makefile' in the most recent commit on the current branch
 $ git rev-parse HEAD:Makefile
 27b9569746179e68c635bdaab8e57395f63faf01

 # hash of 'Makefile' in the index
 $ git rev-parse :Makefile
 27b9569746179e68c635bdaab8e57395f63faf01

 # hash of 'Makefile' in some arbitrary revision
 $ git rev-parse v1.5.1:Makefile
 b159ffd0ae49c28725de6549132e0ad3a3b69d20

And you can compute the git blob hash of any file use git hash-object:

 $ git hash-object --stdin < Makefile
 27b9569746179e68c635bdaab8e57395f63faf01

per prior conversation w/ zmedico

Reproducible: Always
Comment 1 Jeremy Olexa (darkside) (RETIRED) archtester gentoo-dev Security 2009-03-05 22:53:39 UTC
As an FYI, this has been discussed on the gentoo-scm ML. You should join and contribute there as well, obviously you have experience with it ;) Plz check the archives first before bringing up any old topics.
Comment 2 Caleb Cushing 2009-03-05 23:11:02 UTC
that's been pointed out previously. I reviewed the archive and am on the list although it doesn't seem get much traffic (or something is wrong with my subscription) I'm not aware that a conclusion has been made.
Comment 3 Zac Medico gentoo-dev 2009-03-08 20:55:54 UTC
(In reply to comment #0)
>  # hash of 'Makefile' in the most recent commit on the current branch
>  $ git rev-parse HEAD:Makefile
>  27b9569746179e68c635bdaab8e57395f63faf01

It will be useful to have a shared library for this (with python bindings), since it's relatively expensive to spawn a process like that. Having quick access to the digests (as the manifest currently provides) will allow quick validation of new metadata cache entries which will contain digests as described here:

http://archives.gentoo.org/gentoo-dev/msg_cfa80e33ee5fa6f854120ddfb9b468b3.xml

Without quick access to ebuild digests, in order to validate metadata cache we'll have to compute ebuild digests during dependency calculations and that will hurt performance.
Comment 4 Mike Auty (RETIRED) gentoo-dev 2009-03-08 21:00:41 UTC
Of interest might be dulwich [1] and gitpython [2] (there are ebuilds for both in my overlay).  Both are pure-python git libraries (although it's not clear exactly how much each implements).  Still, they're a starting point...  5:)

[1] https://launchpad.net/dulwich
[2] http://pypi.python.org/pypi/GitPython/
Comment 5 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2009-03-09 01:57:41 UTC
Everybody here, can we please continue this discussion explicitly on the -scm mailing list? We're trying to keep everything together in one please, so that anybody wanting to review the progress doesn't need to follow much in the way of links.

zmedico:
access to the Git index is certainly possible without the exec. What isn't avoidable in any case is checking that the ebuilds haven't changed (which git does via lstat for the most part).

For the short-term however, I had thought I noted on the -scm mailing list that you can list the entire (or any subset) tree's SHA1 ids with a single exec() call using 'git ls-files --stage'. Adding '-d -m -o -k -t -v' is useful as it shows you potentially changed items as well, and '-u' beyond that is only the changed files. 
Comment 6 Sebastian Luther (few) 2011-09-20 08:23:55 UTC

*** This bug has been marked as a duplicate of bug 333691 ***