Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 333773 - cvs.eclass: ECVS_VERSION specifying 'global revision'-like tag
Summary: cvs.eclass: ECVS_VERSION specifying 'global revision'-like tag
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Eclasses (show other bugs)
Hardware: All Linux
: High enhancement (vote)
Assignee: SpanKY
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-08-21 10:03 UTC by Michał Górny
Modified: 2010-08-24 21:00 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2010-08-21 10:03:37 UTC
It would be really useful for app-portage/smart-live-rebuild (bug #310975) if cvs.eclass could export some kind of 'global revision' variable like svn, git and other VCSes do.

I suggest naming the var ECVS_VERSION (similarly to other VCS eclasses). To keep it simple, the value could be some kind of 'last change date' (not sure if that would notice file removals though) or something similar.
Comment 1 SpanKY gentoo-dev 2010-08-21 18:26:11 UTC
the problem with cvs is that it has no repo-level information.  individual files have dates/revs, but that is it.  there would be no way to find this out without querying every file in the entire CVS tree and finding out the latest changed date across them all.

that means you're basically left with the checkout date, and that's no different from using `date` or something similar.  if you have any clever ideas, feel free to re-open.
Comment 2 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2010-08-21 19:23:09 UTC
Hm, I see it's harder than I thought. How about simply writing down mtime of the checkout dir there? I think that would be enough to detect whether the tree was changed since last merge. That approach seems to work for me, and it is quite simple to implement and lightweight too.
Comment 3 SpanKY gentoo-dev 2010-08-21 19:33:36 UTC
i think mtime of a dir only directly reflects the files in it.  so you'd need again a recursive walk of the whole CVS tree.

the CVS/Entries files should contain all the rev info for particular files.  so it shouldnt be too bad unless you have a large cvs tree ?  do we really have any ebuilds using cvs.eclass anymore ?

trouble then becomes a sorting one ...
find -ipath '*/CVS/Entries' -exec sed -r 's|^/.*/.*/(.*)/.*/$|\1|' {} + \
    | sort -n -k 5
Comment 4 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2010-08-21 20:28:50 UTC
Another thing worth considering might be 'cvs history'; it seems to give quite a nice results but I find it really hard to understand most of them.
Comment 5 SpanKY gentoo-dev 2010-08-21 21:10:36 UTC
i'm not sure we can rely on the history file.  a lot of projects nuke it and/or symlink it to /dev/null due to its easily excessive expansion.  ive seen that thing hit hundreds of megabytes before.
Comment 6 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2010-08-21 21:21:30 UTC
To sum up, we have three possibilities now:
1) simple stat of the directory as suggested by me -- might not be accurate but easy and seems enough for simple update checks,
2) CVS/Entries mangling -- would have to implement some date parsing/sorting,
3) 'cvs log' mangling -- awfully slow but sortable (8s for a simple project).

I think I would personally still go with 1). What I need exactly is being able to guess (not necessarily correct) if between the last time package was merged (i.e. when the envvar was written to environment.bz2) and the current checkout state anything has changed (i.e. if any 'cvs up' call updated files). And that seems to be accomplishable with my idea, not requiring much effort nor wasting user's time.
Comment 7 David Leverton 2010-08-22 13:52:34 UTC
(In reply to comment #0)
> It would be really useful for app-portage/smart-live-rebuild (bug #310975) if
> cvs.eclass could export some kind of 'global revision' variable like svn, git
> and other VCSes do.

I assume you just want some value which changes when any file has been updated, it doesn't have to correspond to a revision number or timestamp or anything?  In which case:

(In reply to comment #6)
> 2) CVS/Entries mangling -- would have to implement some date parsing/sorting,

wouldn't really need any date manipulation, it should be enough to take a hash of the relevant information (probably the path/filenames with the revision number of each file, sorted to make sure it's deterministic).
Comment 8 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2010-08-22 18:07:32 UTC
(In reply to comment #7)
> (In reply to comment #6)
> > 2) CVS/Entries mangling -- would have to implement some date parsing/sorting,
> 
> wouldn't really need any date manipulation, it should be enough to take a hash
> of the relevant information (probably the path/filenames with the revision
> number of each file, sorted to make sure it's deterministic).

We'd still need to iterate over every CVS/Entries file there, and rely on some external hashing tool. But seems considerable to me.
Comment 9 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2010-08-22 18:48:57 UTC
(In reply to comment #8)
> (In reply to comment #7)
> > (In reply to comment #6)
> > > 2) CVS/Entries mangling -- would have to implement some date parsing/sorting,
> > 
> > wouldn't really need any date manipulation, it should be enough to take a hash
> > of the relevant information (probably the path/filenames with the revision
> > number of each file, sorted to make sure it's deterministic).
> 
> We'd still need to iterate over every CVS/Entries file there, and rely on some
> external hashing tool. But seems considerable to me.

Trying to implement that, run into a serious problem. The find tool iterates over files in a pretty random order (i.e. the filesystem order), and I think we can't guarantee CVS entries are ordered in any way too (updated files are moved to the bottom).

Unless we implement a lot of sorting everywhere, this makes the hashing idea as reliable as my simple checkout directory timestamp check.

I'm reopening the bug to hopefully get some more discussion on the topic.
Comment 10 SpanKY gentoo-dev 2010-08-22 19:48:51 UTC
it isnt random at all.  piping the output into sort is trivial.

if that is acceptable, then it's easy to add:
export ECVS_VERSION=`find -ipath '*/CVS/Entries' -exec cat {} + | sort | sha1sum`
Comment 11 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2010-08-22 20:18:52 UTC
(In reply to comment #10)
> it isnt random at all.  piping the output into sort is trivial.
> 
> if that is acceptable, then it's easy to add:
> export ECVS_VERSION=`find -ipath '*/CVS/Entries' -exec cat {} + | sort |
> sha1sum`

Not exactly. The Entries file doesn't contain the full paths to the files. But I think we could prepend each line of it with them and then sort. My suggestion would be then:

export ECVS_VERSION=$(find -type d -name CVS -prune -exec sed -n -e 's;^/;{}:;p' {}/Entries \; | sort | sha1sum | cut -d' ' -f1)

The list consists then only of the file entries, of the form:

./CVS:black_720x576.mpg/1.1/Sat Jun  3 09:44:41 2006//
./CVS:config.c/1.94/Sun May 30 23:24:12 2010//

The 'CVS' part is unnecessary/incorrect/whatever and first slash is eaten but I don't think it really matters. The formula is POSIX-compliant, except for the sha1sum call.
Comment 12 SpanKY gentoo-dev 2010-08-22 20:28:00 UTC
i dont think the full path is necessary.  the likelihood of getting a file in two different paths with the same name, rev, and timestamp (down to the second) is small enough to not worry about it.
Comment 13 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2010-08-22 20:34:14 UTC
(In reply to comment #12)
> i dont think the full path is necessary.  the likelihood of getting a file in
> two different paths with the same name, rev, and timestamp (down to the second)
> is small enough to not worry about it.

I think that if we're doing something already, we should make it safe, especially that that doesn't cost much.
Comment 14 SpanKY gentoo-dev 2010-08-22 20:50:52 UTC
... yet you're ok with relying on a hash function to provide collision-free results.  sorry, that argument isnt going to fly.

ive committed this:
http://sources.gentoo.org/eclass/cvs.eclass?r1=1.73&r2=1.74
Comment 15 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2010-08-24 09:36:11 UTC
(In reply to comment #14)
> ... yet you're ok with relying on a hash function to provide collision-free
> results.  sorry, that argument isnt going to fly.
> 
> ive committed this:
> http://sources.gentoo.org/eclass/cvs.eclass?r1=1.73&r2=1.74

One more request. Could you add LC_ALL=C to the sort call?
Comment 16 SpanKY gentoo-dev 2010-08-24 21:00:17 UTC
indeed; i should have seen that too.  thanks.

http://sources.gentoo.org/eclass/cvs.eclass?r1=1.74&r2=1.75