Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 333705 - validation of git tree created by conversion process
Summary: validation of git tree created by conversion process
Status: RESOLVED OBSOLETE
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Unspecified (show other bugs)
Hardware: All Linux
: High normal with 3 votes (vote)
Assignee: Robin Johnson
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 333531
  Show dependency tree
 
Reported: 2010-08-20 20:19 UTC by Thilo Bangert (RETIRED) (RETIRED)
Modified: 2024-01-27 08:48 UTC (History)
9 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Thilo Bangert (RETIRED) (RETIRED) gentoo-dev 2010-08-20 20:19:24 UTC
It must be checked, that the conversion process produces a git repository which is equivalent to the original.

Last status:
> Still needs work, hope to work on it in February month myself.
Comment 1 Alex Bennee 2010-08-24 09:12:17 UTC
FYI: I have some experience in maintaining a CVS->GIT conversion at work. Due to the size of tree the conventional cvsps based import is incredibly slow so I needed to resurrect Kieth Packard's <a href="http://github.com/stsquad/parsecvs">parsecvs</a> and tweak it. However it does require direct access to the CVS repo. However if cvsps is working for you you should probably stick to it.
Comment 2 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2010-08-24 16:43:18 UTC
bennee:
We're using a modified cvs2svn, the config and timing results have been posted on the -scm list. It's down to a couple of hours for a full conversion now, but we need validation of that converted tree.
Comment 3 Andreas K. Hüttel archtester gentoo-dev 2012-01-14 15:44:38 UTC
Any news here?
Comment 4 Dirkjan Ochtman (RETIRED) gentoo-dev 2012-05-22 09:46:06 UTC
Does anyone have any scripts/ideas/strategy for validation yet?
Comment 5 Michael Weber (RETIRED) gentoo-dev 2012-05-23 12:17:23 UTC
I see two aspects (please add things I miss)

1) We have some files in gentoo-x86 (eclasses,licenses,ebuilds)
These can be diffed non-destructive by a raw diff -ru, or
destructive after stipping headers like `sed -e /1,3/s:^#.*:: -i` for all tagged files.

Problem: Headers on ebuilds should not be altered, for (signed) manifest integrity.

==> We can detect missing/changed files on a users point of view.

2) We have a lot of changelogs on the (deleted) files we actually don't use/truse/wann look at (we do keep Changelog files for now).

We can either check the cvs2git algorithm and test some `cvs log` vs. `git log` or we can write wrappers to so so an all data (or a random subset) by parsing 'cvs log' and 'git log' and make a fuzzy compare.

Anything I miss?
Comment 6 Dirkjan Ochtman (RETIRED) gentoo-dev 2012-05-25 11:12:45 UTC
Is there infra that I could access to if I were going to work on this? Similarly, are the modified cvs2svn and other scripts available somewhere?
Comment 7 Richard Freeman gentoo-dev 2013-01-02 23:16:28 UTC
(In reply to comment #6)
> Is there infra that I could access to if I were going to work on this?
> Similarly, are the modified cvs2svn and other scripts available somewhere?

ferringb would need to comment on the actual scripts.  You can find a cvs snapshot and corresponding git bundle in d.g.o:/space/git-work.  If any non-devs intend to actually contribute I could probably mirror this elsewhere.
Comment 8 Herbert Wantesh 2013-11-05 00:05:55 UTC
any non-devs still needed? i could help
Comment 9 Richard Freeman gentoo-dev 2013-11-05 01:22:23 UTC
(In reply to puchu from comment #8)
> any non-devs still needed? i could help

Robin - what are your thoughts?  At this point the actual migration routine seems to work fine.  Certainly I can take another look at it if you make any changes or want to confirm that things are still fine.  

I don't think we're likely to be able to do complete real-time validation of a migrated tree to be honest.  There is just way too much commit manipulation in order to combine manifests/etc into single commits.  

I don't think this is really what is holding up the migration at this point.  Should we consider closing this?
Comment 10 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2013-11-05 03:48:53 UTC
Leave it open for me please, I'm hoping to commit a lot of time in the next two months to it.
Comment 11 Richard Freeman gentoo-dev 2013-11-05 03:51:31 UTC
(In reply to Robin Johnson from comment #10)
> Leave it open for me please, I'm hoping to commit a lot of time in the next
> two months to it.

If you need anything from me let me know.  I have my routines scripted to run in parallel on a single host now, but they're more designed to manually scan for interesting patterns and less about detecting any issue.  Their main advantage is that they're implemented in a completely different way than the migration process, so they are a good independent test.
Comment 12 Steve Arnold archtester gentoo-dev 2013-12-11 05:33:03 UTC
Where are we on this so far?  Have the final "tweaks" been made as for the cvs export scripts?  If so, are people happy with the results, or is there more V&V work to do?  I have a little IV&V background, plus I've written some build verification stuff at work.  Is there anything V&V-ish left I can help with?
Comment 13 Brian Harring (RETIRED) gentoo-dev 2014-02-21 15:17:04 UTC
The conversion scripts have been refreshed/fixed for cvs2svn config incompatibilities; that work exists at git://pkgcore.org/git-conversion-tools/ .  Additionally, I've got a conversion running on packages.gentooexperimental.org; I need to refresh it, but it looks like we can restart the validation process and ensure the scripts are fine.

Rich0; got time to resume your end of this work now?
Comment 14 Richard Freeman gentoo-dev 2014-02-21 15:55:31 UTC
(In reply to Brian Harring from comment #13)
> 
> Rich0; got time to resume your end of this work now?

Absolutely, and I'm happy to involve anybody who wants to be (I try to post what I'm doing on -scm, in bugs, and in the git repo README in any case).

Just email me with the location of a matching cvs snapshot (server-side) and git bundle at any time if you want me to take a look at it.
Comment 15 Brian Harring (RETIRED) gentoo-dev 2014-02-22 06:01:50 UTC
Rich0; can you check over http://wiki.gentoo.org/wiki/Project:Infrastructure/Git_migration and ensure the 'how to' for running your validation scripts is documented either there, or in the README?

Meanwhile, there is a new dump in dev.gentoo.org:/space/git-work/dumps/ ; both the cvs tree snapshot and .gitbundle is in there.
Comment 16 Richard Freeman gentoo-dev 2014-02-23 18:54:59 UTC
(In reply to Brian Harring from comment #15)
> Rich0; can you check over
> http://wiki.gentoo.org/wiki/Project:Infrastructure/Git_migration and ensure
> the 'how to' for running your validation scripts is documented either there,
> or in the README?
> 
> Meanwhile, there is a new dump in dev.gentoo.org:/space/git-work/dumps/ ;
> both the cvs tree snapshot and .gitbundle is in there.

I put in a VERY rough set of steps on the wiki.  If you have issues following it feel free to clean it up or ping me to clarify steps - it needs some work, but I wanted to at least get something out there.

I checked gentoo-x86-20140221-22:05.cvs.tar.bz2 vs gentoo-x86-20140221-22:05.gitbundle, and the keyword expansion isn't enabled in the git conversion, which means that just about all of the file hashes don't match.  

As an example git show 3953b8b3981c410f65cb45bda1c09da6168005f6:
# Copyright 1999-2010 Gentoo Foundation
# Distributed under the terms of the GNU General Public License v2
# $Header$

cvs co -p -r 1.5 gentoo-x86/app-accessibility/accerciser/accerciser-1.10.1.ebuild:
# Copyright 1999-2010 Gentoo Foundation
# Distributed under the terms of the GNU General Public License v2
# $Header: /var/cvsroot/gentoo-x86/app-accessibility/accerciser/Attic/accerciser-1.10.1.ebuild,v 1.5 2010/09/11 18:37:22 josejx Exp $

If I find other issues I'll post them, but all the hashes don't match which means any more subtle problems will be lost.  I can look at authors and commit messages though.
Comment 17 Alex Xu (Hello71) 2014-06-24 23:01:35 UTC
I recall there being a bug on keyword expansion; barring that, what's stopping this from moving forward?
Comment 18 Richard Freeman gentoo-dev 2014-06-24 23:40:15 UTC
(In reply to Alex Xu (Hello71) from comment #17)
> I recall there being a bug on keyword expansion; barring that, what's
> stopping this from moving forward?

As far as I'm concerned the validation is done.  Certainly more could be done with it, but I'm not aware of any issues with the conversion routines at this point.
Comment 19 Brian Harring 2024-01-27 06:38:53 UTC
This can be invalid'd/closed.