It must be checked, that the conversion process produces a git repository which is equivalent to the original. Last status: > Still needs work, hope to work on it in February month myself.
FYI: I have some experience in maintaining a CVS->GIT conversion at work. Due to the size of tree the conventional cvsps based import is incredibly slow so I needed to resurrect Kieth Packard's <a href="http://github.com/stsquad/parsecvs">parsecvs</a> and tweak it. However it does require direct access to the CVS repo. However if cvsps is working for you you should probably stick to it.
bennee: We're using a modified cvs2svn, the config and timing results have been posted on the -scm list. It's down to a couple of hours for a full conversion now, but we need validation of that converted tree.
Any news here?
Does anyone have any scripts/ideas/strategy for validation yet?
I see two aspects (please add things I miss) 1) We have some files in gentoo-x86 (eclasses,licenses,ebuilds) These can be diffed non-destructive by a raw diff -ru, or destructive after stipping headers like `sed -e /1,3/s:^#.*:: -i` for all tagged files. Problem: Headers on ebuilds should not be altered, for (signed) manifest integrity. ==> We can detect missing/changed files on a users point of view. 2) We have a lot of changelogs on the (deleted) files we actually don't use/truse/wann look at (we do keep Changelog files for now). We can either check the cvs2git algorithm and test some `cvs log` vs. `git log` or we can write wrappers to so so an all data (or a random subset) by parsing 'cvs log' and 'git log' and make a fuzzy compare. Anything I miss?
Is there infra that I could access to if I were going to work on this? Similarly, are the modified cvs2svn and other scripts available somewhere?
(In reply to comment #6) > Is there infra that I could access to if I were going to work on this? > Similarly, are the modified cvs2svn and other scripts available somewhere? ferringb would need to comment on the actual scripts. You can find a cvs snapshot and corresponding git bundle in d.g.o:/space/git-work. If any non-devs intend to actually contribute I could probably mirror this elsewhere.
any non-devs still needed? i could help
(In reply to puchu from comment #8) > any non-devs still needed? i could help Robin - what are your thoughts? At this point the actual migration routine seems to work fine. Certainly I can take another look at it if you make any changes or want to confirm that things are still fine. I don't think we're likely to be able to do complete real-time validation of a migrated tree to be honest. There is just way too much commit manipulation in order to combine manifests/etc into single commits. I don't think this is really what is holding up the migration at this point. Should we consider closing this?
Leave it open for me please, I'm hoping to commit a lot of time in the next two months to it.
(In reply to Robin Johnson from comment #10) > Leave it open for me please, I'm hoping to commit a lot of time in the next > two months to it. If you need anything from me let me know. I have my routines scripted to run in parallel on a single host now, but they're more designed to manually scan for interesting patterns and less about detecting any issue. Their main advantage is that they're implemented in a completely different way than the migration process, so they are a good independent test.
Where are we on this so far? Have the final "tweaks" been made as for the cvs export scripts? If so, are people happy with the results, or is there more V&V work to do? I have a little IV&V background, plus I've written some build verification stuff at work. Is there anything V&V-ish left I can help with?
The conversion scripts have been refreshed/fixed for cvs2svn config incompatibilities; that work exists at git://pkgcore.org/git-conversion-tools/ . Additionally, I've got a conversion running on packages.gentooexperimental.org; I need to refresh it, but it looks like we can restart the validation process and ensure the scripts are fine. Rich0; got time to resume your end of this work now?
(In reply to Brian Harring from comment #13) > > Rich0; got time to resume your end of this work now? Absolutely, and I'm happy to involve anybody who wants to be (I try to post what I'm doing on -scm, in bugs, and in the git repo README in any case). Just email me with the location of a matching cvs snapshot (server-side) and git bundle at any time if you want me to take a look at it.
Rich0; can you check over http://wiki.gentoo.org/wiki/Project:Infrastructure/Git_migration and ensure the 'how to' for running your validation scripts is documented either there, or in the README? Meanwhile, there is a new dump in dev.gentoo.org:/space/git-work/dumps/ ; both the cvs tree snapshot and .gitbundle is in there.
(In reply to Brian Harring from comment #15) > Rich0; can you check over > http://wiki.gentoo.org/wiki/Project:Infrastructure/Git_migration and ensure > the 'how to' for running your validation scripts is documented either there, > or in the README? > > Meanwhile, there is a new dump in dev.gentoo.org:/space/git-work/dumps/ ; > both the cvs tree snapshot and .gitbundle is in there. I put in a VERY rough set of steps on the wiki. If you have issues following it feel free to clean it up or ping me to clarify steps - it needs some work, but I wanted to at least get something out there. I checked gentoo-x86-20140221-22:05.cvs.tar.bz2 vs gentoo-x86-20140221-22:05.gitbundle, and the keyword expansion isn't enabled in the git conversion, which means that just about all of the file hashes don't match. As an example git show 3953b8b3981c410f65cb45bda1c09da6168005f6: # Copyright 1999-2010 Gentoo Foundation # Distributed under the terms of the GNU General Public License v2 # $Header$ cvs co -p -r 1.5 gentoo-x86/app-accessibility/accerciser/accerciser-1.10.1.ebuild: # Copyright 1999-2010 Gentoo Foundation # Distributed under the terms of the GNU General Public License v2 # $Header: /var/cvsroot/gentoo-x86/app-accessibility/accerciser/Attic/accerciser-1.10.1.ebuild,v 1.5 2010/09/11 18:37:22 josejx Exp $ If I find other issues I'll post them, but all the hashes don't match which means any more subtle problems will be lost. I can look at authors and commit messages though.
I recall there being a bug on keyword expansion; barring that, what's stopping this from moving forward?
(In reply to Alex Xu (Hello71) from comment #17) > I recall there being a bug on keyword expansion; barring that, what's > stopping this from moving forward? As far as I'm concerned the validation is done. Certainly more could be done with it, but I'm not aware of any issues with the conversion routines at this point.
This can be invalid'd/closed.