Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 418431 - >=dev-vcs/git-1.7.9 git-svn is broken with SVN 1.7 and can corrupt data
Summary: >=dev-vcs/git-1.7.9 git-svn is broken with SVN 1.7 and can corrupt data
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Highest critical (vote)
Assignee: Robin Johnson
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 333531
  Show dependency tree
 
Reported: 2012-05-31 19:42 UTC by Robin Johnson
Modified: 2012-08-22 17:53 UTC (History)
13 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2012-05-31 19:42:44 UTC
This affects >=git-1.7.8 with Subversion 1.7. The Git ebuilds have had ALL keywords removed because this can cause data loss.

The error generally looks like this:                                                                                
svn: E235000: In file 'subversion/libsvn_subr/dirent_uri.c' line 2291: assertion failed (svn_uri_is_canonical(url, pool))

The tests that test directly are:
t9100 t9118 t9120       

Most of t91xx will fail unless you change the working directory of 'trash directory' to NOT contain a space.

See git-1.7.8-git-svn-1.7-canonical-path.patch as a partial patch, but it breaks things worse.

If you are running git-svn for real, on a git-svn repo created with an older version of git, and then do 'git-svn fetch', and there was a file in the repo BEFORE the new version that contained a triggering filename, and afterwards the escaping requirements on that filename were different, then git will see them as different files, and your git-svn repo will become very weird and broken.

Reference URLs:
http://comments.gmane.org/gmane.comp.version-control.git/185117
http://comments.gmane.org/gmane.comp.version-control.git/184644
Comment 1 Jeremy Olexa (darkside) (RETIRED) archtester Gentoo Infrastructure gentoo-dev Security 2012-06-14 13:58:04 UTC
Keywords:    1.7.8.6:0: ~alpha ~amd64 ~amd64-fbsd ~amd64-linux ~arm ~hppa ~ia64
                        ~ia64-hpux ~ia64-linux ~mips ~ppc ~ppc-aix ~ppc-macos
                        ~ppc64 ~s390 ~sh ~sparc ~sparc-fbsd ~sparc-solaris
                        ~sparc64-solaris ~x64-freebsd ~x64-macos ~x64-solaris
                        ~x86 ~x86-fbsd ~x86-freebsd ~x86-interix ~x86-linux
                        ~x86-macos ~x86-solaris
Keywords:    1.7.9_rc2:0: 

(In reply to comment #0)
> This affects >=git-1.7.8 with Subversion 1.7. The Git ebuilds have had ALL
> keywords removed because this can cause data loss.

Did you mean >=1.7.9 ?? I'm curious to know if this bug blocks a stablereq for 1.7.8.6
Comment 2 Jeremy Olexa (darkside) (RETIRED) archtester Gentoo Infrastructure gentoo-dev Security 2012-06-26 13:25:33 UTC
  05 Feb 2012; Robin H. Johnson <robbat2@gentoo.org> +git-1.7.9.ebuild:
  Version bump. Please be careful of git-svn functionality with SVN 1.7 if your
  SVN repo URL, branch name or tag names contains characters that need URL
  escaping
Comment 4 Michael G Schwern 2012-07-10 03:46:18 UTC
Robin contacted me to work on this.  I have the basic canonicalization problem licked by using SVN's own API.
https://github.com/schwern/git/commit/5e4163b45e1a35a9d06d1c27c5c4d7bffa639e7e

But there's a lot of places in the code that do not canonicalize, or do not use the functions.  Rather than hunt them all down, I'm reworking the Git::SVN::* objects to do it automatically.
https://github.com/schwern/git/issues

I'll try not to go too deep down any rabbit holes.
Comment 5 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2012-07-13 17:33:13 UTC
Bounty for git-svn quoting & escaping bug with Subversion 1.7

Value: $500 USD
Funder: Gentoo Foundation
Contact: Robin Johnson (robbat2), on behalf of Gentoo Foundation Trustees

Work summary:
Fix git-svn to work with Subversion 1.7, all relevant testcases should pass.

Scope of work:
Produce a patch that fixes git-svn to work with the escaping and quoting
requirements new to Subversion 1.7. This will be confirmed against Git
testcases for git-svn, the t91xx series. Of specific interest are t9118, t9120,
and possibly t9100 (may be unrelated).

Requirements of work:
The submittor shall provide a patch for Git, against the latest development
tree, or version 1.7.11.2 source, conforming to the instructions in the
SubmittingPatches and CodingGuidelines files of the Git source tree.
The patch must also be submitted to the Git upstream developers for review and
merging. Ideally it should also be accepted by upstream, but the contact will be able to help with that.

Licensing of work:
The patch shall be licensed GPL-2 to comply with upstream Git licensing.
The copyright of the patch shall be assigned to the Gentoo Foundation, while
the author shall retain any moral rights to the work as required by the
relevant legal jurisdictions.

Additional requirements:
- All interested parties shall declare their intent to work this bounty in
  Gentoo bug #418431, prior to starting work on the patch, which also contains
  details of prior research on the problem. This is intented to provide
  openness and reduce duplication of effort.
- Multiple parties may collaborate on any submission for this bounty, but shall
  nominate a single point of contact, and if successful, that contact shall be
  solely responsible for any division of payment.
Comment 6 Michael G Schwern 2012-07-13 20:26:18 UTC
Thanks for posting the bounty, Robin.  This is my statement of intent to work on it.  My work so far can be found here: https://github.com/schwern/git  Changes are in the fix-canonical branch (which is the default branch for that fork).  What I'm doing to fix it are listed as issues.  https://github.com/schwern/git/issues

So far...
* The canonicalization routines have been fixed to use the SVN API (adjusting for the 1.6 and 1.7 versions).

This didn't really fix much as
Comment 7 Michael G Schwern 2012-07-13 20:34:41 UTC
Whoops, hit enter too soon.

Fixing the canonicalization routines didn't have much impact as there's tons of places in the code which forget to canonicalize or only do partial canonicalization.  Rather than hunt down all these locations just enough to make the tests work, my approach is to do the canonicalization in the object path and url accessors.  That should cover everything.  Unfortunately, most of the objects don't have accessors, so I'll be adding them.

First step of that refactoring is complete, all the classes have been moved out of the git-svn main file.  Now they can be worked on without becoming cross eyed.

Next step is to add the accessors.
https://github.com/schwern/git/issues/3

Then turn on canonicalization.
https://github.com/schwern/git/issues/4

Hopefully universal canonicalization will not mess up something else.  If it does, I'll add special accessors to return the original path/url and use it as necessary.

On top of that, some SVN API routines are pickier than their own canonicalization does.  For example, SVN::Ra->get_dir will not accept "foo/.." style paths and their canonicalization will not fix that.  I'm not sure how widespread this is.
https://github.com/schwern/git/commit/951e4e96a1514a5e7d94dcc5a6afb68262e17cae

I'll be on freenode as Schwern.
Comment 8 Michael G Schwern 2012-07-16 23:31:30 UTC
I've got everything working except for two tests.

* t9100-git-svn-basic.sh tests 11-13, previously mentioned and probably unrelated.

* t9118-git-svn-funky-branch-names.sh test #2.

This one appears to be a change in SVN itself.  The failing test involves trying to create a branch named "pr ject/branches/not-a%40{0}reflog" to make sure git-svn isn't fooled by something that looks like a revision (see 08fd28bb08a33b419cb91935659635cd053c880b and 73d419558d9fa4de3be28bd58158636bc739808e.

It looks like SVN simply isn't allowing that branch name any more.

By putting an exit after the first test which sets up the SVN repository, I saw the SVN branch is simply called "not-a".

~/devel/git/t/trash_directory.t9118-git-svn-funky-branch-names$ svn ls "file:///Users/schwern/devel/git/t/trash_directory.t9118-git-svn-funky-branch-names/svnrepo/pr ject/branches"
.leading_dot/
Abo-Uebernahme (Bug #994)/
fun plugin/
more fun plugin!/
not-a/
trailing_dot./
trailing_dotlock.lock/

So I think that problem is unrelated.  However, I don't have a copy of SVN 1.6 handy to verify.

All the work is in https://github.com/schwern/git  Where do you want to go from here?
Comment 9 Michael G Schwern 2012-07-16 23:58:40 UTC
I can confirm that SVN changed its behavior between 1.6.18 and 1.7.5 with regard to a branch named "not-a%40{0}reflog".

$ mkdir project project/trunk project/branches project/tag

$ echo foo > project/trunk/foo

$ svnrepo=file:///tmp/svn_test/svnrepo

$ svnadmin1.6 create $svnrepo

$ svn1.6  cp -m "trailing .lock" "$svnrepo/pr ject/trunk" "$svnrepo/pr ject/branches/trailing_dotlock.lock"

Committed revision 2.

$ svn1.6 cp -m "reflog" "$svnrepo/pr ject/trunk" "$svnrepo/pr ject/branches/not-a%40{0}reflog"

Committed revision 3.

$ svn1.7 cp -m "reflog" "$svnrepo/pr ject/trunk" "$svnrepo/pr ject/branches/not-a%40{0}reflog"

Committed revision 4.

$ svn1.6 ls "$svnrepo/pr ject/branches"
not-a/
not-a@{0}reflog/
trailing_dotlock.lock/

$ svn1.7 ls "$svnrepo/pr ject/branches"
not-a/
not-a@{0}reflog/
trailing_dotlock.lock/

I'm tempted to just delete that test.
Comment 10 Michael G Schwern 2012-07-17 00:10:06 UTC
I fixed the test to look at what branch name was created.
https://github.com/schwern/git/commit/4cab2bec33afb22844ce0183ee077a5ef178f7b0
Comment 11 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2012-07-17 06:23:29 UTC
@schwern/all:
I'm testing now.
Comment 12 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2012-07-24 21:56:46 UTC
trustees:
Michael Schwern has completed the work, and it passes the tests most of the time now (something lurking with module install and which copy of the modules get used for testing in the sandbox), and is just getting it merged with upstream (on take 2).
I'll send his details for payment of the bounty to the alias.

schwern:
Got time later in the week (friday early afternoon maybe?) to go over the sandbox failures I'm getting? I suspect it's just perl packaging/perl5lib issue.
Comment 13 Michael G Schwern 2012-07-25 01:09:29 UTC
@robin  Sure, no problem.  Meet on IRC?  I'm free on Friday until 3:30pm PDT, let me know a me a time.

Otherwise I'm still pushing the upstream process forward.  It involves a lot of rebasing.  I also renamed the branch, it's git-svn/fix-canonical now.
Comment 14 Michael G Schwern 2012-07-28 09:54:37 UTC
With the exception of t9100-git-svn-basic.sh tests 11-13 mentioned earlier (something to do with symlinks) everything works with SVN 1.6 and 1.7 in my repo.

https://github.com/schwern/git and the branch is git-svn/fix-svn-1.7.  It's been significantly reworked and made easier to understand.  It's branched from git.git's master as of July 25th.  I might have to do some small rebasing to make upstream happy, but I'm happy with it and the content should be stable.

All patches have been sent upstream and are under review by git.git.
Comment 15 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2012-08-22 04:44:41 UTC
Upstream has merged to the pu branch that will be in v1.7.13.
I have released git-1.7.12-r1 in Gentoo, with the git-svn fixes backported, so we have them in the meantime.
Comment 16 Rafał Mużyło 2012-08-22 15:10:04 UTC
(In reply to comment #15)
> Upstream has merged to the pu branch that will be in v1.7.13.
> I have released git-1.7.12-r1 in Gentoo, with the git-svn fixes backported,
> so we have them in the meantime.

AFAIU (at least going by mail list discussion), upstream was not quite satisfied with some of solutions used in the initial patchset. Are those in -r1 from the initial patchset or the merged version ?
Comment 17 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2012-08-22 17:53:13 UTC
(In reply to comment #16)
> (In reply to comment #15)
> > Upstream has merged to the pu branch that will be in v1.7.13.
> > I have released git-1.7.12-r1 in Gentoo, with the git-svn fixes backported,
> > so we have them in the meantime.
> 
> AFAIU (at least going by mail list discussion), upstream was not quite
> satisfied with some of solutions used in the initial patchset. Are those in
> -r1 from the initial patchset or the merged version ?

-r1 contains the merged version that is going into v1.7.13, not the initial version.