Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 702970

Summary: sys-apps/portage: emerge-webrsync enable xz snapshots with tarsync
Product: Portage Development Reporter: Zac Medico <zmedico>
Component: ToolsAssignee: Portage team <dev-portage>
Status: RESOLVED FIXED    
Severity: normal CC: robbat2, thomas.lindroth
Priority: Normal Keywords: InVCS
Version: unspecified   
Hardware: All   
OS: All   
See Also: https://bugs.gentoo.org/show_bug.cgi?id=332799
https://bugs.gentoo.org/show_bug.cgi?id=703460
https://bugs.gentoo.org/show_bug.cgi?id=632018
Whiteboard:
Package list:
Runtime testing required: ---
Bug Depends on:    
Bug Blocks: 701268    
Attachments: Reset diffball's XZ decompressor in cseek

Description Zac Medico gentoo-dev 2019-12-15 07:15:22 UTC
It looks like tarsync should support xz snapshots since diffball now supports xz:

https://github.com/zmedico/diffball/commit/c6565e078c89e2f7428cc99135810b006ae13f95
Comment 1 Larry the Git Cow gentoo-dev 2019-12-15 07:40:38 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/proj/portage.git/commit/?id=ce0f37932dfa194566aa32bf6f3a11066c4854fe

commit ce0f37932dfa194566aa32bf6f3a11066c4854fe
Author:     Zac Medico <zmedico@gentoo.org>
AuthorDate: 2019-12-15 07:21:01 +0000
Commit:     Zac Medico <zmedico@gentoo.org>
CommitDate: 2019-12-15 07:39:29 +0000

    emerge-webrsync: enable xz snapshots for tarsync
    
    There's xz support in current versions of diffball.
    
    Bug: https://bugs.gentoo.org/702970
    Signed-off-by: Zac Medico <zmedico@gentoo.org>

 bin/emerge-webrsync | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)
Comment 2 Larry the Git Cow gentoo-dev 2019-12-15 23:00:43 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=9557f104aa85b65c7d394c52c5c8d8727a111651

commit 9557f104aa85b65c7d394c52c5c8d8727a111651
Author:     Zac Medico <zmedico@gentoo.org>
AuthorDate: 2019-12-15 22:53:50 +0000
Commit:     Zac Medico <zmedico@gentoo.org>
CommitDate: 2019-12-15 22:56:34 +0000

    sys-apps/portage: Bump to version 2.3.82
    
     #310009 emerge: Show package USE in conflict messages
     #680456 display relevant FEATURES when unshare fails
     #693454 emerge-webrsync: support gentoo-YYYYMMDD snapshots
     #702146 emerge: drop FEATURES=distcc-pump support
     #702970 emerge-webrsync: enable xz snapshots for tarsync
    
    Bug: https://bugs.gentoo.org/701268
    Bug: https://bugs.gentoo.org/310009
    Bug: https://bugs.gentoo.org/680456
    Bug: https://bugs.gentoo.org/693454
    Bug: https://bugs.gentoo.org/702146
    Bug: https://bugs.gentoo.org/702970
    Package-Manager: Portage-2.3.82, Repoman-2.3.20
    Signed-off-by: Zac Medico <zmedico@gentoo.org>

 sys-apps/portage/Manifest              |   1 +
 sys-apps/portage/portage-2.3.82.ebuild | 272 +++++++++++++++++++++++++++++++++
 2 files changed, 273 insertions(+)
Comment 3 Zac Medico gentoo-dev 2019-12-21 21:44:40 UTC
I'm seeing an "error reading" failure like this with gentoo-YYYYMMDD snapshots (should be fixed by elimination of the perl script related to bug 703460), which does not occur with portage-YYYYMMDD snapshots:




> # emerge --sync gentoo
> >>> Syncing repository 'gentoo' into '/var/db/repos/gentoo'...
>  * Using keys from /usr/share/openpgp-keys/gentoo-release.asc
>  * Refreshing keys via WKD ...                                                                                         [ ok ]
> Fetching most recent snapshot ...
> Trying to retrieve 20191220 snapshot from http://distfiles.gentoo.org ...
> Fetching file gentoo-20191220.tar.xz.md5sum ...
> Fetching file gentoo-20191220.tar.xz.gpgsig ...
> Fetching file gentoo-20191220.tar.xz ...
> Checking digest ...
> Checking signature ...
> gpg: Signature made Fri 20 Dec 2019 04:57:27 PM PST
> gpg:                using RSA key E1D6ABB63BFCFB4BA02FDF1CEC590EEAC9189250
> gpg: Good signature from "Gentoo ebuild repository signing key (Automated Signing Key) <infrastructure@gentoo.org>" [unknown]
> gpg:                 aka "Gentoo Portage Snapshot Signing Key (Automated Signing Key)" [unknown]
> gpg: WARNING: Using untrusted key!
> Getting snapshot timestamp ...
> Syncing local tree ...
> scanning tarball...
> error reading /var/cache/distfiles/gentoo-20191220.tar.xz
> emerge-webrsync: error: tarsync failed; tarball is corrupt? (/var/cache/distfiles/gentoo-20191220.tar.xz)
> Trying to retrieve 20191219 snapshot from http://distfiles.gentoo.org ...
> Fetching file gentoo-20191219.tar.xz.md5sum ...
> Fetching file gentoo-20191219.tar.xz.gpgsig ...
> Fetching file gentoo-20191219.tar.xz ...
Comment 4 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2019-12-23 19:01:15 UTC
zmedico: please retest with snapshot gentoo-20191222 or newer for the Perl script change.

The deltas/ are going to need some larger reworking, as the old naming scheme was just:
deltas/snapshot-YYYYMMDD-YYYYMMDD.patch.bz2
Doesn't say if it was on the gentoo or portage base points at all.
Comment 5 Zac Medico gentoo-dev 2019-12-23 20:21:39 UTC
(In reply to Robin Johnson from comment #4)
> zmedico: please retest with snapshot gentoo-20191222 or newer for the Perl
> script change.

The gentoo-20191222.tar.xz snapshot works for me with tarsync. Thanks!
Comment 6 Thomas Lindroth 2020-02-13 17:48:46 UTC
Created attachment 613620 [details, diff]
Reset diffball's XZ decompressor in cseek

I got another sync failure today with the gentoo-20200212.tar.xz tarball. I started looking into the problem and believe I've found the cause. Diffball's support for XZ is broken.

Tarsync use dev-util/diffball for reading tar files. It goes through these steps:
1. Decompress the tarball in memory and generate a list of all files in it.
2. Check which files exists on the local filesystem.
3. Unpack the missing files from the tarball.

Since it's not possible to seek in compressed files, step 3 must first reset the compressor and unpack the tarball from the beginning again. The reset is done in libcfile/cfile.c:cseek().

https://github.com/rafaelmartins/diffball/blob/master/libcfile/cfile.c#L660
The BZIP2_COMPRESSOR resets a lot of parameters of the compressor (cfh->bzs) including cfh->bzs->avail_in but the XZ_COMPRESSOR doesn't reset anything in cfh->xzs.

The difference between the broken and working tarballs is that the broken tarballs have cfh->xzs->avail_in =! 0 at the end of step 1.

https://github.com/rafaelmartins/diffball/blob/master/libcfile/cfile.c#L998
Because of that this check then assume there is still data left in ram to decompress and tries to decompress a stale buffer. The call to lzma_code() fails as a result.

I've attached a patch against diffball to reset the XZ decompressor in cseek just like the BZIP2 decompressor. With that patch the broken tarball now work.
Comment 7 Thomas Lindroth 2020-02-13 17:52:27 UTC
Dammit. I commented on the wrong bug. The above comment was for bug 708736.