It looks like tarsync should support xz snapshots since diffball now supports xz: https://github.com/zmedico/diffball/commit/c6565e078c89e2f7428cc99135810b006ae13f95
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/proj/portage.git/commit/?id=ce0f37932dfa194566aa32bf6f3a11066c4854fe commit ce0f37932dfa194566aa32bf6f3a11066c4854fe Author: Zac Medico <zmedico@gentoo.org> AuthorDate: 2019-12-15 07:21:01 +0000 Commit: Zac Medico <zmedico@gentoo.org> CommitDate: 2019-12-15 07:39:29 +0000 emerge-webrsync: enable xz snapshots for tarsync There's xz support in current versions of diffball. Bug: https://bugs.gentoo.org/702970 Signed-off-by: Zac Medico <zmedico@gentoo.org> bin/emerge-webrsync | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-)
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=9557f104aa85b65c7d394c52c5c8d8727a111651 commit 9557f104aa85b65c7d394c52c5c8d8727a111651 Author: Zac Medico <zmedico@gentoo.org> AuthorDate: 2019-12-15 22:53:50 +0000 Commit: Zac Medico <zmedico@gentoo.org> CommitDate: 2019-12-15 22:56:34 +0000 sys-apps/portage: Bump to version 2.3.82 #310009 emerge: Show package USE in conflict messages #680456 display relevant FEATURES when unshare fails #693454 emerge-webrsync: support gentoo-YYYYMMDD snapshots #702146 emerge: drop FEATURES=distcc-pump support #702970 emerge-webrsync: enable xz snapshots for tarsync Bug: https://bugs.gentoo.org/701268 Bug: https://bugs.gentoo.org/310009 Bug: https://bugs.gentoo.org/680456 Bug: https://bugs.gentoo.org/693454 Bug: https://bugs.gentoo.org/702146 Bug: https://bugs.gentoo.org/702970 Package-Manager: Portage-2.3.82, Repoman-2.3.20 Signed-off-by: Zac Medico <zmedico@gentoo.org> sys-apps/portage/Manifest | 1 + sys-apps/portage/portage-2.3.82.ebuild | 272 +++++++++++++++++++++++++++++++++ 2 files changed, 273 insertions(+)
I'm seeing an "error reading" failure like this with gentoo-YYYYMMDD snapshots (should be fixed by elimination of the perl script related to bug 703460), which does not occur with portage-YYYYMMDD snapshots: > # emerge --sync gentoo > >>> Syncing repository 'gentoo' into '/var/db/repos/gentoo'... > * Using keys from /usr/share/openpgp-keys/gentoo-release.asc > * Refreshing keys via WKD ... [ ok ] > Fetching most recent snapshot ... > Trying to retrieve 20191220 snapshot from http://distfiles.gentoo.org ... > Fetching file gentoo-20191220.tar.xz.md5sum ... > Fetching file gentoo-20191220.tar.xz.gpgsig ... > Fetching file gentoo-20191220.tar.xz ... > Checking digest ... > Checking signature ... > gpg: Signature made Fri 20 Dec 2019 04:57:27 PM PST > gpg: using RSA key E1D6ABB63BFCFB4BA02FDF1CEC590EEAC9189250 > gpg: Good signature from "Gentoo ebuild repository signing key (Automated Signing Key) <infrastructure@gentoo.org>" [unknown] > gpg: aka "Gentoo Portage Snapshot Signing Key (Automated Signing Key)" [unknown] > gpg: WARNING: Using untrusted key! > Getting snapshot timestamp ... > Syncing local tree ... > scanning tarball... > error reading /var/cache/distfiles/gentoo-20191220.tar.xz > emerge-webrsync: error: tarsync failed; tarball is corrupt? (/var/cache/distfiles/gentoo-20191220.tar.xz) > Trying to retrieve 20191219 snapshot from http://distfiles.gentoo.org ... > Fetching file gentoo-20191219.tar.xz.md5sum ... > Fetching file gentoo-20191219.tar.xz.gpgsig ... > Fetching file gentoo-20191219.tar.xz ...
zmedico: please retest with snapshot gentoo-20191222 or newer for the Perl script change. The deltas/ are going to need some larger reworking, as the old naming scheme was just: deltas/snapshot-YYYYMMDD-YYYYMMDD.patch.bz2 Doesn't say if it was on the gentoo or portage base points at all.
(In reply to Robin Johnson from comment #4) > zmedico: please retest with snapshot gentoo-20191222 or newer for the Perl > script change. The gentoo-20191222.tar.xz snapshot works for me with tarsync. Thanks!
Created attachment 613620 [details, diff] Reset diffball's XZ decompressor in cseek I got another sync failure today with the gentoo-20200212.tar.xz tarball. I started looking into the problem and believe I've found the cause. Diffball's support for XZ is broken. Tarsync use dev-util/diffball for reading tar files. It goes through these steps: 1. Decompress the tarball in memory and generate a list of all files in it. 2. Check which files exists on the local filesystem. 3. Unpack the missing files from the tarball. Since it's not possible to seek in compressed files, step 3 must first reset the compressor and unpack the tarball from the beginning again. The reset is done in libcfile/cfile.c:cseek(). https://github.com/rafaelmartins/diffball/blob/master/libcfile/cfile.c#L660 The BZIP2_COMPRESSOR resets a lot of parameters of the compressor (cfh->bzs) including cfh->bzs->avail_in but the XZ_COMPRESSOR doesn't reset anything in cfh->xzs. The difference between the broken and working tarballs is that the broken tarballs have cfh->xzs->avail_in =! 0 at the end of step 1. https://github.com/rafaelmartins/diffball/blob/master/libcfile/cfile.c#L998 Because of that this check then assume there is still data left in ram to decompress and tries to decompress a stale buffer. The call to lzma_code() fails as a result. I've attached a patch against diffball to reset the XZ decompressor in cseek just like the BZIP2 decompressor. With that patch the broken tarball now work.
Dammit. I commented on the wrong bug. The above comment was for bug 708736.