Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 708736 - dev-util/diffball: webrsync fails with tarsync and the new gentoo-YYYYMMDD snapshot format
Summary: dev-util/diffball: webrsync fails with tarsync and the new gentoo-YYYYMMDD sn...
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Zac Medico
URL:
Whiteboard:
Keywords: InVCS
Depends on:
Blocks:
 
Reported: 2020-02-08 16:49 UTC by Thomas Lindroth
Modified: 2020-10-12 20:19 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Reset diffball's XZ decompressor in cseek (dont_break_webrsync.patch,512 bytes, patch)
2020-02-13 17:54 UTC, Thomas Lindroth
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Thomas Lindroth 2020-02-08 16:49:11 UTC
Syncing with webrsync fails on some days. The problem first started 2020-01-23 with the gentoo-20200122.tar.xz tarball. That coincided with the update to portage-2.3.84-r1 on 2020-01-21 and the new gentoo-YYYYMMDD snapshot format.

The following tarballs has failed so far:
gentoo-20200122.tar.xz
gentoo-20200203.tar.xz (not sure about this one)
gentoo-20200206.tar.xz
gentoo-20200207.tar.xz

All the other tarballs between 2020-01-23 and now worked. The snapshots-create.sh script from mastermirror-scripts.git doesn't show any changes since 2020-01-06 so it seems some tarballs are broken intermittently.

If I uninstall tarsync-0.2.1-r1 the sync works fine. I guess the problem is the same as this one https://bugs.gentoo.org/702970#c3 but the intermittent failures made it seem to work.

Output from emaint sync --auto:
>>> Syncing repository 'gentoo' into '/usr/portage'...
 * Using keys from /usr/share/openpgp-keys/gentoo-release.asc
 * Refreshing keys via WKD ...                                                                                                [ ok ]
Fetching most recent snapshot ...
Trying to retrieve 20200207 snapshot from https://ftp-stud.hs-esslingen.de/pub/Mirrors/gentoo ...
Fetching file gentoo-20200207.tar.xz.md5sum ...
Fetching file gentoo-20200207.tar.xz.gpgsig ...
Fetching file gentoo-20200207.tar.xz ...
Checking digest ...
Checking signature ...
gpg: Signature made Sat 08 Feb 2020 01:55:56 AM CET
gpg:                using RSA key E1D6ABB63BFCFB4BA02FDF1CEC590EEAC9189250
gpg: Good signature from "Gentoo ebuild repository signing key (Automated Signing Key) <infrastructure@gentoo.org>" [unknown]
gpg:                 aka "Gentoo Portage Snapshot Signing Key (Automated Signing Key)" [unknown]
gpg: WARNING: Using untrusted key!
Getting snapshot timestamp ...
Syncing local tree ...
scanning tarball...
scanning existing target directory...
3459 to update
failed seeking in tarball!
failed transfering file Manifest
emerge-webrsync: error: tarsync failed; tarball is corrupt? (/var/tmp/portage/webrsync-ggbyfx/gentoo-20200207.tar.xz)
Trying to retrieve 20200206 snapshot from https://ftp-stud.hs-esslingen.de/pub/Mirrors/gentoo ...
Fetching file gentoo-20200206.tar.xz.md5sum ...
Fetching file gentoo-20200206.tar.xz.gpgsig ...
Fetching file gentoo-20200206.tar.xz ...
Checking digest ...
Checking signature ...
gpg: Signature made Fri 07 Feb 2020 01:56:15 AM CET
gpg:                using RSA key E1D6ABB63BFCFB4BA02FDF1CEC590EEAC9189250
gpg: Good signature from "Gentoo ebuild repository signing key (Automated Signing Key) <infrastructure@gentoo.org>" [unknown]
gpg:                 aka "Gentoo Portage Snapshot Signing Key (Automated Signing Key)" [unknown]
gpg: WARNING: Using untrusted key!
Getting snapshot timestamp ...
Syncing local tree ...
scanning tarball...
scanning existing target directory...
3193 to update
failed seeking in tarball!
failed transfering file Manifest
emerge-webrsync: error: tarsync failed; tarball is corrupt? (/var/tmp/portage/webrsync-ggbyfx/gentoo-20200206.tar.xz)
Trying to retrieve 20200205 snapshot from https://ftp-stud.hs-esslingen.de/pub/Mirrors/gentoo ...
Fetching file gentoo-20200205.tar.xz.md5sum ...
Fetching file gentoo-20200205.tar.xz.gpgsig ...
Fetching file gentoo-20200205.tar.xz ...
Checking digest ...
Checking signature ...
gpg: Signature made Thu 06 Feb 2020 01:56:18 AM CET
gpg:                using RSA key E1D6ABB63BFCFB4BA02FDF1CEC590EEAC9189250
gpg: Good signature from "Gentoo ebuild repository signing key (Automated Signing Key) <infrastructure@gentoo.org>" [unknown]
gpg:                 aka "Gentoo Portage Snapshot Signing Key (Automated Signing Key)" [unknown]
gpg: WARNING: Using untrusted key!
Getting snapshot timestamp ...
Syncing local tree ...
scanning tarball...
scanning existing target directory...
664 to update
664 files written, 156952 entires verified, 1340210 bytes written
=== Sync completed for gentoo

Action: sync for repo: gentoo, returned code = 0




/etc/portage/repos.conf/gentoo.conf:
[DEFAULT]
main-repo = gentoo

[gentoo]
location = /usr/portage
sync-type = webrsync
auto-sync = yes
sync-webrsync-verify-signature = yes
sync-openpgp-key-path = /usr/share/openpgp-keys/gentoo-release.asc




emerge --info:
Portage 2.3.84 (python 2.7.17-final-0, default/linux/amd64/17.1/desktop, gcc-8.3.0, glibc-2.29-r7, 4.19.90 x86_64)
=================================================================
System uname: Linux-4.19.90-x86_64-Intel-R-_Core-TM-_i7-4790K_CPU_@_4.00GHz-with-gentoo-2.6
KiB Mem:    16245008 total,    503132 free
KiB Swap:    4194300 total,   4192508 free
Timestamp of repository gentoo: Thu, 06 Feb 2020 00:45:01 +0000
sh bash 4.4_p23-r1
ld GNU ld (Gentoo 2.32 p2) 2.32.0
ccache version 3.7.6 [disabled]
app-shells/bash:          4.4_p23-r1::gentoo
dev-lang/perl:            5.30.1::gentoo
dev-lang/python:          2.7.17::gentoo, 3.6.9::gentoo, 3.7.5-r1::gentoo
dev-util/ccache:          3.7.6::gentoo
dev-util/cmake:           3.14.6::gentoo
dev-util/pkgconfig:       0.29.2::gentoo
sys-apps/baselayout:      2.6-r1::gentoo
sys-apps/openrc:          0.42.1::gentoo
sys-apps/sandbox:         2.13::gentoo
sys-devel/autoconf:       2.13-r1::gentoo, 2.69-r4::gentoo
sys-devel/automake:       1.15.1-r2::gentoo, 1.16.1-r1::gentoo
sys-devel/binutils:       2.32-r1::gentoo
sys-devel/gcc:            8.3.0-r1::gentoo
sys-devel/gcc-config:     2.1::gentoo
sys-devel/libtool:        2.4.6-r6::gentoo
sys-devel/make:           4.2.1-r4::gentoo
sys-kernel/linux-headers: 4.19::gentoo (virtual/os-headers)
sys-libs/glibc:           2.29-r7::gentoo
Repositories:

gentoo
    location: /usr/portage
    sync-type: webrsync
    sync-uri: rsync://rsync.gentoo.org/gentoo-portage
    priority: -1000
    sync-webrsync-verify-signature: yes

tholins-overlay
    location: /usr/local/portage/my-overlay
    masters: gentoo
    priority: 0

ACCEPT_KEYWORDS="amd64"
ACCEPT_LICENSE="*"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-O2 -march=native -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/lib64/tomoyo/conf /usr/share/config /usr/share/gnupg/qualified.txt"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo"
CXXFLAGS="-O2 -march=native -pipe"
DISTDIR="/usr/portage/distfiles"
ENV_UNSET="DBUS_SESSION_BUS_ADDRESS DISPLAY GOBIN PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR"
FCFLAGS="-O2 -pipe"
FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs compress-build-logs config-protect-if-modified distlocks ebuild-locks fixlafiles ipc-sandbox merge-sync multilib-strict network-sandbox news parallel-fetch parallel-install pid-sandbox preserve-libs protect-owned sandbox sfperms splitdebug strict unknown-features-warn unmerge-backup unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
FFLAGS="-O2 -pipe"
GENTOO_MIRRORS="https://ftp-stud.hs-esslingen.de/pub/Mirrors/gentoo/ https://mirror.dkm.cz/gentoo/ https://ftp.lanet.kr/pub/gentoo/"
LANG="en_US.utf8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
MAKEOPTS="-j8 -l12"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git"
PORTAGE_TMPDIR="/var/tmp"
USE="X a52 aac acl acpi alsa amd64 berkdb branding bzip2 cairo cdda cdr cjk cli consolekit crypt cxx dbus dri dts dvd dvdr emboss encode exif flac fortran gdbm gif gpm gtk hardened iconv icu jpeg lcms ldap libnotify libtirpc mad mng mp3 mp4 mpeg multilib ncurses nls nptl offensive ogg opengl openmp pam pango pcre pdf png policykit ppds python qt5 readline sdl seccomp spell split-usr ssl startup-notification svg tcpd tiff truetype udev udisks unicode upower usb vorbis wxwidgets x264 xattr xcb xinerama xml xv xvid zlib" ABI_X86="64" ADA_TARGET="gnat_2018" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="karbon sheets words" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="aes avx avx2 fma3 mmx mmxext popcnt sse sse2 sse3 sse4_1 sse4_2 ssse3" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock greis isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" INPUT_DEVICES="evdev keyboard mouse" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php7-2" POSTGRES_TARGETS="postgres10 postgres11" PYTHON_SINGLE_TARGET="python3_6" PYTHON_TARGETS="python2_7 python3_6" RUBY_TARGETS="ruby24 ruby25" USERLAND="GNU" VIDEO_CARDS="vesa intel i965" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CC, CPPFLAGS, CTARGET, CXX, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL, LINGUAS, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
Comment 1 Thomas Lindroth 2020-02-13 17:54:53 UTC
Created attachment 613622 [details, diff]
Reset diffball's XZ decompressor in cseek

I got another sync failure today with the gentoo-20200212.tar.xz tarball. I started looking into the problem and believe I've found the cause. Diffball's support for XZ is broken.

Tarsync use dev-util/diffball for reading tar files. It goes through these steps:
1. Decompress the tarball in memory and generate a list of all files in it.
2. Check which files exists on the local filesystem.
3. Unpack the missing files from the tarball.

Since it's not possible to seek in compressed files, step 3 must first reset the compressor and unpack the tarball from the beginning again. The reset is done in libcfile/cfile.c:cseek().

https://github.com/rafaelmartins/diffball/blob/master/libcfile/cfile.c#L660
The BZIP2_COMPRESSOR resets a lot of parameters of the compressor (cfh->bzs) including cfh->bzs->avail_in but the XZ_COMPRESSOR doesn't reset anything in cfh->xzs.

The difference between the broken and working tarballs is that the broken tarballs have cfh->xzs->avail_in =! 0 at the end of step 1.

https://github.com/rafaelmartins/diffball/blob/master/libcfile/cfile.c#L998
Because of that this check then assume there is still data left in ram to decompress and tries to decompress a stale buffer. The call to lzma_code() fails as a result.

I've attached a patch against diffball to reset the XZ decompressor in cseek just like the BZIP2 decompressor. With that patch the broken tarball now work.
Comment 2 Zac Medico gentoo-dev 2020-02-16 00:29:59 UTC
(In reply to Thomas Lindroth from comment #1)
> Created attachment 613622 [details, diff] [details, diff]
> Reset diffball's XZ decompressor in cseek

Good catch, fixes interaction with crefill logic here:

>	if(0 == cfh->xzs->avail_in && (cfh->raw.offset +
>		(cfh->raw.end - cfh->xzs->avail_in) < cfh->raw_total_len)) {
Comment 3 Larry the Git Cow gentoo-dev 2020-02-16 01:50:14 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=f78d49bc40c58783eaff9ec874c2f733e63bcc5c

commit f78d49bc40c58783eaff9ec874c2f733e63bcc5c
Author:     Zac Medico <zmedico@gentoo.org>
AuthorDate: 2020-02-16 01:32:45 +0000
Commit:     Zac Medico <zmedico@gentoo.org>
CommitDate: 2020-02-16 01:50:06 +0000

    dev-util/diffball: 1.0.1-r2 revbump for bug 708736
    
    Fix tarsync seek failure with xz compressed files.
    
    Bug: https://bugs.gentoo.org/708736
    Package-Manager: Portage-2.3.89, Repoman-2.3.20
    Signed-off-by: Zac Medico <zmedico@gentoo.org>

 dev-util/diffball/Manifest                 |  1 +
 dev-util/diffball/diffball-1.0.1-r2.ebuild | 42 ++++++++++++++++++++++++++++++
 2 files changed, 43 insertions(+)