Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bugzilla DB migration completed. Please report issues to Infra team via email via infra@gentoo.org or IRC
Bug 585524 - George Tech Rsync Mirror is incredibly slow
Summary: George Tech Rsync Mirror is incredibly slow
Status: CONFIRMED
Alias: None
Product: Mirrors
Classification: Unclassified
Component: Server Problem (show other bugs)
Hardware: All Linux
: Normal normal with 2 votes (vote)
Assignee: Mirror Admins
URL:
Whiteboard:
Keywords:
: 607134 (view as bug list)
Depends on:
Blocks:
 
Reported: 2016-06-10 13:15 UTC by Joshua Kinard
Modified: 2020-09-09 18:05 UTC (History)
5 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Joshua Kinard gentoo-dev 2016-06-10 13:15:20 UTC
The Georgia Tech Rsync Mirror (rsync://128.61.111.9/gentoo-portage) has been slow lately it seems.  This may have been going on for a while.

My last 'emerge --sync' run against it took over 30+ minutes to complete.  Here's the rsync stats from that run:

Number of files: 210,113 (reg: 182,671, dir: 27,442)
Number of created files: 76 (reg: 75, dir: 1)
Number of deleted files: 78 (reg: 77, dir: 1)
Number of regular files transferred: 365
Total file size: 416.21M bytes
Total transferred file size: 6.00M bytes
Literal data: 6.00M bytes
Matched data: 0 bytes
File list size: 4.83M
File list generation time: 0.092 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 39.95K
Total bytes received: 11.36M

sent 39.95K bytes  received 11.36M bytes  2.31K bytes/sec
total size is 416.21M  speedup is 36.52
=== Sync completed for gentoo
q: Updating ebuild cache in /usr/portage ...
q: Finished 40444 entries in 1.621322 seconds

2.31K/sec is rather slow.  Is this expected for this mirror?
Comment 1 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2016-08-20 21:49:46 UTC
@Kumba:
I can't reproduce your painful speeds; I get 1.5-2Mbit for that mirror.
Regardless, I've CC'd the GATech mirror admin.

Can you test other mirrors as well?

# rsync rsync://128.61.111.9/gentoo-portage/ /tmp/test --stats -av
...
Number of files: 205,188 (reg: 177,792, dir: 27,396)
Number of created files: 24,910 (reg: 24,899, dir: 11)
Number of regular files transferred: 95,943
Total file size: 412,605,111 bytes
Total transferred file size: 251,991,174 bytes
Literal data: 155,314,568 bytes
Matched data: 96,676,606 bytes
File list size: 5,889,597
File list generation time: 0.097 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 2,970,745
Total bytes received: 165,133,520

sent 2,970,745 bytes  received 165,133,520 bytes  162,341.15 bytes/sec
total size is 412,605,111  speedup is 2.45
Comment 2 Joshua Kinard gentoo-dev 2016-08-21 02:30:04 UTC
(In reply to Robin Johnson from comment #1)
> @Kumba:
> I can't reproduce your painful speeds; I get 1.5-2Mbit for that mirror.
> Regardless, I've CC'd the GATech mirror admin.
> 
> Can you test other mirrors as well?
> 
> # rsync rsync://128.61.111.9/gentoo-portage/ /tmp/test --stats -av
> ...
> Number of files: 205,188 (reg: 177,792, dir: 27,396)
> Number of created files: 24,910 (reg: 24,899, dir: 11)
> Number of regular files transferred: 95,943
> Total file size: 412,605,111 bytes
> Total transferred file size: 251,991,174 bytes
> Literal data: 155,314,568 bytes
> Matched data: 96,676,606 bytes
> File list size: 5,889,597
> File list generation time: 0.097 seconds
> File list transfer time: 0.000 seconds
> Total bytes sent: 2,970,745
> Total bytes received: 165,133,520
> 
> sent 2,970,745 bytes  received 165,133,520 bytes  162,341.15 bytes/sec
> total size is 412,605,111  speedup is 2.45

So this is weird.  The slowdown might be something emerge/portage is doing.  If I use your rsync command and sync an entire tree to a ramdrive, these are my statistics:

Number of files: 205,193 (reg: 177,797, dir: 27,396)
Number of created files: 205,193 (reg: 177,797, dir: 27,396)
Number of regular files transferred: 177,797
Total file size: 412,614,482 bytes
Total transferred file size: 412,614,482 bytes
Literal data: 412,614,482 bytes
Matched data: 0 bytes
File list size: 3,091,228
File list generation time: 1.751 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 3,487,804
Total bytes received: 425,548,536

sent 3,487,804 bytes  received 425,548,536 bytes  154,970.68 bytes/sec
total size is 412,614,482  speedup is 0.96

The speed is roughly similar to yours, @ 154,970.68 bytes/sec.  But if I run "emerge --sync", and it happens to select the gatech server for rsync, I get these statistics instead:

Number of files: 205,193 (reg: 177,797, dir: 27,396)
Number of created files: 45 (reg: 43, dir: 2)
Number of regular files transferred: 302
Total file size: 412.61M bytes
Total transferred file size: 1.95M bytes
Literal data: 1.95M bytes
Matched data: 0 bytes
File list size: 2.38M
File list generation time: 0.925 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 33.36K
Total bytes received: 7.17M

sent 33.36K bytes  received 7.17M bytes  9.30K bytes/sec
total size is 412.61M  speedup is 57.30
=== Sync completed for gentoo

9.30K bytes/sec instead (must be a diff command flag to use K bytes).

All of the other rsync mirrors sync via emerge at normal speeds, except for this one.
Comment 3 Neil Bright 2016-08-21 02:45:54 UTC
What IP address are you coming from?  I can look in the logs to see if there is anything suspect.

Additionally, please don't use a hard-coded IP address for GTlib.  We have four hosts providing rsync services, and they all point at the same bits.  rsync://rsync.gtlib.gatech.edu/gentoo-portage would be much more preferable.

Gentoo folks - if you could change rsync3.us.gentoo.org from an A record to a CNAME pointing to rsync.gtlib.gatech.edu this would balance things on our end a bit better.

--
Neil Bright
Comment 4 Joshua Kinard gentoo-dev 2016-08-21 02:50:02 UTC
(In reply to Neil Bright from comment #3)
> What IP address are you coming from?  I can look in the logs to see if there
> is anything suspect.
> 
> Additionally, please don't use a hard-coded IP address for GTlib.  We have
> four hosts providing rsync services, and they all point at the same bits. 
> rsync://rsync.gtlib.gatech.edu/gentoo-portage would be much more preferable.
> 
> Gentoo folks - if you could change rsync3.us.gentoo.org from an A record to
> a CNAME pointing to rsync.gtlib.gatech.edu this would balance things on our
> end a bit better.
> 
> --
> Neil Bright

Neil, I will e-mail you my current ISP-assigned address shortly.  I'll defer to Infra on the other bits.
Comment 5 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2016-08-21 05:35:04 UTC
DNS updated to CNAME instead for rsync3.us, and all 4 IPs added to rsync.us rotation.

I was testing w/ the same IP given by Kumba originally, to make sure I hit the same results rather than any v6 hosts as would normally be preferred on my end.
Comment 6 Neil Bright 2016-08-26 19:56:26 UTC
Hi Josh,

Looking at our logs from your Comcast address (sent out of band), I do see a number of relatively small transfers, but also a couple of 'max connections reached' type messages out of our rsyncd.

What does emerge do when it can't make a successful rsync connection?  My best hypothesis at the moment is that you see slowness when we've maxed out rsync connections.

The DNS round robin implemented by Robin should help with this if you're using rsync3.us.gentoo.org in your configuration.
Comment 7 Joshua Kinard gentoo-dev 2016-09-05 10:58:14 UTC
(In reply to Neil Bright from comment #6)
> Hi Josh,
> 
> Looking at our logs from your Comcast address (sent out of band), I do see a
> number of relatively small transfers, but also a couple of 'max connections
> reached' type messages out of our rsyncd.
> 
> What does emerge do when it can't make a successful rsync connection?  My
> best hypothesis at the moment is that you see slowness when we've maxed out
> rsync connections.
> 
> The DNS round robin implemented by Robin should help with this if you're
> using rsync3.us.gentoo.org in your configuration.

Hi Neil,

As far as I can tell, the transfer is only slow when emerge does it against my current tree.  If I run the 'rsync' command manually, either a cutdown set of options or the full option set used by emerge, and start syncing the tree to an empty directory in a ramdrive or to my filesystem, the transfer runs very fast.  So I'm at a bit of a loss as to why this issue is happening.

I might have to blow the local copy of the tree away, then drop the machine and run some disk checking tools just to make sure that area of the drive isn't a bit screwy.  Though, that wouldn't explain why other mirrors seem to have no problem whatsoever.
Comment 8 Matt Turner gentoo-dev 2017-01-26 18:12:36 UTC
*** Bug 607134 has been marked as a duplicate of this bug. ***
Comment 9 Gabriel Marcano 2017-06-14 12:18:12 UTC
I am suffering from the issue described here. I blew away my portage directory, to no avail. Doing a manual wget against the Georgia Tech server is fast... For some reason it's only eix-sync (which I assume is calling emerge --sync under the hood) that seems to be suffering severely. Here is my emerge --info:

# emerge --info
Portage 2.3.6 (python 3.4.6-final-0, default/linux/amd64/13.0/desktop/plasma, gcc-7.1.0, glibc-2.24-r1, 4.11.3-gentoo x86_64)
=================================================================
System uname: Linux-4.11.3-gentoo-x86_64-Intel-R-_Core-TM-_i7-3770K_CPU_@_3.50GHz-with-gentoo-2.4.1
KiB Mem:    16367984 total,   1672048 free
KiB Swap:   16776192 total,  16776192 free
Timestamp of repository gentoo: Wed, 14 Jun 2017 00:45:01 +0000
sh bash 4.4_p12
ld GNU ld (Gentoo 2.27 p1.0) 2.27
distcc 3.2rc1 x86_64-pc-linux-gnu [disabled]
ccache version 3.3.4 [disabled]
app-shells/bash:          4.4_p12::gentoo
dev-java/java-config:     2.2.0-r3::gentoo
dev-lang/perl:            5.24.1-r2::gentoo
dev-lang/python:          2.7.13::gentoo, 3.4.6::gentoo, 3.5.3::gentoo
dev-util/ccache:          3.3.4::gentoo
dev-util/cmake:           3.8.2::gentoo
dev-util/pkgconfig:       0.29.2::gentoo
sys-apps/baselayout:      2.4.1::gentoo
sys-apps/openrc:          0.27.2::gentoo
sys-apps/sandbox:         2.10-r4::gentoo
sys-devel/autoconf:       2.13::gentoo, 2.69-r3::gentoo
sys-devel/automake:       1.11.6-r2::gentoo, 1.13.4-r1::gentoo, 1.15-r2::gentoo
sys-devel/binutils:       2.27::gentoo, 2.28-r2::gentoo
sys-devel/gcc:            6.3.0::gentoo, 7.1.0-r1::gentoo
sys-devel/gcc-config:     1.8-r1::gentoo
sys-devel/libtool:        2.4.6-r4::gentoo
sys-devel/make:           4.2.1-r1::gentoo
sys-kernel/linux-headers: 4.10::gentoo (virtual/os-headers)
sys-libs/glibc:           2.24-r1::gentoo
Repositories:

gentoo
    location: /usr/portage
    sync-type: rsync
    sync-uri: rsync://rsync.us.gentoo.org/gentoo-portage
    priority: -1000

Local_Overlay
    location: /usr/local/portage
    masters: gentoo
    eclass-overrides: gentoo Local_Overlay

chrytoo
    location: /var/lib/layman/chrytoo
    masters: gentoo
    priority: 50

ACCEPT_KEYWORDS="amd64 ~amd64"
ACCEPT_LICENSE="* -@EULA"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=ivybridge -pipe -fomit-frame-pointer -O2"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/lib64/libreoffice/program/sofficerc /usr/share/config /usr/share/gnupg/qualified.txt /usr/share/themes/oxygen-gtk/gtk-2.0"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/splash /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c"
CXXFLAGS="-march=ivybridge -pipe -fomit-frame-pointer -O2"
DISTDIR="/usr/portage/distfiles"
FCFLAGS="-O2 -pipe"
FEATURES="assume-digests binpkg-logs clean-logs config-protect-if-modified distlocks ebuild-locks fixlafiles merge-sync news parallel-fetch preserve-libs protect-owned sandbox sfperms split-log strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
FFLAGS="-O2 -pipe"
GENTOO_MIRRORS="http://distfiles.gentoo.org"
LANG="en_US.utf8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
MAKEOPTS="-j9"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git"
PORTAGE_TMPDIR="/tmp"
USE="=-ppp X a52 aac acl acpi alsa amd64 avx bash-completion berkdb bluetooth branding bzip2 cairo cdda cdr cli consolekit cracklib crypt cups cxx dbus declarative dri dts dvd dvdr emboss encode exif fam firefox flac fortran gdbm gif glamor gpm gtk iconv ipv6 jpeg kde kipi kwallet lcms ldap libnotify mad mmx mng modules mp3 mp4 mpeg mtp multilib ncurses nls nptl nvidia ogg opengl openmp pam pango pcre pdf phonon plasma png policykit ppds qml qt qt3support qt4 qt5 readline sdl seccomp semantic-desktop session spell sse sse2 sse3 sse4 ssl startup-notification svg tcpd tiff truetype udev udisks unicode upower usb vdpau vorbis widgets wxwidgets x264 xattr xcb xcomposite xinerama xml xscreensaver xv xvid zlib" ABI_X86="64" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump author" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="aes avx mmx mmxext popcnt sse sse2 sse3 sse4_1 sse4_2 ssse3" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" GRUB_PLATFORMS="efi-64 emu" INPUT_DEVICES="evdev roccat_tyon" KERNEL="linux" L10N="en-US es" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" LINGUAS="en es" NETBEANS_MODULES="cnd mobility" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php5-6" PYTHON_SINGLE_TARGET="python3_4" PYTHON_TARGETS="python2_7 python3_4 python3_4 python3_5" RUBY_TARGETS="ruby21 ruby22 ruby22 ruby23 ruby24" USERLAND="GNU" VIDEO_CARDS="nvidia nouveau intel" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
USE_PYTHON="2.7 3.4 3.5"
Unset:  CC, CPPFLAGS, CTARGET, CXX, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
Comment 10 Gabriel Marcano 2018-10-11 01:52:02 UTC
I am willing to try to troubleshoot this issue, I just need someone to tell me what information they need. Every time eix-sync tries to use the Georgia Tech mirrors, the transfer still goes incredibly slow.
Comment 11 Neil Bright 2018-10-11 13:57:07 UTC
Hi Gabriel,

Can we start with some basic network diagnostics?  Point a browser at http://www.speedtest.net and search for the server at the Georgia Institute of Technology and post results.
Comment 12 Gabriel Marcano 2018-10-11 16:12:50 UTC
Here's the information from speedtest.net (My ISP is Verizon FiOS, and I live in the Washington DC area):

IP_ADDRESS        108.48.74.178
TEST_DATE         10/11/2018 4:06 PM
TIME_ZONE         GMT
DOWNLOAD_MEGABITS 58.18
UPLOAD_MEGABITS   50.95
LATENCY_MS        19
SERVER_NAME       Atlanta, GA
DISTANCE_MILES    550

Seems like a direct connection is fine, I'm getting the full advertised speeds from my ISP. This also makes sense, since when I've tried to wget data from the Georgia Tech mirrors it downloads fast, it's just when I do eix-sync (and emerge --sync) that things come to a crawl.
Comment 13 Andrey Hippo 2019-01-02 19:12:22 UTC
Same issue for me -- GA Tech mirror is painfully slow, and rsync.us.gentoo.org resolves very frequently to GA Tech, and there is no blacklist option available.

I also live in DC area, and the problem happens both from work (Cox) and from home (FIOS).

`emerge --sync` stats from work (Cox):
Number of files: 163,363 (reg: 135,869, dir: 27,494)
Number of created files: 588 (reg: 565, dir: 23)
Number of deleted files: 645 (reg: 628, dir: 17)
Number of regular files transferred: 3,515
Total file size: 220.11M bytes
Total transferred file size: 18.08M bytes
Literal data: 18.08M bytes
Matched data: 0 bytes
File list size: 2.87M
File list generation time: 0.378 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 98.48K
Total bytes received: 22.46M

sent 98.48K bytes  received 22.46M bytes  28.57K bytes/sec
total size is 220.11M  speedup is 9.76


Speedtest from work (Cox):
IP_ADDRESS        70.163.25.XXX
TEST_DATE         1/2/2019 6:56 PM
TIME_ZONE         GMT
DOWNLOAD_MEGABITS 96.26
UPLOAD_MEGABITS   20.63
LATENCY_MS        25
SERVER_NAME       Atlanta, GA
DISTANCE_MILES    550
CONNECTION_MODE   multi


Speedtest from home (FIOS):
$ speedtest-cli --server 3165
Retrieving speedtest.net configuration...
Testing from Verizon Fios (100.15.83.XXX)...
Retrieving speedtest.net server list...
Selecting best server based on ping...
Hosted by Georgia Institute of Technology (Atlanta, GA) [877.47 km]: 32.862 ms
Testing download speed........
Download: 80.25 Mbit/s
Testing upload speed.........
Upload: 76.12 Mbit/s
Comment 14 Alec Warner archtester Gentoo Infrastructure gentoo-dev Security 2019-01-02 20:09:25 UTC
Hi I'm the mirror admin for Gentoo and I'm definitely interested in expectations around how long operations take.

(In reply to Andrey Hippo from comment #13)
> Same issue for me -- GA Tech mirror is painfully slow, and
> rsync.us.gentoo.org resolves very frequently to GA Tech, and there is no
> blacklist option available.

You say painful, but can you describe that in more detail? Is it painful because your bandwidth should be 1Gbit but rsync stats say its 30KB/s? What kind of speed do you expect?

How fast should emerge --sync be to be consider not painful?

> 
> I also live in DC area, and the problem happens both from work (Cox) and
> from home (FIOS).
> 
> `emerge --sync` stats from work (Cox):
> Number of files: 163,363 (reg: 135,869, dir: 27,494)
> Number of created files: 588 (reg: 565, dir: 23)
> Number of deleted files: 645 (reg: 628, dir: 17)
> Number of regular files transferred: 3,515
> Total file size: 220.11M bytes
> Total transferred file size: 18.08M bytes
> Literal data: 18.08M bytes
> Matched data: 0 bytes
> File list size: 2.87M
> File list generation time: 0.378 seconds
> File list transfer time: 0.000 seconds
> Total bytes sent: 98.48K
> Total bytes received: 22.46M
> 
> sent 98.48K bytes  received 22.46M bytes  28.57K bytes/sec
> total size is 220.11M  speedup is 9.76
> 
> 
> Speedtest from work (Cox):
> IP_ADDRESS        70.163.25.XXX
> TEST_DATE         1/2/2019 6:56 PM
> TIME_ZONE         GMT
> DOWNLOAD_MEGABITS 96.26
> UPLOAD_MEGABITS   20.63
> LATENCY_MS        25
> SERVER_NAME       Atlanta, GA
> DISTANCE_MILES    550
> CONNECTION_MODE   multi
> 
> 
> Speedtest from home (FIOS):
> $ speedtest-cli --server 3165
> Retrieving speedtest.net configuration...
> Testing from Verizon Fios (100.15.83.XXX)...
> Retrieving speedtest.net server list...
> Selecting best server based on ping...
> Hosted by Georgia Institute of Technology (Atlanta, GA) [877.47 km]: 32.862
> ms
> Testing download speed........
> Download: 80.25 Mbit/s
> Testing upload speed.........
> Upload: 76.12 Mbit/s
Comment 15 Alec Warner archtester Gentoo Infrastructure gentoo-dev Security 2019-01-02 21:54:50 UTC
So for myself for example (1gbit FIOS) from home:

Number of files: 163,370 (reg: 135,875, dir: 27,495)
Number of created files: 3 (reg: 3)
Number of deleted files: 0
Number of regular files transferred: 20
Total file size: 220,136,057 bytes
Total transferred file size: 334,558 bytes
Literal data: 101,492 bytes
Matched data: 233,066 bytes
File list size: 3,736,844
File list generation time: 0.080 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 118,315
Total bytes received: 4,407,968

sent 118,315 bytes  received 4,407,968 bytes  6,112.47 bytes/sec
total size is 220,136,057  speedup is 48.64

real    12m19.890s
user    0m1.072s
sys     0m1.293s

So from my view this is 'painful' because it takes 12 minutes and we transferred very little actual data (< 5MB). I don't actually think its bandwidth, the number (6kb/s) is just a mean that it calculated by doing 4.5MB / 12 minutes (yields about 6KB/s). Note that the above is an incremental sync; its not even doing a full tree.

To compare:

Number of files: 163,370 (reg: 135,875, dir: 27,495)
Number of created files: 0
Number of deleted files: 0
Number of regular files transferred: 10
Total file size: 220,136,062 bytes
Total transferred file size: 5,250 bytes
Literal data: 5,250 bytes
Matched data: 0 bytes
File list size: 3,185,037
File list generation time: 0.001 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 100,886
Total bytes received: 3,770,811

sent 100,886 bytes  received 3,770,811 bytes  455,493.76 bytes/sec
total size is 220,136,062  speedup is 56.86

real    0m8.242s
user    0m0.826s
sys     0m0.992s

This mirror is a single core GCP container with 4GB of RAM serving rsync out of RAM (no disks, basically.) I don't intend to highlight the bandwidth (which again is just some mean number rsync computes of data_sent / total_time) but instead to highlight that the incremental sync itself takes < 60s which is more in line with what I would expect (and is really the metric that I'm using to evaluate performance).

-A
Comment 16 Andrey Hippo 2019-01-02 23:24:21 UTC
(In reply to Alec Warner from comment #14)
> Hi I'm the mirror admin for Gentoo and I'm definitely interested in
> expectations around how long operations take.
> 
> (In reply to Andrey Hippo from comment #13)
> > Same issue for me -- GA Tech mirror is painfully slow, and
> > rsync.us.gentoo.org resolves very frequently to GA Tech, and there is no
> > blacklist option available.
> 
> You say painful, but can you describe that in more detail? Is it painful
> because your bandwidth should be 1Gbit but rsync stats say its 30KB/s? What
> kind of speed do you expect?
> 
> How fast should emerge --sync be to be consider not painful?

Sure.
In addition to `emerge --sync` crob job, I tend to run it interactively.
So, I perceive `emerge --sync` to be "slow" if I can read file names as they are being rsync'ed.
Similarly, I perceive `emerge --sync` to be "fast" if the lines and filenames fly by.
I don't have hard numbers though.

I'll measure `emerge --sync` time tomorrow to have some numbers.
(I just synced a few hours ago, so there are not many changes out there right now)

> Is it painful because your bandwidth should be 1Gbit but rsync stats say its 30KB/s?
No, not really.
For "slow" rsync mirrors, I don't usually have rsync stats at all because I simply abort `emerge --sync` and rerun it.
(so that it hopefully picks up another mirror from the rotation)

Total rsync time is probably a good metric as you say below.

(In reply to Alec Warner from comment #15)
> So for myself for example (1gbit FIOS) from home:
> <snip>
> 
> So from my view this is 'painful' because it takes 12 minutes and we
> transferred very little actual data (< 5MB). I don't actually think its
> bandwidth, the number (6kb/s) is just a mean that it calculated by doing
> 4.5MB / 12 minutes (yields about 6KB/s). Note that the above is an
> incremental sync; its not even doing a full tree.
> 
> To compare:
> 
> <snip>
> 
> real    0m8.242s
> user    0m0.826s
> sys     0m0.992s
> 
> This mirror is a single core GCP container with 4GB of RAM serving rsync out
> of RAM (no disks, basically.)
Which mirror is this?
I'm eager to start using it instead of rsync.us.gentoo.org rotation. :)

> I don't intend to highlight the bandwidth
> (which again is just some mean number rsync computes of data_sent /
> total_time) but instead to highlight that the incremental sync itself takes
> < 60s which is more in line with what I would expect (and is really the
> metric that I'm using to evaluate performance).


Yeah, you're right that the average speed reported by rsync is not a good metric.
I agree, that the total time rsync takes is a better metric.

Unfortunately, signature verification performed after rsync now takes a good portion of `emerge --sync` time.
Comment 17 Alec Warner archtester Gentoo Infrastructure gentoo-dev Security 2019-01-02 23:49:51 UTC
(In reply to Andrey Hippo from comment #16)
> (In reply to Alec Warner from comment #14)
> > Hi I'm the mirror admin for Gentoo and I'm definitely interested in
> > expectations around how long operations take.
> > 
> > (In reply to Andrey Hippo from comment #13)
> > > Same issue for me -- GA Tech mirror is painfully slow, and
> > > rsync.us.gentoo.org resolves very frequently to GA Tech, and there is no
> > > blacklist option available.
> > 
> > You say painful, but can you describe that in more detail? Is it painful
> > because your bandwidth should be 1Gbit but rsync stats say its 30KB/s? What
> > kind of speed do you expect?
> > 
> > How fast should emerge --sync be to be consider not painful?
> 
> Sure.
> In addition to `emerge --sync` crob job, I tend to run it interactively.
> So, I perceive `emerge --sync` to be "slow" if I can read file names as they
> are being rsync'ed.
> Similarly, I perceive `emerge --sync` to be "fast" if the lines and
> filenames fly by.
> I don't have hard numbers though.
> 
> I'll measure `emerge --sync` time tomorrow to have some numbers.
> (I just synced a few hours ago, so there are not many changes out there
> right now)
> 
> > Is it painful because your bandwidth should be 1Gbit but rsync stats say its 30KB/s?
> No, not really.
> For "slow" rsync mirrors, I don't usually have rsync stats at all because I
> simply abort `emerge --sync` and rerun it.
> (so that it hopefully picks up another mirror from the rotation)
> 
> Total rsync time is probably a good metric as you say below.

On my end i'm using an SSD and I'd expect emerge --sync to take < 60s; so that is nominally what I'm aiming for; experience wise.

> 
> (In reply to Alec Warner from comment #15)
> > So for myself for example (1gbit FIOS) from home:
> > <snip>
> > 
> > So from my view this is 'painful' because it takes 12 minutes and we
> > transferred very little actual data (< 5MB). I don't actually think its
> > bandwidth, the number (6kb/s) is just a mean that it calculated by doing
> > 4.5MB / 12 minutes (yields about 6KB/s). Note that the above is an
> > incremental sync; its not even doing a full tree.
> > 
> > To compare:
> > 
> > <snip>
> > 
> > real    0m8.242s
> > user    0m0.826s
> > sys     0m0.992s
> > 
> > This mirror is a single core GCP container with 4GB of RAM serving rsync out
> > of RAM (no disks, basically.)
> Which mirror is this?
> I'm eager to start using it instead of rsync.us.gentoo.org rotation. :)

35.190.132.250 is currently in the rsync.us.gentoo.org rotation; so you are plausibly already using it.

> 
> > I don't intend to highlight the bandwidth
> > (which again is just some mean number rsync computes of data_sent /
> > total_time) but instead to highlight that the incremental sync itself takes
> > < 60s which is more in line with what I would expect (and is really the
> > metric that I'm using to evaluate performance).
> 
> 
> Yeah, you're right that the average speed reported by rsync is not a good
> metric.
> I agree, that the total time rsync takes is a better metric.
> 
> Unfortunately, signature verification performed after rsync now takes a good
> portion of `emerge --sync` time.

Yeah I'm not considering this as part of my evaluation because nothing I do on the rsync-side will help here; but its something to keep in mind.

-A
Comment 18 Andrey Hippo 2019-01-04 03:31:30 UTC
(In reply to Andrey Hippo from comment #16)
> (In reply to Alec Warner from comment #14)
> > Hi I'm the mirror admin for Gentoo and I'm definitely interested in
> > expectations around how long operations take.
> > 
> > (In reply to Andrey Hippo from comment #13)
> > > Same issue for me -- GA Tech mirror is painfully slow, and
> > > rsync.us.gentoo.org resolves very frequently to GA Tech, and there is no
> > > blacklist option available.
> > 
> > You say painful, but can you describe that in more detail? Is it painful
> > because your bandwidth should be 1Gbit but rsync stats say its 30KB/s? What
> > kind of speed do you expect?
> > 
> > How fast should emerge --sync be to be consider not painful?
> 
> Sure.
> In addition to `emerge --sync` crob job, I tend to run it interactively.
> So, I perceive `emerge --sync` to be "slow" if I can read file names as they
> are being rsync'ed.
> Similarly, I perceive `emerge --sync` to be "fast" if the lines and
> filenames fly by.
> I don't have hard numbers though.
> 
> I'll measure `emerge --sync` time tomorrow to have some numbers.
> (I just synced a few hours ago, so there are not many changes out there
> right now)

I timed how long rsync takes with the same arguments that `emerge --sync` uses.
(/usr/portage/.tmp-unverified-download-quarantine.* were `cp -al`'ed from /usr/portage/.tmp-unverified-download-quarantine, prepared by Portage)

1. Fast
 $ /usr/bin/time -v rsync --recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git --verbose --progress rsync://35.190.132.250/gentoo-portage/ /usr/portage/.tmp-unverified-download-quarantine.1
 Number of regular files transferred: 504
 sent 39.33K bytes  received 8.90M bytes  940.71K bytes/sec
 Elapsed (wall clock) time (h:mm:ss or m:ss): 0:09.33

2. Almost fast
 $ /usr/bin/time -v rsync --recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git --verbose --progress rsync://rsync25.us.gentoo.org/gentoo-portage/ /usr/portage/.tmp-unverified-download-quarantine.2
 Number of regular files transferred: 502
 sent 39.49K bytes  received 9.95M bytes  605.34K bytes/sec
 Elapsed (wall clock) time (h:mm:ss or m:ss): 0:16.76

3. Painfully slow
 First, GA Tech mirror timed out 3 times after "receiving incremental file list".
 Then, it finally was able to sloowly go through (the IP address was likely 128.61.111.9).
 $ /usr/bin/time -v rsync --recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=30 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git --verbose --progress rsync://rsync3.us.gentoo.org/gentoo-portage/ /usr/portage/.tmp-unverified-download-quarantine.3
 Number of regular files transferred: 502
 sent 39.51K bytes  received 10.09M bytes  13.66K bytes/sec
 Elapsed (wall clock) time (h:mm:ss or m:ss): 12:20.82
Comment 19 Bob Johnson 2020-03-09 23:27:47 UTC
Most of the reporters on here are seeing blazingly fast speeds from Georgia Tech. It takes my system 2 to 3 HOURS to download from that mirror, and almost invariably the manifest verification fails after it finally completes:

Number of files: 155,167 (reg: 128,607, dir: 26,560)
Number of created files: 195 (reg: 187, dir: 8)
Number of deleted files: 180 (reg: 173, dir: 7)
Number of regular files transferred: 1,085
Total file size: 205.19M bytes
Total transferred file size: 9.73M bytes
Literal data: 9.73M bytes
Matched data: 0 bytes
File list size: 3.71M
File list generation time: 6.032 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 51.48K
Total bytes received: 13.78M

sent 51.48K bytes  received 13.78M bytes  1.38K bytes/sec
total size is 205.19M  speedup is 14.83
 * Manifest timestamp: 2020-03-09 16:39:06 UTC
 * Valid OpenPGP signature found:
 * - primary key: DCD05B71EAB94199527F44ACDB6B8C1F96D8BF6D
 * - subkey: E1D6ABB63BFCFB4BA02FDF1CEC590EEAC9189250
 * - timestamp: 2020-03-09 16:39:06 UTC
 * Verifying /usr/portage/.tmp-unverified-download-quarantine ...!!! Manifest verification failed:
Manifest mismatch for metadata/Manifest.gz
  __size__: expected: 2837, have: 2835

Action: sync for repo: gentoo, returned code = 1

None of the other mirrors in the US rotation have that problem, typically taking only a minute or less to completely download approximately the same number of bytes on my 50M cable ISP connection.
Comment 20 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2020-03-10 18:42:24 UTC
(In reply to Bob Johnson from comment #19)
> Most of the reporters on here are seeing blazingly fast speeds from Georgia
> Tech. It takes my system 2 to 3 HOURS to download from that mirror, and
> almost invariably the manifest verification fails after it finally completes:
Yes, if the sync takes this long, your download was effectively a smeared version of the files over time, and since some files come from before a modification window, others afterwards, there is a high change the Manifest wasn't from the same point in time.

> sent 51.48K bytes  received 13.78M bytes  1.38K bytes/sec
> total size is 205.19M  speedup is 14.83
...
> None of the other mirrors in the US rotation have that problem, typically
> taking only a minute or less to completely download approximately the same
> number of bytes on my 50M cable ISP connection.

We're really suffering from lack of reproducibility in this bug: 5 users over 4 years, and none of the Gentoo infra system able to reproduce it from our own systems.

So for somebody that CAN reproduce it, here's what needs to happen:
1. Reduce the command down to the simplest possible variant, ideally depends on no local state, and has just ONE call to rsync, no other code.
2. Capture a tcpdump of the ENTIRE transaction, every single byte of it.

For the definition of "fast" vs "slow": it needs to complete within the window of a single mirror sync, so 30 minutes at the absolute maximum, and ideally half of that.
Comment 21 Joshua Kinard gentoo-dev 2020-03-11 04:01:47 UTC
(In reply to Robin Johnson from comment #20)
> We're really suffering from lack of reproducibility in this bug: 5 users
> over 4 years, and none of the Gentoo infra system able to reproduce it from
> our own systems.
> 
> So for somebody that CAN reproduce it, here's what needs to happen:
> 1. Reduce the command down to the simplest possible variant, ideally depends
> on no local state, and has just ONE call to rsync, no other code.
> 2. Capture a tcpdump of the ENTIRE transaction, every single byte of it.
> 
> For the definition of "fast" vs "slow": it needs to complete within the
> window of a single mirror sync, so 30 minutes at the absolute maximum, and
> ideally half of that.

Easily reproduced for me on the same IP (128.61.111.9) that I initially reported on ~4 years ago.  Since then, my internet has been periodically upped by my ISP, I've replaced the cable modem, and replaced my router (OpenWRT -> FreeBSD 12.x).  I did not attempt to reduce the command, just ran "emerge --sync".  Captured about ~30min of the transfer before killing it, because of the mirror sync window.  It was transferring updates to 'dev-java/gnu-crypto' before I killed it.  And my portage tree was only a few days out of date, so that is abysmally slow.

PCAP is uploaded to the root of my dev folder.  Just reach in and grab it.  The filename is "rsync_gtlib_gatech_edu.pcap.xz".  The series of ACKs and RSTs at the end is from Ctrl+C'ing the --sync.  If you want a capture of a full sync, I can do another one in a day or two.
Comment 22 Andrey Hippo 2020-03-14 16:00:36 UTC
(In reply to Robin Johnson from comment #20)

> We're really suffering from lack of reproducibility in this bug: 5 users
> over 4 years, and none of the Gentoo infra system able to reproduce it from
> our own systems.
> 
> So for somebody that CAN reproduce it, here's what needs to happen:
> 1. Reduce the command down to the simplest possible variant, ideally depends
> on no local state, and has just ONE call to rsync, no other code.
> 2. Capture a tcpdump of the ENTIRE transaction, every single byte of it.
> 
> For the definition of "fast" vs "slow": it needs to complete within the
> window of a single mirror sync, so 30 minutes at the absolute maximum, and
> ideally half of that.

Please, find my pcap at the link below (it's 50MiB xz-compressed, so I can't attach it here):
https://drive.google.com/file/d/1Jmwwo1ygvEP3jIVM-esY3LRbftvxBur-/view

The pcap is captured by "tcpdump -n -i eth0 -p host rsync3.us.gentoo.org -s 0 -w rsync3.us.full.pcap -v" while running the following rsync command.
I started it around 1:04am EDT on March 11 and it took almost 6 hours (!) to complete.
rsync --recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git --verbose --progress rsync://rsync3.us.gentoo.org/gentoo-portage/ /usr/portage/.tmp-unverified-download-quarantine.3

Rsync stats:
Number of files: 155,235 (reg: 128,670, dir: 26,565)
Number of created files: 155,235 (reg: 128,670, dir: 26,565)
Number of deleted files: 0
Number of regular files transferred: 128,670
Total file size: 205.79M bytes
Total transferred file size: 205.79M bytes
Literal data: 205.79M bytes
Matched data: 0 bytes
File list size: 4.46M
File list generation time: 17.762 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 2.59M
Total bytes received: 215.55M

sent 2.59M bytes  received 215.55M bytes  10.74K bytes/sec
total size is 205.79M  speedup is 0.94

Elapsed (wall clock) time (h:mm:ss or m:ss): 5:38:28