Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 738738 - [2a00:1828:a00d:ffff::6] / 89.238.71.6 (turnstone.gentoo.org) rsync mirror resets connections
Summary: [2a00:1828:a00d:ffff::6] / 89.238.71.6 (turnstone.gentoo.org) rsync mirror re...
Status: IN_PROGRESS
Alias: None
Product: Mirrors
Classification: Unclassified
Component: Server Problem (show other bugs)
Hardware: All Linux
: Normal normal
Assignee: Mirror Admins
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-08-23 18:51 UTC by Vjaceslavs Klimovs
Modified: 2022-01-23 22:43 UTC (History)
4 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Vjaceslavs Klimovs 2020-08-23 18:51:52 UTC
* Running emerge --sync
>>> Syncing repository 'gentoo' into '/var/db/repos/gentoo'...
 * Using keys from /usr/share/openpgp-keys/gentoo-release.asc
 * Refreshing keys via WKD ...                                                                                                                                                                              [ ok ]
>>> Starting rsync with rsync://[2a00:1828:a00d:ffff::6]/gentoo-portage...
>>> Checking server timestamp ...
rsync: [Receiver] safe_read failed to read 1 bytes: Connection reset by peer (104)
rsync error: error in rsync protocol data stream (code 12) at io.c(276) [Receiver=3.2.3]
>>> Retrying...


>>> Starting retry 1 of 1 with rsync://89.238.71.6/gentoo-portage
>>> Checking server timestamp ...
rsync: [Receiver] safe_read failed to read 1 bytes: Connection reset by peer (104)
rsync error: error in rsync protocol data stream (code 12) at io.c(276) [Receiver=3.2.3]
>>> Retrying...
!!! Exhausted addresses for rsync.gentoo.org

Action: sync for repo: gentoo, returned code = 1


 * emerge --sync failed
 * Time statistics:
     4 seconds for syncing
     4 seconds total
Comment 1 Alec Warner (RETIRED) archtester gentoo-dev Security 2020-08-23 19:50:44 UTC
a few questions here:
 - how far are you from turnstone (like a ping, what is your RTT?)
 - The log you provided implies we tried twice for 2 seconds a piece; so this isn't strictly a timeout problem?
 - Can you provide your client IP (so I can look at the logs?)

Is SYNC set in /etc/portage/make.conf ?

-A
Comment 2 Vjaceslavs Klimovs 2020-08-23 20:04:17 UTC
1) Distance from turnstone:
pulley ~ # ping turnstone.gentoo.org
PING turnstone.gentoo.org(2a00:1828:a00d:ffff::6 (2a00:1828:a00d:ffff::6)) 56 data bytes
64 bytes from 2a00:1828:a00d:ffff::6 (2a00:1828:a00d:ffff::6): icmp_seq=1 ttl=51 time=166 ms
64 bytes from 2a00:1828:a00d:ffff::6 (2a00:1828:a00d:ffff::6): icmp_seq=2 ttl=51 time=167 ms
64 bytes from 2a00:1828:a00d:ffff::6 (2a00:1828:a00d:ffff::6): icmp_seq=3 ttl=51 time=169 ms
64 bytes from 2a00:1828:a00d:ffff::6 (2a00:1828:a00d:ffff::6): icmp_seq=4 ttl=51 time=168 ms
64 bytes from 2a00:1828:a00d:ffff::6 (2a00:1828:a00d:ffff::6): icmp_seq=5 ttl=51 time=168 ms
64 bytes from 2a00:1828:a00d:ffff::6 (2a00:1828:a00d:ffff::6): icmp_seq=6 ttl=51 time=166 ms
64 bytes from 2a00:1828:a00d:ffff::6 (2a00:1828:a00d:ffff::6): icmp_seq=7 ttl=51 time=169 ms
64 bytes from 2a00:1828:a00d:ffff::6 (2a00:1828:a00d:ffff::6): icmp_seq=8 ttl=51 time=166 ms
64 bytes from 2a00:1828:a00d:ffff::6 (2a00:1828:a00d:ffff::6): icmp_seq=9 ttl=51 time=169 ms

2) Capture while attempting sync:
pulley ~ # tcpdump -i eno1 'host 2a00:1828:a00d:ffff::6 or host 89.238.71.6'
dropped privs to tcpdump
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eno1, link-type EN10MB (Ethernet), capture size 262144 bytes
13:00:59.831597 IP6 2600:1700:5450:d9a1::1.52960 > 2a00:1828:a00d:ffff::6.rsync: Flags [S], seq 1567411857, win 64800, options [mss 1440,sackOK,TS val 1753053048 ecr 0,nop,wscale 7], length 0
13:01:00.000589 IP6 2a00:1828:a00d:ffff::6.rsync > 2600:1700:5450:d9a1::1.52960: Flags [S.], seq 3101754608, ack 1567411858, win 65535, options [mss 1440,nop,nop,TS val 3803797059 ecr 1753053048,nop,wscale 9], length 0
13:01:00.000638 IP6 2600:1700:5450:d9a1::1.52960 > 2a00:1828:a00d:ffff::6.rsync: Flags [.], ack 1, win 507, options [nop,nop,TS val 1753053217 ecr 3803797059], length 0
13:01:00.000734 IP6 2600:1700:5450:d9a1::1.52960 > 2a00:1828:a00d:ffff::6.rsync: Flags [P.], seq 1:15, ack 1, win 507, options [nop,nop,TS val 1753053217 ecr 3803797059], length 14
13:01:00.167964 IP6 2a00:1828:a00d:ffff::6.rsync > 2600:1700:5450:d9a1::1.52960: Flags [.], ack 15, win 128, options [nop,nop,TS val 3803797228 ecr 1753053217], length 0
13:01:00.167983 IP6 2a00:1828:a00d:ffff::6.rsync > 2600:1700:5450:d9a1::1.52960: Flags [R.], seq 1, ack 15, win 128, options [nop,nop,TS val 3803797228 ecr 1753053217], length 0
13:01:00.171558 IP 75-3-241-49.lightspeed.sntcca.sbcglobal.net.35406 > turnstone.gentoo.org.rsync: Flags [S], seq 2920500784, win 64240, options [mss 1460,sackOK,TS val 683869120 ecr 0,nop,wscale 7], length 0
13:01:00.338322 IP turnstone.gentoo.org.rsync > 75-3-241-49.lightspeed.sntcca.sbcglobal.net.35406: Flags [S.], seq 664649618, ack 2920500785, win 65535, options [mss 1460,nop,nop,TS val 2196955357 ecr 683869120,nop,wscale 9], length 0
13:01:00.338381 IP 75-3-241-49.lightspeed.sntcca.sbcglobal.net.35406 > turnstone.gentoo.org.rsync: Flags [.], ack 1, win 502, options [nop,nop,TS val 683869286 ecr 2196955357], length 0
13:01:00.338442 IP 75-3-241-49.lightspeed.sntcca.sbcglobal.net.35406 > turnstone.gentoo.org.rsync: Flags [P.], seq 1:15, ack 1, win 502, options [nop,nop,TS val 683869287 ecr 2196955357], length 14
13:01:00.503479 IP turnstone.gentoo.org.rsync > 75-3-241-49.lightspeed.sntcca.sbcglobal.net.35406: Flags [.], ack 15, win 128, options [nop,nop,TS val 2196955524 ecr 683869287], length 0
13:01:00.503840 IP turnstone.gentoo.org.rsync > 75-3-241-49.lightspeed.sntcca.sbcglobal.net.35406: Flags [R.], seq 1, ack 15, win 128, options [nop,nop,TS val 2196955524 ecr 683869287], length 0

(looks like turnstone resets the connection)

3) Source addresses are 75.3.241.49 and 2600:1700:5450:d9a0::1
Comment 3 Vjaceslavs Klimovs 2020-08-23 20:05:37 UTC
repos.conf entry:

pulley ~ # cat /etc/portage/repos.conf/gentoo.conf
[DEFAULT]
main-repo = gentoo

[gentoo]
location = /var/db/repos/gentoo
sync-type = rsync
sync-uri = rsync://rsync.gentoo.org/gentoo-portage
auto-sync = yes
sync-rsync-verify-jobs = 1
sync-rsync-verify-metamanifest = yes
sync-rsync-verify-max-age = 24
sync-openpgp-key-path = /usr/share/openpgp-keys/gentoo-release.asc
sync-openpgp-keyserver = hkps://keys.gentoo.org
sync-openpgp-key-refresh-retry-count = 40
sync-openpgp-key-refresh-retry-overall-timeout = 1200
sync-openpgp-key-refresh-retry-delay-exp-base = 2
sync-openpgp-key-refresh-retry-delay-max = 60
sync-openpgp-key-refresh-retry-delay-mult = 4
sync-webrsync-verify-signature = yes
Comment 4 Alec Warner (RETIRED) archtester gentoo-dev Security 2020-08-23 20:46:19 UTC
(In reply to Vjaceslavs Klimovs from comment #2)
> 1) Distance from turnstone:
> pulley ~ # ping turnstone.gentoo.org
> PING turnstone.gentoo.org(2a00:1828:a00d:ffff::6 (2a00:1828:a00d:ffff::6))
> 56 data bytes
> 64 bytes from 2a00:1828:a00d:ffff::6 (2a00:1828:a00d:ffff::6): icmp_seq=1
> ttl=51 time=166 ms
> 64 bytes from 2a00:1828:a00d:ffff::6 (2a00:1828:a00d:ffff::6): icmp_seq=2
> ttl=51 time=167 ms
> 64 bytes from 2a00:1828:a00d:ffff::6 (2a00:1828:a00d:ffff::6): icmp_seq=3
> ttl=51 time=169 ms
> 64 bytes from 2a00:1828:a00d:ffff::6 (2a00:1828:a00d:ffff::6): icmp_seq=4
> ttl=51 time=168 ms
> 64 bytes from 2a00:1828:a00d:ffff::6 (2a00:1828:a00d:ffff::6): icmp_seq=5
> ttl=51 time=168 ms
> 64 bytes from 2a00:1828:a00d:ffff::6 (2a00:1828:a00d:ffff::6): icmp_seq=6
> ttl=51 time=166 ms
> 64 bytes from 2a00:1828:a00d:ffff::6 (2a00:1828:a00d:ffff::6): icmp_seq=7
> ttl=51 time=169 ms
> 64 bytes from 2a00:1828:a00d:ffff::6 (2a00:1828:a00d:ffff::6): icmp_seq=8
> ttl=51 time=166 ms
> 64 bytes from 2a00:1828:a00d:ffff::6 (2a00:1828:a00d:ffff::6): icmp_seq=9
> ttl=51 time=169 ms
> 
> 2) Capture while attempting sync:
> pulley ~ # tcpdump -i eno1 'host 2a00:1828:a00d:ffff::6 or host 89.238.71.6'
> dropped privs to tcpdump
> tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
> listening on eno1, link-type EN10MB (Ethernet), capture size 262144 bytes
> 13:00:59.831597 IP6 2600:1700:5450:d9a1::1.52960 >
> 2a00:1828:a00d:ffff::6.rsync: Flags [S], seq 1567411857, win 64800, options
> [mss 1440,sackOK,TS val 1753053048 ecr 0,nop,wscale 7], length 0
> 13:01:00.000589 IP6 2a00:1828:a00d:ffff::6.rsync >
> 2600:1700:5450:d9a1::1.52960: Flags [S.], seq 3101754608, ack 1567411858,
> win 65535, options [mss 1440,nop,nop,TS val 3803797059 ecr
> 1753053048,nop,wscale 9], length 0
> 13:01:00.000638 IP6 2600:1700:5450:d9a1::1.52960 >
> 2a00:1828:a00d:ffff::6.rsync: Flags [.], ack 1, win 507, options [nop,nop,TS
> val 1753053217 ecr 3803797059], length 0
> 13:01:00.000734 IP6 2600:1700:5450:d9a1::1.52960 >
> 2a00:1828:a00d:ffff::6.rsync: Flags [P.], seq 1:15, ack 1, win 507, options
> [nop,nop,TS val 1753053217 ecr 3803797059], length 14
> 13:01:00.167964 IP6 2a00:1828:a00d:ffff::6.rsync >
> 2600:1700:5450:d9a1::1.52960: Flags [.], ack 15, win 128, options
> [nop,nop,TS val 3803797228 ecr 1753053217], length 0
> 13:01:00.167983 IP6 2a00:1828:a00d:ffff::6.rsync >
> 2600:1700:5450:d9a1::1.52960: Flags [R.], seq 1, ack 15, win 128, options
> [nop,nop,TS val 3803797228 ecr 1753053217], length 0
> 13:01:00.171558 IP 75-3-241-49.lightspeed.sntcca.sbcglobal.net.35406 >
> turnstone.gentoo.org.rsync: Flags [S], seq 2920500784, win 64240, options
> [mss 1460,sackOK,TS val 683869120 ecr 0,nop,wscale 7], length 0
> 13:01:00.338322 IP turnstone.gentoo.org.rsync >
> 75-3-241-49.lightspeed.sntcca.sbcglobal.net.35406: Flags [S.], seq
> 664649618, ack 2920500785, win 65535, options [mss 1460,nop,nop,TS val
> 2196955357 ecr 683869120,nop,wscale 9], length 0
> 13:01:00.338381 IP 75-3-241-49.lightspeed.sntcca.sbcglobal.net.35406 >
> turnstone.gentoo.org.rsync: Flags [.], ack 1, win 502, options [nop,nop,TS
> val 683869286 ecr 2196955357], length 0
> 13:01:00.338442 IP 75-3-241-49.lightspeed.sntcca.sbcglobal.net.35406 >
> turnstone.gentoo.org.rsync: Flags [P.], seq 1:15, ack 1, win 502, options
> [nop,nop,TS val 683869287 ecr 2196955357], length 14
> 13:01:00.503479 IP turnstone.gentoo.org.rsync >
> 75-3-241-49.lightspeed.sntcca.sbcglobal.net.35406: Flags [.], ack 15, win
> 128, options [nop,nop,TS val 2196955524 ecr 683869287], length 0
> 13:01:00.503840 IP turnstone.gentoo.org.rsync >
> 75-3-241-49.lightspeed.sntcca.sbcglobal.net.35406: Flags [R.], seq 1, ack
> 15, win 128, options [nop,nop,TS val 2196955524 ecr 683869287], length 0
> 
> (looks like turnstone resets the connection)
> 
> 3) Source addresses are 75.3.241.49 and 2600:1700:5450:d9a0::1

COol

Aug 23 18:38:47 turnstone xinetd[3302]: refused connect from XXXX due to excessive load.

We have lost a few rsync nodes recently due to maintenance. I've done a few things:
 - Let rsync run at a higher priority.
 - Doubled the load limits.

Right now xinetd is dropping connections for you (and others) often due to high machine load. This works well on a machine that is not shared (rsync connections tend to take a while if there is data to transfer and throttling new reqs is often effective to sustain service.) On Turnstone, we have some batch jobs running in tehe backgruond that can drive up the load, so the load capping is less effective there.

The new limits should take effect in an hour or so.

-A
Comment 5 Alec Warner (RETIRED) archtester gentoo-dev Security 2020-08-24 18:07:44 UTC
https://infra-status.gentoo.org/notice/20200824-rsync

We have increased the number of serving nodes as well.  I will monitor the situation over the coming days but I believe things should be better now.

-A
Comment 6 Vjaceslavs Klimovs 2020-10-25 19:04:19 UTC
Should this be resolved now?
Comment 7 Jiahao_XU 2021-02-16 06:12:43 UTC
I also had this issue a couple of times and had to use `emerge-webrsync` instead.

I am in Australia, the ping to 89.238.71.6 is roughly 400ms.
Comment 8 Alec Warner (RETIRED) archtester gentoo-dev Security 2021-02-16 06:34:08 UTC
(In reply to Jiahao_XU from comment #7)
> I also had this issue a couple of times and had to use `emerge-webrsync`
> instead.
> 
> I am in Australia, the ping to 89.238.71.6 is roughly 400ms.

It still drops rsyncs from time to time, yes.

It looks like about 10% of syncs fail because the host is too busy.

-A
Comment 9 Jiahao_XU 2021-02-17 12:30:50 UTC
Just kidding, but maybe it's time to switch to P2P.
Comment 10 Jiahao_XU 2021-02-17 12:32:56 UTC
Maybe something like ipfs can be used to improve the situation.
Comment 11 Alec Warner (RETIRED) archtester gentoo-dev Security 2021-02-17 16:09:26 UTC
(In reply to Jiahao_XU from comment #10)
> Maybe something like ipfs can be used to improve the situation.

We have 43 mirrors; try using another one. rsync.asia.gentoo.org has 5 machines in it for example.

-A
Comment 12 Jennifer Deborja 2022-01-23 22:38:42 UTC
Device

Identifiers
Pseudo ID: ffffffff-e743-ec99-ffff-ffffef05ac4a
User agent: Mozilla/5.0 (Linux; Android 9; CPH2083 Build/PPR1.180610.011; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/97.0.4692.98 Mobile Safari/537.36__Dalvik/2.1.0 (Linux; U; Android 9; CPH2083 Build/PPR1.180610.011)

NFS
Present: No
Enabled: No

Display
Display Resolution: 1395x720
Screen Density: unknown
Screen Size: 5.82
Screen Refresh rate: 60.15 Hz

Date
Date: Jan 24, 2022
Time: 06:36:49
Uptime: 07:18:36

CPU
ABIS x32: armeabi-v7a_armeabi
ABIS x64: arm64-v8a
ABIS: arm64-v8a_armeabi-v7a_armeabiManufacturer: OPPO
Model: CPH2083
OS Codename: unknown
OS Version: 9
Language: en
Device Serial: "secret"
Display Version: CPH2083EX_11_A.65
Radio Version: M_V3_P10,M_V3_P10
Product: CPH2083
Board: oppo6765_19451
Hardware: mt6765
Device Rooted: No
Loader: unknown
Brand: OPPO
Host: Ubuntu-134-13
Release: 9
Codename: REL
SDK version: 28
Build ID: PPR1.180610.011
Core: nullDevice type: Phone
Phone type: GSM
Orientation: Portrait