Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 449644 - ftp.halifax.rwth-aachen.de missing newer stages
Summary: ftp.halifax.rwth-aachen.de missing newer stages
Status: RESOLVED FIXED
Alias: None
Product: Mirrors
Classification: Unclassified
Component: Server Problem (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Mirror Admins
URL: http://ftp.halifax.rwth-aachen.de/gen...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-01-01 14:10 UTC by Chí-Thanh Christopher Nguyễn
Modified: 2014-12-16 22:07 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Chí-Thanh Christopher Nguyễn gentoo-dev 2013-01-01 14:10:47 UTC
All stages directories starting from 20121222/ seem to be emtpy. This has the consequence that e.g. current-stage3-armv6j_hardfp/ is also empty.

However latest-stage3-armv6j_hardfp.txt points to an existing stage in the 20121220/ directory

As this host appears to be in the distfiles.gentoo.org mirror rotation this can cause problems for users.
Comment 1 Chí-Thanh Christopher Nguyễn gentoo-dev 2013-01-16 15:17:34 UTC
The issue seems to affect all arches since 20121222

Please at least take ftp.halifax.rwth-aachen.de out of distfiles.g.o DNS rotation. Several users have already complained on IRC that the stages are missing.
Comment 2 Alec Warner (RETIRED) archtester gentoo-dev Security 2013-01-16 16:35:10 UTC
manually verified missing stages. Removing.

remote: Pushing commit to ns0.gentoo.org (corvid)
remote: From git+ssh://git.gentoo.org/infra/dns-zones
remote:    28332a8..f0ad6bf  master     -> origin/master
remote: Updating 28332a8..f0ad6bf
remote: Fast-forward
remote:  zones/global/org.gentoo.distfiles |    3 ++-
remote:  1 files changed, 2 insertions(+), 1 deletions(-)

TTL is 7200 seconds (2h), so don't expect it gone until then (or longer.)

We still need to contact them and tell them to fix it.

-A
Comment 3 Alec Warner (RETIRED) archtester gentoo-dev Security 2013-01-17 05:54:56 UTC
Email sent to their mirror-admin...hopefully I didn't screw that up.

-A
Comment 4 Carsten Otto 2013-01-17 08:35:38 UTC
Hi,

I'm one of the mirror admins. The error might be on your side (on masterdistfiles-eu.gentoo.org).
The following errors started to pop up on December 22nd (04:31 German time).

rsync: connection unexpectedly closed (279202474 bytes received so far) [receiver]
rsync error: error in rsync protocol data stream (code 12) at io.c(601) [receiver=3.0.]
rsync: connection unexpectedly closed (1680614 bytes received so far) [generator]
rsync error: error in rsync protocol data stream (code 12) at io.c(601) [generator=3.0.7]

Best regards,
Carsten
Comment 5 Alec Warner (RETIRED) archtester gentoo-dev Security 2013-01-17 09:16:18 UTC
I sessionized our rsync logs:

kea rsync # for file in /tmp/rsyncd.log.*; do awk -F[ '{ print $2 }' $file | cut -f 1 -d] | sort -u | zgrep -F -f - $(basename ${file%_*}); done | awk '/connect/ { print $8 }' | sort | uniq -c
    131 ftp.halifax.rwth-aachen.de
      1 ftp.prz.edu.pl

Those are the only two hosts experiencing issues. For some reason your mirror disconnects from mastermirror-eu. I can confirm that the last non-error sync was late Dec 21, 2012.

I'll try a tcpdump capture during your attempted sync so we can see what is happening traffic wise.

-A
Comment 6 Jeremy Olexa (darkside) (RETIRED) archtester gentoo-dev Security 2013-01-17 11:23:13 UTC
Add the mirror admin to CC so he/they can resolve it, please.

http://mirrorstats.gentoo.org/#de
Comment 7 Alec Warner (RETIRED) archtester gentoo-dev Security 2013-01-17 21:44:59 UTC
I have started a tcpdump on mirrormaster-eu to hopefully add insight into why the rsync connections are timing out.

We also have netflow data from 20121222 (the date these syncs started to go bad.) I'm unfamiliar with how to read the flow data, but I will try to figure that part out.

-A
Comment 8 Carsten Otto 2013-01-18 10:11:36 UTC
If it helps, here are the last few lines of the last rsync runs. Maybe your file system is slow on the last files, or transfer in general takes too long and hits a timeout?

==> rsync-ftpsync-gentoo.log.0 <==
releases/x86/autobuilds/latest-stage3-i686.txt
releases/x86/autobuilds/latest-stage3.txt
snapshots/
snapshots/.timestamp-snapshots.x
snapshots/portage-20130110.tar.bz2

==> rsync-ftpsync-gentoo.log.1 <==
releases/x86/autobuilds/latest-stage3-i686.txt
releases/x86/autobuilds/latest-stage3.txt
snapshots/
snapshots/.timestamp-snapshots.x
snapshots/portage-20130110.tar.bz2

==> rsync-ftpsync-gentoo.log.2 <==
releases/x86/autobuilds/latest-stage3-i686-hardened.txt
releases/x86/autobuilds/latest-stage3-i686.txt
releases/x86/autobuilds/latest-stage3.txt
snapshots/.timestamp-snapshots.x
snapshots/portage-20130109.tar.bz2

==> rsync-ftpsync-gentoo.log.3 <==
releases/sparc/autobuilds/latest-stage3-sparc64-multilib.txt
releases/sparc/autobuilds/latest-stage3-sparc64.txt
releases/sparc/autobuilds/latest-stage3.txt
releases/x86/20121221/
releases/x86/20121221/livedvd-x86-amd64-32ul-20121221.iso

==> rsync-ftpsync-gentoo.log.4 <==
releases/sparc/autobuilds/latest-stage3-sparc64-multilib.txt
releases/sparc/autobuilds/latest-stage3-sparc64.txt
releases/sparc/autobuilds/latest-stage3.txt
releases/x86/20121221/
releases/x86/20121221/livedvd-x86-amd64-32ul-20121221.iso

==> rsync-ftpsync-gentoo.log.5 <==
releases/sparc/autobuilds/latest-stage3-sparc64-multilib.txt
releases/sparc/autobuilds/latest-stage3-sparc64.txt
releases/sparc/autobuilds/latest-stage3.txt
releases/x86/20121221/
releases/x86/20121221/livedvd-x86-amd64-32ul-20121221.iso

==> rsync-ftpsync-gentoo.log.6 <==
releases/sparc/autobuilds/latest-stage3-sparc64-multilib.txt
releases/sparc/autobuilds/latest-stage3-sparc64.txt
releases/sparc/autobuilds/latest-stage3.txt
releases/x86/20121221/
releases/x86/20121221/livedvd-x86-amd64-32ul-20121221.iso

==> rsync-ftpsync-gentoo.log.7 <==
releases/sparc/autobuilds/latest-stage3-sparc64-multilib.txt
releases/sparc/autobuilds/latest-stage3-sparc64.txt
releases/sparc/autobuilds/latest-stage3.txt
releases/x86/20121221/
releases/x86/20121221/livedvd-x86-amd64-32ul-20121221.iso

==> rsync-ftpsync-gentoo.log.8 <==
releases/sparc/autobuilds/latest-stage3-sparc64-multilib.txt
releases/sparc/autobuilds/latest-stage3-sparc64.txt
releases/sparc/autobuilds/latest-stage3.txt
releases/x86/20121221/
releases/x86/20121221/livedvd-x86-amd64-32ul-20121221.iso

==> rsync-ftpsync-gentoo.log.9 <==
releases/sparc/autobuilds/latest-stage3-sparc64-multilib.txt
releases/sparc/autobuilds/latest-stage3-sparc64.txt
releases/sparc/autobuilds/latest-stage3.txt
releases/x86/20121221/
releases/x86/20121221/livedvd-x86-amd64-32ul-20121221.iso

==> rsync-ftpsync-gentoo.log.10 <==
releases/sparc/autobuilds/latest-stage3-sparc64-multilib.txt
releases/sparc/autobuilds/latest-stage3-sparc64.txt
releases/sparc/autobuilds/latest-stage3.txt
releases/x86/20121221/
releases/x86/20121221/livedvd-x86-amd64-32ul-20121221.iso

==> rsync-ftpsync-gentoo.log.11 <==
releases/sparc/autobuilds/latest-stage3-sparc64-multilib.txt
releases/sparc/autobuilds/latest-stage3-sparc64.txt
releases/sparc/autobuilds/latest-stage3.txt
releases/x86/20121221/
releases/x86/20121221/livedvd-x86-amd64-32ul-20121221.iso

==> rsync-ftpsync-gentoo.log.12 <==
releases/sparc/autobuilds/latest-stage3-sparc64-multilib.txt
releases/sparc/autobuilds/latest-stage3-sparc64.txt
releases/sparc/autobuilds/latest-stage3.txt
releases/x86/20121221/
releases/x86/20121221/livedvd-x86-amd64-32ul-20121221.iso

==> rsync-ftpsync-gentoo.log.13 <==
releases/sparc/autobuilds/latest-stage3-sparc64-multilib.txt
releases/sparc/autobuilds/latest-stage3-sparc64.txt
releases/sparc/autobuilds/latest-stage3.txt
releases/x86/20121221/
releases/x86/20121221/livedvd-x86-amd64-32ul-20121221.iso

==> rsync-ftpsync-gentoo.log.14 <==
releases/sparc/autobuilds/latest-stage3-sparc64-multilib.txt
releases/sparc/autobuilds/latest-stage3-sparc64.txt
releases/sparc/autobuilds/latest-stage3.txt
releases/x86/20121221/
releases/x86/20121221/livedvd-x86-amd64-32ul-20121221.iso
Comment 9 Carsten Otto 2013-01-18 10:12:25 UTC
I just noticed that the number of transferred bytes always is roughly the same:

rsync-ftpsync-gentoo.error.0:rsync: connection unexpectedly closed (1681810 bytes received so far) [generator]
rsync-ftpsync-gentoo.error.1:rsync: connection unexpectedly closed (1681288 bytes received so far) [generator]
rsync-ftpsync-gentoo.error.10:rsync: connection unexpectedly closed (1679981 bytes received so far) [generator]
rsync-ftpsync-gentoo.error.11:rsync: connection unexpectedly closed (1680238 bytes received so far) [generator]
rsync-ftpsync-gentoo.error.12:rsync: connection unexpectedly closed (1680332 bytes received so far) [generator]
rsync-ftpsync-gentoo.error.13:rsync: connection unexpectedly closed (34061600 bytes received so far) [receiver]
rsync-ftpsync-gentoo.error.13:rsync: connection unexpectedly closed (1680243 bytes received so far) [generator]
rsync-ftpsync-gentoo.error.14:rsync: connection unexpectedly closed (1680409 bytes received so far) [generator]
rsync-ftpsync-gentoo.error.2:rsync: connection unexpectedly closed (113423523 bytes received so far) [receiver]
rsync-ftpsync-gentoo.error.2:rsync: connection unexpectedly closed (1680622 bytes received so far) [generator]
rsync-ftpsync-gentoo.error.3:rsync: connection unexpectedly closed (1679873 bytes received so far) [generator]
rsync-ftpsync-gentoo.error.4:rsync: connection unexpectedly closed (178224813 bytes received so far) [receiver]
rsync-ftpsync-gentoo.error.4:rsync: connection unexpectedly closed (1679931 bytes received so far) [generator]
rsync-ftpsync-gentoo.error.5:rsync: connection unexpectedly closed (1680698 bytes received so far) [generator]
rsync-ftpsync-gentoo.error.6:rsync: connection unexpectedly closed (26172014 bytes received so far) [receiver]
rsync-ftpsync-gentoo.error.6:rsync: connection unexpectedly closed (1680449 bytes received so far) [generator]
rsync-ftpsync-gentoo.error.7:rsync: connection unexpectedly closed (279202474 bytes received so far) [receiver]
rsync-ftpsync-gentoo.error.7:rsync: connection unexpectedly closed (1680614 bytes received so far) [generator]
rsync-ftpsync-gentoo.error.8:rsync: connection unexpectedly closed (325185712 bytes received so far) [receiver]
rsync-ftpsync-gentoo.error.8:rsync: connection unexpectedly closed (1680183 bytes received so far) [generator]
rsync-ftpsync-gentoo.error.9:rsync: connection unexpectedly closed (1679983 bytes received so far) [generator]
Comment 10 Carsten Otto 2013-01-23 12:26:53 UTC
Any news on this?
Comment 11 Carsten Otto 2013-01-29 20:14:33 UTC
I'd love to see my mirror being working again. So... what's the status? Thanks!
Comment 12 Alec Warner (RETIRED) archtester gentoo-dev Security 2013-02-02 21:48:18 UTC
Apologies, I am going to look at the tcpdump today.

-A
Comment 13 Alec Warner (RETIRED) archtester gentoo-dev Security 2013-02-03 08:24:58 UTC
I examined the stream at 2013-01-26 23:24:40 (UTC)

# You ACK, and say 'I can receive 512 bytes'
61532	2013-01-26 23:24:40.570918	137.226.34.42	194.116.84.30	TCP	60	57799 > rsync [ACK] Seq=19998985 Ack=78018360 Win=512 Len=0

# We send you 512 bytes, and say 'WINDOW FULL'
61533	2013-01-26 23:24:40.821932	194.116.84.30	137.226.34.42	RSYNC	566	[TCP Window Full] Data
# You ACK our reply; window=0
61534	2013-01-26 23:24:40.837448	137.226.34.42	194.116.84.30	TCP	60	[TCP ZeroWindow] 57799 > rsync [ACK] Seq=19998985 Ack=78018872 Win=0 Len=0

# We send keep-alive, we can transmit 7561216 bytes
61535	2013-01-26 23:24:41.081931	194.116.84.30	137.226.34.42	TCP	54	[TCP Keep-Alive] rsync > 57799 [ACK] Seq=78018871 Ack=19998985 Win=7561216 Len=0

# You reply (as per RFC 793), still ZeroWindow
61536	2013-01-26 23:24:41.097300	137.226.34.42	194.116.84.30	TCP	60	[TCP ZeroWindow] 57799 > rsync [ACK] Seq=19998985 Ack=78018872 Win=0 Len=0

<This game of TCP KeepAlive + ZeroWindow goes on for ~4 minutes.>

61553	2013-01-26 23:28:48.029944	194.116.84.30	137.226.34.42	TCP	54	[TCP Keep-Alive] rsync > 57799 [ACK] Seq=78018871 Ack=19998985 Win=7561216 Len=0
61554	2013-01-26 23:28:48.045128	137.226.34.42	194.116.84.30	TCP	60	[TCP ZeroWindow] 57799 > rsync [ACK] Seq=19998985 Ack=78018872 Win=0 Len=0
61555	2013-01-26 23:30:40.870891	194.116.84.30	137.226.34.42	TCP	54	rsync > 57799 [RST, ACK] Seq=78018872 Ack=19998985 Win=7561216 Len=0

This RST is not super clear to me. We sent keep-alives, and received responses. The TCP RFC is pretty clear that we should not drop the connection due to the ZeroWindow (we should technically wait for you to re-open.) However I imagine at some point TCP will just be like 'screw this' and close the connection. The keep-alives get progressively further apart. 

200ms
500ms
1000ms
2000ms
4000ms
6000ms
16000ms
30000ms
60000ms
12000ms
RST

So looking at the stream; ZeroWindow happens all the time. Usually just 1-2 keepalives (so 200-700ms) are needed, followed by a TCP Window Update packet.

# Here is a subset of packets from the same stream, 2 seconds earlier.
61502	2013-01-26 23:24:37.814633	137.226.34.42	194.116.84.30	TCP	60	57799 > rsync [ACK] Seq=19998985 Ack=77997332 Win=512 Len=0
61503	2013-01-26 23:24:38.065934	194.116.84.30	137.226.34.42	RSYNC	566	[TCP Window Full] Data
61504	2013-01-26 23:24:38.081544	137.226.34.42	194.116.84.30	TCP	60	[TCP ZeroWindow] 57799 > rsync [ACK] Seq=19998985 Ack=77997844 Win=0 Len=0
61505	2013-01-26 23:24:38.325932	194.116.84.30	137.226.34.42	TCP	54	[TCP Keep-Alive] rsync > 57799 [ACK] Seq=77997843 Ack=19998985 Win=7561216 Len=0
61506	2013-01-26 23:24:38.341609	137.226.34.42	194.116.84.30	TCP	60	[TCP ZeroWindow] 57799 > rsync [ACK] Seq=19998985 Ack=77997844 Win=0 Len=0
61507	2013-01-26 23:24:38.413927	137.226.34.42	194.116.84.30	TCP	60	[TCP Window Update] 57799 > rsync [ACK] Seq=19998985 Ack=77997844 Win=2048 Len=0

In this sub-conversation, Zero Window lasted about 600ms, before you re-opened the window at 2048 bytes.

This is all very weird anyhow. This one TCP stream started at 23:00:00 UTC and ran fine for 23 minutes (no ZeroWindow condition anywhere in the stream.) Then suddenly at:

55440	2013-01-26 23:21:31.858980	137.226.34.42	194.116.84.30	TCP	60	57799 > rsync [ACK] Seq=19998985 Ack=61744153 Win=11606528 Len=0

This was the last time you ever increased the TCP receive window any significant amount. We transmitted fine for another minute or so.

# This is the first ZeroWindow I could find.
60127	2013-01-26 23:22:42.797932	194.116.84.30	137.226.34.42	RSYNC	566	[TCP Window Full] Data

However after this your side would only reopen the window for some tiny number of bytes (usually < 8kb.) Is there something on your side that would cause rsync to be unable to receive data?

Note that both my wireshark and my TCP is a bit rusty; most of this was digging through RFCs.

-A
Comment 14 Alec Warner (RETIRED) archtester gentoo-dev Security 2013-02-03 08:26:56 UTC
Sorry that was unclear. When I said 'ZeroWindow happens all the time' I meant that it happened very often after 23:21 in this stream.
Comment 15 Carsten Otto 2013-02-03 14:33:36 UTC
I see that, after several minutes of unproblematic transfer, we decrease the window size until it reaches 0. After that a few tiny windows are used, but nothing really happens. According to "atop" the Gentoo LVM is at 100% usage, although only (constantly!) 0.11 MByte/sec are written and IO is (constant!) at 18ms-19ms. I run "sync", which did not change anything. I will investigate this further (in the next few days). Maybe our policy of doing a lot in RAM is the problem (we have 92 GByte of RAM in this machine), it could be that after roughly 20 minutes the data is written to disk and the connection needs to wait until this is done?
Comment 16 Carsten Otto 2013-02-05 19:11:04 UTC
It seems the issue is resolved? At least I did not get any error mail after my last comment.
Comment 17 Carsten Otto 2013-03-19 11:22:13 UTC
*bounce* *wave* What's the status? I don't see progress and we also get way less traffic for Gentoo.
Comment 18 Chí-Thanh Christopher Nguyễn gentoo-dev 2013-07-27 18:37:14 UTC
Yes it appears that the problem has gone away.
infra: can you add the mirror back to DNS rotation?
Comment 19 Alex Legler (RETIRED) archtester gentoo-dev Security 2014-12-16 22:07:29 UTC
(In reply to Chí-Thanh Christopher Nguyễn from comment #18)
> Yes it appears that the problem has gone away.
> infra: can you add the mirror back to DNS rotation?

done.