All stages directories starting from 20121222/ seem to be emtpy. This has the consequence that e.g. current-stage3-armv6j_hardfp/ is also empty. However latest-stage3-armv6j_hardfp.txt points to an existing stage in the 20121220/ directory As this host appears to be in the distfiles.gentoo.org mirror rotation this can cause problems for users.
The issue seems to affect all arches since 20121222 Please at least take ftp.halifax.rwth-aachen.de out of distfiles.g.o DNS rotation. Several users have already complained on IRC that the stages are missing.
manually verified missing stages. Removing. remote: Pushing commit to ns0.gentoo.org (corvid) remote: From git+ssh://git.gentoo.org/infra/dns-zones remote: 28332a8..f0ad6bf master -> origin/master remote: Updating 28332a8..f0ad6bf remote: Fast-forward remote: zones/global/org.gentoo.distfiles | 3 ++- remote: 1 files changed, 2 insertions(+), 1 deletions(-) TTL is 7200 seconds (2h), so don't expect it gone until then (or longer.) We still need to contact them and tell them to fix it. -A
Email sent to their mirror-admin...hopefully I didn't screw that up. -A
Hi, I'm one of the mirror admins. The error might be on your side (on masterdistfiles-eu.gentoo.org). The following errors started to pop up on December 22nd (04:31 German time). rsync: connection unexpectedly closed (279202474 bytes received so far) [receiver] rsync error: error in rsync protocol data stream (code 12) at io.c(601) [receiver=3.0.] rsync: connection unexpectedly closed (1680614 bytes received so far) [generator] rsync error: error in rsync protocol data stream (code 12) at io.c(601) [generator=3.0.7] Best regards, Carsten
I sessionized our rsync logs: kea rsync # for file in /tmp/rsyncd.log.*; do awk -F[ '{ print $2 }' $file | cut -f 1 -d] | sort -u | zgrep -F -f - $(basename ${file%_*}); done | awk '/connect/ { print $8 }' | sort | uniq -c 131 ftp.halifax.rwth-aachen.de 1 ftp.prz.edu.pl Those are the only two hosts experiencing issues. For some reason your mirror disconnects from mastermirror-eu. I can confirm that the last non-error sync was late Dec 21, 2012. I'll try a tcpdump capture during your attempted sync so we can see what is happening traffic wise. -A
Add the mirror admin to CC so he/they can resolve it, please. http://mirrorstats.gentoo.org/#de
I have started a tcpdump on mirrormaster-eu to hopefully add insight into why the rsync connections are timing out. We also have netflow data from 20121222 (the date these syncs started to go bad.) I'm unfamiliar with how to read the flow data, but I will try to figure that part out. -A
If it helps, here are the last few lines of the last rsync runs. Maybe your file system is slow on the last files, or transfer in general takes too long and hits a timeout? ==> rsync-ftpsync-gentoo.log.0 <== releases/x86/autobuilds/latest-stage3-i686.txt releases/x86/autobuilds/latest-stage3.txt snapshots/ snapshots/.timestamp-snapshots.x snapshots/portage-20130110.tar.bz2 ==> rsync-ftpsync-gentoo.log.1 <== releases/x86/autobuilds/latest-stage3-i686.txt releases/x86/autobuilds/latest-stage3.txt snapshots/ snapshots/.timestamp-snapshots.x snapshots/portage-20130110.tar.bz2 ==> rsync-ftpsync-gentoo.log.2 <== releases/x86/autobuilds/latest-stage3-i686-hardened.txt releases/x86/autobuilds/latest-stage3-i686.txt releases/x86/autobuilds/latest-stage3.txt snapshots/.timestamp-snapshots.x snapshots/portage-20130109.tar.bz2 ==> rsync-ftpsync-gentoo.log.3 <== releases/sparc/autobuilds/latest-stage3-sparc64-multilib.txt releases/sparc/autobuilds/latest-stage3-sparc64.txt releases/sparc/autobuilds/latest-stage3.txt releases/x86/20121221/ releases/x86/20121221/livedvd-x86-amd64-32ul-20121221.iso ==> rsync-ftpsync-gentoo.log.4 <== releases/sparc/autobuilds/latest-stage3-sparc64-multilib.txt releases/sparc/autobuilds/latest-stage3-sparc64.txt releases/sparc/autobuilds/latest-stage3.txt releases/x86/20121221/ releases/x86/20121221/livedvd-x86-amd64-32ul-20121221.iso ==> rsync-ftpsync-gentoo.log.5 <== releases/sparc/autobuilds/latest-stage3-sparc64-multilib.txt releases/sparc/autobuilds/latest-stage3-sparc64.txt releases/sparc/autobuilds/latest-stage3.txt releases/x86/20121221/ releases/x86/20121221/livedvd-x86-amd64-32ul-20121221.iso ==> rsync-ftpsync-gentoo.log.6 <== releases/sparc/autobuilds/latest-stage3-sparc64-multilib.txt releases/sparc/autobuilds/latest-stage3-sparc64.txt releases/sparc/autobuilds/latest-stage3.txt releases/x86/20121221/ releases/x86/20121221/livedvd-x86-amd64-32ul-20121221.iso ==> rsync-ftpsync-gentoo.log.7 <== releases/sparc/autobuilds/latest-stage3-sparc64-multilib.txt releases/sparc/autobuilds/latest-stage3-sparc64.txt releases/sparc/autobuilds/latest-stage3.txt releases/x86/20121221/ releases/x86/20121221/livedvd-x86-amd64-32ul-20121221.iso ==> rsync-ftpsync-gentoo.log.8 <== releases/sparc/autobuilds/latest-stage3-sparc64-multilib.txt releases/sparc/autobuilds/latest-stage3-sparc64.txt releases/sparc/autobuilds/latest-stage3.txt releases/x86/20121221/ releases/x86/20121221/livedvd-x86-amd64-32ul-20121221.iso ==> rsync-ftpsync-gentoo.log.9 <== releases/sparc/autobuilds/latest-stage3-sparc64-multilib.txt releases/sparc/autobuilds/latest-stage3-sparc64.txt releases/sparc/autobuilds/latest-stage3.txt releases/x86/20121221/ releases/x86/20121221/livedvd-x86-amd64-32ul-20121221.iso ==> rsync-ftpsync-gentoo.log.10 <== releases/sparc/autobuilds/latest-stage3-sparc64-multilib.txt releases/sparc/autobuilds/latest-stage3-sparc64.txt releases/sparc/autobuilds/latest-stage3.txt releases/x86/20121221/ releases/x86/20121221/livedvd-x86-amd64-32ul-20121221.iso ==> rsync-ftpsync-gentoo.log.11 <== releases/sparc/autobuilds/latest-stage3-sparc64-multilib.txt releases/sparc/autobuilds/latest-stage3-sparc64.txt releases/sparc/autobuilds/latest-stage3.txt releases/x86/20121221/ releases/x86/20121221/livedvd-x86-amd64-32ul-20121221.iso ==> rsync-ftpsync-gentoo.log.12 <== releases/sparc/autobuilds/latest-stage3-sparc64-multilib.txt releases/sparc/autobuilds/latest-stage3-sparc64.txt releases/sparc/autobuilds/latest-stage3.txt releases/x86/20121221/ releases/x86/20121221/livedvd-x86-amd64-32ul-20121221.iso ==> rsync-ftpsync-gentoo.log.13 <== releases/sparc/autobuilds/latest-stage3-sparc64-multilib.txt releases/sparc/autobuilds/latest-stage3-sparc64.txt releases/sparc/autobuilds/latest-stage3.txt releases/x86/20121221/ releases/x86/20121221/livedvd-x86-amd64-32ul-20121221.iso ==> rsync-ftpsync-gentoo.log.14 <== releases/sparc/autobuilds/latest-stage3-sparc64-multilib.txt releases/sparc/autobuilds/latest-stage3-sparc64.txt releases/sparc/autobuilds/latest-stage3.txt releases/x86/20121221/ releases/x86/20121221/livedvd-x86-amd64-32ul-20121221.iso
I just noticed that the number of transferred bytes always is roughly the same: rsync-ftpsync-gentoo.error.0:rsync: connection unexpectedly closed (1681810 bytes received so far) [generator] rsync-ftpsync-gentoo.error.1:rsync: connection unexpectedly closed (1681288 bytes received so far) [generator] rsync-ftpsync-gentoo.error.10:rsync: connection unexpectedly closed (1679981 bytes received so far) [generator] rsync-ftpsync-gentoo.error.11:rsync: connection unexpectedly closed (1680238 bytes received so far) [generator] rsync-ftpsync-gentoo.error.12:rsync: connection unexpectedly closed (1680332 bytes received so far) [generator] rsync-ftpsync-gentoo.error.13:rsync: connection unexpectedly closed (34061600 bytes received so far) [receiver] rsync-ftpsync-gentoo.error.13:rsync: connection unexpectedly closed (1680243 bytes received so far) [generator] rsync-ftpsync-gentoo.error.14:rsync: connection unexpectedly closed (1680409 bytes received so far) [generator] rsync-ftpsync-gentoo.error.2:rsync: connection unexpectedly closed (113423523 bytes received so far) [receiver] rsync-ftpsync-gentoo.error.2:rsync: connection unexpectedly closed (1680622 bytes received so far) [generator] rsync-ftpsync-gentoo.error.3:rsync: connection unexpectedly closed (1679873 bytes received so far) [generator] rsync-ftpsync-gentoo.error.4:rsync: connection unexpectedly closed (178224813 bytes received so far) [receiver] rsync-ftpsync-gentoo.error.4:rsync: connection unexpectedly closed (1679931 bytes received so far) [generator] rsync-ftpsync-gentoo.error.5:rsync: connection unexpectedly closed (1680698 bytes received so far) [generator] rsync-ftpsync-gentoo.error.6:rsync: connection unexpectedly closed (26172014 bytes received so far) [receiver] rsync-ftpsync-gentoo.error.6:rsync: connection unexpectedly closed (1680449 bytes received so far) [generator] rsync-ftpsync-gentoo.error.7:rsync: connection unexpectedly closed (279202474 bytes received so far) [receiver] rsync-ftpsync-gentoo.error.7:rsync: connection unexpectedly closed (1680614 bytes received so far) [generator] rsync-ftpsync-gentoo.error.8:rsync: connection unexpectedly closed (325185712 bytes received so far) [receiver] rsync-ftpsync-gentoo.error.8:rsync: connection unexpectedly closed (1680183 bytes received so far) [generator] rsync-ftpsync-gentoo.error.9:rsync: connection unexpectedly closed (1679983 bytes received so far) [generator]
Any news on this?
I'd love to see my mirror being working again. So... what's the status? Thanks!
Apologies, I am going to look at the tcpdump today. -A
I examined the stream at 2013-01-26 23:24:40 (UTC) # You ACK, and say 'I can receive 512 bytes' 61532 2013-01-26 23:24:40.570918 137.226.34.42 194.116.84.30 TCP 60 57799 > rsync [ACK] Seq=19998985 Ack=78018360 Win=512 Len=0 # We send you 512 bytes, and say 'WINDOW FULL' 61533 2013-01-26 23:24:40.821932 194.116.84.30 137.226.34.42 RSYNC 566 [TCP Window Full] Data # You ACK our reply; window=0 61534 2013-01-26 23:24:40.837448 137.226.34.42 194.116.84.30 TCP 60 [TCP ZeroWindow] 57799 > rsync [ACK] Seq=19998985 Ack=78018872 Win=0 Len=0 # We send keep-alive, we can transmit 7561216 bytes 61535 2013-01-26 23:24:41.081931 194.116.84.30 137.226.34.42 TCP 54 [TCP Keep-Alive] rsync > 57799 [ACK] Seq=78018871 Ack=19998985 Win=7561216 Len=0 # You reply (as per RFC 793), still ZeroWindow 61536 2013-01-26 23:24:41.097300 137.226.34.42 194.116.84.30 TCP 60 [TCP ZeroWindow] 57799 > rsync [ACK] Seq=19998985 Ack=78018872 Win=0 Len=0 <This game of TCP KeepAlive + ZeroWindow goes on for ~4 minutes.> 61553 2013-01-26 23:28:48.029944 194.116.84.30 137.226.34.42 TCP 54 [TCP Keep-Alive] rsync > 57799 [ACK] Seq=78018871 Ack=19998985 Win=7561216 Len=0 61554 2013-01-26 23:28:48.045128 137.226.34.42 194.116.84.30 TCP 60 [TCP ZeroWindow] 57799 > rsync [ACK] Seq=19998985 Ack=78018872 Win=0 Len=0 61555 2013-01-26 23:30:40.870891 194.116.84.30 137.226.34.42 TCP 54 rsync > 57799 [RST, ACK] Seq=78018872 Ack=19998985 Win=7561216 Len=0 This RST is not super clear to me. We sent keep-alives, and received responses. The TCP RFC is pretty clear that we should not drop the connection due to the ZeroWindow (we should technically wait for you to re-open.) However I imagine at some point TCP will just be like 'screw this' and close the connection. The keep-alives get progressively further apart. 200ms 500ms 1000ms 2000ms 4000ms 6000ms 16000ms 30000ms 60000ms 12000ms RST So looking at the stream; ZeroWindow happens all the time. Usually just 1-2 keepalives (so 200-700ms) are needed, followed by a TCP Window Update packet. # Here is a subset of packets from the same stream, 2 seconds earlier. 61502 2013-01-26 23:24:37.814633 137.226.34.42 194.116.84.30 TCP 60 57799 > rsync [ACK] Seq=19998985 Ack=77997332 Win=512 Len=0 61503 2013-01-26 23:24:38.065934 194.116.84.30 137.226.34.42 RSYNC 566 [TCP Window Full] Data 61504 2013-01-26 23:24:38.081544 137.226.34.42 194.116.84.30 TCP 60 [TCP ZeroWindow] 57799 > rsync [ACK] Seq=19998985 Ack=77997844 Win=0 Len=0 61505 2013-01-26 23:24:38.325932 194.116.84.30 137.226.34.42 TCP 54 [TCP Keep-Alive] rsync > 57799 [ACK] Seq=77997843 Ack=19998985 Win=7561216 Len=0 61506 2013-01-26 23:24:38.341609 137.226.34.42 194.116.84.30 TCP 60 [TCP ZeroWindow] 57799 > rsync [ACK] Seq=19998985 Ack=77997844 Win=0 Len=0 61507 2013-01-26 23:24:38.413927 137.226.34.42 194.116.84.30 TCP 60 [TCP Window Update] 57799 > rsync [ACK] Seq=19998985 Ack=77997844 Win=2048 Len=0 In this sub-conversation, Zero Window lasted about 600ms, before you re-opened the window at 2048 bytes. This is all very weird anyhow. This one TCP stream started at 23:00:00 UTC and ran fine for 23 minutes (no ZeroWindow condition anywhere in the stream.) Then suddenly at: 55440 2013-01-26 23:21:31.858980 137.226.34.42 194.116.84.30 TCP 60 57799 > rsync [ACK] Seq=19998985 Ack=61744153 Win=11606528 Len=0 This was the last time you ever increased the TCP receive window any significant amount. We transmitted fine for another minute or so. # This is the first ZeroWindow I could find. 60127 2013-01-26 23:22:42.797932 194.116.84.30 137.226.34.42 RSYNC 566 [TCP Window Full] Data However after this your side would only reopen the window for some tiny number of bytes (usually < 8kb.) Is there something on your side that would cause rsync to be unable to receive data? Note that both my wireshark and my TCP is a bit rusty; most of this was digging through RFCs. -A
Sorry that was unclear. When I said 'ZeroWindow happens all the time' I meant that it happened very often after 23:21 in this stream.
I see that, after several minutes of unproblematic transfer, we decrease the window size until it reaches 0. After that a few tiny windows are used, but nothing really happens. According to "atop" the Gentoo LVM is at 100% usage, although only (constantly!) 0.11 MByte/sec are written and IO is (constant!) at 18ms-19ms. I run "sync", which did not change anything. I will investigate this further (in the next few days). Maybe our policy of doing a lot in RAM is the problem (we have 92 GByte of RAM in this machine), it could be that after roughly 20 minutes the data is written to disk and the connection needs to wait until this is done?
It seems the issue is resolved? At least I did not get any error mail after my last comment.
*bounce* *wave* What's the status? I don't see progress and we also get way less traffic for Gentoo.
Yes it appears that the problem has gone away. infra: can you add the mirror back to DNS rotation?
(In reply to Chí-Thanh Christopher Nguyễn from comment #18) > Yes it appears that the problem has gone away. > infra: can you add the mirror back to DNS rotation? done.