Greetings, I believe the PORTAGE_RSYNC_RETRIES setting in portage is broken for the non-default case since the change in the default behavior from this: PORTAGE_RSYNC_RETRIES = [NUMBER] The number of times rsync should retry on failed connections before giving up. Defaults to 3. To this: PORTAGE_RSYNC_RETRIES = [NUMBER] The number of times rsync should retry on failed connections before giving up. If set to a negative number, then retry until all possible addresses are exhausted. Defaults to -1. Scenario(s): If no SYNC or PORTAGE_RSYNC_RETRIES setting is specified in /etc/make.conf, the default SYNC is rsync://rsync.gentoo.org which is currently a multi-IP DNS entry of three ipv4 and one ipv6 host. If we block connections[1] to these hosts we can test PORTAGE_RSYNC_RETRIES settings. With neither variable set, the default PORTAGE_RSYNC_RETRIES setting of -1 works as described. If we set PORTAGE_RSYNC_RETRIES to less than the number of hosts in a DNS entry, the setting works as described. If we set PORTAGE_RSYNC_RETRIES higher than the number of hosts in the DNS entry (8, 15, 45, 100, whatever), portage does not retry this number of times, it reverts to -1 behavior. It shows the same behavior if we do the same test against rsync.us.gentoo.org, which has many more hosts in its DNS entry. I suspect this isn't noticed until now because our excellent Gentoo infra team doesn't have all hosts in a DNS entry down at a time. ;) However, this becomes particularly problematic though when the SYNC setting is an IP or a DNS entry with just one host. This is often the case for people running local networks re-mirroring the Gentoo tree to local (or remote vpn ;) clients and being a good Gentoo netizen, etc. You will get 0 retries, ever, no matter what PORTAGE_RSYNC_RETRIES is set to. This causes automation to fail when the first connection does not succeed (for whatever reason), despite having retries set to a high number. I have attached a patch which implements a check on retries setting prior to host list exhaustion error-return code. It also adds the appropriate code to ensure the host-list is cycled through properly in-order when PORTAGE_RSYNC_RETRIES setting exceeds the number of rsync hosts in a DNS entry. The patch applies cleanly to portage 2.1.11.31. It applies to current portage 2.1.11.x with 75 lines of offset, no fuzz. I didn't test latest portage-2.1.11.x but I looked the source and didn't see any change which would fix this bug. Most of my emerge --info for this instance is not applicable to Gentoo or this bug. Here is the part of it that may have some relevance though: Portage 2.1.11.31 (gcc-4.5.4, glibc-2.15-r3, 3.4.X-amazon-xen x86_64) ================================================================= System uname: Linux-3.4.X-amazon-xen-x86_64-Intel-R-_Xeon-R-_CPU_E5645_@_2.40GHz-with-gentoo-2.1 ld GNU ld (GNU Binutils) 2.22 app-shells/bash: 4.2_p37 dev-java/java-config: 2.1.11-r3 dev-lang/python: 2.7.3-r2, 3.2.3 dev-util/cmake: 2.8.9 dev-util/pkgconfig: 0.27.1 sys-apps/baselayout: 2.1-r1 sys-apps/openrc: 0.11.6-r1::overlay sys-apps/sandbox: 2.5 sys-devel/autoconf: 2.68 sys-devel/automake: 1.11.6 sys-devel/binutils: 2.22-r1 sys-devel/gcc: 4.5.4 sys-devel/gcc-config: 1.7.3 sys-devel/libtool: 2.4-r1 sys-devel/make: 3.82-r3 sys-kernel/linux-headers: 3.6 (virtual/os-headers) sys-libs/glibc: 2.15-r3 [1] Something like this is helpful here (for ipv4, do similar for ipv6): for i in $(host rsync.us.gentoo.org | grep "has address" | awk '{ print $4 }'); do iptables -A OUTPUT -p tcp -d $i -j DROP ; done Reproducible: Always
Created attachment 367454 [details, diff] Fix PORTAGE_RSYNC_RETRIES functionality
I hope I put this ticket in the right place, please feel free to move it if I didn't.
Created attachment 428380 [details, diff] portage-2.2.20.1-fix-rsync-retries.patch It's been awhile, but an updated patch is now available for this issue. I'm providing it on behalf of Gordon since his retired status is preventing bug updates (email address wise). Let me know if there's any questions or alterations needed for this.
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/proj/portage.git/commit/?id=c7c04b4f4f4f5e6f18ad76366b535dcbad72989e commit c7c04b4f4f4f5e6f18ad76366b535dcbad72989e Author: Zac Medico <zmedico@gentoo.org> AuthorDate: 2017-10-27 19:40:01 +0000 Commit: Zac Medico <zmedico@gentoo.org> CommitDate: 2017-10-27 19:42:44 +0000 RsyncSync: fix PORTAGE_RSYNC_RETRIES (bug 497596) When PORTAGE_RSYNC_RETRIES is set to a positive integer, recycle the uris until the specified number of retries has been exhausted. Bug: https://bugs.gentoo.org/497596 pym/portage/sync/modules/rsync/rsync.py | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)}