Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 180009 - timeout 15 when rsyncing initial timestamp.chk is too low in some cases
Summary: timeout 15 when rsyncing initial timestamp.chk is too low in some cases
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Core - Interface (emerge) (show other bugs)
Hardware: All Linux
: High enhancement (vote)
Assignee: Portage team
Keywords: InVCS
Depends on:
Blocks: 181949
  Show dependency tree
Reported: 2007-05-27 14:02 UTC by Petr Behan
Modified: 2020-12-27 02:58 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---

temporary fix, patch against portage-, apply in /usr/lib/portage/bin (emerge.patch,429 bytes, patch)
2007-05-27 14:06 UTC, Petr Behan
Details | Diff
add a PORTAGE_RSYNC_INITIAL_TIMEOUT config variable (rsync_initial_timeout.patch,897 bytes, patch)
2007-05-27 21:45 UTC, Zac Medico
Details | Diff
add a PORTAGE_RSYNC_INITIAL_TIMEOUT config variable (initial_timeout.patch,1.78 KB, patch)
2007-05-28 07:34 UTC, Zac Medico
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Petr Behan 2007-05-27 14:02:44 UTC
In my setup I have one server with portage mirror and other computers on the network sync with it - fairly standard setup I guess. The problem is that I used rsyncd's "pre-xfer exec" option on the server to run a script that checks how old the portage on server. If it's too old the script first does "emerge --sync" which takes a while, and all that time rsyncd waits without sending any data. Sometime during past year this setup broke and I had to sync twice from then - the first sync starts the update on the server, but times out after 3 retries. The second sync done after few minutes succeeds (because there is no initial delay if portage image on server is recent).

The reason is that emerge first tries to sync metadata/timestamp.chk and interrupts the child process in 15 seconds regardless of any timeouts set in make.conf. I temporarily solved this problem on my system with the patch below, but I'm not sure how to fix this properly. I doubt that the timer is there just for fun, so my suggestion would be adding "slow-sync" feature or --slowsync option to emerge. Whatever is easier to add (and can be stored in make.conf as permanent setting) would work. This option/feature would completely disable the timer used to handle unresponsive rsync and let it rely only on --timeout value supplied in PORTAGE_RSYNC_EXTRA_OPTS.

Reproducible: Always

Steps to Reproduce:
add following line to /etc/rsyncd.conf to [portage] section on rsync server:
pre-xfer exec = sleep 200
$ emerge --sync on client

Actual Results:  
>>> Starting rsync with rsync://
>>> Checking server timestamp ...
timed out
rsync error: received SIGINT, SIGTERM, or SIGHUP (code 20) at rsync.c(244) [receiver=2.6.9]
>>> Retrying...

[...another 3 retries...]

!!! Rsync has not successfully finished. It is recommended that you keep
!!! trying or that you use the 'emerge-webrsync' option if you are unable
!!! to use rsync due to firewall or other restrictions. This should be a
!!! temporary problem unless complications exist with your network
!!! (and possibly your system's filesystem) configuration.

Expected Results:  
successfull --sync
Comment 1 Petr Behan 2007-05-27 14:06:06 UTC
Created attachment 120445 [details, diff]
temporary fix, patch against portage-, apply in /usr/lib/portage/bin
Comment 2 Zac Medico gentoo-dev 2007-05-27 14:13:33 UTC
If you set PORTAGE_RSYNC_RETRIES to a higher number, then it will keep retrying.  Is that good enough?  It seems like you have an abnormal setup there.
Comment 3 Petr Behan 2007-05-27 15:58:22 UTC
OK, I didn't think of using PORTAGE_RSYNC_RETRIES before. It fixes the issue, but in my opinion it is still in category "workarounds" (same as my patch). It would spam about 8 retries on my setup, create 8 server threads all waiting until the first one finishes and would change again when someone decides that 10 seconds should be enough for everyone. The setup IS ABNORMAL in a convenience + bandwidth saving way. I don't want to sync via cron - sometimes there are few weeks where nothing interesting happens and syncing would be waste, and syncing on server by hand requires login, remembering when was last sync... this setup was first that came to my mind and worked fine until the timer was implemented sometimes in last year (I didn't research it right when the problem started. I just blamed rsync for a while and did with the two-phase sync for some time).

It's nothing major - two easy workarounds so far... Maybe I should have flagged it as enhancement suggestion instead of minor bug *changed*
Comment 4 Zac Medico gentoo-dev 2007-05-27 21:45:48 UTC
Created attachment 120484 [details, diff]

(In reply to comment #0)
> timer used to handle unresponsive rsync and let it rely only on --timeout value

For bug 168646 I wrote a script that probes all of the rsync servers and from that  I found that in some cases the rsync client will hang indefinitely on the initial connection attempt.  It does this regardless of the --timeout option, which only seems to apply after the initial connection has been made.

I'm not sure if the above should be considered a bug in rsync or not.  If they really intend for the --timeout not to apply to the initial connection attempt, perhaps they should add an --initial-timeout option.
Comment 5 Zac Medico gentoo-dev 2007-05-28 07:34:20 UTC
Created attachment 120500 [details, diff]

This version is in svn r6651 and it allows PORTAGE_RSYNC_INITIAL_TIMEOUT=0 to disable the timeout.
Comment 6 Petr Behan 2007-05-28 20:48:39 UTC
(In reply to comment #5)

Thanks for your time, that looks exactly as what I was hoping for and works for me.
Comment 7 Zac Medico gentoo-dev 2007-05-31 01:29:06 UTC
This has been released in