Bug 99628 - mirrorselect 1.1.3: Deep mode returns too few servers.
Bug#: 99628 Product:  Gentoo Linux Version: unspecified Platform: All
OS/Version: All Status: CLOSED Severity: normal Priority: P2
Resolution: FIXED Assigned To: tercel@gentoo.org Reported By: 1i5t5.duncan@cox.net
Component: Applications
URL: 
Summary: mirrorselect 1.1.3: Deep mode returns too few servers.
Keywords:  
Status Whiteboard: 
Opened: 2005-07-20 00:52 0000
Description:   Opened: 2005-07-20 00:52 0000
Mirrorselect's deep class assumes it is returning only one server, so  
immediately sets the max-time to the first server time, returning no  
servers with times longer than that.  With -s20, I get only 1-5 servers  
returned, because those are the only ones under the artificially limited time  
constraint.  
  
Instead, mirrorselect -D -s20 should take the first 20 successful returns,  
setting max-time to the highest time of those 20, then replace the highest time  
server (and reset max-time) each time it finds a lower timed server.  
  
Note that -D is the only mode that works, for those of us behind NAPT routers  
that filter the ICMP(??) returns that allow netselect to function.  

Reproducible: Always
Steps to Reproduce:
mirrorselect -oD -s20 
  
Actual Results:  
mirrorselect returns too few servers, because it artificially limits the 
max-time based on the first return, not the first N returns, where N is the 
number of servers requested. 

Expected Results:  
mirrorselect should return the fastest N servers from the list, provided that N 
servers return /distfiles/mirrorselect-test, of course. 

I doubt emerge info is relevant, but I'll attach it, just in case.

------- Comment #1 From Duncan 2005-07-20 00:55:29 0000 -------
Created an attachment (id=63863) [details]
emerge info output

------- Comment #2 From Duncan 2005-07-20 01:01:37 0000 -------
Argh!  Mirrorselect version 1.1.3.  This new Bugzilla version has a useragent 
blank that I filled in with that info, but that doesn't seem to appear in the 
final report.  Changing the summary to reflect the version. 

------- Comment #3 From Colin Kingsley (RETIRED) 2005-07-20 13:10:07 0000 -------
Could you please run mirrorselect with the -d flag, (in addition to whatever
other flags you were using) and redirect the output into a file, and post that
file on this bug?

I'd like to analyze exactly what is going wrong with my algorithm. This issue
does not happen for me, nor should it for you, but I've only been able to test
on a few very similar internet connections, so I could be wrong.

Once again, thanks for the testing.

------- Comment #4 From Duncan 2005-07-21 01:41:54 0000 -------
(In reply to comment #3) 
> please run mirrorselect with the -d flag [and others used] and redirect the 
> output into a file, and post that file on this bug? 
 
Absolutely!  I probably should have done that before (I ran the -d flag myself, 
but didn't think about posting the output)!  Only... one of my ISP's routers is 
behaving very strangely right now and I don't trust it enough to try updating 
to the new version with the make.conf fix, presently, so I'll try tomorrow.  
(With connectivity what it is right now, some sites working some failing, some 
failing intermittently, I couldn't get an accurate run anyway... =8^( 
 
FWIW tho, I had similar problems pre-1.0, but it was a different issue, IIRC.  
Back then, the test files were apparently not on most of the servers, or at 
least when I checked them manually by browser they were missing, and changing 
the test file to something else cured the issue.  This time, the files are 
showing up when I check manually, and the debug info seems to indicate they are 
being found as well, only the thing is rejecting the downloads too early, as 
described. 
 
One other possibility... I'm running python-2.4.1-r1 (~arch), which just came 
out of testing not long ago.  Maybe there's some issue mirrorselect is 
activating, perhaps only on ~amd64, that's a corner case nobody's yet caught?  
I don't think it likely, but it's possible. 
 
... I just wish that stupid ISP router would get back to full working order so 
I could test this thing and post the log! 

------- Comment #5 From Duncan 2005-07-25 03:24:12 0000 -------
Created an attachment (id=64240) [details]
mirrorselect debug output

Well, router and follow-on cable modem issues fixed, so I can finally get a
decent debug test, full debug output attached.	The below are the highlights.

Mirrorselect version (updated):
1.1.4

Command used:
mirrorselect -odD -s20 -t10

Partial output from first two mirrors:

[1 of 156]
_deeptime(): timeout is 10
deeptime(): 0.775030851364 seconds
_list_add(): added host [snip] with a time of 0.775030851364
_list_add(): new max time is 0.775030851364 seconds, and now len(host_dict)= 1

[2 of 156]
_deeptime(): timeout is 0.775030851364
deeptime(): download timed out. killing wget.
deeptime(): wget returned -1, adding 10 to delta

Final number of hosts over number requested:
8/20

See what I mean?  It sets the timeout to the return time of the first host, so
nothing higher than that gets added to the list.  As it happens, there were
seven additional mirrors faster than the first one tested, so I got a total of
eight in the output list, but I requested twenty.  Often, I only get four.

------- Comment #6 From Colin Kingsley (RETIRED) 2005-07-25 18:42:34 0000 -------
Fixed in next release (1.1.5)

------- Comment #7 From Duncan 2005-07-26 12:51:13 0000 -------
Confirmed fixed in mirrorselect-1.1.5.  Setting bug full-closed.