Never ending story about broken mirrors and portage mis-understanding them...: When making a fetching attempt from a broken mirror which returns an error page, under some circumstances portage "continues" downloading at an other mirror, appending the correct content to the html error message. From my attempts (http://ftp.tu-clausthal.de is a wonderful broken mirror): <...cut...> >>> Downloading http://ftp.tu-clausthal.de/pub/linux/gentoo/distfiles/tetex-texmf-2.0.tar.gz --03:04:10-- http://ftp.tu-clausthal.de/pub/linux/gentoo/distfiles/tetex-texmf-2.0.tar.gz => `/usr/portage/distfiles/tetex-texmf-2.0.tar.gz' Resolving ftp.tu-clausthal.de... done. Connecting to ftp.tu-clausthal.de[139.174.2.36]:80... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/html] [ <=> ] 835 815.43K/s 03:04:10 (815.43 KB/s) - `/usr/portage/distfiles/tetex-texmf-2.0.tar.gz' saved [835] >>> Resuming download... >>> Downloading http://gentoo.oregonstate.edu//distfiles/tetex-texmf-2.0.tar.gz --03:04:10-- http://gentoo.oregonstate.edu//distfiles/tetex-texmf-2.0.tar.gz => `/usr/portage/distfiles/tetex-texmf-2.0.tar.gz' Resolving gentoo.oregonstate.edu... done. Connecting to gentoo.oregonstate.edu[128.193.0.3]:80... connected. HTTP request sent, awaiting response... 206 Partial Content <...cut...> # file /usr/portage/distfiles/tetex-texmf-2.0.tar.gz /usr/portage/distfiles/tetex-texmf-2.0.tar.gz: HTML document text # strings /usr/portage/distfiles/tetex-texmf-2.0.tar.gz|head <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> <HTML><HEAD><LINK REV="made" HREF="mailto:ftpadm@rz.tu-clausthal.de"> <TITLE>File too large (please use FTP)</TITLE></HEAD><BODY BGCOLOR="#C0C0C0" BACKGROUND="/icons/backgnd.gif"> <ADDRESS><A HREF="http://www.rz.tu-clausthal.de/">Rechenzentrum</A> der <A HREF="http://www.tu-clausthal.de/">TU Clausthal</A></ADDRESS> <IMG SRC="/icons/rzHR.gif" ALT="" WIDTH="576" HEIGHT="23"> <H1>File too large (please use FTP)</H1> <P>The requested file is very large, therefore it is <STRONG>not</STRONG> available via HTTP. Please use FTP instead. <HR><ADDRESS>Generated by <A HREF="http://www.apache.org/">Apache/1.3.26 (Unix)</A> at <A HREF="/">ftp.tu-clausthal.de</A>, 20,
hetter error-checking on content is what you mean?
You can imagine that it's not big fun if (like in my experiment) a >50MB file gets apparently downloaded but always damaged because you have a faulty mirror in the list. Maybe portage should check if the downloaded document's MIME type actually matches the expected one. I suppose that with mirrors like http://ftp.tu-clausthal.de you can never get a full list of their fancy error messages they generate without code and without reason. A "wandering checksum" would be too much, I think; it would bloat the ebuilds.
This is a known issue with ftp.tu-clausthal.de, but I don't know who the admin is, so I can't get in touch with them. I'm CC'ing klieber in case he knows.
Hm, yes, but you can imagine there can always be a mirror behaving exactly as unpredictable as this one. The can be cases of failure which are similar to this; do you remember e.g. when sourceforge.net changed their download pages from only-ours to please-select-from-these-mirrors? IMHO it's better to fix the software or to find heuristics which make unexpected server behaviour less harmful. E.g. too small && wrong MIME type -> trash.
I've removed their http mirror from the thirdpartymirrors list. If it gets fixed, one way or another, it can be put back. http://ftp.tu-clausthal.de/pub/linux/gentoo/distfiles
Any success with determining the admin of ftp.tu-clausthal.de? I had contact with him about 9 months ago. Would need to check my mail archive, if necessary.
The file's there and I believe that the mirror in question no longer offers http access. Closing bug.