Summary: | user_agent in /etc/wgetrc might break mirror://sourceforge/ for ebuilds | ||
---|---|---|---|
Product: | Portage Development | Reporter: | Jaak Ristioja <jaak> |
Component: | Documentation | Assignee: | Portage team <dev-portage> |
Status: | RESOLVED WORKSFORME | ||
Severity: | normal | CC: | graaff, Storklerk |
Priority: | High | ||
Version: | unspecified | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- | |
Bug Depends on: | |||
Bug Blocks: | 377365 | ||
Attachments: | Example fetch log |
Description
Jaak Ristioja
2010-07-06 23:04:19 UTC
Created attachment 237807 [details]
Example fetch log
Here's the example output of a failed fetch from a mirror://sourceforge/.
app-accessibility/eflite was just an example! Every other ebuild with mirror://sourceforge/ also failed. The downloaded file is always some SourceForge download page, i.e. HTML code. Anyway, it appears to work now. I think it might have been a temporary issue with the sourceforge mirrors rather than Gentoo. Ah, sry, I was mistaken. All mirror://sourceforge/ still fail. And as you can see from the log I attached previously, the downloaded file is always text/html. Maybe sourceforge changed the setup of its mirrors or something? This blocks development of ebuilds for packages which can only be downloaded from the sourceforge mirrors. Bug wranglers, can you please redirect this to the appropriate assignee? Thanks, William Works for me >>> Downloading 'http://internode.dl.sourceforge.net/sourceforge/eflite/eflite-0.4.1.tar.gz' --2010-07-15 14:15:36-- http://internode.dl.sourceforge.net/sourceforge/eflite/eflite-0.4.1.tar.gz Resolving internode.dl.sourceforge.net... 150.101.135.12 Connecting to internode.dl.sourceforge.net|150.101.135.12|:80... connected. HTTP request sent, awaiting response... 302 Moved Temporarily Location: http://downloads.sourceforge.net/sourceforge/eflite/eflite-0.4.1.tar.gz?download&failedmirror=internode.dl.sourceforge.net [following] --2010-07-15 14:15:37-- http://downloads.sourceforge.net/sourceforge/eflite/eflite-0.4.1.tar.gz?download&failedmirror=internode.dl.sourceforge.net Resolving downloads.sourceforge.net... 216.34.181.59 Connecting to downloads.sourceforge.net|216.34.181.59|:80... connected. HTTP request sent, awaiting response... 301 Moved Permanently Location: http://downloads.sourceforge.net/project/eflite/eflite/0.4.1/eflite-0.4.1.tar.gz [following] --2010-07-15 14:15:37-- http://downloads.sourceforge.net/project/eflite/eflite/0.4.1/eflite-0.4.1.tar.gz Reusing existing connection to downloads.sourceforge.net:80. HTTP request sent, awaiting response... 302 Found Location: http://freefr.dl.sourceforge.net/project/eflite/eflite/0.4.1/eflite-0.4.1.tar.gz [following] --2010-07-15 14:15:37-- http://freefr.dl.sourceforge.net/project/eflite/eflite/0.4.1/eflite-0.4.1.tar.gz Resolving freefr.dl.sourceforge.net... 88.191.250.132 Connecting to freefr.dl.sourceforge.net|88.191.250.132|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 63192 (62K) [application/x-gzip] Saving to: `/usr/portage/distfiles/eflite-0.4.1.tar.gz' 100%[=====================================================================================================================>] 63,192 246K/s in 0.3s 2010-07-15 14:15:38 (246 KB/s) - `/usr/portage/distfiles/eflite-0.4.1.tar.gz' saved [63192/63192] * eflite-0.4.1.tar.gz RMD160 SHA1 SHA256 size ;-) ... [ ok ] * checking ebuild checksums ;-) ... [ ok ] * checking auxfile checksums ;-) ... [ ok ] * checking miscfile checksums ;-) ... Ok, comparing your fetch log to the attached one, I've tracked this down to SourceForge redirecting us depending on the HTTP User-Agent header. Some users (e.g. weird paranoid geeks like me) might have a different user_agent set in /etc/wgetrc. This is what appears to break the SourceForge mirroring. Hence, I suggest that FETCHCOMMAND for emerge should contain a wget flag like "-U Wget" to force the User-Agent. Even user agent strings like "GentooEmerge" appear to work, whereas "Mozilla/5.0" and such do not. If that is not an option, I think there should at least be a comment in /etc/wgetrc warning the administrator that changing the user_agent option might break some mirroring for portage. wget isnt broken (In reply to comment #7) > Hence, I suggest that FETCHCOMMAND for emerge should contain a wget flag like > "-U Wget" to force the User-Agent. Even user agent strings like "GentooEmerge" > appear to work, whereas "Mozilla/5.0" and such do not. > I don't think that we should force this on the user, after all, there must be a reason why the User-Agent was changed in the first place. If it's acceptable for you to get the User-Agent overwritten, set your own FETCHCOMMAND in make.conf (man make.conf). I wonder if there is an appropriate place to add a warning for this. or add mention to the default make.conf. but i doubt people will see it ... The real cause is advertising crap from sourceforge. See: http://sourceforge.net/apps/trac/sourceforge/ticket/13094#comment:1 If SF thinks a browser (or anything that might look like one) wants to download a file, they deliver HTML ad crap and JavaScript to trigger the download. So if you change your UA with /etc/wgetrc (maybe to get around a stupid restriction that some other site has on "wget" or unknown UAs) like Jaak or have a proxy that rewrites the UA of all HTTP-Request to the same value (like me) then SF will happily deliver you some HTML with a 200 OK that will then fail the portage size/checksum tests. The only real solution I'm seeing here is to change the sourceforge-mirror-URLs in portage, as the current ones do no longer really seem correct. On trying to download kipi-plugins-1.3.0.tar.bz2 I get the following redirects: Original request: http://freefr.dl.sourceforge.net/sourceforge/kipi/kipi-plugins-1.3.0.tar.bz2 -> Redirected to: http://downloads.sourceforge.net/sourceforge/kipi/kipi-plugins-1.3.0.tar.bz2?download&failedmirror=freefr.dl.sourceforge.net -> Redirected to: http://sourceforge.net/project/downloading.php?groupname=kipi&filename=kipi-p lugins-1.3.0.tar.bz2&use_mirror=puzzle -> Redirected to: http://sourceforge.net/projects/kipi/files/kipi-plugins/1.3.0/kipi-plugins-1.3.0.tar.bz2/download -> That returns the ad page which has the following direct link embedded: http://downloads.sourceforge.net/project/kipi/kipi-plugins/1.3.0/kipi-plugins-1.3.0.tar.bz2?r=&ts=1284190127&use_mirror=netcologne Each of the tries from portage the pattern repeates: The first mirror fails, the use_mirror-URL gets redirected to the new ad infested download url and the the download fails because wget only gets the ad. So shouldn't the or SF-mirror-URLs http://{mirror}.sourceforge.net/sourceforge/{project}/{file} be replace by something like http://downloads.sourceforge.net/project/{project}/{file}?r=&ts=1284190127&use_mirror={mirror} (ts looks like the current unix timestamp) I'm seeing this behaviour now as well, and I don't have an explicit user agent in my /etc/wgetrc. Is this still an issue? Yes. rw-r--r-- 1 portage portage 25656 Jul 21 14:04 sleuthkit-4.1.0.tar.gz._checksum_failure_.xihbaz rw-r--r-- 1 portage portage 25777 Jul 11 08:39 espeak-1.47.11-source.zip._checksum_failure_.u7y5_l These two examples still linger in my distfiles directory as the latest failures. For each packages there will be 10s of files, each around 25k in size and only containing an html page. My "fix" is either to download these packages manually and moving them into distfiles or just waiting until the next day when the gentoo mirrors have caught up. But its not a really ggod fix, because this will abort the current emerge process. --keep-going mostly works. I don't think we should fix this in Portage (and certainly not in wget). Ebuilds should choose better links instead. i've added a note to the /etc/wgetrc file. but the answer is still "if you change user_agent in /etc/wgetrc, you need to also set a custom FETCHCOMMAND to reset that back to something else". http://sources.gentoo.org/net-misc/wget/files/wget-1.14-wgetrc.patch?rev=1.1 |