Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 327229 - user_agent in /etc/wgetrc might break mirror://sourceforge/ for ebuilds
Summary: user_agent in /etc/wgetrc might break mirror://sourceforge/ for ebuilds
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Documentation (show other bugs)
Hardware: All Linux
: High normal (vote)
Assignee: Portage team
Depends on:
Blocks: 377365
  Show dependency tree
Reported: 2010-07-06 23:04 UTC by Jaak Ristioja
Modified: 2013-12-23 03:26 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---

Example fetch log (sourceforge_mirror_fetch.log,12.30 KB, text/plain)
2010-07-06 23:08 UTC, Jaak Ristioja

Note You need to log in before you can comment on or make changes to this bug.
Description Jaak Ristioja 2010-07-06 23:04:19 UTC
Things like
appear to be broken in ebuilds. Portage installed is latest stable version
Comment 1 Jaak Ristioja 2010-07-06 23:08:00 UTC
Created attachment 237807 [details]
Example fetch log

Here's the example output of a failed fetch from a mirror://sourceforge/.
Comment 2 Jaak Ristioja 2010-07-07 08:45:13 UTC
app-accessibility/eflite was just an example! Every other ebuild with mirror://sourceforge/ also failed. The downloaded file is always some SourceForge download page, i.e. HTML code.

Anyway, it appears to work now. I think it might have been a temporary issue with the sourceforge mirrors rather than Gentoo.
Comment 3 Jaak Ristioja 2010-07-07 08:56:21 UTC
Ah, sry, I was mistaken. All mirror://sourceforge/ still fail. And as you can see from the log I attached previously, the downloaded file is always text/html. Maybe sourceforge changed the setup of its mirrors or something?
Comment 4 Jaak Ristioja 2010-07-12 15:33:51 UTC
This blocks development of ebuilds for packages which can only be downloaded from the sourceforge mirrors.
Comment 5 William Hubbs gentoo-dev 2010-07-14 16:44:33 UTC
Bug wranglers,

can you please redirect this to the appropriate assignee?


Comment 6 Kacper Kowalik (Xarthisius) (RETIRED) gentoo-dev 2010-07-15 12:18:33 UTC
Works for me
>>> Downloading ''
--2010-07-15 14:15:36--
Connecting to||:80... connected.
HTTP request sent, awaiting response... 302 Moved Temporarily
Location: [following]
--2010-07-15 14:15:37--
Connecting to||:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: [following]
--2010-07-15 14:15:37--
Reusing existing connection to
HTTP request sent, awaiting response... 302 Found
Location: [following]
--2010-07-15 14:15:37--
Connecting to||:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 63192 (62K) [application/x-gzip]
Saving to: `/usr/portage/distfiles/eflite-0.4.1.tar.gz'

100%[=====================================================================================================================>] 63,192       246K/s   in 0.3s    

2010-07-15 14:15:38 (246 KB/s) - `/usr/portage/distfiles/eflite-0.4.1.tar.gz' saved [63192/63192]

 * eflite-0.4.1.tar.gz RMD160 SHA1 SHA256 size ;-) ...                                                                                        [ ok ]
 * checking ebuild checksums ;-) ...                                                                                                                    [ ok ]
 * checking auxfile checksums ;-) ...                                                                                                                   [ ok ]
 * checking miscfile checksums ;-) ...   
Comment 7 Jaak Ristioja 2010-07-15 20:22:35 UTC
Ok, comparing your fetch log to the attached one, I've tracked this down to SourceForge redirecting us depending on the HTTP User-Agent header. Some users (e.g. weird paranoid geeks like me) might have a different user_agent set in /etc/wgetrc. This is what appears to break the SourceForge mirroring.

Hence, I suggest that FETCHCOMMAND for emerge should contain a wget flag like "-U Wget" to force the User-Agent. Even user agent strings like "GentooEmerge" appear to work, whereas "Mozilla/5.0" and such do not.

If that is not an option, I think there should at least be a comment in /etc/wgetrc warning the administrator that changing the user_agent option might break some mirroring for portage.
Comment 8 SpanKY gentoo-dev 2010-07-18 02:24:52 UTC
wget isnt broken
Comment 9 Sebastian Luther (few) 2010-07-19 05:10:50 UTC
(In reply to comment #7)
> Hence, I suggest that FETCHCOMMAND for emerge should contain a wget flag like
> "-U Wget" to force the User-Agent. Even user agent strings like "GentooEmerge"
> appear to work, whereas "Mozilla/5.0" and such do not.

I don't think that we should force this on the user, after all, there must be a reason why the User-Agent was changed in the first place.

If it's acceptable for you to get the User-Agent overwritten, set your own FETCHCOMMAND in make.conf (man make.conf).

I wonder if there is an appropriate place to add a warning for this.
Comment 10 SpanKY gentoo-dev 2010-07-26 19:59:15 UTC
or add mention to the default make.conf.  but i doubt people will see it ...
Comment 11 Torsten Kaiser 2010-09-11 08:01:30 UTC
The real cause is advertising crap from sourceforge.


If SF thinks a browser (or anything that might look like one) wants to download a file, they deliver HTML ad crap and JavaScript to trigger the download.

So if you change your UA with /etc/wgetrc (maybe to get around a stupid restriction that some other site has on "wget" or unknown UAs) like Jaak or have a proxy that rewrites the UA of all HTTP-Request to the same value (like me) then SF will happily deliver you some HTML with a 200 OK that will then fail the portage size/checksum tests.

The only real solution I'm seeing here is to change the sourceforge-mirror-URLs in portage, as the current ones do no longer really seem correct.

On trying to download kipi-plugins-1.3.0.tar.bz2 I get the following redirects:
Original request:
-> Redirected to:
-> Redirected to:
-> Redirected to:
-> That returns the ad page which has the following direct link embedded:

Each of the tries from portage the pattern repeates: The first mirror fails, the use_mirror-URL gets redirected to the new ad infested download url and the the download fails because wget only gets the ad.

So shouldn't the or SF-mirror-URLs http://{mirror}{project}/{file} be replace by something like{project}/{file}?r=&ts=1284190127&use_mirror={mirror}

(ts looks like the current unix timestamp)
Comment 12 Hans de Graaff gentoo-dev 2010-11-24 12:25:48 UTC
I'm seeing this behaviour now as well, and I don't have an explicit user agent in my /etc/wgetrc.
Comment 13 Zac Medico gentoo-dev 2013-08-12 16:35:43 UTC
Is this still an issue?
Comment 14 Torsten Kaiser 2013-08-12 17:51:08 UTC

rw-r--r-- 1 portage portage      25656 Jul 21 14:04 sleuthkit-4.1.0.tar.gz._checksum_failure_.xihbaz
rw-r--r-- 1 portage portage      25777 Jul 11 08:39

These two examples still linger in my distfiles directory as the latest failures.
For each packages there will be 10s of files, each around 25k in size and only containing an html page.

My "fix" is either to download these packages manually and moving them into distfiles or just waiting until the next day when the gentoo mirrors have caught up. But its not a really ggod fix, because this will abort the current emerge process. --keep-going mostly works.
Comment 15 Alexander Berntsen (RETIRED) gentoo-dev 2013-09-19 09:55:14 UTC
I don't think we should fix this in Portage (and certainly not in wget). Ebuilds should choose better links instead.
Comment 16 SpanKY gentoo-dev 2013-12-23 03:26:30 UTC
i've added a note to the /etc/wgetrc file.  but the answer is still "if you change user_agent in /etc/wgetrc, you need to also set a custom FETCHCOMMAND to reset that back to something else".