At least >=net-misc/wget-1.19.2 has a bad interaction for the default FETCHCOMMAND: --compression=type Choose the type of compression to be used. Legal values are auto, gzip and none. If auto or gzip are specified, Wget asks the server to compress the file using the gzip compression format. If the server compresses the file and responds with the "Content-Encoding" header field set appropriately, the file will be decompressed automatically. This is the default. If none is specified, wget will not ask the server to compress the file and will not decompress any server responses. So when the webserver sets Content-Encoding header the fetched file gets uncompressed: Resolving erlang.org (erlang.org)... 192.121.151.106 Connecting to erlang.org (erlang.org)|192.121.151.106|:80... connected. HTTP request sent, awaiting response... HTTP/1.1 200 OK Date: Wed, 13 Dec 2017 16:48:14 GMT Server: Apache/1.3.42 (Unix) Last-Modified: Wed, 13 Dec 2017 08:56:39 GMT ETag: "222878-5319a50-5a30eb47" Accept-Ranges: bytes Content-Length: 87136848 Keep-Alive: timeout=15, max=100 Connection: Keep-Alive Content-Type: application/x-tar Content-Encoding: gzip Length: 87136848 (83M) [application/x-tar] Saving to: 'otp_src_20.2.tar.gz' [...] # stat /usr/portage/distfiles/otp_src_20.2.tar.gz File: /usr/portage/distfiles/otp_src_20.2.tar.gz Size: 237035520 Blocks: 462960 IO Block: 4096 regular file So a 87MB file is now 237M on disk. Yeargh! This creates broken Manifests and causes lots of frustration. /usr/share/portage/config/make.globals contains: FETCHCOMMAND="wget -t 3 -T 60 --passive-ftp -O \"\${DISTDIR}/\${FILE}\" \"\${URI}\"" which should have --compression=none injected into the options
There was a user on IRC who encountered something like this; it turns out he was using a misconfigured proxy server. Are you sure this is wget's fault?
(In reply to Patrick Lauer from comment #0) > which should have --compression=none injected into the options Or --compression=auto so wget will do the right thing whether the file is sent deflated or not. This also sounds like a whet bug considering the docs you pasted say the auto behaviour is the default, which it's clearly not.
(In reply to Mike Gilbert from comment #1) Nevermind. I can reproduce this without any proxy involved.
Tested locally with --compress=none in FETCHCOMMAND, filesize is now as intended.
Copying wget maintainer since this new behavior is pretty questionable.
*** Bug 640948 has been marked as a duplicate of this bug. ***
*** This bug has been marked as a duplicate of bug 636238 ***
The erlang.org http server incorrectly sets "Content-Encoding: x-gzip": > $ curl -IL http://www.erlang.org/download/otp_src_20.1.tar.gz > HTTP/1.1 301 Moved Permanently > Server: nginx > Date: Wed, 13 Dec 2017 22:09:45 GMT > Content-Type: text/html > Content-Length: 178 > Connection: keep-alive > Location: http://erlang.org/download/otp_src_20.1.tar.gz > > HTTP/1.1 200 OK > Date: Wed, 13 Dec 2017 22:15:03 GMT > Server: Apache/1.3.42 (Unix) > Last-Modified: Tue, 26 Sep 2017 10:17:28 GMT > ETag: "22319d-534bcd8-59ca2938" > Accept-Ranges: bytes > Content-Length: 87342296 > Content-Type: application/x-tar > Content-Encoding: x-gzip
(In reply to Zac Medico from comment #8) > The erlang.org http server incorrectly sets "Content-Encoding: x-gzip": If you add the "--compressed" option, curl sends "Accept-Encoding: deflate, gzip" web server sends "Content-Encoding: gzip", but without re-compressing the data stream. So yeah, this seems to be a config issue on the web server, but one that would be easily worked around by updating FETCHCOMMAND.
Gentoo would be great (again) if people spent as much time reporting problems to server admins as they do shoving problems under the carpet.
The upstream web server does not obey the http spec with respect to the "Content-Encoding: x-gzip" header, so an argument can be made to switch the files in SRC_URI to mirror://gentoo until the problem has been fixed upstream.
I think this is solved now with ebuilds moving to the usage of github tarballs