I'm trying to fetch a binary package I've uploaded to Amazon AWS S3 but it is failing. The file name is dvd+rw-tools-7.1-r1.tbz2, but S3 doesn't like the plus symbol (+) when this is converted to a URL. I attempted to look through the code and find an easy way to resolve this. For binary packages the function is _start in _emerge/BinPkgFetcher.py. Adding the actual modifications to this file to conditionally quote the URL isn't difficult, but I don't know how to retry the fetch if it fails the first time. I haven't looked to see what happens with source package fetches. *********************************************** *** Current output for binary package fetch *** *********************************************** >>> Fetching (1 of 1) app-cdr/dvd+rw-tools-7.1-r1::gentoo --2016-10-21 01:27:18-- https://s3.amazonaws.com/mybucket/app-cdr/dvd+rw-tools-7.1-r1.tbz2 Resolving s3.amazonaws.com... 52.216.64.51 Connecting to s3.amazonaws.com|52.216.64.51|:443... connected. HTTP request sent, awaiting response... 403 Forbidden 2016-10-21 01:27:19 ERROR 403: Forbidden. ******************************** *** Expected output on retry *** ******************************** >>> Fetching (1 of 1) app-cdr/dvd+rw-tools-7.1-r1::gentoo --2016-10-21 01:27:52-- https://s3.amazonaws.com/mybucket/app-cdr/dvd%2Brw-tools-7.1-r1.tbz2 Resolving s3.amazonaws.com... 52.216.65.51 Connecting to s3.amazonaws.com|52.216.65.51|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 140577 (137K) [application/x-tar] Saving to: '/usr/portage/packages/app-cdr/dvd+rw-tools-7.1-r1.tbz2.partial'
This isn't just limited to dvd+rw-tools, although that does make the bug easier to understand. I've had the issue with gtk+ packages too. I'm not sure if the URL encoding should be done by default. If that is the case then the change is trivial to implement.
Adjusting the summary to address the binpkg aspect. SRC_URI is being addressed via bug 598380.
IMHO, Portage should leave URIs alone. If we encode "+" as "%2B" then how would one pass a literal plus sign? Note that RFC 3986 defines "+" (amongst other characters) as a sub-delimiter, in order "to provide a set of delimiting characters that are distinguishable from other data within a URI". Maybe it is clearer for "/" which is also a reserved character. You only encode it as "%2F" if you *don't* want it to act as a delimiter. (In reply to MCassaniti from comment #0) > https://s3.amazonaws.com/mybucket/app-cdr/dvd+rw-tools-7.1-r1.tbz2 > Resolving s3.amazonaws.com... 52.216.64.51 > Connecting to s3.amazonaws.com|52.216.64.51|:443... connected. > HTTP request sent, awaiting response... 403 Forbidden > https://s3.amazonaws.com/mybucket/app-cdr/dvd%2Brw-tools-7.1-r1.tbz2 > Resolving s3.amazonaws.com... 52.216.65.51 > Connecting to s3.amazonaws.com|52.216.65.51|:443... connected. > HTTP request sent, awaiting response... 200 OK > [...] As one can see, the 403 and 200 responses are generated on the server side.