Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 174704 - SRC_URI and URLs containing ampersands or question marks
Summary: SRC_URI and URLs containing ampersands or question marks
Status: RESOLVED WONTFIX
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Core - Ebuild Support (show other bugs)
Hardware: All Linux
: High normal
Assignee: Portage team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-04-15 18:23 UTC by Phillip Berndt
Modified: 2007-10-01 02:09 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Use urllib for URL parsing (portage-correct-srcuri.patch,2.31 KB, patch)
2007-04-21 19:56 UTC, Phillip Berndt
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Phillip Berndt 2007-04-15 18:23:23 UTC
Portage can't handle SRC_URI url's containing ampersands or question marks. Problems include:
* Default FETCHCOMMAND does not enclose ${URI} with quotes.
* Even if the file is stored in distfiles/, portage says that it can't be found although it's stored in the Manifest (created by ebuild..digest)

Some websites use parameters for static files, i.e.
 http://www.host.tld/download.tbz       --> Download information page
 http://www.host.tld/download.tbz?dl=1  --> Download

I suggest using urlparse instead of basename for URI-parsing.

Reproducible: Always

Steps to Reproduce:
1. Create an ebuild containing a SOURCE_URI following the scheme http://*/foo?bar&baz
2. Call ebuild foo.ebuild digest
3. Enjoy.
Comment 1 solar (RETIRED) gentoo-dev 2007-04-15 18:33:57 UTC
Atleast one build in the tree uses ampersands and it escapes them
games-puzzle/hoh-bin/hoh-bin-1.01.ebuild:SRC_URI="http://retrospec.sgn.net/download.php?id=63\&path=games/hoh/bin/hohlin-${PV/./}.tar.bz2"
Comment 2 Phillip Berndt 2007-04-15 18:52:28 UTC
Ah, that's a workaround I didn't think of. :) basename strips before the last /.

If the URL was specified as

SRC_URI="http://retrospec.sgn.net/download.php?path=games/hoh/bin/hohlin-${PV/./}.tar.bz2\&id=63"

(which should make no difference) it doesn't work anymore. Of course one could always add a filename to the URL, i.e.

 http://www.host.tld/download.tbz?dl=1 ->
 http://www.host.tld/download.tbz?dl=1\&/download.tbz
 
but anyway, I'd prefer resolving this (potential) design flaw..
Comment 3 Phillip Berndt 2007-04-15 18:54:32 UTC
Ps.

nijil hoh-bin # ebuild hoh-bin-1.01.ebuild fetch
...
>>> Downloading 'http://retrospec.sgn.net/download.php?id=63\&path=games/hoh/bin/hohlin-101.tar.bz2'
20:53:39 (116.34 KB/s) - `/usr/portage/distfiles/game-links.php?link=hoh.1' saved [30635]

Doesn't work either for me :)
Comment 4 Jakub Moc (RETIRED) gentoo-dev 2007-04-16 11:28:31 UTC
Bug 172068 is related.
Comment 5 Phillip Berndt 2007-04-21 19:56:00 UTC
Created attachment 116928 [details, diff]
Use urllib for URL parsing

Since there was no further reaction on this topic yet, I wrote a patch to implement my suggestion. I don't know anything about portage coding standards so regard it as a proof-of-concept.

With this patch, all URLs in portage.py and md5check.py will be converted to filenames using urlparse & basename. I tested it using
 SOURCE_URI="http://foobar/foo/bar/baz.tar.bz2?foobar

After editing make.conf to
 FETCHCOMMAND="/usr/bin/wget -t 2 -T 60 --passive-ftp -O '\${DISTDIR}/\${FILE}' '\${URI}'"
the file was saved as baz.tar.bz2. References to that file in ebuilds were correct at first sight.

I also checked the whole portage tree for packages which might have problems due to shared distfiles-filenames with the patch (I searched by m/(?<!#)(SRC_URI="[^"]+:\/\/\S+(\?|&)[^"]*)/i). Affected packages include:
------
/usr/portage/games-puzzle/hoh-bin/hoh-bin-1.01.ebuild
 SRC_URI="http://retrospec.sgn.net/download.php?id=63\&path=games/hoh/bin/hohlin-${PV/./}.tar.bz2
/usr/portage/sys-power/nvram-wakeup/nvram-wakeup-0.97_p863.ebuild
 SRC_URI="${SRC_URI}
 	http://nvram-wakeup.svn.sourceforge.net/viewvc/*checkout*/nvram-wakeup/trunk/nvram-wakeup/nvram-wakeup-mb.c?revision=${REV}
/usr/portage/games-board/r-katro/r-katro-0.7.0.ebuild
 SRC_URI="http://f.rodrigo.free.fr/r-tech/cmp/addon-module/link/link.php?games/r-katro/${P}.tar.bz2
/usr/portage/dev-python/pythonutils/pythonutils-0.2.3.ebuild
 SRC_URI="http://www.voidspace.org.uk/cgi-bin/voidspace/downman.py?file=${P}.zip
/usr/portage/dev-python/pythonutils/pythonutils-0.2.5.ebuild
 SRC_URI="http://www.voidspace.org.uk/cgi-bin/voidspace/downman.py?file=${P}.zip
/usr/portage/dev-python/pythonutils/pythonutils-0.2.0.ebuild
 SRC_URI="http://www.voidspace.org.uk/cgi-bin/voidspace/downman.py?file=${P}.zip
/usr/portage/dev-python/pythonutils/pythonutils-0.2.2.ebuild
 SRC_URI="http://www.voidspace.org.uk/cgi-bin/voidspace/downman.py?file=${P}.zip
/usr/portage/media-sound/kstreamripper/kstreamripper-0.3.4.ebuild
 SRC_URI="http://www.tuxipuxi.org/?download=${P}.tar.bz2
/usr/portage/games-arcade/spacerider/spacerider-0.13.ebuild
 SRC_URI="http://www.hackl.dhs.org/data/download/download.php?file=${P}.tar.bz2
-----
(All of them create files like "/usr/portage/distfiles/download.php?file=spacerider-0.13.tar.bz2")

A more generic and maybe even nicer attempt (than my patch) was to force package filenames to
 category/package.<orginal extension>
which however was not compatible with the existing distfiles/ structure.
Comment 6 Zac Medico gentoo-dev 2007-06-01 14:45:48 UTC
In order to handle all the different possible cases, it seems like what's really needed is and ability to specify a mapping as suggested in bug 177863.
Comment 7 Zac Medico gentoo-dev 2007-10-01 02:09:31 UTC
(In reply to comment #5)
> Created an attachment (id=116928) [edit]
> Use urllib for URL parsing

Since your patch introduces incompatible behavior (files saved and unpacked using different names), and the enhancement suggested in bug 177863 will solve the problem, I'm marking this WONTFIX. Thanks anyway for trying.