Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 175058

Summary: emerge/portage does not support DISTDIR path with spaces in the name
Product: Portage Development Reporter: Alexander Skwar <askwar>
Component: Enhancement/Feature RequestsAssignee: Portage team <dev-portage>
Status: RESOLVED FIXED    
Severity: normal CC: eva, togge.gentoo, tom.gl
Priority: High Keywords: InVCS
Version: 2.1   
Hardware: All   
OS: All   
Whiteboard:
Package list:
Runtime testing required: ---
Bug Depends on:    
Bug Blocks: 172589    
Attachments: Adds " around the distdir var in portage/__init__.py
quote the variables in FETCHCOMMAND and RESUMECOMMAND
avoid potential quoting issues by spawning FETCHCOMMAND without a shell
expand all possible variables in each argument

Description Alexander Skwar 2007-04-18 10:18:49 UTC
I'm using Portage 2.1.2.4. In make.conf, I've set, among other things:

DISTDIR="/Windows/Documents and Settings/askwar/My Documents/Downloads/Unix/Gentoo/Distfiles

So the path for DISTDIR contains spaces in the name.

Now I tried to have emerge fetch some file:

caskwar@winnb000488 /Gentoo/Portage/tree/x11-apps $ emerge -vf xrdb

These are the packages that would be fetched, in order:

Calculating dependencies... done!
[ebuild   R   ] x11-apps/xrdb-1.0.3  USE="-debug" 98 kB 

Total: 1 package (1 reinstall), Size of downloads: 98 kB


>>> Emerging (1 of 1) x11-apps/xrdb-1.0.3 to /
Adjusting permissions recursively: '/Windows/Documents and Settings/askwar/My Documents/Downloads/Unix/Gentoo/Distfiles/'
Adjusting permissions recursively: '/Windows/Documents and Settings/askwar/My Documents/Downloads/Unix/Gentoo/Distfiles/.locks'
>>> Downloading 'http://gentoo.supp.name/distfiles/xrdb-1.0.3.tar.bz2'
--12:09:54--  http://and/
           => `/Windows/Documents/index.html'
Resolving and... failed: Unknown host.
--12:09:55--  http://settings/askwar/My
           => `/Windows/Documents/My'
Resolving settings... failed: Unknown host.
--12:09:55--  http://documents/Downloads/Unix/Gentoo/Distfiles
           => `/Windows/Documents/Distfiles'
Resolving documents... failed: Unknown host.
--12:09:55--  http://gentoo.supp.name/distfiles/xrdb-1.0.3.tar.bz2
           => `/Windows/Documents/xrdb-1.0.3.tar.bz2'
Resolving gentoo.supp.name... 82.208.58.65
Connecting to gentoo.supp.name|82.208.58.65|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 99'840 (98K) [application/x-tar]

100%[====================================>] 99'840       550.29K/s             

12:09:55 (547.99 KB/s) - `/Windows/Documents/xrdb-1.0.3.tar.bz2' saved [99840/99840]


This did not work, as the file was stored on the wrong directory.


It would be nice, if Portage would either support spaces in path names or if it would completely refuse to work, if there's a space in the path name (ie. check if space are in path, and if so, die with an error message to that effect).
Comment 1 Togge 2007-04-18 11:37:00 UTC
Here's a fix, hope it's the correct one :)

$ svn diff pym/portage/__init__.py 
Index: pym/portage/__init__.py
===================================================================
--- pym/portage/__init__.py	(revision 6415)
+++ pym/portage/__init__.py	(working copy)
@@ -2601,8 +2601,8 @@
 				else:
 					resumecommand=mysettings["RESUMECOMMAND"]
 
-				fetchcommand=fetchcommand.replace("${DISTDIR}",mysettings["DISTDIR"])
-				resumecommand=resumecommand.replace("${DISTDIR}",mysettings["DISTDIR"])
+				fetchcommand=fetchcommand.replace("${DISTDIR}",'"'+mysettings["DISTDIR"]+'"')
+				resumecommand=resumecommand.replace("${DISTDIR}",'"'+mysettings["DISTDIR"]'"')
 
 				if not can_fetch:
 					if fetched != 2:
Comment 2 Togge 2007-04-18 11:39:20 UTC
Created attachment 116626 [details, diff]
Adds " around the distdir var in portage/__init__.py

Damn those line breaks...
Comment 3 Alexander Skwar 2007-04-18 11:58:46 UTC
(In reply to comment #1)
> Here's a fix, hope it's the correct one :)
> 
> $ svn diff pym/portage/__init__.py 
> Index: pym/portage/__init__.py

Couldn't find this file, so I applied the patch against /usr/lib/portage/pym/portage.py

I'd like to propose the following though:

                                fetchcommand=fetchcommand.replace("${DISTDIR}","'" + mysettings["DISTDIR"].encode('string-escape') + "'")                                                                                                             
                                resumecommand=resumecommand.replace("${DISTDIR}","'" + mysettings["DISTDIR"].encode('string-escape') + "'")                                                                                                           
                                                                                                                           

Ie. enclose the string in ' and encode the string. The encode() will encode ' to \', see http://www.thescripts.com/forum/thread40131.html

With this, even more funny path names like 

/tmp/Funny " Directory

(yes, with a ") work.
Comment 4 Alexander Skwar 2007-04-28 14:24:08 UTC
Any chance that this bug gets fixed in an upcoming release of portage?
Comment 5 Zac Medico gentoo-dev 2007-04-28 17:42:18 UTC
Created attachment 117564 [details, diff]
quote the variables in FETCHCOMMAND and RESUMECOMMAND

Rather than do a special case for ${DISTDIR}, I've simply quoted both variables in the default FETCHCOMMAND and RESUMECOMMAND.
Comment 6 Alexander Skwar 2007-04-28 20:29:38 UTC
This will still break for directories named like

  /tmp/Funny " Directory

(ie. with " in the name).

And what about

  /tmp `rm -rf /` Dir

?

With the command I suggested in Comment #3, all of these would work.
Comment 7 Zac Medico gentoo-dev 2007-04-29 05:37:04 UTC
(In reply to comment #3)
> resumecommand=resumecommand.replace("${DISTDIR}","'" +
> mysettings["DISTDIR"].encode('string-escape') + "'")                            

This is getting ridiculous.  I think it will be better if we just bypass the bash shell completely.  I doubt that anybody actually uses any shell features in their FETCHCOMMAND anyway.
Comment 8 Zac Medico gentoo-dev 2007-04-29 06:40:42 UTC
Created attachment 117612 [details, diff]
avoid potential quoting issues by spawning FETCHCOMMAND without a shell

This is in svn r6455:6457.
Comment 9 Alexander Skwar 2007-04-29 13:51:24 UTC
(In reply to comment #7)
> (In reply to comment #3)
> > resumecommand=resumecommand.replace("${DISTDIR}","'" +
> > mysettings["DISTDIR"].encode('string-escape') + "'")                            
> 
> This is getting ridiculous. 

I disagree. If  there's something that's ridiculous, it is that user entered values are used directly in a code; ie. without making sure that the code the user entered is safe (by whatever is defined as being "safe").

> I think it will be better if we just bypass the
> bash shell completely.

Maybe. Why make the code so complicated?

BTW: The same issue exists for all the other *DIR entries like PORTDIR, PKGDIR, .... (To be honest, I haven't checked, but I'd suppose so.)


>  I doubt that anybody actually uses any shell features
> in their FETCHCOMMAND anyway.

Well, it might happen, though. Suppose you've got multiple people who are allowed to edit make.conf and one is trying to play a trick on the other "portage admins". Or suppose there's some typo in the make.conf (eg. accidently pasted text). Without proper "safety" checks, bad things might happen.

And no, this is not about "holding the hands of other people". It's about writing something, that provides safety; or that doesn't feature unnecessary dangers.
Comment 10 Marius Mauch (RETIRED) gentoo-dev 2007-04-29 14:06:23 UTC
IMO we should simply ensure that all config variables have sane values, e.g. for paths that they match [-0-9a-zA-Z/_] or so and abort if we get unexpected input, rather than checking all usages for potential issues.
Comment 11 Alexander Skwar 2007-04-29 15:57:15 UTC
(In reply to comment #10)
> IMO we should simply ensure that all config variables have sane values, e.g.
> for paths that they match [-0-9a-zA-Z/_] or so and abort if we get unexpected
> input, rather than checking all usages for potential issues.

Fine. This way, only safe values are allowed.

But please don't simply check for a-zA-Z, but take into consideration "special" characters like the german umlauts (äöüÄÖÜ and also ß). So, please check for something like :alnum:. And please also allow a space in the list of valid characters.
Comment 12 Zac Medico gentoo-dev 2007-04-29 18:07:49 UTC
(In reply to comment #9)
> > I think it will be better if we just bypass the
> > bash shell completely.
> 
> Maybe. Why make the code so complicated?

Why bother to spawn the command via a shell when it creates potential problems and nobody needs it anyway?  I think the patch that I've attached is complex.  It's pretty simple and it doesn't put any restriction on the characters that can be used in ${DISTDIR} or ${URI} since the arguments are passed to the exec call as an array.
Comment 13 Zac Medico gentoo-dev 2007-04-29 18:08:41 UTC
(In reply to comment #12)
> I think the patch that I've attached is complex. 
 s/I think/I don't think/ ;)
Comment 14 Zac Medico gentoo-dev 2007-05-05 04:37:44 UTC
This has been released in 2.1.2.6.
Comment 15 TGL 2007-05-05 11:56:58 UTC
Here is my FETCHCOMMAND:
FETCHCOMMAND="/usr/bin/curl --ftp-method singlecwd -g -f -k \
	--connect-timeout 10 --retry 3 \
	-o \${DISTDIR}/\${FILE} \${URI}"

Sure, it no more works with this patch, since "${DISTDIR}/${FILE}" is no more substituted.  For now, i've worked around that by adding it to the "variables" dict:
     variables = {"${DISTDIR}":mysettings["DISTDIR"],
-        "${URI}":loc, "${FILE}":myfile}
+        "${URI}":loc, "${FILE}":myfile,
+        "${DISTDIR}/${FILE}":mysettings["DISTDIR"]+"/"+myfile}

But I don't think it's a real fix: what if, for instance, someone uses  "--output-document=\${FILE}" in his FETCHCOMMAND?  I'm not sure how to rewrite this code so that it doesn't expect the variables in FETCHCOMMAND are all isolated by the split() function.

Could someone reopen this bug?
Thanks.
Comment 16 Zac Medico gentoo-dev 2007-05-05 17:45:49 UTC
Created attachment 118264 [details, diff]
expand all possible variables in each argument

This is fixed in svn rr6477:6479.

(In reply to comment #15)
> I'm not sure how to rewrite
> this code so that it doesn't expect the variables in FETCHCOMMAND are all
> isolated by the split() function.
Comment 17 Zac Medico gentoo-dev 2007-05-05 17:47:40 UTC
Reopening until the last patch is released.
Comment 18 Zac Medico gentoo-dev 2007-05-07 03:33:34 UTC
This has been released in 2.1.2.7.