OK. This is pretty bad, and we are kind of blessed that a service error on the part of portagefilelist.de brought it up.
The script uses curl to fetch "http://www.portagefilelist.de/index.php/Special:PFLQuery2?file=$1&searchfile=lookup&lookup=file&txt" (where "$1" is the user's input. At the moment, this service is giving a 403 Forbidden response, so the awk code in e-file fails.
Probably the easiest way to understand the security issue I'm referring to is to simply run "e-file foo" at the moment. You will find it builds all kinds of shell commands directly from the response that comes back from the web server.
The first one is on line 65:
cmd="ls -tgGd --time-style=+%c /var/db/pkg/" pkg "* 2>/dev/null"
There, that "pkg" is an awk variable that came from splitting the input on
"/". In my execution I end up with this error:
sh: -c: line 0: `ls -tgGd --time-style=+%c /var/db/pkg/</body></html>/* 2>/dev/null'
This resulted in an ugly shell error -- which I've seen several users mention
on #gentoo -- but the real problem is that if that HTML was something
malicious, it could wreak havoc.
I don't think I need to make a proof-of-concept intput, but I could do so,
easily, if you wanted.
My proposal is to completely offline the package for now, then when
portagefilelist.de gets its "Special:PFLQuery2" back up (I submitted a bug
report to them -- I don't know how responsive they are), then someone should
rewrite a legitimate script, probably in Python.
I am willing to do this, but for a modest fee of $4,350.
Only kidding about the fee. I'd be fine with rewriting it. From the looks of
it the output from the original Special:PFLQuery2 was pretty well-formed. If
we used Python's httplib, not only could we fail more gracefully in events
such as the current 403 error, but we could also do things like use os.stat()
to get the file time rather than a constructed "ls ..." command.
Hit me up on IRC as "rking" or email firstname.lastname@example.org if you need anything
I think I fixed the service outage.
Pleaes acknowledge (or not, if it doesn't ;) ).
Currently there is no "nice" API for searching :( I have to program a new interface. May I do in SOAP. Thus (providing a WSDL) would result in easy programming of clients, wouldn't it?
Yep, it looks great.
Now that I see the format, it is simple TSV - using awk to build up shell commands is definitely not the right approach.
Here is what I have so far:
there is https://github.com/richardgv/e-file-py/blob/master/e-file-py.py
Is it yours? I am a little confused because an issue created in my own tracker:
May I should stop public using of jira? However, lets talk about the script.
It's nice and it will help much because there is nothing that I have to do (sorry, less and more less time :-( ). The script is directly using the HTML output of http://www.portagefilelist.de .
There is just one thing: It is using dev-python/beautifulsoup-4 which is currently keyworded in portage.
But I think we should put in in the portage and should wait for stabilization of beautifulsoup.
@rking: Next time please don't assign your bugs yourself but use the default assignee email@example.com. It took me two weeks to become aware of this bug.
Hrm, weird… I don't remember assigning the bug to myself. Oops.
So, I notice the text interface is down (again) and again the awk-based version is doing crazy stuff on failure (it starts off by taking a really long time to fail, then failing similar to before).
This puts more pressure on the use of the py3 version.
And no, it's not my code, but I like it.
(In reply to comment #5)
> So, I notice the text interface is down (again) and again the awk-based
> version is doing crazy stuff on failure (it starts off by taking a really
> long time to fail, then failing similar to before).
> This puts more pressure on the use of the py3 version.
Well, I had to delete million of records (pre 2009). Thus the DB was "down for maintenance". Sorry for that :-(
I have send a message to the origin author of e-file. May he want to participate.
(In reply to comment #3)
> there is https://github.com/richardgv/e-file-py/blob/master/e-file-py.py
> Is it yours? I am a little confused because an issue created in my own
It's mine. (richardgv == Richard Grenville) Sorry the late reply, I forgot that the issue tracker is anonymous and I won't get any email notification when you reply. Indeed I suppose using an issue tracking system that supports user registration might be a better idea.
> It's nice and it will help much because there is nothing that I have to do
> (sorry, less and more less time :-( ). The script is directly using the HTML
> output of http://www.portagefilelist.de .
Thanks for your appreciation. Do you have any other requirements about the script? The script definitely lacks proper documentation of all its features, but I simply don't like writing documentation...
> There is just one thing: It is using dev-python/beautifulsoup-4 which is
> currently keyworded in portage.
At the time I wrote the script dev-python/beautifulsoup:4 already has the keyword ~amd64, I remember, at least. I could have used regular expressions to parse the HTML result instead but I thought that would be even more unreliable than BeautifulSoup.
> But I think we should put in in the portage and should wait for
> stabilization of beautifulsoup.
Could you put a link to my script on portagefilelist.de firstly?
(In reply to comment #6)
> I have send a message to the origin author of e-file. May he want to
bones7456's blog ( http://luy.li/ ) has not been updated for almost a year. I'm afraid you cannot expect him to come back shortly.
@@ -25,7 +25,7 @@
-curl -s $URL | awk -v isgentoo=$isgentoo '
+curl -s -f $URL | awk -v isgentoo=$isgentoo '
Does this patch fix the issue. Is there a way to test it?
(In reply to Daniel Pielmeier from comment #9)
> Does this patch fix the issue. Is there a way to test it?
Yes. Do a MitM for the www.portagefilelist.de domain that returns something irresponsible from tcp/80.
That's what would have worked when I tested the patch I posted, but since the web server in question was on the blink at the time, I didn't need it.
(In reply to Jeroen Roovers from comment #10)
> (In reply to Daniel Pielmeier from comment #9)
> > Does this patch fix the issue. Is there a way to test it?
> Yes. Do a MitM for the www.portagefilelist.de domain that returns something
> irresponsible from tcp/80.
The easiest would probably be to add www.portagefilelist.de to your hosts file and direct it to something that isn't www.portagefilelist.de .
+*pfl-2.4-r1 (14 Oct 2013)
+ 14 Oct 2013; Daniel Pielmeier <firstname.lastname@example.org> +pfl-2.4-r1.ebuild,
+ Revision bump. Fixes bug #413191. Thanks to rking for the report and Jeroen
+ Roovers the for the fix.
Okay I did test this. Using the "-f" flag does not result in garbage output if there is a problem with the webserver.
@rking, Jeroen, Daniel: Thanks for taking care.
As discussed with Richard I will create a JSON interface to PFL including a JSON Schema. That should take care about server2client impacts for the future.
(In reply to Daniel from comment #13)
> @rking, Jeroen, Daniel: Thanks for taking care.
> As discussed with Richard I will create a JSON interface to PFL including a
> JSON Schema. That should take care about server2client impacts for the
Switching to HTTPS (with a proper certificate) would help, too.
@security: any reason to keep this bug open pfl-2.4-r1 fixes this bug and is already stable.
Per previous comment a patch has been applied. GLSA needed?
Fix was committed 2 1/2 years a go and no vote placed.
*pfl-2.4-r1 (14 Oct 2013)
14 Oct 2013; Daniel Pielmeier <email@example.com> +pfl-2.4-r1.ebuild,
Revision bump. Fixes bug #413191. Thanks to rking for the report and Jeroen
Roovers the for the fix.
Final stable marking:
21 Jan 2014; Markus Meier <firstname.lastname@example.org> pfl-2.4-r1.ebuild:
arm stable, bug #488730