First Last Prev Next    No search results available      Search page      Enter new bug
Bug#: 162493
Alias:
Product:
Component:
Status: RESOLVED
Resolution: FIXED
Assigned To: Portage Utilities Team <tools-portage@gentoo.org>
Hardware:
OS:
Version:
Priority:
Severity:
Reporter: Chris Gottbrath <gentoo@gil-barad.net>
Add CC:
CC:
Remove selected CCs
URL:
Summary:
Status Whiteboard:
Keywords:

Filename Description Type Creator Created Size Actions
Create a New Attachment (proposed patch, testcase, etc.) View All

Bug 162493 depends on: Show dependency tree
Show dependency graph
Bug 162493 blocks: 170220 172955 181170 186549 194356
Votes: 20    Show votes for this bug    Vote for this bug

Additional Comments: (this is where you put emerge --info)







View Bug Activity   |   Format For Printing   |   XML   |   Clone This Bug


Description:   Opened: 2007-01-17 06:48 0000
When I run any glsa-check query that reads 200701-12 it chokes on 
the unicode in someone's name. 


Reproducible: Always

Steps to Reproduce:
1. glsa-check -l 200701-12

Actual Results:  
% glsa-check -l 200701-12                                                      
     chrisg@AmonDin:/home/chrisg
[A] means this GLSA was already applied,
[U] means the system is not affected and
[N] indicates that the system might be affected.

Traceback (most recent call last):
  File "/usr/bin/glsa-check", line 205, in ?
    sys.exit(summarylist(glsalist))
  File "/usr/bin/glsa-check", line 171, in summarylist
    myglsa = Glsa(myid, glsaconfig)
  File "/usr/lib/gentoolkit/pym/glsa.py", line 414, in __init__
    self.read()
  File "/usr/lib/gentoolkit/pym/glsa.py", line 432, in read
    self.parse(urllib.urlopen(myurl))
  File "/usr/lib/gentoolkit/pym/glsa.py", line 470, in parse
    self.description = getText(myroot.getElementsByTagName("description")[0],
format="xml")
  File "/usr/lib/gentoolkit/pym/glsa.py", line 233, in getText
    return str(rValue)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 8:
ordinal not in range(128)


Expected Results:  
glsa-check should handle unicode properly.

In my environment the following 'quick fix' avoids the crash but replaces the
two characters with question marks. I suspect this is not the 'correct' fix. 

% diff -c /usr/lib/gentoolkit/pym/glsa.py /home/chrisg/temp/                  
chrisg@AmonDin:/home/chrisg/temp
*** /usr/lib/gentoolkit/pym/glsa.py     Wed Jan 17 01:13:07 2007
--- /home/chrisg/temp/glsa.py   Wed Jan 17 02:25:48 2007
***************
*** 230,235 ****
--- 230,236 ----
        if format == "strip":
                rValue = rValue.strip(" \n\t")
                rValue = re.sub("[\s]{2,}", " ", rValue)
+       rValue=rValue.encode('ascii','replace')  # fix to handle unicode input
        return str(rValue)

  def getMultiTagsText(rootnode, tagname, format):

------- Comment #1 From Jakub Moc 2007-01-17 13:24:12 0000 -------
*** Bug 162532 has been marked as a duplicate of this bug. ***

------- Comment #2 From Jakub Moc 2007-01-17 13:49:22 0000 -------
just emerge --sync; we've replaced the offending characters in that GLSA
meanwhile...

------- Comment #3 From Richard Benjamin Voigt 2007-04-01 03:01:16 0000 -------
Another one....  I think Chris's solution would be quite reasonable.

glsa-check -l 200703-26
[A] means this GLSA was already applied,
[U] means the system is not affected and
[N] indicates that the system might be affected.

Traceback (most recent call last):
  File "/usr/bin/glsa-check", line 212, in ?
    sys.exit(summarylist(glsalist))
  File "/usr/bin/glsa-check", line 172, in summarylist
    myglsa = Glsa(myid, glsaconfig)
  File "/usr/lib/gentoolkit/pym/glsa.py", line 414, in __init__
    self.read()
  File "/usr/lib/gentoolkit/pym/glsa.py", line 432, in read
    self.parse(urllib.urlopen(myurl))
  File "/usr/lib/gentoolkit/pym/glsa.py", line 470, in parse
    self.description = getText(myroot.getElementsByTagName("description")[0],
format="xml")
  File "/usr/lib/gentoolkit/pym/glsa.py", line 233, in getText
    return str(rValue)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position
11: ordinal not in range(128)

------- Comment #4 From vicaya 2007-04-01 07:56:44 0000 -------
A better fix would be changing line 233 to:

return str(rValue.encode('utf-8'));

glsa*.xml is already encoded in utf-8, reencode to ascii (python's default
encoding) would bound to have problems.

------- Comment #5 From Marius Mauch 2007-05-30 18:17:07 0000 -------
Did anyone actually test those changes with a GLSA containing non-Ascii
characters (with all glsa-check operations)? I have to admit that I'm pretty
ignorant when it comes to Unicode issues, so I'm not exactly the most qualified
person to test this.
Two things one should be aware of here:
1) the current conversion mainly exists to ensure that we only pass Ascii
strings into portage as portage does a few type checks that would fail with
Unicode strings resulting in even nastier error messages.
2) in recent versions glsa-check got a new --mail option, if glsa.py would
return strings containing non-ascii characters one would have to make sure that
we set the correct MIME type for mails.

------- Comment #6 From Rob M. 2007-06-07 05:40:28 0000 -------
I have the exact same bug, same traceback and everything. Just synced a hour
ago.

just tried the fix provided, works fine. This should probably be committed if
there is some more testing.

------- Comment #7 From Sebastian Siewior 2007-06-07 10:14:45 0000 -------
Just synced, same problem with glsa-200706-02.xml.
Fix from comment #1 did not work, fix #4 works fine. Please apply this glsa

------- Comment #8 From Toby Murray 2007-07-25 13:47:52 0000 -------
And another one this morning. This seems like such a simple problem to fix
permanently... 

------- Comment #9 From Calum 2007-07-26 17:12:15 0000 -------
I'm suffering too on several boxes.

  File "/usr/lib/gentoolkit/pym/glsa.py", line 233, in getText
    return str(rValue)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in position
34: ordinal not in range(128)

Is there something I've not done - like emerge python with the right unicode
support? If so, I'll do it.

------- Comment #10 From Dmitry Karasik 2007-10-01 03:03:17 0000 -------
Same problem with glsa-200709-18.xml.

Traceback (most recent call last):
  File "/usr/bin/glsa-check", line 168, in <module>
    myglsa = Glsa(x, glsaconfig)
  File "/usr/lib/gentoolkit/pym/glsa.py", line 441, in __init__
    self.read()
  File "/usr/lib/gentoolkit/pym/glsa.py", line 459, in read
    self.parse(urllib.urlopen(myurl))
  File "/usr/lib/gentoolkit/pym/glsa.py", line 497, in parse
    self.description = getText(myroot.getElementsByTagName("description")[0],
format="xml")
  File "/usr/lib/gentoolkit/pym/glsa.py", line 242, in getText
    return str(rValue)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position
544: ordinal not in range(128)

------- Comment #11 From Jakub Moc 2007-10-01 18:18:17 0000 -------
*** Bug 194404 has been marked as a duplicate of this bug. ***

------- Comment #12 From Toby Murray 2007-10-01 22:01:23 0000 -------
Is this ever going to get fixed or are we just going to continue to come back
here every 2-3 months and complain about it?

------- Comment #13 From Calum 2007-10-02 08:34:29 0000 -------
Agreed. As someone who only updates packages based on the cron output of
glsa-check -l affected, this is quite important to me.
If it stops working, I don't get any notice of security alerts.
They already stopped issuing kernel GLSAs (for some reason I can't fathom) -
but this needs to be solid.
Not all of us have time to read all the bugzilla security entries.

------- Comment #14 From Calum 2007-10-02 08:37:06 0000 -------
Who's going to change the severity then? :)

------- Comment #15 From Gerben Vos 2007-10-02 11:28:49 0000 -------
Added a few things to my proposed fix at bug 194404 .

------- Comment #16 From hexa 2007-10-02 14:07:23 0000 -------
All my security scripts fail every time we have some strange glsa entry and i
fail to get report by mail. Leaving my servers vulnerable while thinking i did
my work.

PLEASE some1 fix this and change severity.

------- Comment #17 From Andrea 2007-10-02 16:56:58 0000 -------
Same problem here with gentoolkit-0.2.3-r1

Fixed changing line 233 in "gentoolkit/pym/glsa.py"
from:
return str(rValue)
to:
return rValue.encode('utf-8')

------- Comment #18 From Peter Bichler 2007-10-08 10:33:52 0000 -------
This works fine for me.
Many thanks


(In reply to comment #17)
> Same problem here with gentoolkit-0.2.3-r1
> 
> Fixed changing line 233 in "gentoolkit/pym/glsa.py"
> from:
> return str(rValue)
> to:
> return rValue.encode('utf-8')
> 

------- Comment #19 From hexa 2007-10-08 10:39:38 0000 -------
(In reply to comment #18)
> This works fine for me.
> Many thanks
> 
> 
It will untill you emerge gentoolkit again :-)

------- Comment #20 From Paul Varner 2008-02-21 01:51:41 0000 -------
Released in gentoolkit-0.2.4_rc2

First Last Prev Next    No search results available      Search page      Enter new bug