Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 532784

Summary: UnicodeDecodeError on urlopen error with ru_RU.UTF-8 locale
Product: Portage Development Reporter: Mike Hiretsky <mh>
Component: CoreAssignee: Portage team <dev-portage>
Status: RESOLVED FIXED    
Severity: major CC: mike
Priority: Normal Keywords: InVCS
Version: 2.2   
Hardware: All   
OS: Linux   
Whiteboard:
Package list:
Runtime testing required: ---
Bug Depends on:    
Bug Blocks: 484436    
Attachments: [PATCH] bintree.py: do not pass unicode encoding with non-string type

Description Mike Hiretsky 2014-12-17 10:47:35 UTC
If network (or host) is unreachable, urlopen raise urlopen error with message "Errno 101] Network is unreachable". I have bug with ru_RU.UTF-8 locale. Bug appear due localization of glibc message, activated by "locale.setlocale(locale.LC_ALL, '')" in _emerge/main.py module. I have python2.7.

!!! Error fetching binhost package info from 'http://5.255.226.146/calculate/CLDX/grp/x86_64'
Traceback (most recent call last):
  File "/usr/lib/python-exec/python2.7/emerge", line 50, in <module>
    retval = emerge_main()
  File "/usr/lib64/python2.7/site-packages/_emerge/main.py", line 1070, in emerge_main
    return run_action(emerge_config)
  File "/usr/lib64/python2.7/site-packages/_emerge/actions.py", line 3713, in run_action
    getbinpkgs="--getbinpkg" in emerge_config.opts)
  File "/usr/lib64/python2.7/site-packages/portage/dbapi/bintree.py", line 627, in populate
    self._populate(getbinpkgs)
  File "/usr/lib64/python2.7/site-packages/portage/dbapi/bintree.py", line 1010, in _populate
    writemsg("!!! %s\n\n" % str(e))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 27: ordinal not in range(128)
Comment 1 Zac Medico gentoo-dev 2014-12-17 17:38:48 UTC
There's a patch in this branch:

	https://github.com/zmedico/portage/tree/bug_532784

I've posted it for review here:

	http://thread.gmane.org/gmane.linux.gentoo.portage.devel/5012
Comment 2 Zac Medico gentoo-dev 2014-12-17 22:14:57 UTC
This is in the master branch now:

https://github.com/gentoo/portage/commit/4496ee37d6fa327ada635c67500e82f830141a9e
Comment 3 Mike Hiretsky 2014-12-19 07:49:43 UTC
compiled from 9999 today.

!!! Error fetching binhost package info from 'http://5.255.226.146/calculate/CLDX/grp/x86_64'
Traceback (most recent call last):
  File "/usr/lib/python-exec/python2.7/emerge", line 50, in <module>
    retval = emerge_main()
  File "/usr/lib64/python2.7/site-packages/_emerge/main.py", line 1087, in emerge_main
    return run_action(emerge_config)
  File "/usr/lib64/python2.7/site-packages/_emerge/actions.py", line 2811, in run_action
    getbinpkgs="--getbinpkg" in emerge_config.opts)
  File "/usr/lib64/python2.7/site-packages/portage/dbapi/bintree.py", line 630, in populate
    self._populate(getbinpkgs)
  File "/usr/lib64/python2.7/site-packages/portage/dbapi/bintree.py", line 1031, in _populate
    _encodings["stdio"], errors="replace"))
TypeError: coercing to Unicode: need string or buffer, URLError found
Comment 4 Zac Medico gentoo-dev 2014-12-19 14:53:12 UTC
(In reply to Mike Hiretsky from comment #3)
>   File "/usr/lib64/python2.7/site-packages/portage/dbapi/bintree.py", line
> 1031, in _populate
>     _encodings["stdio"], errors="replace"))
> TypeError: coercing to Unicode: need string or buffer, URLError found

Hmm, this one is tricky. It seems impossible to reproduce with my locale:

Python 2.7.7 (default, Sep  9 2014, 07:37:40) 
[GCC 4.7.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale
>>> locale.setlocale(locale.LC_ALL, '')
'en_US.utf8'
>>> import urllib2
>>> e = urllib2.URLError('foo')
>>> unicode(e)
u'<urlopen error foo>'
>>> e = urllib2.URLError(u'foo')
>>> unicode(e)
u'<urlopen error foo>'

I guess I'll have to generate a Russian locale, and do some experimenting...
Comment 5 Arfrever Frehtes Taifersar Arahesis 2014-12-19 15:37:16 UTC
(In reply to Zac Medico from comment #4)

You probably need to add "ru" to LINGUAS in make.conf and rebuild glibc.
Comment 6 Mike Hiretsky 2015-04-29 08:06:48 UTC
!!! Error fetching binhost package info from 'http://mirror.cnet.kz/calculate/CLDX/grp/x86_64'
Traceback (most recent call last):
  File "/usr/lib/python-exec/python2.7/emerge", line 50, in <module>
    retval = emerge_main()
  File "/usr/lib64/python2.7/site-packages/_emerge/main.py", line 1154, in emerge_main
    return run_action(emerge_config)
  File "/usr/lib64/python2.7/site-packages/_emerge/actions.py", line 2818, in run_action
    getbinpkgs="--getbinpkg" in emerge_config.opts)
  File "/usr/lib64/python2.7/site-packages/portage/dbapi/bintree.py", line 633, in populate
    self._populate(getbinpkgs)
  File "/usr/lib64/python2.7/site-packages/portage/dbapi/bintree.py", line 1034, in _populate
    _encodings["stdio"], errors="replace"))
TypeError: coercing to Unicode: need string or buffer, URLError found
Comment 7 mike@marineau.org 2015-04-30 00:52:51 UTC
I get the following error when the binhost url reports a 404. LANG=en_US.UTF-8

> !!! Error fetching binhost package info from 'http://builds.developer.core-os.net/sdk/amd64/668.0.0/pkgs/'
> Traceback (most recent call last):
>   File "/usr/lib/python-exec/python2.7/emerge", line 50, in <module>
>     retval = emerge_main()
>   File "/usr/lib64/python2.7/site-packages/_emerge/main.py", line 1154, in emerge_main
>     return run_action(emerge_config)
>   File "/usr/lib64/python2.7/site-packages/_emerge/actions.py", line 2818, in run_action
>     getbinpkgs="--getbinpkg" in emerge_config.opts)
>   File "/usr/lib64/python2.7/site-packages/portage/dbapi/bintree.py", line 633, in populate
>     self._populate(getbinpkgs)
>   File "/usr/lib64/python2.7/site-packages/portage/dbapi/bintree.py", line 1034, in _populate
>     _encodings["stdio"], errors="replace"))
> TypeError: coercing to Unicode: need string or buffer, HTTPError found

Previously this gracefully logged an error and continued on.
Comment 8 mike@marineau.org 2015-04-30 01:57:02 UTC
(In reply to Zac Medico from comment #4)
> (In reply to Mike Hiretsky from comment #3)
> >   File "/usr/lib64/python2.7/site-packages/portage/dbapi/bintree.py", line
> > 1031, in _populate
> >     _encodings["stdio"], errors="replace"))
> > TypeError: coercing to Unicode: need string or buffer, URLError found
> 
> Hmm, this one is tricky. It seems impossible to reproduce with my locale:
> 
> Python 2.7.7 (default, Sep  9 2014, 07:37:40) 
> [GCC 4.7.3] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import locale
> >>> locale.setlocale(locale.LC_ALL, '')
> 'en_US.utf8'
> >>> import urllib2
> >>> e = urllib2.URLError('foo')
> >>> unicode(e)
> u'<urlopen error foo>'
> >>> e = urllib2.URLError(u'foo')
> >>> unicode(e)
> u'<urlopen error foo>'
> 
> I guess I'll have to generate a Russian locale, and do some experimenting...

The above did not cause an error because you did not specify an encoding like the code in question does. unicode() without the encoding argument works but with one it does not:

>>> e = Exception("failure")
>>> unicode(e)
u'failure'
>>> unicode(e, 'utf_8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: coercing to Unicode: need string or buffer, exceptions.Exception found

The documentations reports: "If encoding and/or errors are given, unicode() will decode the object which can either be an 8-bit string or a character buffer using the codec for encoding... If no optional parameters are given, unicode() will mimic the behaviour of str()"

So using unicode() in this way is invalid, just drop the extra args. :)
Comment 9 mike@marineau.org 2015-04-30 02:14:42 UTC
Created attachment 402260 [details, diff]
[PATCH] bintree.py: do not pass unicode encoding with non-string type
Comment 10 Zac Medico gentoo-dev 2015-05-04 03:17:16 UTC
(In reply to mike@marineau.org from comment #9)
> Created attachment 402260 [details, diff] [details, diff]
> [PATCH] bintree.py: do not pass unicode encoding with non-string type

Maybe we should also add errors="replace" to the _unicode arguments.
Comment 11 Zac Medico gentoo-dev 2015-05-04 03:20:26 UTC
(In reply to Zac Medico from comment #10)
> Maybe we should also add errors="replace" to the _unicode arguments.

Actually, that probably raises TypeError. So, lets apply the patch as is.
Comment 12 Zac Medico gentoo-dev 2015-05-04 03:35:41 UTC
(In reply to Zac Medico from comment #11)
> (In reply to Zac Medico from comment #10)
> > Maybe we should also add errors="replace" to the _unicode arguments.
> 
> Actually, that probably raises TypeError. So, lets apply the patch as is.

I'm going modify it to handle UnicodeDecodeError (as in comment #0).
Comment 14 Brian Dolbec (RETIRED) gentoo-dev 2015-05-19 19:50:06 UTC
Released in portage-2.2.19