Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 604826

Summary: www-client/pybugz - bugz search -C Stabilization -C Keywording: UnicodeEncodeError: 'ascii' codec can't encode character '\u2008' in position 55: ordinal not in range(128)
Product: Gentoo Linux Reporter: Jeroen Roovers (RETIRED) <jer>
Component: Current packagesAssignee: William Hubbs <williamh>
Status: CONFIRMED ---    
Severity: normal CC: toralf
Priority: Normal    
Version: unspecified   
Hardware: All   
OS: Linux   
Package list:
Runtime testing required: ---
Attachments: pybugz-unicode.txt

Description Jeroen Roovers (RETIRED) gentoo-dev 2017-01-06 12:29:09 UTC
Created attachment 458902 [details]

[ebuild   R    ] www-client/pybugz-0.13::gentoo  USE="-zsh-completion" PYTHON_TARGETS="python3_4 -python3_5" 0 KiB
Comment 1 Jeroen Roovers (RETIRED) gentoo-dev 2017-01-06 12:45:08 UTC
While recreating the bug list using the web interface, I found the next bug on the list:

591972	=dev-libs/grok-0.9.2 stabilization request
591974	=dev-libs/librdkafka-0.9.1 stabilization request
591978	=net-libs/libnet-1.2_rc3-r1 stabilization request

But I don't see anything wrong with bug #591974 or bug #591978.

I also can't reproduce this on another system.
Comment 2 Arfrever Frehtes Taifersar Arahesis 2017-01-06 20:14:51 UTC
\u2008 is a whitespace character, potentially hard to see :) .


This character is currently used in summary of bug #591978.

Problem occurs when C locale is used:

$ python3.4 -c 'print("x\u2008x")'
x x
$ LC_ALL="C" python3.4 -c 'print("x\u2008x")'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character '\u2008' in position 1: ordinal not in range(128)

The solution is to manually encode string to bytes (using UTF-8 encoding) and to directly write to underlying binary stream (sys.stdout.buffer). Remember to use explicit \n when writing to sys.stdout or sys.stdout.buffer.

$ LC_ALL="C" python3.4 -c 'import sys; sys.stdout.buffer.write("x\u2008x\n".encode("UTF-8"))'
x x
Comment 3 Toralf Förster gentoo-dev 2019-07-24 17:27:15 UTC
IMO since few days this bug does appear again. But not at the stable Gentoo hardened system acting as the tinderbox, but at its unstable chroot images:

Within an image I do have:

mr-fox ~ # bugz -q --columns 400 search --show-status ' undefined reference to'
Traceback (most recent call last):
  File "/usr/lib/python-exec/python3.6/bugz", line 11, in <module>
    load_entry_point('pybugz==0.13', 'console_scripts', 'bugz')()
  File "/usr/lib64/python3.6/site-packages/bugz/", line 702, in main
  File "/usr/lib64/python3.6/site-packages/bugz/", line 678, in search
    list_bugs(result, settings)
  File "/usr/lib64/python3.6/site-packages/bugz/", line 103, in list_bugs
UnicodeEncodeError: 'ascii' codec can't encode character '\u2026' in position 89: ordinal not in range(128)

mr-fox ~ # emerge -qpvO www-client/pybugz
 * waiting for lock on /var/db/.pkg.portage_lockfile ...                                                                                                      [ ok ]
[ebuild   R   ] www-client/pybugz-0.13  USE="-zsh-completion" PYTHON_TARGETS="python3_6 -python3_5 -python3_7" 

whereas the same query outside of the chroot works fine:

tinderbox@mr-fox ~ $ bugz -q --columns 400 search --show-status ' undefined reference to'
650304 CONFIRMED    toolchain            dev-libs/cloog-0.18.4 with >=dev-libs/isl-0.19: …/ undefined reference to `isl_basic_set_drop_constraint'

/me wonders how to deal with that?
Comment 4 Toralf Förster gentoo-dev 2019-07-24 18:43:22 UTC
(In reply to Toralf Förster from comment #3)
So FWIW so I do have a reproducer here at these images:

tinderbox@mr-fox ~ $ grep -H "ordinal not in range" img/*/var/tmp/tb/issues/*/body | cut -f1 -d':' | sort -u | xargs ls -lt
-rw-rw-rw- 1 root root 369128 Jul 24 20:38 img/17.1_desktop-stable-20190722-130807/var/tmp/tb/issues/20190724-203818-app-i18n_librime-1.2.9/body
-rw-rw-rw- 1 root root 598320 Jul 24 08:37 img/17.1_no-multilib_hardened-20190722-003824/var/tmp/tb/issues/20190724-083634-dev-libs_cloog-0.18.4/body
-rw-rw-rw- 1 root root 488399 Jul 24 01:37 img/17.1_desktop_gnome-libressl-20190721-114445/var/tmp/tb/issues/20190724-013714-net-firewall_ipt_netflow-2.4/body
-rw-rw-rw- 1 root root 343113 Jul 23 22:33 img/17.1_desktop-stable-20190722-130807/var/tmp/tb/issues/20190723-223312-dev-util_lttng-tools-2.7.1/body
-rw-rw-rw- 1 root root 315987 Jul 23 22:33 img/17.1-stable_libressl_abi32+64-20190723-212825/var/tmp/tb/issues/20190723-223222-sys-libs_ncurses-6.1_p20181020/body
-rw-rw-rw- 1 root root 267670 Jul 23 15:22 img/17.1_no-multilib_hardened-20190722-003824/var/tmp/tb/issues/20190723-152148-x11-misc_unclutter-xfixes-1.5/body
-rw-rw-rw- 1 root root 503473 Jul 23 12:55 img/17.1_no-multilib_hardened-20190722-003824/var/tmp/tb/issues/20190723-125428-net-firewall_xtables-addons-3.3/body
-rw-rw-rw- 1 root root 271530 Jul 23 12:24 img/17.1_desktop_plasma-stable_libressl-20190723-112602/var/tmp/tb/issues/20190723-122346-dev-perl_Net-SSLeay-1.820.0/body
-rw-rw-rw- 1 root root 343837 Jul 23 02:59 img/17.1_desktop-stable-20190722-130807/var/tmp/tb/issues/20190723-025832-dev-util_lttng-tools-2.7.1/body
Comment 5 Toralf Förster gentoo-dev 2019-07-26 20:44:34 UTC
Well, maybe the problem here was to not run "eselect locale set en_US.utf8" affected images but maybe bugz should set it?
Comment 6 Arfrever Frehtes Taifersar Arahesis 2019-07-26 23:32:47 UTC
(In reply to Toralf Förster from comment #5)
> Well, maybe the problem here was to not run "eselect locale set en_US.utf8"
> affected images but maybe bugz should set it?

Using UTF-8 locale is a user-side workaround, but the solution possible to implement in pybugz is described in comment #2.
Calls to print() with potentially unsafe input should be replaced with calls to sys.stdout.buffer.write().