Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 460288 - sys-apps/portage: UnicodeEncodeError: 'utf-8' codec can't encode character '\udcc3' in position 7: surrogates not allowed
Summary: sys-apps/portage: UnicodeEncodeError: 'utf-8' codec can't encode character '\...
Status: RESOLVED FIXED
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Core - Interface (emerge) (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Portage team
URL: http://dpaste.com/1011727/
Whiteboard:
Keywords:
Depends on:
Blocks: 472632
  Show dependency tree
 
Reported: 2013-03-04 17:17 UTC by aves
Modified: 2020-04-18 20:28 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description aves 2013-03-04 17:17:28 UTC
I think it is also the case in older versions but I just noticed that when I emerge vanilla-sources-3.8.1 with the deblob useflag, some blobs fail to be removed.

LIBERTAS_USB - Marvell Libertas 8388 USB 802.11b/g cards
drivers/net/wireless/libertas/if_usb.c: removed blobs
Traceback (most recent call last):
File "/var/tmp/portage/sys-kernel/vanilla-sources-3.8.1/temp/deblob-check-script-bwGgd3", line 48, in <module>
for line in sys.stdin:
File "/usr/lib64/python3.2/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 169: ordinal not in range(128)
ERROR: failed removing blobs from drivers/net/wireless/libertas/README

Here is the full list of drivers that fail that way:
ATM_SOLOS, DVB_USB_AF9035, VIDEO_BT848, B43, WLAGS49_H2, LIBERTAS_USB, USB_EMI26, USB_EMI62.

I don't know what it is due to, but I think it means that these proprietary drivers can be used without knowing that they are non-free.

It was suggested to me to `export LANG=C`. But it didn't solve the problem and portage failed curiously the first time I try to emerge vanilla-sources.

Jelly ~ # emerge vanilla-sources
Traceback (most recent call last):
  File "/usr/bin/emerge", line 51, in <module>
    retval = emerge_main()
  File "/usr/lib64/portage/pym/_emerge/main.py", line 1044, in emerge_main
    gc_locals=locals().clear)
  File "/usr/lib64/portage/pym/_emerge/actions.py", line 3295, in run_action
    ext = os.path.splitext(x)[1]
  File "/usr/lib64/portage/pym/portage/__init__.py", line 244, in __call__
    wrapped_args, wrapped_kwargs = self._process_args(args, kwargs)
  File "/usr/lib64/portage/pym/portage/__init__.py", line 231, in _process_args
    for x in args]
  File "/usr/lib64/portage/pym/portage/__init__.py", line 231, in <listcomp>
    for x in args]
  File "/usr/lib64/portage/pym/portage/__init__.py", line 179, in _unicode_encode
    s = s.encode(encoding, errors)
UnicodeEncodeError: 'utf-8' codec can't encode character '\udcc3' in position 7: surrogates not allowed

Reproducible: Always




Jelly ~ # emerge --info vanilla-sources
Portage 2.3.5-r7 (funtoo/1.0/linux-gnu/arch/x86-64bit, gcc-4.6.3, glibc-2.15-r3, 3.8.1-gnu x86_64)
=================================================================
                         System Settings
=================================================================
System uname: Linux-3.8.1-gnu-x86_64-AMD_FX-tm-6100_Six-Core_Processor-with-gentoo-2.2.0
KiB Mem:     4045576 total,   3574780 free
KiB Swap:    8388604 total,   8388604 free
Timestamp of tree: Mon, 04 Mar 2013 06:45:01 +0000
ld GNU ld (GNU Binutils) 2.22
app-shells/bash:          4.2_p37
dev-lang/python:          2.7.3-r1000, 3.2.3-r1000
dev-util/cmake:           2.8.10.2-r1
sys-apps/baselayout:      2.2.0-r4
sys-apps/openrc:          0.10.2-r7
sys-apps/sandbox:         2.6
sys-devel/autoconf:       2.13, 2.69
sys-devel/automake:       1.9.6-r3, 1.11.6
sys-devel/binutils:       2.22-r1
sys-devel/gcc:            4.6.3
sys-devel/gcc-config:     1.5-r1
sys-devel/libtool:        2.4.2
sys-devel/make:           3.82-r4
sys-kernel/linux-headers: 3.4-r2 (virtual/os-headers)
sys-libs/glibc:           2.15-r3
Repositories: gentoo Jelly
ACCEPT_KEYWORDS="amd64 ~amd64"
ACCEPT_LICENSE="* -@EULA"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-O3 -pipe -march=bdver1 -mno-movbe -mno-fma -mno-bmi -mno-tbm --param l1-cache-size=16 --param l1-cache-line-size=64 --param l2-cache-size=2048 -mtune=bdver1"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/gnupg/qualified.txt"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/udev/rules.d"
CXXFLAGS="-O3 -pipe -march=bdver1 -mno-movbe -mno-fma -mno-bmi -mno-tbm --param l1-cache-size=16 --param l1-cache-line-size=64 --param l2-cache-size=2048 -mtune=bdver1"
DISTDIR="/usr/portage/distfiles"
FEATURES="assume-digests binpkg-logs config-protect-if-modified distlocks ebuild-locks fixlafiles merge-sync news parallel-fetch preserve-libs protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch"
FFLAGS=""
GENTOO_MIRRORS="http://distfiles.gentoo.org"
LANG="C"
LDFLAGS="-Wl,-O1 -Wl,--sort-common -Wl,--as-needed"
MAKEOPTS="-j7"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage"
SYNC="git://github.com/funtoo/ports-2012.git"
SYNC_USER="root"
USE="X acl alsa amd64 apng berkdb bzip2 cdr cracklib crypt cups cxx dbus dri dvd dvdr dvdread flac gdbm gif gpm iconv icu ipv6 jpeg lame mad mmx modules mp3 mpeg mudflap multilib ncurses nls nptl ogg opengl openmp pam pcre png pppd python readline resolvconf sse sse2 ssl tcpd tiff truetype udev unicode vorbis wavpack xml zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias authn_core authz_core socache_shmcb unixd" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump" CAMERAS="ptp2" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip tripmate tnt ubx" INPUT_DEVICES="evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" PHP_TARGETS="php5-3" PYTHON_ABIS="2.7 3.2" PYTHON_SINGLE_TARGET="python2_7" PYTHON_TARGETS="python2_7 python3_2" QEMU_SOFTMMU_TARGETS="i386 x86_64" QEMU_USER_TARGETS="i386 x86_64" RUBY_TARGETS="ruby18 ruby19" USERLAND="GNU" VIDEO_CARDS="nouveau" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL, LINGUAS, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, SYNC_UMASK

=================================================================
                        Package Settings
=================================================================

sys-kernel/vanilla-sources-3.8.0 was built with the following:
USE="deblob (multilib) symlink -build"
CFLAGS="-O2 -pipe -march=bdver1 -mno-movbe -mno-fma -mno-bmi -mno-tbm --param l1-cache-size=16 --param l1-cache-line-size=64 --param l2-cache-size=2048 -mtune=bdver1"
CXXFLAGS="-O2 -pipe -march=bdver1 -mno-movbe -mno-fma -mno-bmi -mno-tbm --param l1-cache-size=16 --param l1-cache-line-size=64 --param l2-cache-size=2048 -mtune=bdver1"


sys-kernel/vanilla-sources-3.8.1 was built with the following:
USE="deblob (multilib) symlink -build"
Comment 1 Zac Medico gentoo-dev 2013-03-04 18:28:57 UTC
The deblob-check-script thing is as separate program which is not part of sys-apps/portage, and therefore should be considered as an independent issue (see bug 458032).

The issue with portage looks like this:

http://stackoverflow.com/questions/11735363/python3-unicodeencodeerror-only-when-run-from-crontab

I'm not sure if there is a solution, other than using a UTF-8 locale. Please post the output of the following 2 commands:

locale
env LANG=C locale
Comment 2 aves 2013-03-04 18:43:12 UTC
Jelly ~ # locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE=POSIX
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
Jelly ~ # env LANG=C locale
LANG=C
LC_CTYPE="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_COLLATE=POSIX
LC_MONETARY="C"
LC_MESSAGES="C"
LC_PAPER="C"
LC_NAME="C"
LC_ADDRESS="C"
LC_TELEPHONE="C"
LC_MEASUREMENT="C"
LC_IDENTIFICATION="C"
LC_ALL=

The bug occured just once. Now it works even when LANG=C.
Comment 3 Zac Medico gentoo-dev 2013-03-04 18:56:05 UTC
(In reply to comment #2)
> The bug occured just once. Now it works even when LANG=C.

It may be dependent on the state of the terminal used for input. It's strange that pure ascii input like "vanilla-sources" could somehow be translated into that \udcc3 character. Please re-open if you are able to consistently reproduce the issue.
Comment 4 Sebastian Pipping gentoo-dev 2020-02-27 22:09:48 UTC
For the record, I ran into this issue again today.  In my case, the hostname of the machine contained weird data:

# hostname | od -x
0000000 a170 1f69 7f40 000a
0000007

Once I set a healthy hostname, the problem went away.
Comment 5 Zac Medico gentoo-dev 2020-02-27 23:46:54 UTC
Looks like this could happen for a string that python decodes using the surrogateescape error handler (like program arguments). We can re-encode the string as bytes, and then attempt to decode it again as UTF-8.
Comment 6 Vladimir Varlamov 2020-04-17 12:13:33 UTC
portage-2.3.99-r1

# hostname | od -x
0000000 9be1 0a9c
0000004

same error for dev-lang/perl
Comment 7 Zac Medico gentoo-dev 2020-04-18 20:13:28 UTC
(In reply to Sebastian Pipping from comment #4)
> For the record, I ran into this issue again today.  In my case, the hostname
> of the machine contained weird data:
> 
> # hostname | od -x
> 0000000 a170 1f69 7f40 000a
> 0000007
> 
> Once I set a healthy hostname, the problem went away.

(In reply to Vladimir Varlamov from comment #6)
> portage-2.3.99-r1
> 
> # hostname | od -x
> 0000000 9be1 0a9c
> 0000004
> 
> same error for dev-lang/perl

Please post a traceback so that I can see the source of the problem.
Comment 8 Zac Medico gentoo-dev 2020-04-18 20:22:48 UTC
(In reply to Zac Medico from comment #7)
> (In reply to Sebastian Pipping from comment #4)
> > For the record, I ran into this issue again today.  In my case, the hostname
> > of the machine contained weird data:
> > 
> > # hostname | od -x
> > 0000000 a170 1f69 7f40 000a
> > 0000007
> > 
> > Once I set a healthy hostname, the problem went away.
> 
> (In reply to Vladimir Varlamov from comment #6)
> > portage-2.3.99-r1
> > 
> > # hostname | od -x
> > 0000000 9be1 0a9c
> > 0000004
> > 
> > same error for dev-lang/perl
> 
> Please post a traceback so that I can see the source of the problem.

Actually, please open a new bug.

The issue that triggered the traceback shown in comment #0 would have been fixed by this commit:

https://gitweb.gentoo.org/proj/portage.git/commit/?id=a636c88eb998c562bfa8310862caa36315335aae

commit a636c88eb998c562bfa8310862caa36315335aae
Author:     Zac Medico <zmedico@gentoo.org>
AuthorDate: 2013-06-20 03:11:37 -0700
Commit:     Zac Medico <zmedico@gentoo.org>
CommitDate: 2013-06-20 03:11:37 -0700

    Decode sys.argv with surrogateescape for Python 3
    
    With Python 3, the surrogateescape encoding error handler makes it
    possible to access the original argv bytes, which can be useful
    if their actual encoding does no match the filesystem encoding.

 bin/portageq            |  3 +--
 bin/repoman             |  3 +--
 pym/_emerge/main.py     |  6 ++----
 pym/portage/__init__.py | 13 +++++++++++++
 4 files changed, 17 insertions(+), 8 deletions(-)