Bug 705122

Summary:	app-text/asciidoc-8.6.10_p20181016 when compiling net-firewall/nftables-0.9.3-r1::gentoo -- UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 5796: ordinal not in range(128)
Product:	Gentoo Linux	Reporter:	Gary E. Miller <gem>
Component:	Current packages	Assignee:	No maintainer - Look at https://wiki.gentoo.org/wiki/Project:Proxy_Maintainers if you want to take care of it <maintainer-needed>
Status:	UNCONFIRMED ---
Severity:	normal	CC:	base-system, gem, jstein, Klaus+gentoo, klondike, prometheanfire, proxy-maint, reagentoo, shimarin, tvorup
Priority:	Normal
Version:	unspecified
Hardware:	All
OS:	Linux
See Also:	https://github.com/asciidoc/asciidoc-py3/issues/92
Whiteboard:
Package list:		Runtime testing required:	---
Attachments:	build log

Description Gary E. Miller 2020-01-10 18:57:43 UTC

a2x -L --doctype manpage --format manpage -D . nft.txt
Traceback (most recent call last):
  File "/usr/bin/a2x", line 931, in <module>
    source_options = get_source_options(sys.argv[-1])
  File "/usr/bin/a2x", line 337, in get_source_options
    for line in f:
  File "/usr/lib64/python3.6/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 5796: ordinal not in range(128)



Reproducible: Always

Steps to Reproduce:
1. emerge nftables
2.
3.
Actual Results:  
Making all in doc
make[2]: Entering directory '/var/tmp/portage/net-firewall/nftables-0.9.3-r1/work/nftables-0.9.3/doc'
a2x -L --doctype manpage --format manpage -D . nft.txt
Traceback (most recent call last):
  File "/usr/bin/a2x", line 931, in <module>
    source_options = get_source_options(sys.argv[-1])
  File "/usr/bin/a2x", line 337, in get_source_options
    for line in f:
  File "/usr/lib64/python3.6/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 5796: ordinal not in range(128)
make[2]: *** [Makefile:643: nft.8] Error 1


Expected Results:  
A clean emerge

spidey ~ # emerge --info '=net-firewall/nftables-0.9.3-r1::gentoo'
Portage 2.3.79 (python 2.7.17-final-0, default/linux/amd64/17.1, gcc-9.2.0, glibc-2.29-r7, 5.4.6-gentoo x86_64)
=================================================================
                         System Settings
=================================================================
System uname: Linux-5.4.6-gentoo-x86_64-Intel-R-_Xeon-R-_CPU_E5-1620_v3_@_3.50GHz-with-gentoo-2.6
KiB Mem:    16346000 total,  13429748 free
KiB Swap:   33554428 total,  33554428 free
Timestamp of repository gentoo: Fri, 10 Jan 2020 18:00:02 +0000
Head commit of repository gentoo: eb8ad6cd6910158aaaf51bb1adf3c9a3e27effdc
Head commit of repository brother-overlay: 47f14e766fbbd0cf283c733b1501788eab5deba4

sh bash 4.4_p23-r1
ld GNU ld (Gentoo 2.32 p2) 2.32.0
distcc[26207] (dcc_trace_version) distcc 3.3.3 x86_64-pc-linux-gnu; built Dec 30 2019 14:04:55 [disabled]
ccache version 3.7.6 [disabled]
app-shells/bash:          4.4_p23-r1::gentoo
dev-java/java-config:     2.2.0-r4::gentoo
dev-lang/perl:            5.30.1::gentoo
dev-lang/python:          2.7.17::gentoo, 3.5.7::gentoo, 3.6.9::gentoo
dev-util/ccache:          3.7.6::gentoo
dev-util/cmake:           3.14.6::gentoo
dev-util/pkgconfig:       0.29.2::gentoo
sys-apps/baselayout:      2.6-r1::gentoo
sys-apps/openrc:          0.42.1::gentoo
sys-apps/sandbox:         2.13::gentoo
sys-devel/autoconf:       2.13-r1::gentoo, 2.69-r4::gentoo
sys-devel/automake:       1.11.6-r3::gentoo, 1.13.4-r2::gentoo, 1.14.1::gentoo, 1.15.1-r2::gentoo, 1.16.1-r1::gentoo
sys-devel/binutils:       2.32-r1::gentoo
sys-devel/gcc:            9.2.0-r3::gentoo
sys-devel/gcc-config:     2.1::gentoo
sys-devel/libtool:        2.4.6-r3::gentoo
sys-devel/make:           4.2.1-r4::gentoo
sys-kernel/linux-headers: 5.4::gentoo (virtual/os-headers)
sys-libs/glibc:           2.29-r7::gentoo
Repositories:

gentoo
    location: /var/db/repos/gentoo
    sync-type: rsync
    sync-uri: rsync://backup.rellim.com/gentoo-portage
    priority: -1000
    sync-rsync-verify-jobs: 1
    sync-rsync-verify-metamanifest: yes
    sync-rsync-extra-opts: --exclude ChangeLog* --delete-excluded
    sync-rsync-verify-max-age: 24

brother-overlay
    location: /usr/local/overlay/brother-overlay
    sync-type: git
    sync-uri: https://github.com/stefan-langenmaier/brother-overlay.git
    masters: gentoo

ACCEPT_KEYWORDS="amd64"
ACCEPT_LICENSE="*"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-O2 -pipe -march=haswell"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/lib64/libreoffice/program/sofficerc /usr/share/gnupg/qualified.txt /usr/share/themes/oxygen-gtk/gtk-2.0 /var/bind"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/php/apache2-php7.2/ext-active/ /etc/php/apache2-php7.3/ext-active/ /etc/php/apache2-php7.4/ext-active/ /etc/php/cgi-php7.2/ext-active/ /etc/php/cgi-php7.3/ext-active/ /etc/php/cgi-php7.4/ext-active/ /etc/php/cli-php7.2/ext-active/ /etc/php/cli-php7.3/ext-active/ /etc/php/cli-php7.4/ext-active/ /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo"
CXXFLAGS="-O2 -pipe -march=haswell"
DISTDIR="/var/cache/distfiles"
EMERGE_DEFAULT_OPTS="--keep-going --with-bdeps=y --backtrack=5"
ENV_UNSET="DBUS_SESSION_BUS_ADDRESS DISPLAY GOBIN PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR"
FCFLAGS="-O2 -pipe"
FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs cgroup config-protect-if-modified distlocks ebuild-locks fixlafiles ipc-sandbox merge-sync multilib-strict network-sandbox news parallel-fetch pid-sandbox preserve-libs protect-owned sfperms splitdebug unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
FFLAGS="-O2 -pipe"
GENTOO_MIRRORS="http://gentoo.mirrors.hoobly.com/ rsync://gentoo.gossamerhost.com/gentoo-distfiles/ ftp://mirror.datapipe.net/gentoo http://mirror.csclub.uwaterloo.ca/gentoo-distfiles/ ftp://mirror.csclub.uwaterloo.ca/gentoo-distfiles/"
LANG="en_US.utf8"
LC_ALL="C"
LDFLAGS="-Wl,-O1 -Wl,--as-needed -Wl,-O1,--hash-style=gnu,--enable-new-dtags"
LINGUAS="en en_US"
MAKEOPTS="-j1 -l3"
PKGDIR="/var/cache/binpkgs"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_EXTRA_OPTS="--exclude ChangeLog* --delete-excluded"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git"
PORTAGE_TMPDIR="/var/tmp/"
USE="X a52 aac aacplus aacs aalib acl adns aesicm aften aio alsa amd64 amr ao apache2 archive ares aspell avahi avresample avx bash-completion bcmath berkdb binfilter bluray bzip2 cacert cairo calendar caps cdda cgi cgroup cli community corefonts cron crypt cscope cups curl cxx daap dane dbi dbus dirac dlna dri drm dvbcsa dvben50221 dvd ecwj2k egl enca encode examples exif exit extraengine faac faad ffmpeg fftw firmware flac fontconfig fontforge foomaticdb fortran fping fpm fpx gbm gcrypt gd gdbm geoip geos gif gimp git glamor glib gml gmp gmplayer gnome-keyring gnuplot gnutls gpg gphoto2 gps gs gsm gtk gtk3 gui hardened harfbuzz hddtemp hdf hdf5 hdri highlight http2 hwdb iconv icu id3tag ilbc imagemagick imap infinality inifile inotify introspection ipmi iproute2 ipv6 jack java jbig jce jemalloc jpeg json lcms libass libedit libextractor liblockfile libnotify libsamplerate libsecret libsoxr libtirpc lmdb lto lxc lz4 lzma lzo mad managesieve mbox memcached mhash mjpeg mmap mmx mmxext mng mp3 mp4 mpeg mtp multilib mysql ncurses netcdf netlink network nfs nfsv4 nfsv41 nodrm nptl nsplugin offensive ogdi ogg openal openexr opengl openmp openssl openvg opus osmesa pam pango pcap pch pcntl pcre pdf pdfimport pdo perl pgo plotutils plugins png policykit postproc ppds printsupport pth pulseaudio python q32 q8 qt4 qt5 rar raw readline realtime rle romio rpc rpz rrl rtmp scanner schroedinger scrypt sctp seccomp server sharedmem slp smi smime smp sndfile snmp sockets spamassassin speex spell sphinx split-usr sqlite sqlite3 sse sse2 sse3 sse4_1 ssh ssl ssse3 stats subtitles svg syslog szip t1lib tcpd theora thin-splines threads threadsafe thumbnail tiff tk tokudb tools truetype twolame udev unbound unicode upcall update_drivedb urandom usb utils v4l2 vaapi vhosts vim-syntax vlc vorbis vorbix vpx wavpack webkit webkit2 webp widgets wikipedia winbind wmf x264 xattr xcb xface xine xinerama xml xmp xorg xpm xslt xvfb xvid xvmc xz zeroconf zip zlib zstd" ABI_X86="64" ADA_TARGET="gnat_2018" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_core authn_dbm authn_file authz_core authz_dbm authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir env expires ext_filter file_cache filter headers include info log_config logio mime mime_magic negotiation rewrite setenvif socache_shmcb speling status unique_id unixd userdir usertrack vhost_alias http2 proxy proxy_fcgi" APACHE2_MPMS="worker" CALLIGRA_FEATURES="karbon sheets words" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="aes avx avx2 f16c fma3 mmx mmxext pclmul popcnt sse sse2 sse3 sse4_1 sse4_2 ssse3" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock greis isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" INPUT_DEVICES="keyboard mouse" KERNEL="linux" L10N="en en-US" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" NETBEANS_MODULES="apisupport cnd groovy gsf harness ide identity j2ee java mobility nb php profiler soa visualweb webcommon websvccommon xml" OFFICE_IMPLEMENTATION="libreoffice" POSTGRES_TARGETS="postgres10 postgres11" PYTHON_SINGLE_TARGET="python3_6" PYTHON_TARGETS="python2_7 python3_5 python3_6" RUBY_TARGETS="ruby24 ruby25" USERLAND="GNU" VIDEO_CARDS="vesa" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CC, CPPFLAGS, CTARGET, CXX, INSTALL_MASK, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS

spidey ~ # emerge -pqv '=net-firewall/nftables-0.9.3-r1::gentoo'
[ebuild     U ] net-firewall/nftables-0.9.3-r1 [0.9.0-r5] USE="doc* gmp json modern-kernel python%* readline -debug -static-libs% -xtables%" PYTHON_TARGETS="python3_6%* (-python3_7)" 
spidey ~ # 

Typical Python3 problem.  Builds fine with USE=-doc

Comment 1 Gary E. Miller 2020-01-10 18:58:41 UTC

Created attachment 602928 [details]
build log

build log

Comment 2 Francisco Blas Izquierdo Riera (RETIRED) gentoo-dev

2020-01-15 17:23:05 UTC

*** Bug 705158 has been marked as a duplicate of this bug. ***

Comment 3 Francisco Blas Izquierdo Riera (RETIRED) gentoo-dev

2020-01-15 17:23:58 UTC

I could reproduce this error after upgrading to the new asciidoc version using python3.6 instead of python2.7

Comment 4 Francisco Blas Izquierdo Riera (RETIRED) gentoo-dev

2020-01-15 17:38:50 UTC

This is a bug in app-text/asciidoc combined by a second bug introduced by python2.7 being removed (and therefore no longer being default) on 5 of January by commit cd3f25deb13cf4d6c9d721d515dbf772a988426f

Changing line 336 and following by the following, correct python code does solve the issue. I'm reassigning this to the asciidoc maintainers as we can't junt remove the © from the bottom of the document.

        with open(asciidoc_file, 'rb') as f:
            for line in f:
                mo = re.search(b'^//\s*a2x:', line)
                if mo:
                    options += ' ' + line[mo.end():].strip().decode('ascii')

Comment 5 Marc Joliet 2020-01-18 19:41:07 UTC

(In reply to Francisco Blas Izquierdo Riera from comment #4)
> This is a bug in app-text/asciidoc combined by a second bug introduced by
> python2.7 being removed (and therefore no longer being default) on 5 of
> January by commit cd3f25deb13cf4d6c9d721d515dbf772a988426f
> 
> Changing line 336 and following by the following, correct python code does
> solve the issue. I'm reassigning this to the asciidoc maintainers as we
> can't junt remove the © from the bottom of the document.
> 
>         with open(asciidoc_file, 'rb') as f:
>             for line in f:
>                 mo = re.search(b'^//\s*a2x:', line)
>                 if mo:
>                     options += ' ' + line[mo.end():].strip().decode('ascii')

Sorry for the delay, I was sick for the last three days.  If you could provide me with a proper unified diff for your POC with version information (latest git? the version in portage?), then I can use that in my upstream bug report.

FWIW, I have not been able to reproduce this (by trying the command line by itself on the downloaded "nft.txt" or by installing via portage), but then I also don't set LC_ALL="C".  Is that actually supported?  My understanding was that using LC_ALL at all is discouraged (ah, yes, I see that the localization guide at https://wiki.gentoo.org/wiki/Localization/Guide says so).

Now glimpsing at the documentation to "open()" in Python I tend to agree with you that Asciidoc should not be (indirectly) relying on the locale being correct and should either pass an encoding to open() (after all, Asciidoc assumes UTF-8 documents unless overridden by an ":encoding:" attribute in the document header, see http://asciidoc.org/userguide.html#_gotchas) or (as you suggested) open the file as binary.

So while I consider it an Asciidoc bug, the LC_ALL="C" irritates me somewhat (if you don't know what I mean, look at the emerge --info output from OP).  What do you think?

Comment 6 Marc Joliet 2020-01-18 20:13:51 UTC

Alright, I opened a bug upstream, see https://github.com/asciidoc/asciidoc-py3/issues/92.