Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 790737 - dev-libs/libxml2-2.9.12 with dev-python/lxml: mangles XML files
Summary: dev-libs/libxml2-2.9.12 with dev-python/lxml: mangles XML files
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Stabilization (show other bugs)
Hardware: All Linux
: Normal normal with 1 vote (vote)
Assignee: Sam James
URL: https://bugs.launchpad.net/lxml/+bug/...
Whiteboard:
Keywords: CC-ARCHES, PullRequest
: 791127 791139 792159 792492 (view as bug list)
Depends on:
Blocks: CVE-2021-3516, CVE-2021-3517, CVE-2021-3518, CVE-2021-3537, CVE-2021-3541
  Show dependency tree
 
Reported: 2021-05-18 05:37 UTC by Milan Beneš
Modified: 2021-07-02 13:41 UTC (History)
14 users (show)

See Also:
Package list:
dev-libs/libxml2-2.9.12-r2 dev-python/lxml-4.6.3-r1
Runtime testing required: ---
nattka: sanity-check+


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Milan Beneš 2021-05-18 05:37:42 UTC
After upgrade to libxml2-2.9.12 ansible emits following error:

An exception occurred during task execution. To see the full traceback, use -vvv. The error was: lxml.etree.XMLSyntaxError: Extra content at the end of the document, line 1740, column 1
fatal: [lucifer]: FAILED! => {"changed": false, "module_stderr": "Traceback (most recent call last):\n  File \"/home/saruman/.ansible/tmp/ansible-local-33645134corbh0j/ansible-tmp-1621313947.440982-3364892-134798764758155/AnsiballZ_netconf_config.py\", line 102, in <module>\n    _ansiballz_main()\n  File \"/home/saruman/.ansible/tmp/ansible-local-33645134corbh0j/ansible-tmp-1621313947.440982-3364892-134798764758155/AnsiballZ_netconf_config.py\", line 94, in _ansiballz_main\n    invoke_module(zipped_mod, temp_path, ANSIBALLZ_PARAMS)\n  File \"/home/saruman/.ansible/tmp/ansible-local-33645134corbh0j/ansible-tmp-1621313947.440982-3364892-134798764758155/AnsiballZ_netconf_config.py\", line 40, in invoke_module\n    runpy.run_module(mod_name='ansible_collections.ansible.netcommon.plugins.modules.netconf_config', init_globals=None, run_name='__main__', alter_sys=True)\n  File \"/usr/lib/python3.9/runpy.py\", line 210, in run_module\n    return _run_module_code(code, init_globals, run_name, mod_spec)\n  File \"/usr/lib/python3.9/runpy.py\", line 97, in _run_module_code\n    _run_code(code, mod_globals, init_globals,\n  File \"/usr/lib/python3.9/runpy.py\", line 87, in _run_code\n    exec(code, run_globals)\n  File \"/tmp/ansible_netconf_config_payload_bcmjlwgk/ansible_netconf_config_payload.zip/ansible_collections/ansible/netcommon/plugins/modules/netconf_config.py\", line 765, in <module>\n  File \"/tmp/ansible_netconf_config_payload_bcmjlwgk/ansible_netconf_config_payload.zip/ansible_collections/ansible/netcommon/plugins/modules/netconf_config.py\", line 739, in main\n  File \"/tmp/ansible_netconf_config_payload_bcmjlwgk/ansible_netconf_config_payload.zip/ansible_collections/ansible/netcommon/plugins/module_utils/network/netconf/netconf.py\", line 140, in sanitize_xml\n  File \"src/lxml/etree.pyx\", line 3237, in lxml.etree.fromstring\n  File \"src/lxml/parser.pxi\", line 1896, in lxml.etree._parseMemoryDocument\n  File \"src/lxml/parser.pxi\", line 1784, in lxml.etree._parseDoc\n  File \"src/lxml/parser.pxi\", line 1141, in lxml.etree._BaseParser._parseDoc\n  File \"src/lxml/parser.pxi\", line 615, in lxml.etree._ParserContext._handleParseResultDoc\n  File \"src/lxml/parser.pxi\", line 725, in lxml.etree._handleParseResult\n  File \"src/lxml/parser.pxi\", line 654, in lxml.etree._raiseParseError\n  File \"<string>\", line 1740\nlxml.etree.XMLSyntaxError: Extra content at the end of the document, line 1740, column 1\n", "module_stdout": "", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error", "rc": 1}

dev-libs/libxml2-2.9.10-r5 works as expected

emerge --info

Portage 3.0.18 (python 3.9.4-final-0, default/linux/amd64/17.1/desktop/plasma/systemd, gcc-10.2.0, glibc-2.32-r7, 5.10.27-gentoo x86_64)
=================================================================
System uname: Linux-5.10.27-gentoo-x86_64-AMD_Ryzen_5_4500U_with_Radeon_Graphics-with-glibc2.32
KiB Mem:    32314324 total,  16651804 free
KiB Swap:          0 total,         0 free
Timestamp of repository gentoo: Tue, 18 May 2021 05:20:11 +0000
Head commit of repository gentoo: d9ba153210ee9f1eef14691f9aac51b5326988a7

Head commit of repository saruman_common_overlay: b8e5c9021b06716c938a9c1a330a6897724fb35c

sh bash 5.1_p8
ld GNU ld (Gentoo 2.35.2 p1) 2.35.2
app-shells/bash:          5.1_p8::gentoo
dev-java/java-config:     2.3.1::gentoo
dev-lang/perl:            5.30.3::gentoo
dev-lang/python:          3.9.4_p1::gentoo
dev-lang/rust:            1.51.0-r2::gentoo
dev-util/cmake:           3.18.5::gentoo
dev-util/pkgconfig:       0.29.2::gentoo
sys-apps/baselayout:      2.7::gentoo
sys-apps/sandbox:         2.23::gentoo
sys-devel/autoconf:       2.13-r1::gentoo, 2.69-r5::gentoo
sys-devel/automake:       1.13.4-r2::gentoo, 1.16.2-r1::gentoo
sys-devel/binutils:       2.35.2::gentoo
sys-devel/gcc:            10.2.0-r5::gentoo
sys-devel/gcc-config:     2.4::gentoo
sys-devel/libtool:        2.4.6-r6::gentoo
sys-devel/make:           4.3::gentoo
sys-kernel/linux-headers: 5.10::gentoo (virtual/os-headers)
sys-libs/glibc:           2.32-r7::gentoo
Repositories:

gentoo
    location: /usr/portage
    sync-type: git
    sync-uri: https://github.com/gentoo-mirror/gentoo
    priority: -1000

saruman_common_overlay
    location: /var/db/repos/saruman_common_overlay
    sync-type: git
    sync-uri: https://git.benesovi.eu/saruman/saruman_common_overlay.git
    masters: gentoo

x-portage
    location: /usr/local/portage
    masters: gentoo
    priority: 0

ACCEPT_KEYWORDS="amd64"
ACCEPT_LICENSE="* -@EULA Oracle-BCLA-JavaSE NVIDIA-CUDA"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=znver2 -O2 -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/lib64/libreoffice/program/sofficerc /usr/share/config /usr/share/gnupg/qualified.txt"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c"
CXXFLAGS="-march=znver2 -O2 -pipe"
DISTDIR="/usr/portage/distfiles"
ENV_UNSET="CARGO_HOME DBUS_SESSION_BUS_ADDRESS DISPLAY GOBIN GOPATH PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR"
FCFLAGS="-O2 -pipe"
FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs config-protect-if-modified distlocks ebuild-locks fixlafiles ipc-sandbox merge-sync multilib-strict network-sandbox news parallel-fetch pid-sandbox preserve-libs protect-owned qa-unresolved-soname-deps sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
FFLAGS="-O2 -pipe"
GENTOO_MIRRORS="http://distfiles.gentoo.org"
LANG="cs_CZ.utf8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
LINGUAS="cs en"
MAKEOPTS="-j7"
PKGDIR="/var/cache/binpkgs"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git"
PORTAGE_TMPDIR="/var/tmp"
USE="X a52 aac acl acpi activities alsa amd64 avahi bash-completion berkdb bluetooth branding bzip2 cairo cdda cddb cdr cli crypt cups custom-optimization dbus declarative djvu dri dts dvb dvd dvdr emboss encode exif fat ffmpeg flac fortran gdbm gif gimp gpm gui hwaccel iconv icq icu ipv6 jabber java jemalloc jpeg kde kipi kwallet lcms libglvnd libnotify libtirpc lm_sensors mad mng mp3 mp4 mpeg multilib multimedia ncurses networkmanager nfs nfsdcld nfsv3 nls nptl ntfs ntp offensive ogg opengl openmp oscar pam pango pcre pdf phonon plasma pm-utils png policykit ppds pulseaudio qml qt5 rar readline samba savedconfig scanner sdl seccomp semantic-desktop slp sna sound spell split-usr ssl startup-notification svg system-boost system-cairo system-harfbuzz system-icu system-jpeg system-libevent system-libvpx system-sqlite systemd taglib tcpd thumbnail thumbnails tiff truetype udev udisks unicode upnp upower usb usbredir user-session vaapi vdpau virgl vorbis vpx vulkan webgl widevine widgets wifi wxwidgets x264 x265 xattr xcb xml xv xvid zeroconf zlib zsh-completion" ABI_X86="64" ADA_TARGET="gnat_2018" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="karbon sheets words" CAMERAS="canon ptp2" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="aes avx avx2 f16c fma3 mmx mmxext pclmul popcnt rdrand sha sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock greis isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" INPUT_DEVICES="libinput" KERNEL="linux" L10N="cs en" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" LUA_SINGLE_TARGET="lua5-1" LUA_TARGETS="lua5-1" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php7-3 php7-4" POSTGRES_TARGETS="postgres10 postgres11" PYTHON_SINGLE_TARGET="python3_9" PYTHON_TARGETS="python3_9" QEMU_SOFTMMU_TARGETS="arm i386 x86_64" QEMU_USER_TARGETS="arm i386 x86_64" RUBY_TARGETS="ruby26" SANE_BACKENDS="net" USERLAND="GNU" VIDEO_CARDS="amdgpu radeonsi" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq proto steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CC, CPPFLAGS, CTARGET, CXX, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, RUSTFLAGS
Comment 1 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2021-05-18 05:42:03 UTC
This is related to changes in libxml2 and how lxml handles them: https://gitlab.gnome.org/GNOME/libxml2/-/issues/255.

libxml2's upstream reckons that lxml is to blame here.
Comment 2 Larry the Git Cow gentoo-dev 2021-05-18 12:51:39 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=10828321263bb5a7c9285fb53771c7ae00283ae6

commit 10828321263bb5a7c9285fb53771c7ae00283ae6
Author:     Michał Górny <mgorny@gentoo.org>
AuthorDate: 2021-05-18 12:33:42 +0000
Commit:     Michał Górny <mgorny@gentoo.org>
CommitDate: 2021-05-18 12:51:33 +0000

    dev-python/lxml: Force old libxml2 for the time being
    
    Bug: https://bugs.gentoo.org/790737
    Signed-off-by: Michał Górny <mgorny@gentoo.org>

 dev-python/lxml/lxml-4.6.3.ebuild | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)
Comment 3 Thomas Deutschmann (RETIRED) gentoo-dev 2021-05-19 19:14:59 UTC
*** Bug 791127 has been marked as a duplicate of this bug. ***
Comment 4 Stefan Schmid 2021-05-19 19:46:31 UTC
(In reply to Larry the Git Cow from comment #2)
> The bug has been referenced in the following commit(s):
> 
> https://gitweb.gentoo.org/repo/gentoo.git/commit/
> ?id=10828321263bb5a7c9285fb53771c7ae00283ae6

And now because of this we have the following error message with every world update.

> WARNING: One or more updates/rebuilds have been skipped due to a dependency conflict:
> 
> dev-libs/libxml2:2
> 
>   (dev-libs/libxml2-2.9.12:2/2::gentoo, ebuild scheduled for merge) USE="icu ipv6 lzma python readline -debug -examples -static-libs -test -verify-sig" ABI_X86="(64) -32 (-x32)" PYTHON_TARGETS="python3_8 -python3_7 -python3_9" conflicts with
>     <dev-libs/libxml2-2.9.12 required by (dev-python/lxml-4.6.3:0/0::gentoo, installed) USE="threads -doc -examples -test" ABI_X86="(64)" PYTHON_TARGETS="python3_8 (-pypy3) -python3_7 -python3_9"

A better solution would be very welcome...
Comment 5 Mike Gilbert gentoo-dev 2021-05-19 20:11:39 UTC
*** Bug 791139 has been marked as a duplicate of this bug. ***
Comment 6 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2021-05-19 20:54:45 UTC
There is a candidate patch we can use but I’m currently testing. Please be patient.

I am aware of the frustrating “warning” but it is really harmless and just informational. We have asked in the past for it to be downgraded to a different phrase.
Comment 7 Larry the Git Cow gentoo-dev 2021-05-20 01:46:59 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=b77811a11fd46ecd492592b6facfdcacc0b79143

commit b77811a11fd46ecd492592b6facfdcacc0b79143
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2021-05-20 01:35:00 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2021-05-20 01:46:18 +0000

    dev-python/lxml: add ~arch revbump to allow patched libxml2
    
    libxml2-2.9.12-r1 includes a compatibility patch to restore/support
    older behaviour which lxml relies upon.
    
    Bug: https://bugs.gentoo.org/790737
    Signed-off-by: Sam James <sam@gentoo.org>

 dev-python/lxml/lxml-4.6.3-r1.ebuild | 100 +++++++++++++++++++++++++++++++++++
 1 file changed, 100 insertions(+)

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=6d220e09b25048d28d6598f8a6ffb62f0fbe92b4

commit 6d220e09b25048d28d6598f8a6ffb62f0fbe92b4
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2021-05-20 01:33:39 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2021-05-20 01:46:11 +0000

    dev-libs/libxml2: include lxml compatibility patch
    
    Bug: https://bugs.gentoo.org/790737
    Signed-off-by: Sam James <sam@gentoo.org>

 .../libxml2-2.9.12-fix-lxml-compatibility.patch    | 214 ++++++++++++++++++
 dev-libs/libxml2/libxml2-2.9.12-r1.ebuild          | 245 +++++++++++++++++++++
 2 files changed, 459 insertions(+)
Comment 8 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2021-05-21 23:11:59 UTC
Patch merged. 

Minded to, for now, stable this version, but still considering using our previous patchiest and just cherry-picking...
Comment 9 Michal Privoznik 2021-05-25 12:43:06 UTC
FYI - the original upstream patch introduced a regression. I'll be opening a PR with the backport soon.
Comment 10 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2021-05-25 12:44:07 UTC
(In reply to Michal Privoznik from comment #9)
> FYI - the original upstream patch introduced a regression. I'll be opening a
> PR with the backport soon.

Thanks for letting me know. Seems the wait was justified in not stabling yet.

I'm still tempted to just go back to our old 2.0.10+git patches and cherry-pick the important ones.
Comment 11 Larry the Git Cow gentoo-dev 2021-05-25 13:24:01 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=5dd398f1e5a3acda39f669bd9d94a6f9d715ac59

commit 5dd398f1e5a3acda39f669bd9d94a6f9d715ac59
Author:     Michal Privoznik <mprivozn@redhat.com>
AuthorDate: 2021-05-25 12:58:19 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2021-05-25 13:23:53 +0000

    dev-libs/libxml2: add new upstream patch to fix lxml regression
    
    A regression was introduced by the previous upstream patch (which we
    added to Gentoo in 6d220e09b25048d28d6598f8a6ffb62f0fbe92b4).
    
    Bug: https://bugs.gentoo.org/790737
    Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
    Signed-off-by: Sam James <sam@gentoo.org>

 dev-libs/libxml2/Manifest                 |   1 +
 dev-libs/libxml2/libxml2-2.9.12-r2.ebuild | 249 ++++++++++++++++++++++++++++++
 2 files changed, 250 insertions(+)
Comment 12 Larry the Git Cow gentoo-dev 2021-05-25 13:27:16 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=5d9183a38a07a1d99855e5d8f684b18a51dd949b

commit 5d9183a38a07a1d99855e5d8f684b18a51dd949b
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2021-05-25 13:26:52 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2021-05-25 13:26:56 +0000

    dev-libs/libxml2: drop 2.9.12-r1
    
    Upstream patch was found to be flawed.
    
    Bug: https://bugs.gentoo.org/790737
    Signed-off-by: Sam James <sam@gentoo.org>

 dev-libs/libxml2/Manifest                 |   1 -
 dev-libs/libxml2/libxml2-2.9.12-r1.ebuild | 249 ------------------------------
 2 files changed, 250 deletions(-)
Comment 13 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2021-05-26 08:36:12 UTC
*** Bug 792159 has been marked as a duplicate of this bug. ***
Comment 14 Ionen Wolkens gentoo-dev 2021-05-27 18:15:50 UTC
*** Bug 792492 has been marked as a duplicate of this bug. ***
Comment 15 Larry the Git Cow gentoo-dev 2021-05-28 03:23:29 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=42986adf347f8f9d7a246b0be00c4010fd441424

commit 42986adf347f8f9d7a246b0be00c4010fd441424
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2021-05-28 03:22:33 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2021-05-28 03:22:56 +0000

    dev-python/lxml: tighten libxml2 bounds
    
    Only as a nudge for people doing partial upgrades or similar. -r1 had
    a general regression (contained in the upstream patch) rather than
    any issues specific to lxml.
    
    Bug: https://bugs.gentoo.org/790737
    Signed-off-by: Sam James <sam@gentoo.org>

 dev-python/lxml/lxml-4.6.3-r1.ebuild | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
Comment 16 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2021-05-31 11:39:23 UTC
amd64 done
Comment 17 Rolf Eike Beer archtester 2021-05-31 19:37:29 UTC
sparc done
Comment 18 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2021-06-01 00:46:07 UTC
arm done
Comment 19 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2021-06-03 00:41:38 UTC
arm64 done
Comment 20 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2021-06-03 16:34:24 UTC
x86 done
Comment 21 Rolf Eike Beer archtester 2021-06-03 16:58:32 UTC
hppa stable
Comment 22 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2021-06-13 06:03:57 UTC
ppc done
Comment 23 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2021-06-13 06:04:02 UTC
ppc64 done

all arches done