Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 745162

Summary: dev-libs/libxml2-2.9.10-r3: Unicode handling in causes itstool crashes
Product: Gentoo Linux Reporter: Alex Belits <abelits>
Component: Current packagesAssignee: Sam James <sam>
Status: RESOLVED FIXED    
Severity: normal CC: abelits, base-system, gentoo, lacyc3, soap, StormByte, toralf
Priority: Normal Keywords: PATCH
Version: unspecified   
Hardware: All   
OS: Linux   
URL: https://gitlab.gnome.org/GNOME/libxml2/-/issues/64
See Also: https://bugs.gentoo.org/show_bug.cgi?id=701020
Whiteboard: Workaround patch from Fedora was in place, may need to restore it for now; not addressed upstream
Package list:
Runtime testing required: ---
Bug Depends on:    
Bug Blocks: 734968, 745135    
Attachments: Patch used in Red Hat and Debian builds
broken-libxml2.tar.xz

Description Alex Belits 2020-09-28 19:04:42 UTC
Build of app-editors/pluma-2.24.1 fails on itstool crash. Further investigation had shown that a segmentation fault was caused by a known libxml2 bug.

Reproducible: Always

Steps to Reproduce:
1. Build app-editors/pluma-2.24.1
Actual Results:  
Build fails on a segmentation fault in itstool.

Expected Results:  
Pluma 2.24.1 built.
Comment 1 Alex Belits 2020-09-28 19:06:19 UTC
Created attachment 662872 [details, diff]
Patch used in Red Hat and Debian builds
Comment 2 Alex Belits 2020-09-29 20:47:04 UTC
https://gitlab.gnome.org/GNOME/libxml2/-/issues/187 seems to be a similar but different issue. In both cases malformed document causes segfault, however this problem is with structure and not broken unicode. Looking at the same file in libxml2 source, I can see a bunch of places where strings are passed from vsnprintf() to Python with no checks.

In at least two instances null termination of vsnprintf() results also has off-by-one errors in truncation detection. Fortunately, the error is to the safe side, however that truncation potentially can produce _another_ invalid unicode from valid one.
Comment 3 OzTiram 2020-11-28 18:11:37 UTC
I can confirm this behaviour. I applied the patch from 
https://gitweb.gentoo.org/repo/gentoo.git/plain/dev-libs/libxml2/files/2.9.9-python3-unicode-errors.patch?id=47c1fed5929fd9633e535c9da15d34c1f09d065a

and it solved the problem.
Comment 5 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2020-12-02 23:56:51 UTC
I can re-apply the old patch, but is this *actually* reported upstream?
Comment 6 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2020-12-03 00:08:40 UTC
(In reply to Sam James from comment #5)
> I can re-apply the old patch, but is this *actually* reported upstream?

Ah: https://gitlab.gnome.org/GNOME/libxml2/-/issues/64.

(https://745162.bugs.gentoo.org/attachment.cgi?id=662872)
Comment 7 Larry the Git Cow gentoo-dev 2020-12-03 00:14:53 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=da4038c33b2c7684f5766d6e8f1d1089e863e87c

commit da4038c33b2c7684f5766d6e8f1d1089e863e87c
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2020-12-03 00:13:24 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2020-12-03 00:14:49 +0000

    dev-libs/libxml2: restore itstool-segfault patch
    
    We stopped applying this patch during a roll
    of a new patchset (my fault), but it seems to still
    be needed. Noticed when building some of MATE.
    
    Bug: https://bugs.gentoo.org/745162
    Package-Manager: Portage-3.0.9, Repoman-3.0.2
    Signed-off-by: Sam James <sam@gentoo.org>

 .../libxml2-2.9.8-python3-unicode-errors.patch     | 34 ++++++++++++++++++++++
 ...2-2.9.10-r3.ebuild => libxml2-2.9.10-r4.ebuild} |  3 ++
 2 files changed, 37 insertions(+)
Comment 8 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2021-01-12 22:26:05 UTC
*** Bug 744739 has been marked as a duplicate of this bug. ***
Comment 9 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2021-01-12 22:26:49 UTC
*** Bug 734968 has been marked as a duplicate of this bug. ***
Comment 10 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2021-01-12 22:26:57 UTC
*** Bug 745135 has been marked as a duplicate of this bug. ***
Comment 11 gsra99 2022-11-27 22:05:07 UTC
I'm having this issue with libxml2-2.10.3. Is the patch no longer being applied to this version of libxml2?
Comment 12 Kobboi 2022-11-27 22:40:29 UTC
(In reply to gsra99 from comment #11)
> I'm having this issue with libxml2-2.10.3. Is the patch no longer being
> applied to this version of libxml2?

When trying to build something in the tree? Or are you calling itstool yourself or is some program you work with using itstool?
Comment 13 gsra99 2022-11-28 22:53:38 UTC
(In reply to Kobboi from comment #12)
> (In reply to gsra99 from comment #11)
> > I'm having this issue with libxml2-2.10.3. Is the patch no longer being
> > applied to this version of libxml2?
> 
> When trying to build something in the tree? Or are you calling itstool
> yourself or is some program you work with using itstool?

I am trying build and install pluma-1.24.2. I get the same error as mentioned for pluma-1.24.1. However libxml2 is now at v.2.10 rather than 2.9.
Comment 14 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2022-11-28 22:57:17 UTC
This is back again, it seems, so reopening. I managed to construct a minimalish reproducer about 6 months ago but it then succeeded on a modern libxml2 :(

https://forums.gentoo.org/viewtopic-t-1159343.html mentions a test case which may be useful for debugging (I can't try it right now).
Comment 15 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2022-11-28 22:57:53 UTC
*** Bug 868630 has been marked as a duplicate of this bug. ***
Comment 16 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2022-11-28 22:58:11 UTC
*** Bug 878183 has been marked as a duplicate of this bug. ***
Comment 17 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2022-11-28 22:58:45 UTC
*** Bug 878179 has been marked as a duplicate of this bug. ***
Comment 18 OzTiram 2022-11-29 19:23:11 UTC
The old patch still works.
building libxml2 with the patch, I was able to compile pluma and mate-applets.
Comment 19 Larry the Git Cow gentoo-dev 2022-11-29 19:59:05 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=bee4fbd32b0225d09bf7fca8d38b83a9ac368bfd

commit bee4fbd32b0225d09bf7fca8d38b83a9ac368bfd
Author:     David Seifert <soap@gentoo.org>
AuthorDate: 2022-11-29 19:58:44 +0000
Commit:     David Seifert <soap@gentoo.org>
CommitDate: 2022-11-29 19:58:44 +0000

    dev-libs/libxml2: add workaround patch for itstool breakage
    
    Bug: https://bugs.gentoo.org/745162
    Signed-off-by: David Seifert <soap@gentoo.org>

 .../libxml2-2.10.3-python3-unicode-errors.patch    | 35 ++++++++++++++++++++++
 ...xml2-2.10.3.ebuild => libxml2-2.10.3-r1.ebuild} |  2 ++
 2 files changed, 37 insertions(+)
Comment 20 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-04-23 07:17:01 UTC
Created attachment 860557 [details]
broken-libxml2.tar.xz

In an old chroot (2019-10-30) I created, I get:
```
/var/tmp/portage/app-office/gnumeric-1.12.44/work/gnumeric-1.12.44/doc # itstool -m "/var/tmp/portage/app-office/gnumeric-1.12.44/work/gnumeric-1.12.44/doc/cs/cs.mo" C/gnumeric.xml
Warning: Could not merge cs translation for msgid:
b"Using <inlineequation> <_:alt-1/> <_:mathphrase-2/> </inlineequation> as the z-value of the <inlineequation> <_:alt-3/> <_:mathphrase-4/> </inlineequation> percentile of the standard normal distribution, set the initial estimate of the number of iterations required as the smallest integer <inlineequation> <_:alt-5/> <_:mathphrase-6/> </inlineequation> such that <_:equation-7/>. Note that if <inlineequation> <_:alt-8/> <_:mathphrase-9/> </inlineequation> is small, it would be more appropriate to use the student's t-distribution of <inlineequation> <_:alt-10/> <_:mathphrase-11/> </inlineequation> instead of <inlineequation> <_:alt-12/> <_:mathphrase-13/> </inlineequation>."
Segmentation fault (core dumped)
```

I also ended up minimising a reproducer derived from itstool, but it doesn't reproduce w/ latest libxml2, while the crash does still happen (not for gnumeric but for the MATE stuff mentioned above on forums: https://forums.gentoo.org/viewtopic-t-1159343.html) without the patch w/ latest. This implies that my reduced itstool was too specific.

Attaching a smaller version of the forums case, but we still need either a small C or Python reproducer which leverages the libxml2 API.
Comment 21 Larry the Git Cow gentoo-dev 2023-05-01 06:25:06 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=eb72ee41457cc0b7f93c1d70f170ddc8ea877175

commit eb72ee41457cc0b7f93c1d70f170ddc8ea877175
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2023-05-01 06:24:33 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2023-05-01 06:24:33 +0000

    dev-libs/libxml2: add note re itstool/unicode crash patch
    
    We need to check if it's still relevant w/ 2.11.0.
    
    Bug: https://bugs.gentoo.org/745162
    Signed-off-by: Sam James <sam@gentoo.org>

 dev-libs/libxml2/libxml2-2.11.1.ebuild | 3 +++
 profiles/package.mask                  | 2 ++
 2 files changed, 5 insertions(+)
Comment 22 Larry the Git Cow gentoo-dev 2023-05-10 19:34:22 UTC
The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=9b2ad65342b2445a38775260e7f4497d06466ee4

commit 9b2ad65342b2445a38775260e7f4497d06466ee4
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2023-05-10 19:33:44 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2023-05-10 19:33:44 +0000

    profiles: unmask new libxml2
    
    Seems to have fixed the python bindings issue too: https://gitlab.gnome.org/GNOME/libxml2/-/commit/76c6da420923f2721a2e16adfcef8707a2454a1b.
    
    Closes: https://bugs.gentoo.org/745162
    Signed-off-by: Sam James <sam@gentoo.org>

 dev-libs/libxml2/libxml2-2.11.2.ebuild | 3 ---
 profiles/package.mask                  | 7 -------
 2 files changed, 10 deletions(-)