Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 924601 - app-admin/clustershell-1.9.2 fails test (hang): test deadlocks with >=dev-libs/expat-2.6.0
Summary: app-admin/clustershell-1.9.2 fails test (hang): test deadlocks with >=dev-lib...
Status: IN_PROGRESS
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Petr Vaněk
URL: https://github.com/cea-hpc/clustershe...
Whiteboard:
Keywords: TESTFAILURE
Depends on:
Blocks: CVE-2023-52425 CVE-2024-28757
  Show dependency tree
 
Reported: 2024-02-14 18:33 UTC by Agostino Sarubbo
Modified: 2024-04-03 09:40 UTC (History)
3 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
build.log (build.log,314.27 KB, text/plain)
2024-02-14 18:34 UTC, Agostino Sarubbo
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Agostino Sarubbo gentoo-dev 2024-02-14 18:33:58 UTC
https://blogs.gentoo.org/ago/2020/07/04/gentoo-tinderbox/

Issue: app-admin/clustershell-1.9.2 fails tests.
Discovered on: x86 (internal ref: tinderbox_x86)
System: GCC-14-SYSTEM (https://wiki.gentoo.org/wiki/Project:Tinderbox/Common_Issues_Helper#GCC-14)

Info about the issue:
https://wiki.gentoo.org/wiki/Project:Tinderbox/Common_Issues_Helper#CF0015
Comment 1 Agostino Sarubbo gentoo-dev 2024-02-14 18:34:00 UTC
Created attachment 884992 [details]
build.log

build log and emerge --info
Comment 2 Agostino Sarubbo gentoo-dev 2024-02-14 18:34:01 UTC
Error(s) that match a know pattern:


ERROR:ClusterShell.Gateway:MessageProcessingError: Invalid "message" attributes: missing key "gateway"
test groups when not allowed to read some YAML config file ... DEBUG:ClusterShell.NodeUtils:[Errno 13] Permission denied: '/var/tmp/portage/app-admin/clustershell-1.9.2/temp/cs-test-vyo42aiu/cs-test-24lq2nzx.yaml'
Comment 3 Petr Vaněk gentoo-dev 2024-02-15 14:32:40 UTC
This bug happens on amd64 as well. I was able to bisect it to commit
9429ff64f132 ("dev-lang/python: Bump to 3.10.13_p3") which allows to pull in
dev-libs/expat-2.6.0. So, This issue happens only with dev-libs/expat-2.6.0,
tests pass if I downgrade to expat-2.5.0.

The issue is a deadlock in test_basic_noop from
https://github.com/cea-hpc/clustershell/blob/v1.9.2/tests/TreeGatewayTest.py

There are two threads sending data through pipes and the main process stuck
waiting for more data, most probably in this function:
https://github.com/cea-hpc/clustershell/blob/v1.9.2/tests/TreeGatewayTest.py#L131-L136

In case of the issue, strace looks like this:

write(7, "<channel version=\"1.9.2\">\n", 26) = 26
read(8, "<?xml version=\"1.0\" encoding=\"ut"..., 4096) = 39
read(8, "<channel", 4096)               = 8
read(8, " version=\"1.9.2\"", 4096)     = 16
read(8, ">", 4096)                      = 1
read(8, 

While normally it looks like this:

write(7, "<channel version=\"1.9.2\">\n", 26) = 26
read(8, "<?xml version=\"1.0\" encoding=\"ut"..., 4096) = 39
read(8, "<channel", 4096)               = 8
read(8, " version=\"1.9.2\"", 4096)     = 16
read(8, ">", 4096)                      = 1
write(7, "</channel>\n", 11)            = 11
read(8, "</channel>", 4096)             = 10
close(8)                                = 0
close(7)                                = 0
write(2, "ok\n", 3)                     = 3

I bissected the libexpat and the issue is triggered by this commit
https://github.com/libexpat/libexpat/commit/9cdf9b8d77d5c2c2a27d15fb68dd3f83cafb45a1
("Skip parsing after repeated partials on the same token")

Any idea?
Comment 4 Sebastian Pipping gentoo-dev 2024-02-15 18:41:11 UTC
Thanks for the detailed report!  I will try to find time for a closer look on the weekend.
Comment 5 Sebastian Pipping gentoo-dev 2024-02-18 11:50:03 UTC
I've had a chance at investigating this more by now.  Options for next steps depend on how CPython upstream feels about my new and related pull request https://github.com/python/cpython/pull/115623 .  Let's see about that first.