Summary: | sys-apps/systemd-247.2-r4: test failures on non-systemd host | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | Sam James <sam> |
Component: | Current packages | Assignee: | Gentoo systemd Team <systemd> |
Status: | RESOLVED FIXED | ||
Severity: | normal | ||
Priority: | Normal | ||
Version: | unspecified | ||
Hardware: | All | ||
OS: | Linux | ||
See Also: | https://github.com/systemd/systemd/pull/19025 | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- | |
Bug Depends on: | |||
Bug Blocks: | 770244 | ||
Attachments: |
build.log.xz (sparc)
testlog.txt.xz (sparc) build.log.xz (sparc) testlog.txt.xz (sparc) build.log.xz (amd64) testlog.txt.xz (amd64) |
Description
Sam James
2021-02-21 05:51:22 UTC
Created attachment 687807 [details]
build.log.xz (sparc)
Created attachment 687810 [details]
testlog.txt.xz (sparc)
Created attachment 687813 [details]
build.log.xz (sparc)
(Originally, dbus wasn't running, so there were extra fixable errors).
# grep "FAIL" /var/tmp/portage/sys-apps/systemd-247.2-r4/temp/build.log
309/562 test-oomd-util FAIL 2.83s (killed by signal 6 SIGABRT)
321/562 test-engine FAIL 2.36s (killed by signal 6 SIGABRT)
329/562 test-unit-name FAIL 2.71s (killed by signal 6 SIGABRT)
330/562 test-load-fragment FAIL 2.81s (killed by signal 6 SIGABRT)
430/562 test-cgroup-util FAIL 2.92s (killed by signal 6 SIGABRT)
436/562 test-path-util FAIL 3.01s (killed by signal 6 SIGABRT)
438/562 test-path FAIL 3.01s (killed by signal 6 SIGABRT)
443/562 test-sched-prio FAIL 2.74s (killed by signal 6 SIGABRT)
491/562 test-bus-creds FAIL 0.30s (killed by signal 6 SIGABRT)
497/562 test-login FAIL 0.64s (killed by signal 6 SIGABRT)
Created attachment 687816 [details]
testlog.txt.xz (sparc)
Created attachment 687819 [details]
build.log.xz (amd64)
9 failures on amd64 vs SPARC's 10:
# grep "FAIL" /var/tmp/portage/sys-apps/systemd-247.2-r4/temp/build.log
300/541 test-oomd-util FAIL 0.09s (killed by signal 6 SIGABRT)
311/541 test-engine FAIL 0.09s (killed by signal 6 SIGABRT)
319/541 test-unit-name FAIL 0.03s (killed by signal 6 SIGABRT)
320/541 test-load-fragment FAIL 0.03s (killed by signal 6 SIGABRT)
419/541 test-cgroup-util FAIL 0.02s (killed by signal 6 SIGABRT)
427/541 test-path FAIL 0.02s (killed by signal 6 SIGABRT)
432/541 test-sched-prio FAIL 0.02s (killed by signal 6 SIGABRT)
480/541 test-bus-creds FAIL 0.02s (killed by signal 6 SIGABRT)
486/541 test-login FAIL 0.01s (killed by signal 6 SIGABRT)
The test which fails only on SPARC (both with and without dbus, so I guess it's not a fluke):
436/562 test-path-util FAIL 3.01s (killed by signal 6 SIGABRT)
Created attachment 687822 [details]
testlog.txt.xz (amd64)
Found at /var/tmp/portage/sys-apps/systemd-247.2-r4/work/systemd-stable-247.2-abi_x86_32.x86/meson-logs/testlog.txt.
No other testlog.txt exists, so I assume because it failed the 32-bit one, it didn't bother running 64-bit.
> I'm aware this is a bit of a pain as a case, but until now (< 247), I've been able to run tests on non-systemd hosts.
I would like to see proof of this: as far as I am aware, the systemd test suite has never worked without systemd running as PID 1.
When I run the tests in a container running sysvinit as PID 1 and OpenRC as the service manager, I get no failures. I wonder if there is something subtle like a missing mount point or kernel sysctl that is causing the problem on your baremetal OpenRC systems. All of the failing tests have a similar cause related to cgroups that looks something like this: statfs("/sys/fs/cgroup/systemd" failed: No such file or directory Assertion 'r >= 0' failed at src/shared/tests.c:269, function enter_cgroup(). Aborting. Basically, I need to figure out exactly how this line of code gets reached. https://github.com/systemd/systemd/blob/v247/src/basic/cgroup-util.c#L2026 What is rc_cgroup_mode set to in rc.conf? What does /sys/fs/cgroup look like? Please check /proc/self/mounts for any mounts starting with that path. (In reply to Mike Gilbert from comment #9) > All of the failing tests have a similar cause related to cgroups that looks > something like this: > > statfs("/sys/fs/cgroup/systemd" failed: No such file or directory > Assertion 'r >= 0' failed at src/shared/tests.c:269, function > enter_cgroup(). Aborting. > > Basically, I need to figure out exactly how this line of code gets reached. > > https://github.com/systemd/systemd/blob/v247/src/basic/cgroup-util.c#L2026 > > What is rc_cgroup_mode set to in rc.conf? > > What does /sys/fs/cgroup look like? Please check /proc/self/mounts for any > mounts starting with that path. For this host, catbus, the host is systemd with an OpenRC chroot mounted using gentoo-chrootiez [0]. Nothing has (knowingly) really changed here. # grep "cgroup" /proc/self/mounts tmpfs /sys/fs/cgroup tmpfs rw,nosuid,nodev,noexec,relatime,mode=755 0 0 tmpfs /sys/fs/cgroup/portage cgroup rw,nosuid,nodev,noexec,relatime,release_agent=/usr/lib/portage/python3.8/cgroup-release-agent,name=portage 0 0 rc.conf within chroot: >rc_shell=/sbin/sulogin >unicode="YES" >rc_tty_number=12 >rc_sys="prefix" >rc_controller_cgroups="NO" >rc_depend_strict="NO" >rc_need="!net !dev !udev-mount !sysfs !checkfs !fsck !netmount !logger !clock !modules" [0] https://github.com/trofi/gentoo-chrootiez Try running the following commands in the chroot. This will put a cgroup instance where systemd expects to find it.
> mkdir /sys/fs/cgroup/systemd
> mount -t cgroup -o none,name=systemd cgroup /sys/fs/cgroup/systemd
(In reply to Mike Gilbert from comment #11) > Try running the following commands in the chroot. This will put a cgroup > instance where systemd expects to find it. > > > mkdir /sys/fs/cgroup/systemd > > mount -t cgroup -o none,name=systemd cgroup /sys/fs/cgroup/systemd I've tried poking but I'm not sure how to get past this: # mkdir /sys/fs/cgroup/systemd mkdir: cannot create directory ‘/sys/fs/cgroup/systemd’: No such file or directory Ok, here's what I think is happening: 1. When you enter the chroot, /sys/fs/cgroup is not mounted. 2. When you run emerge, it mounts /sys/fs/cgroup and sets up its own hierarchy. https://gitweb.gentoo.org/proj/portage.git/tree/lib/_emerge/AbstractEbuildProcess.py?h=portage-3.0.17#n73 3. The systemd code in cg_unified_cached() sees that /sys/fs/cgroup is mounted as a tmpfs, and assumes that it will find /sys/fs/cgroup/unified or /sys/fs/cgroup/systemd mounted as cgroup2 or cgroup, respectively. https://github.com/systemd/systemd/blob/v247/src/basic/cgroup-util.c#L1997 A workaround would be to ensure that /sys/fs/cgroup is already mounted, with fstype equal to either cgroup2 or tmpfs. If /sys/fs/cgroup is mounted as a tmpfs, ensure that /sys/fs/cgroup/unified is mounted as cgroup2, or /sys/fs/cgroup/systemd is mounted as cgroup. Could you give this patch a try? https://github.com/floppym/systemd/commit/236ce6ce005fecfdc80e0e9fb8a77f698bbc6aa7 (In reply to Mike Gilbert from comment #14) > Could you give this patch a try? > > https://github.com/floppym/systemd/commit/ > 236ce6ce005fecfdc80e0e9fb8a77f698bbc6aa7 Before: Ok: 565 Expected Fail: 0 Fail: 10 Unexpected Pass: 0 Skipped: 17 Timeout: 0 After: Ok: 567 Expected Fail: 0 Fail: 1 Unexpected Pass: 0 Skipped: 24 Timeout: 0 The failure was one we already discussed as acceptable for now: 387/592 test-fs-util FAIL 0.11s (killed by signal 6 SIGABRT) So, thanks -- works! The bug has been closed via the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=e473f70fbcfc239779f91c1649af4f369e0f2b6e commit e473f70fbcfc239779f91c1649af4f369e0f2b6e Author: Mike Gilbert <floppym@gentoo.org> AuthorDate: 2021-03-17 14:18:23 +0000 Commit: Mike Gilbert <floppym@gentoo.org> CommitDate: 2021-03-17 14:18:23 +0000 sys-apps/systemd: fix cgroup-related test failures Closes: https://bugs.gentoo.org/771819 Signed-off-by: Mike Gilbert <floppym@gentoo.org> sys-apps/systemd/files/247-cgroup-test.patch | 35 ++++++++++++++++++++++++++++ sys-apps/systemd/systemd-247.2-r4.ebuild | 1 + 2 files changed, 36 insertions(+) |