Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 828542 - app-emulation/libvirt-7.(9|10).0: broken systemd-machinectl integration for libvirt_lxc containers
Summary: app-emulation/libvirt-7.(9|10).0: broken systemd-machinectl integration for l...
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal
Assignee: Matthias Maier
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-12-08 05:37 UTC by Matthias Maier
Modified: 2023-06-18 01:40 UTC (History)
3 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
emerge --info (emerge.info,7.05 KB, application/x-info)
2021-12-08 20:49 UTC, Matthias Maier
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Matthias Maier gentoo-dev 2021-12-08 05:37:41 UTC
The systemd-machined integration for libvirt_lxc containers seems to be somewhat broken:

 - machinectl status lxc-$PID-$CONTAINER does not show IP addresses any more

 - machinectl shell lxc-$PID-$CONTAINER does drop into a root shell on the host (!) instead of the container

I will try to bisect to find the troublesome upstream commit tomorrow.

Reproducible: Always

Steps to Reproduce:
1. start a libvirt_lxc container
2. use machinectl status or machinectl shell
Comment 1 Matthias Maier gentoo-dev 2021-12-08 20:49:28 UTC
Created attachment 757767 [details]
emerge --info
Comment 2 Matthias Maier gentoo-dev 2021-12-08 20:57:20 UTC
I have attached an emerge --info.

I am running sys-apps/systemd-249.6 with cgroup v2 layout.


Looking at this issue a bit more:

 - Startup and shutdown of lxc containers works just fine

 - The issue seems to be in /usr/libexec/libvirt_lxc (and independent of the running daemon) when registering a new container with systemd-machined. It seems (for a lack of a better description) that crucial information such as associated namespaces and IPs are missing.

 - I have bisected the problem down to the following upstream commit:


1b9ce05ce241a581d4e80228c92ceb0266f21f94 is the first bad commit
commit 1b9ce05ce241a581d4e80228c92ceb0266f21f94
Author: Cole Robinson <crobinso@redhat.com>
Date:   Tue Oct 5 09:42:12 2021 -0400

    lxc: controller: Fix container launch on cgroup v1
    
    With cgroup v1 I'm seeing LXC container startup failures:
    
    $ sudo virt-install --connect lxc:/// --name test-container --memory 128
    --boot init=/bin/sh
    
    Starting install...
    ERROR    error from service:
    GDBus.Error:org.freedesktop.machine1.NoMachineForPID: PID 2145047 does
    not belong to any known machine
    
    libvirt 7.0.0 works but 7.1.0+ does not. The root error seems to predate
    that, showing up in syslog, but commit 9c1693eff made it fatal:
    
    commit 9c1693eff427661616ce1bd2795688f87288a412
    Author: Pavel Hrdina <phrdina@redhat.com>
    Date:   Fri Feb 5 16:17:35 2021 +0100
    
         vircgroup: use DBus call to systemd for some APIs
    
    The error comes from virSystemdGetMachineByPID. The PID that shows up in
    the above error message does not match the leader PID as reported by
    machinectl.
    
    This change fixes the error. Things seem to continue to work with
    cgroupsv2 after this change.
    
    https://gitlab.com/libvirt/libvirt/-/issues/182
    
    Tested-by: Jim Fehlig <jfehlig@suse.com>
    Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
    Signed-off-by: Cole Robinson <crobinso@redhat.com>

 src/lxc/lxc_controller.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
Comment 3 Matthias Maier gentoo-dev 2021-12-08 21:03:00 UTC
In summary,

this change:

--- a/src/lxc/lxc_controller.c
+++ b/src/lxc/lxc_controller.c
@@ -865,12 +865,12 @@ static int virLXCControllerSetupCgroupLimits(virLXCController *ctrl)
     nodeset = virDomainNumatuneGetNodeset(ctrl->def->numa, auto_nodeset, -1);
 
     if (!(ctrl->cgroup = virLXCCgroupCreate(ctrl->def,
-                                            ctrl->initpid,
+                                            getpid(),
                                             ctrl->nnicindexes,
                                             ctrl->nicindexes)))
         goto cleanup;
 
-    if (virCgroupAddMachineProcess(ctrl->cgroup, getpid()) < 0)
+    if (virCgroupAddMachineProcess(ctrl->cgroup, ctrl->initpid) < 0)
         goto cleanup;
 
     /* Add all qemu-nbd tasks to the cgroup */


somehow breaks the namespace association for systemd-machined for cgroup v2 layout. For example,

  $ machinectl shell <container>

will happily open a shell on the host due to machined talking to the wrong systemd instance (in the root namespace and not in the container namespace).
Comment 4 Larry the Git Cow gentoo-dev 2021-12-08 21:14:58 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=46d2a0c12d7304c56bcb4ece27fa831e8bcaadf5

commit 46d2a0c12d7304c56bcb4ece27fa831e8bcaadf5
Author:     Matthias Maier <tamiko@gentoo.org>
AuthorDate: 2021-12-08 21:14:21 +0000
Commit:     Matthias Maier <tamiko@gentoo.org>
CommitDate: 2021-12-08 21:14:52 +0000

    app-emulation/libvirt: v7.(9|10).0: (temporary) fix cgroup v2 support
    
    Revert an upstream commit that fixed an libvirt_lxc container startup
    issue with cgroup v1 layout. The patch in question breaks
    systemd-machined integration (at least under cgroup v2 layout).
    
    Le't temporarily revert the commit in question until upstream has found
    a proper fix.
    
    Bug: https://bugs.gentoo.org/828542
    Package-Manager: Portage-3.0.28, Repoman-3.0.3
    Signed-off-by: Matthias Maier <tamiko@gentoo.org>

 .../libvirt/files/libvirt-7.9.0-fix_cgroupv2.patch | 32 ++++++++++++++++++++++
 ...virt-7.10.0.ebuild => libvirt-7.10.0-r1.ebuild} |  1 +
 ...ibvirt-7.9.0.ebuild => libvirt-7.9.0-r1.ebuild} |  1 +
 3 files changed, 34 insertions(+)