Hi, i've installed =sys-cluster/k3s-1.25.4_p1 but it enters a crash loop as soon as i start it Reproducible: Always Steps to Reproduce: 1. emerge sys-cluster/k3s 2. /etc/init.d/k3s start Actual Results: note the crash loop in /var/log/k3s/k3s.log Expected Results: the cluster should start attaching the relevant files
Created attachment 874153 [details] emerge info
Created attachment 874154 [details] logfile
Update: setting rc_cgroup_mode="hybrid" in /etc/rc.conf makes the problem go away
Hi a quick update. I've traced the root issue to the cpu cgroup v2 controller not being settable. root@box /sys/fs/cgroup # echo "+cpuset +io +memory +hugetlb +pids +rdma +misc" > cgroup.subtree_control root@box /sys/fs/cgroup # echo "+cpu" > cgroup.subtree_control -bash: echo: write error: Invalid argument With cgroup v1 it starts but since its support is being phased out from k8s, that's not a good solution. I have zero understanding of cgroup and there's no trace of the reason in the kernel logs. I'm taking hints. Thanks a lot!
One further finding, sorry for the spam. I CAN enable "+cpu" in cgroup.subtree_control if I do that via tty1 BEFORE LOGGING IN via lxdm. But afterwards it always ends up in EINVAL. The following might be related (from Documentation/admin-guide/cgroup-v2.rst): WARNING: cgroup2 doesn't yet support control of realtime processes and the cpu controller can only be enabled when all RT processes are in the root cgroup. Be aware that system management software may already have placed RT processes into nonroot cgroups during the system boot process, and these processes may need to be moved to the root cgroup before the cpu controller can be enabled.
ok i've figured this out: the bug is in openrc the /etc/init.d/cgroups has: start() { # set up kernel support for cgroups if [ -d /sys/fs/cgroup ]; then mount_cgroups restorecon_cgroups fi return 0 } The "mount_cgroups" func handles the 3 different case (legacy, hybrid and unified). Important! Note that "unified" is the default case here. mount_cgroups() { case "${rc_cgroup_mode:-unified}" in hybrid) cgroups_hybrid ;; legacy) cgroups_legacy ;; unified) cgroups_unified ;; esac return 0 } In turn the "cgroups_unified" func reads: cgroups_unified() { cgroup2_base cgroup2_controllers return 0 } The cgroup2_base just mounts /sys/fs/cgroup; the interesting one is "cgroup2_controllers". A snip of the func follows. Note how "unified" is NOT the default anymore. cgroup2_controllers() [...] read -r active < "${cgroup_path}/cgroup.controllers" for x in ${active}; do case "$rc_cgroup_mode" in unified) echo "+${x}" > "${cgroup_path}/cgroup.subtree_control" ;; hybrid) [...] This leads to the issue i was experiencing: when leaving rc_cgroup_mode unset (the default) in /etc/rc.conf, the "unified" mode should be used (cgroup v2 only). However due to the inconsistency in /etc/init.d/cgroups script presented above, the cgroupv2 fs gets mounted but the controlllers are not enabled. And once the system boot progresses it becomes too late for the "cpu" controller to be manually enabled. The workaround is to explicitly set rc_cgroup_mode="unified". A proper fix belongs in /etc/init.d/cgroups. Shall i file a bug? Where? Now back to this very bug, it can be closed. Sorry for the noise, it took a while to figure this out.
(In reply to acab from comment #6) > ok i've figured this out: the bug is in openrc > I think it should be okay to just rename this bug / move it into the OpenRC component, unless I'm misunderstanding the issue. > [...]
(In reply to Sam James from comment #7) > (In reply to acab from comment #6) > > ok i've figured this out: the bug is in openrc > > > > I think it should be okay to just rename this bug / move it into the OpenRC > component, unless I'm misunderstanding the issue. > > > [...] No, that's correct. Go ahead. Thanks
Does https://github.com/OpenRC/openrc/pull/669 look sufficient? I've not tested it yet.
(In reply to Sam James from comment #9) > Does https://github.com/OpenRC/openrc/pull/669 look sufficient? I've not > tested it yet. Just tested it and it works perfectly. Thanks!
Thanks!
This will be fixed in 0.52.