Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Use systemd cgroup manager when kubelet/containerd are managed by systemd unit #4099

Closed
fabriziopandini opened this issue Jan 21, 2021 · 3 comments · Fixed by #4236
Closed
Assignees
Labels
area/control-plane Issues or PRs related to control-plane lifecycle management area/upgrades Issues or PRs related to upgrades kind/feature Categorizes issue or PR as related to a new feature. lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Milestone

Comments

@fabriziopandini
Copy link
Member

fabriziopandini commented Jan 21, 2021

User Story

As a user, I would like to use systemd cgroup driver when kubelet/containerd are managed by systemd unit.

Detailed Description

Image builder is using systemd unit to run both kubelet and containerd; systemd allocates a cgroup per systemd unit, while instead both kubelet and containerd uses cgroupfs as a default cgroup driver.

As a result, there will be two different cgroup managers on each machine and this leads you to have two views of machines's resources; In the field, people have reported cases where such systems become unstable under resource pressure (see here for more context)

This is not ideal; instead, we should make sure everything is configured to use the systemd cgroup driver only.
In order to make this happen a coordinated, multi project effort is required:

  • We should make sure the image builder configures containerd for using the systemd cgroup driver. This effort is tracked in containerd should be using systemd cgroup hierarchy image-builder#471; the assumption here is that this will be the default for images with Kubernetes version >= v1.21
  • We should make sure that kubeadm defaults the Kubelet Configuration for using systemd cgroup driver. This effort is tracked in default the "cgroupDriver" setting of the kubelet to "systemd" kubernetes/kubeadm#2376; also here, the assumption is that this will be the default for cluster initialized with Kubernetes version >= v1.21
  • We should make sure that the KubeadmControlPlane would update the KubeletConfiguration when executing upgrades from v1.20 to v1.21. this effort should be tracked in this repository and backported to the v0.3 branch
  • TBD how to handle this change in CAPD

Anything else you would like to add:

We should account for configurations not using containerd, not using systemd and for configuration not using kubeadm as a bootstrap/controlplane provider.

Given that both image builder and kubeadm allow overriding the default and that the KubeadmControlPlane is optional I don't see blockers for these scenarios, but if everyone has more context on these use cases, please comment.

This was discussed during CAPI office hours on the 20th of January.

/kind feature

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Jan 21, 2021
@fabriziopandini fabriziopandini added area/control-plane Issues or PRs related to control-plane lifecycle management area/upgrades Issues or PRs related to upgrades labels Jan 21, 2021
@fabriziopandini
Copy link
Member Author

/milestone v0.4.0
/priority important-soon

/assign

@k8s-ci-robot k8s-ci-robot added this to the v0.4.0 milestone Jan 21, 2021
@k8s-ci-robot k8s-ci-robot added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Jan 21, 2021
@neolit123
Copy link
Member

i've sent the PR for moving to systemd in kubeadm 1.21:
kubernetes/kubernetes#99471

@fabriziopandini
Copy link
Member Author

/lifecycle active

@k8s-ci-robot k8s-ci-robot added the lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. label Mar 10, 2021
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
area/control-plane Issues or PRs related to control-plane lifecycle management area/upgrades Issues or PRs related to upgrades kind/feature Categorizes issue or PR as related to a new feature. lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants