Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Fix io limiting with cgroup v2 #9457

Closed
wants to merge 1 commit into from
Closed

Fix io limiting with cgroup v2 #9457

wants to merge 1 commit into from

Conversation

Furisto
Copy link
Member

@Furisto Furisto commented Apr 21, 2022

Description

Fix io limiting with cgroup v2.

Related Issue(s)

n.a.

How to test

https://www.loom.com/share/3fc4a931fe3c4541b673650f86d56430

Release Notes

Ensure io limiting of workspaces works with cgroup v2

Documentation

@Furisto
Copy link
Member Author

Furisto commented Apr 21, 2022

/werft run

👍 started the job as gitpod-build-fo-fix-v2-io-limit.1

@github-actions github-actions bot added the team: workspace Issue belongs to the Workspace team label Apr 21, 2022
@aledbf
Copy link
Member

aledbf commented Apr 21, 2022

@Furisto please refactor v1 and this plugin to share the logic of the devices?

@csweichel
Copy link
Contributor

Also, the user cgroup is modifiable by the user. I.e. they can disable their own limits

@Furisto
Copy link
Member Author

Furisto commented Apr 21, 2022

@csweichel Good point. Actually even if we would hide that within another delegated cgroup namespace you could always create another child cgroup and move your io intensive process in that cgroup 🤔

@csweichel
Copy link
Contributor

Aren't cgroups hierarchical? I.e. you can only act within the context/limits of your parent cgroup?

@Furisto
Copy link
Member Author

Furisto commented Apr 21, 2022

I would have expected that but that is not what it looks like in regards to io.max limits.

@Furisto Furisto marked this pull request as draft April 21, 2022 15:14
@csweichel
Copy link
Contributor

kernel bug? :P

@utam0k
Copy link
Contributor

utam0k commented Apr 22, 2022

😭

@utam0k
Copy link
Contributor

utam0k commented Apr 22, 2022

Worst case, why not check the tree structure of all cgroupfs and apply it? In cgroup v2 the process belongs only to the terminal cgroup. However, we will need to check if the search can be realistically terminated in a finite time.

Or, monitor the cgroup notifications. Wecan use cgroup.events to monitor when a process is registered or deleted. The implementation will be tough, but it should be possible to use this to detect when a user creates a new cgroup and moves a process.

@Furisto
Copy link
Member Author

Furisto commented Apr 24, 2022

Worst case, why not check the tree structure of all cgroupfs and apply it? In cgroup v2 the process belongs only to the terminal cgroup. However, we will need to check if the search can be realistically terminated in a finite time.

Or, monitor the cgroup notifications. Wecan use cgroup.events to monitor when a process is registered or deleted. The implementation will be tough, but it should be possible to use this to detect when a user creates a new cgroup and moves a process.

  • When the limit is removed from the cgroup the process resides in, it will not have any limits. The limits of the parent do not have any effect as far as I can tell, so applying it to the whole tree structure is not effective.

  • I am pretty sure the user could make the modification and then make the file read only or mount tmpfs over it or other shenanigans so that subsequent writes will have no effect.

@kylos101
Copy link
Contributor

@Furisto can we close this for now, and reopen we decide to resume the work? I ask to remove it from our list of active PRs. 🙏

@Furisto Furisto closed this May 25, 2022
@aledbf
Copy link
Member

aledbf commented May 25, 2022

@Furisto another thing we need to test is the switch to systems instead cgroupfs for cgroups v2. Not sure if the behavior is the same.

@aledbf
Copy link
Member

aledbf commented May 25, 2022

If we want to upgrade to k8s 1.24, we need to switch to systemd k3s-io/k3s#5462

@Furisto
Copy link
Member Author

Furisto commented May 25, 2022

The most significant change will be that the location of the cgroup for the container will change. As the cgroup is created by systemd it will be located in the systemd managed subtree of the cgroup fs. Considering that we can get the cgroup path from containerd the change will probably transparent to us though and no code will need to be adapted.

From a resource control standpoint nothing will change as runc will connect to systemd via dbus to instruct it to write the resource limits into the cgroup fs. So this is just an indirection, everything will still end up in the cgroup fs, just in a different location. We would still write to the cgroup fs directly in order to control resources.

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants