-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Detect whether Delegate is available on both slices and scopes #1776
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
You need to sign your commit with the DCO. Please look at the contributing docs for information on the process. |
I've copied |
@crosbymichael Will do, I'll update the commit description and force-push it here. @adelton The problem you reported in kubernetes/kubernetes#61474 is:
Which this commit should fix. The problem you reported in the comment above (kubelet startup hanging) is actually fixed by #1772. So I think you need both :-) While #1772 by fixes the problem of kubelet hanging on systemd v236 and below, that commit by itself won't fix the problem you're seeing starting kubelet on systemd v238, then this commit will fix that too. Can you try with both of them cherry-picked? Thanks! |
Thank you. Now I took #1776 and cherry-picked 8ab251f from #1772 on top of that, and copied
message and node not getting ready. |
Thanks for testing @adelton I'm working with the hypothesis that systemd v238 will not return an error that Delegate is read-only (which is what the code is checking) but that the attribute does not exist (which is effectively the change that went in on v237.) Will perform some additional tests using busctl and update this code to do that instead. (In fact, I think I used to have a "go run" test case I could use on v238 too, let me dig that up.) I'll let you know when I have a new version you can test. Cheers, |
Starting with systemd 237, in preparation for cgroup v2, delegation is only now available for scopes, not slices. Update libcontainer code to detect whether delegation is available on both and use that information when creating new slices. Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Yes that was it... I just added @adelton Can you please test this again in your environment? Note that #1772 is now merged and I rebased this branch on top of that one, so if you just sync the vendor/ tree in Kubernetes to where this latest commit of mine is, you should be good to go... It would be nice to confirm that that's indeed fixed before we merge this! Cheers, |
Why on earth... cgroupv2 still isn't ready yet and they're already removing features from systemd and breaking things "just for fun" because "everyone should just switch to cgroupv2 -- ignore that important controllers are totally absent from cgroupv2, and systemd doesn't support setting up different paths for different controllers because of cgroupv2". Not to mention that the scope/slice distinction is entirely an invention of systemd. All of that being said, if they've broken it then we will need this patch (once tested). Although I believe that adding |
I tested it with a small 15-line Go program creating a slice. It fails with I'd still prefer to wait for @adelton to confirm that it does fix the issue in Kubernetes, just to make sure there's nothing else lurking there... Cheers, |
@umohnani is also testing this out. Thanks! |
This patch fixed https://bugzilla.redhat.com/show_bug.cgi?id=1558425 for me. See also https://pagure.io/atomic-wg/issue/452#comment-505389 |
Awesome, thanks for testing! I think we're good to go... Merge? |
Once this one is in, I'll update kubernetes/kubernetes#61926 to include everything including this PR to fix the issue in Kubernetes code base too. Cheers! |
Works for me also! |
works here as well 👍 |
PR opencontainers/runc#1754 works around an issue in manager.Apply(-1) that makes Kubelet startup hang when using systemd cgroup driver (by adding a timeout) and further PR opencontainers/runc#1772 fixes that bug by checking the proper error status before waiting on the channel. PR opencontainers/runc#1776 checks whether Delegate works in slices, which keeps libcontainer systemd cgroup driver working on systemd v237+. PR opencontainers/runc#1781 makes the channel buffered, so if we time out waiting on the channel, the updater will not block trying to it since there are no longer any consumers.
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Update libcontainer to include PRs with fixes to systemd cgroup driver **What this PR does / why we need it**: PR opencontainers/runc#1754 works around an issue in manager.Apply(-1) that makes Kubelet startup hang when using systemd cgroup driver (by adding a timeout) and further PR opencontainers/runc#1772 fixes that bug by checking the proper error status before waiting on the channel. PR opencontainers/runc#1776 checks whether Delegate works in slices, which keeps libcontainer systemd cgroup driver working on systemd v237+. **Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*: Fixes #61474 **Special notes for your reviewer**: /assign @derekwaynecarr cc @vikaschoudhary16 @sjenning @adelton @mrunalp **Release note**: ```release-note NONE ```
PR opencontainers/runc#1754 works around an issue in manager.Apply(-1) that makes Kubelet startup hang when using systemd cgroup driver (by adding a timeout) and further PR opencontainers/runc#1772 fixes that bug by checking the proper error status before waiting on the channel. PR opencontainers/runc#1776 checks whether Delegate works in slices, which keeps libcontainer systemd cgroup driver working on systemd v237+. PR opencontainers/runc#1781 makes the channel buffered, so if we time out waiting on the channel, the updater will not block trying to it since there are no longer any consumers.
Starting with systemd 237, in preparation for cgroup v2, delegation is only now available for scopes, not slices.
Update libcontainer code to detect whether delegation is available on both and use that information when creating new slices.
Fixes kubernetes/kubernetes#61474.
/cc @adelton, @jmontleon, @derekwaynecarr.