-
Notifications
You must be signed in to change notification settings - Fork 272
Initial NTH v2 design proposal #556
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Big fan of the approach 👍
* IMDS — (EC2) Instance MetaData Service | ||
* k8s — Kubernetes | ||
* CRD - Custom Resource Definitions | ||
* Spot ITN - Spot Interruption Termination Event |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: v1 - Current/existing implementation of NTH
|
||
NTH Queue-Processor mode has accumulated a wealth of configuration options which can make it cumbersome to install. | ||
Configurations options and different modes of operation have caused the Helm Chart to become large and complex. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: example of helm installation step with a realistic config of nth
template: | | ||
{"text":"[NTH][Instance Interruption] EventID: {{ .EventID }} - Kind: {{ .Kind }} - Instance: {{ .InstanceID }} - Node: {{ .NodeName }} - Description: {{ .Description }} - Start Time: {{ .StartTime }}"} | ||
... | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
related to comment above, a comparison of an equivalent QP/IMDS NTH config to this would make some of the benefits clearer
Logical Terminators allow users to customize termination of specific node groups or provisioner managed nodes using a node selector. For example, a | ||
logical terminator could be configured to respond to events on nodes with the label `training-group-1`. Nodes in `training-group-1` may need to some | ||
extra time to drain, which can be configured on the Terminator resource in the `pod-termination-grace-period` and the `node-termination-grace-period`. | ||
Other groups could use a second logical Terminator resource with more aggressive grace periods. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be useful to show both of these configurations in the example below, to make it more clear how the different node groups are selected and handled.
This approach looks great but have you considered splitting the monitoring and termination workloads? A rough suggestion would be a new project containing a termination controller and CRDs for termination events and termination configuration. The controller would watch for termination event CRDs being created and process them with the relevant termination configuration. The termination controller would be abstracted and wouldn't be tied to any specific implementation for creating the termination events. The AWS NTH v2 could then just be a producer of termination events, as could Karpenter. Alternatively, with the same logic as above, the AWS NTH v2 could combine monitor controller and termination controller which would still support Karpenter. |
We've definitely thought about breaking out the Interruption Monitoring logic and Termination Handling logic to support custom use-cases like you're talking about. I don't think it would make sense from a maintainability perspective to separate the two into separate projects, but I definitely agree that they need to be separated logically into separate controllers and be very loosely coupled from one another. The Termination CRD is an interesting idea. I was thinking this would be a little heavy for the use-case and was thinking more along the lines of applying a label to the node (this is how Karpenter currently operates). The Termination CRD may give a looser coupling from Interruption Events to Termination since the CRD could contain the exact drain configuration and node set for a workflow style drain (ie handle PDBs within the drain group, potentially drain LoadBalancers, support AMI upgrade strategies, etc). |
@bwagner5 would this enable NTH to support multiple monitoring controllers and maybe be cloud agnostic?
I was thinking a CRD would make it easier to handle termination lifecycles; it would allow multiple termination events against a single node with a status for the termination to be captured and K8s events to be generated. I was still expecting the termination logic to come from a separate CRD and mapped from the termination CRD (e.g. SPOT vs NODE_REFRESH). |
Yes!
I think we're on the same page here. The setup of interruption events to the termination actions would still be in the Terminator CRD (main NTH CRD). The Termination CRD would contain the exact termination process defined for the workflow from the Terminator CRD. We can explore how this works in more detail as we get a more detailed design document. |
Just a couple of further thoughts and obviously the actual design still needs to be done, but using a termination CRD named for the node would allow termination triggers to be aggregated and the status of the termination to be captured and monitored. It'd also be great if the terminator CRD could be selected based on the termination CRD metadata (e.g. termination type) and node metadata (e.g. labels). The following termination patterns could then be supported and combined.
As an example a capacity rebalance termination trigger might start a cordon and wait pattern but if a terminate immediately trigger was added for that node then the cordon and wait could be skipped and the default pattern attempted immediately. Equally it should be possible to stop a termination by flagging the CRD as cancelled (or deleting it). |
This PR has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you want this PR to never become stale, please ask a maintainer to apply the "stalebot-ignore" label. |
@bwagner5 has there been any progress on the designs for v2? |
@bwagner5 I've been thinking about the separate terminator component discussion further and I can't see how it would make the system more complex, if anything it would simplify everything be decoupling the development. In this model if you have something like Karpenter installed on a MNG you don't need NTH just the terminator (or just MNGs). By separate terminator I mean a system that watches for apiVersion: v1
kind: TerminationRequest
metadata:
name: nth-terminate-node-1-20220209000000
spec:
node: node1
source:
name: aws-node-termination-handler
reason: INSTANCE_REFRESH The service to service contract for the terminator service would just be the Also most importantly I'd suggest the name Krowbar for this service, "Karpenters can use krowbars, but so can anyone else"! |
Couple of comments-
|
NTH is not necessary on EKS spot managed node groups, because they have their own managed termination handler.
PDBs are respected with NTH v1 and will continue to w/ v2 since we are using the k8s eviction API. |
SPOT_ITN: ["CORDON", "DRAIN"] | ||
SPOT_REBALANCE: ["NO_ACTION"] | ||
ASG_TERMINATION: ["CORDON", "DRAIN"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would a user want to configure these? Are there sensible defaults they can use instead?
This PR has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you want this PR to never become stale, please ask a maintainer to apply the "stalebot-ignore" label. |
This PR was closed because it has become stale with no activity. |
Issue #, if available:
N/A
Description of changes:
To view with markdown rendering: https://github.com/bwagner5/aws-node-termination-handler/blob/v2-proposal/designs/v2.md
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.