Initial NTH v2 design proposal #556

bwagner5 · 2022-01-04T15:24:55Z

Issue #, if available:
N/A

Description of changes:

Adds a design proposal for NTH v2

To view with markdown rendering: https://github.com/bwagner5/aws-node-termination-handler/blob/v2-proposal/designs/v2.md

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

brycahta

Big fan of the approach 👍

brycahta · 2022-01-04T17:06:54Z

designs/v2.md

+* IMDS — (EC2) Instance MetaData Service
+* k8s — Kubernetes
+* CRD - Custom Resource Definitions
+* Spot ITN - Spot Interruption Termination Event


nit: v1 - Current/existing implementation of NTH

brycahta · 2022-01-04T17:07:59Z

designs/v2.md

+
+NTH Queue-Processor mode has accumulated a wealth of configuration options which can make it cumbersome to install.
+Configurations options and different modes of operation have caused the Helm Chart to become large and complex.
+


nit: example of helm installation step with a realistic config of nth

brycahta · 2022-01-04T17:09:22Z

designs/v2.md

+     template: |
+        {"text":"[NTH][Instance Interruption] EventID: {{ .EventID }} - Kind: {{ .Kind }} - Instance: {{ .InstanceID }} - Node: {{ .NodeName }} - Description: {{ .Description }} - Start Time: {{ .StartTime }}"}
+...
+```


related to comment above, a comparison of an equivalent QP/IMDS NTH config to this would make some of the benefits clearer

snay2 · 2022-01-04T17:18:40Z

designs/v2.md

+Logical Terminators allow users to customize termination of specific node groups or provisioner managed nodes using a node selector. For example, a 
+logical terminator could be configured to respond to events on nodes with the label `training-group-1`.  Nodes in `training-group-1` may need to some 
+extra time to drain, which can be configured on the Terminator resource in the `pod-termination-grace-period` and the `node-termination-grace-period`. 
+Other groups could use a second logical Terminator resource with more aggressive grace periods.


Could be useful to show both of these configurations in the example below, to make it more clear how the different node groups are selected and handled.

stevehipwell · 2022-01-05T12:54:20Z

This approach looks great but have you considered splitting the monitoring and termination workloads?

A rough suggestion would be a new project containing a termination controller and CRDs for termination events and termination configuration. The controller would watch for termination event CRDs being created and process them with the relevant termination configuration. The termination controller would be abstracted and wouldn't be tied to any specific implementation for creating the termination events. The AWS NTH v2 could then just be a producer of termination events, as could Karpenter.

Alternatively, with the same logic as above, the AWS NTH v2 could combine monitor controller and termination controller which would still support Karpenter.

bwagner5 · 2022-01-05T17:59:06Z

This approach looks great but have you considered splitting the monitoring and termination workloads?

A rough suggestion would be a new project containing a termination controller and CRDs for termination events and termination configuration. The controller would watch for termination event CRDs being created and process them with the relevant termination configuration. The termination controller would be abstracted and wouldn't be tied to any specific implementation for creating the termination events. The AWS NTH v2 could then just be a producer of termination events, as could Karpenter.

Alternatively, with the same logic as above, the AWS NTH v2 could combine monitor controller and termination controller which would still support Karpenter.

We've definitely thought about breaking out the Interruption Monitoring logic and Termination Handling logic to support custom use-cases like you're talking about. I don't think it would make sense from a maintainability perspective to separate the two into separate projects, but I definitely agree that they need to be separated logically into separate controllers and be very loosely coupled from one another.

The Termination CRD is an interesting idea. I was thinking this would be a little heavy for the use-case and was thinking more along the lines of applying a label to the node (this is how Karpenter currently operates). The Termination CRD may give a looser coupling from Interruption Events to Termination since the CRD could contain the exact drain configuration and node set for a workflow style drain (ie handle PDBs within the drain group, potentially drain LoadBalancers, support AMI upgrade strategies, etc).

stevehipwell · 2022-01-05T18:07:07Z

We've definitely thought about breaking out the Interruption Monitoring logic and Termination Handling logic to support custom use-cases like you're talking about. I don't think it would make sense from a maintainability perspective to separate the two into separate projects, but I definitely agree that they need to be separated logically into separate controllers and be very loosely coupled from one another.

@bwagner5 would this enable NTH to support multiple monitoring controllers and maybe be cloud agnostic?

The Termination CRD is an interesting idea. I was thinking this would be a little heavy for the use-case and was thinking more along the lines of applying a label to the node (this is how Karpenter currently operates). The Termination CRD may give a looser coupling from Interruption Events to Termination since the CRD could contain the exact drain configuration and node set for a workflow style drain (ie handle PDBs within the drain group, potentially drain LoadBalancers, support AMI upgrade strategies, etc).

I was thinking a CRD would make it easier to handle termination lifecycles; it would allow multiple termination events against a single node with a status for the termination to be captured and K8s events to be generated. I was still expecting the termination logic to come from a separate CRD and mapped from the termination CRD (e.g. SPOT vs NODE_REFRESH).

bwagner5 · 2022-01-06T23:30:58Z

@bwagner5 would this enable NTH to support multiple monitoring controllers and maybe be cloud agnostic?

Yes!

I was thinking a CRD would make it easier to handle termination lifecycles; it would allow multiple termination events against a single node with a status for the termination to be captured and K8s events to be generated. I was still expecting the termination logic to come from a separate CRD and mapped from the termination CRD (e.g. SPOT vs NODE_REFRESH).

I think we're on the same page here. The setup of interruption events to the termination actions would still be in the Terminator CRD (main NTH CRD). The Termination CRD would contain the exact termination process defined for the workflow from the Terminator CRD. We can explore how this works in more detail as we get a more detailed design document.

stevehipwell · 2022-01-07T10:59:43Z

Just a couple of further thoughts and obviously the actual design still needs to be done, but using a termination CRD named for the node would allow termination triggers to be aggregated and the status of the termination to be captured and monitored. It'd also be great if the terminator CRD could be selected based on the termination CRD metadata (e.g. termination type) and node metadata (e.g. labels). The following termination patterns could then be supported and combined.

Default
- Cordon node (could support taint not cordon)
- Terminate pods with grace period
- Terminate node after all pods stopped or grace period
Cordon and wait
- Cordon node so no new workloads can be scheduled (could support taint not cordon)
- Wait for a set period or until all pods have finished
- Pods to wait for could be matched with an expression
- If pods are still running at the end of the period fall back to another (or default) pattern
Schedule
- Wait until a specific time before starting another pattern

As an example a capacity rebalance termination trigger might start a cordon and wait pattern but if a terminate immediately trigger was added for that node then the cordon and wait could be skipped and the default pattern attempted immediately. Equally it should be possible to stop a termination by flagging the CRD as cancelled (or deleting it).

github-actions · 2022-01-22T17:04:19Z

This PR has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you want this PR to never become stale, please ask a maintainer to apply the "stalebot-ignore" label.

stevehipwell · 2022-02-07T17:14:39Z

@bwagner5 has there been any progress on the designs for v2?

stevehipwell · 2022-02-09T17:06:15Z

@bwagner5 I've been thinking about the separate terminator component discussion further and I can't see how it would make the system more complex, if anything it would simplify everything be decoupling the development. In this model if you have something like Karpenter installed on a MNG you don't need NTH just the terminator (or just MNGs).

By separate terminator I mean a system that watches for TerminationRequest resources being created and acts on them with the status reported on the request. The creator of the TerminationRequest can then watch it with an observer. A MVP example of this CR would be something like below.

apiVersion: v1
kind: TerminationRequest
metadata:
  name: nth-terminate-node-1-20220209000000
spec:
  node: node1
  source:
    name: aws-node-termination-handler
    reason: INSTANCE_REFRESH

The service to service contract for the terminator service would just be the TerminationRequest CRD (spec and status); nice and simple to integrate. The termination logic would then be completely separate from the requestor and would be configurable by the operator who would know their termination sources and reasons. Multiple termination requests can be supported for a node with the controller managing best behaviour. Additional options could either be supported by the terminator as know values or passed through for operator configuration (e.g. NTH adds a new termination request with args.critical when reason is SPOT_REBALANCE and the node hasn't terminated within a set time).

Also most importantly I'd suggest the name Krowbar for this service, "Karpenters can use krowbars, but so can anyone else"!

sauryadas · 2022-02-11T19:39:19Z

Couple of comments-

Will NTH v2 work with spot managed node groups?
How will this behave if k8s PDB's are defined ?

bwagner5 · 2022-02-11T20:02:13Z

Couple of comments-

Will NTH v2 work with spot managed node groups?

NTH is not necessary on EKS spot managed node groups, because they have their own managed termination handler.
#559

How will this behave if k8s PDB's are defined ?

PDBs are respected with NTH v1 and will continue to w/ v2 since we are using the k8s eviction API.

ellistarn · 2022-03-14T23:05:45Z

designs/v2.md

+     SPOT_ITN: ["CORDON", "DRAIN"]
+     SPOT_REBALANCE: ["NO_ACTION"]
+     ASG_TERMINATION: ["CORDON", "DRAIN"]


Why would a user want to configure these? Are there sensible defaults they can use instead?

github-actions · 2022-04-15T17:06:32Z

This PR has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you want this PR to never become stale, please ask a maintainer to apply the "stalebot-ignore" label.

github-actions · 2022-04-20T17:13:18Z

This PR was closed because it has become stale with no activity.

Initial NTH v2 design proposal

71f2b9c

bwagner5 requested a review from a team as a code owner January 4, 2022 15:24

brycahta reviewed Jan 4, 2022

View reviewed changes

snay2 reviewed Jan 4, 2022

View reviewed changes

github-actions bot added the stale Issues / PRs with no activity label Jan 22, 2022

snay2 added stalebot-ignore To NOT let the stalebot update or close the Issue / PR and removed stale Issues / PRs with no activity labels Jan 24, 2022

cjerad mentioned this pull request Feb 8, 2022

add v2 scaffolding #577

Merged

cjerad mentioned this pull request Feb 16, 2022

replace kustomize with helm #587

Merged

cjerad mentioned this pull request Feb 25, 2022

Knative webhook #596

Merged

ellistarn reviewed Mar 14, 2022

View reviewed changes

jillmon removed the stalebot-ignore To NOT let the stalebot update or close the Issue / PR label Mar 30, 2022

snay2 mentioned this pull request Mar 31, 2022

End-to-end test suite refactor #611

Open

cjerad mentioned this pull request Apr 4, 2022

NTHv2 core functionality #612

Merged

github-actions bot added the stale Issues / PRs with no activity label Apr 15, 2022

cjerad mentioned this pull request Apr 18, 2022

add node label selector to Terminator #625

Merged

github-actions bot closed this Apr 20, 2022

cjerad mentioned this pull request Apr 20, 2022

add event action config to Terminator #628

Merged

cjerad mentioned this pull request May 18, 2022

add webhook config to Terminator #641

Merged

bwagner5 deleted the v2-proposal branch October 8, 2022 19:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial NTH v2 design proposal #556

Initial NTH v2 design proposal #556

bwagner5 commented Jan 4, 2022 •

edited

Loading

brycahta left a comment

brycahta Jan 4, 2022

brycahta Jan 4, 2022

brycahta Jan 4, 2022

snay2 Jan 4, 2022

stevehipwell commented Jan 5, 2022

bwagner5 commented Jan 5, 2022

stevehipwell commented Jan 5, 2022

bwagner5 commented Jan 6, 2022

stevehipwell commented Jan 7, 2022

github-actions bot commented Jan 22, 2022

stevehipwell commented Feb 7, 2022

stevehipwell commented Feb 9, 2022

sauryadas commented Feb 11, 2022

bwagner5 commented Feb 11, 2022

ellistarn Mar 14, 2022

github-actions bot commented Apr 15, 2022

github-actions bot commented Apr 20, 2022


		NTH Queue-Processor mode has accumulated a wealth of configuration options which can make it cumbersome to install.
		Configurations options and different modes of operation have caused the Helm Chart to become large and complex.

Initial NTH v2 design proposal #556

Initial NTH v2 design proposal #556

Conversation

bwagner5 commented Jan 4, 2022 • edited Loading

brycahta left a comment

Choose a reason for hiding this comment

brycahta Jan 4, 2022

Choose a reason for hiding this comment

brycahta Jan 4, 2022

Choose a reason for hiding this comment

brycahta Jan 4, 2022

Choose a reason for hiding this comment

snay2 Jan 4, 2022

Choose a reason for hiding this comment

stevehipwell commented Jan 5, 2022

bwagner5 commented Jan 5, 2022

stevehipwell commented Jan 5, 2022

bwagner5 commented Jan 6, 2022

stevehipwell commented Jan 7, 2022

github-actions bot commented Jan 22, 2022

stevehipwell commented Feb 7, 2022

stevehipwell commented Feb 9, 2022

sauryadas commented Feb 11, 2022

bwagner5 commented Feb 11, 2022

ellistarn Mar 14, 2022

Choose a reason for hiding this comment

github-actions bot commented Apr 15, 2022

github-actions bot commented Apr 20, 2022

bwagner5 commented Jan 4, 2022 •

edited

Loading