Skip to content
This repository has been archived by the owner on Jun 21, 2023. It is now read-only.

Add drain-delay flag to wait before eviction #13

Merged
merged 1 commit into from
Feb 16, 2021

Conversation

rtim75
Copy link
Contributor

@rtim75 rtim75 commented Feb 9, 2021

This flag allows to specify a period of time for which drainer should
wait after cordoning a node. This ensure that for example after instance
refresh in EKS pods don't become 'Pending' for some period of time waiting for
the new node to come up.

This flag allows to specify a period of time for which drainer should
wait after cordoning a node. This ensure that for example after instance
refresh pods don't become 'Pending' for some period of time waiting for
the new node to come up.
@rtim75
Copy link
Contributor Author

rtim75 commented Feb 13, 2021

@pawelprazak could you take a look and let me know what you think?

@pawelprazak
Copy link
Contributor

Sounds useful, and backwards compatible. Thank you for your contribution.

@@ -45,6 +45,7 @@ func serveCmd() *cobra.Command {
Timeout: 60 * time.Second,
DeleteLocalData: true,
IgnoreAllDaemonSets: true,
DrainDelay: 0 * time.Second,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the fact it's zero for backward compatibility

@@ -39,6 +39,7 @@ type Options struct {
DeleteLocalData bool
Selector string
PodSelector string
DrainDelay time.Duration `mapstructure:"drain-delay"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I don't understand the significance of the mapstructure:"drain-delay" part, the code is not fresh in my mind like it used to be, but AFAIR there is no serialisation here, happy to learn more

Copy link
Contributor Author

@rtim75 rtim75 Feb 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope I got it right.
In cmd/kubedrainer/serve.go#181 you call settings.Parse(o.Drainer) which unmarshalls flags using mapstructure:

err := viper.Unmarshal(target, func(config *mapstructure.DecoderConfig) {
	config.ErrorUnused = false
	config.ZeroFields = false
})

I tested it without a mapstructure tag and it couldn't parse flags properly: the value of DrainDelay stayed default - 0s.

Copy link
Contributor Author

@rtim75 rtim75 Feb 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, I believe, that some other flags, like for example pod-selector cannot be parsed properly as well:

I added simple debug output at the end of serve/Parse() function to check it:

// Parse parses all flags and settings to options
func (o *ServeOptions) Parse(cmd *cobra.Command) error {
	settings.Bind(cmd.Flags()) // needs to be run inside the command and before any viper usage for flags to be visible

	if debug {
		log.Debug().Msgf("All keys: %+v", viper.AllKeys())
		log.Debug().Msgf("All settings: %+v", viper.AllSettings())
		cmd.Flags().VisitAll(func(flag *pflag.Flag) {
			log.Debug().Msgf("'%s' -> flag: '%+v' | setting: '%+v'", flag.Name, flag.Value, viper.Get(flag.Name))
		})
		log.Debug().Msgf("Settings: %+v", *o)
	}
	if err := settings.Parse(o.Kubernetes); err != nil {
		return err
	}
	if err := settings.Parse(o.Drainer); err != nil {
		return err
	}
	if err := settings.Parse(o.AWS); err != nil {
		return err
	}
	log.Debug().Msgf("Parsed settings: %+v", *o) # <--- this one
	return nil
}

and I see next behaviour when starting kubedrainer with ./kubedrainer -d serve --pod-selector checking:

  • without mapstructure tag for PodSelector:
{"level":"debug","time":1613463410,"message":"Settings: {Kubernetes:{ConfigFlags:{CacheDir:/Users/rtim/.kube/http-cache KubeConfig: ClusterName: AuthInfoName: Context: Namespace:<nil> APIServer: Insecure:false CertFile: KeyFile: CAFile: BearerToken: Impersonate:<nil> ImpersonateGroup:<nil> Username:<nil> Password:<nil> Timeout:0 clientConfig:<nil> lock:{state:0 sema:0} usePersistentConfig:true}} Drainer:{Node: Force:false GracePeriodSeconds:-1 IgnoreAllDaemonSets:true Timeout:1m0s DeleteLocalData:true Selector: PodSelector: DrainDelay:0s} AWS:0xc00003a6e0}"}

{"level":"debug","time":1613463410,"message":"Parsed settings: {Kubernetes:{ConfigFlags:{CacheDir:/Users/rtim/.kube/http-cache KubeConfig: ClusterName: AuthInfoName: Context: Namespace:<nil> APIServer: Insecure:false CertFile: KeyFile: CAFile: BearerToken: Impersonate:<nil> ImpersonateGroup:<nil> Username:<nil> Password:<nil> Timeout:0 clientConfig:<nil> lock:{state:0 sema:0} usePersistentConfig:true}} Drainer:{Node: Force:false GracePeriodSeconds:-1 IgnoreAllDaemonSets:true Timeout:1m0s DeleteLocalData:true Selector: PodSelector: DrainDelay:0s} AWS:0xc00003a6e0}"}

you can see that PodSelector(at the end of JSONs) doesn't change

  • with mapstructure tag mapstructure:"pod-selector":
{"level":"debug","time":1613463080,"message":"Settings: {Kubernetes:{ConfigFlags:{CacheDir:/Users/rtim/.kube/http-cache KubeConfig: ClusterName: AuthInfoName: Context: Namespace:<nil> APIServer: Insecure:false CertFile: KeyFile: CAFile: BearerToken: Impersonate:<nil> ImpersonateGroup:<nil> Username:<nil> Password:<nil> Timeout:0 clientConfig:<nil> lock:{state:0 sema:0} usePersistentConfig:true}} Drainer:{Node: Force:false GracePeriodSeconds:-1 IgnoreAllDaemonSets:true Timeout:1m0s DeleteLocalData:true Selector: PodSelector: DrainDelay:0s} AWS:0xc00029e5f0}"}

{"level":"debug","time":1613463080,"message":"Parsed settings: {Kubernetes:{ConfigFlags:{CacheDir:/Users/rtim/.kube/http-cache KubeConfig: ClusterName: AuthInfoName: Context: Namespace:<nil> APIServer: Insecure:false CertFile: KeyFile: CAFile: BearerToken: Impersonate:<nil> ImpersonateGroup:<nil> Username:<nil> Password:<nil> Timeout:0 clientConfig:<nil> lock:{state:0 sema:0} usePersistentConfig:true}} Drainer:{Node: Force:false GracePeriodSeconds:-1 IgnoreAllDaemonSets:true Timeout:1m0s DeleteLocalData:true Selector: PodSelector:checking DrainDelay:0s} AWS:0xc00029e5f0}"}

You can verify it yourself and if you want I can work on this issue as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're absolutely right, thank you for looking at other fields as well

Copy link
Contributor Author

@rtim75 rtim75 Feb 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to me to continue in this PR then or should we merge this one and open a new PR to fix other fields?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll merge this one, and if you would be son kind we'll gladly accept the fix for the other fields (was about to start working on it) :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, I can fix other fields

@pawelprazak pawelprazak merged commit 2f4909b into VirtusLab:master Feb 16, 2021
@pawelprazak
Copy link
Contributor

FYI: https://github.com/VirtusLab/kubedrainer/releases/tag/v0.0.10 was released

# for free to subscribe to this conversation on GitHub. Already have an account? #.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants