Skip to content
This repository has been archived by the owner on Jun 21, 2023. It is now read-only.

Add drain-delay flag to wait before eviction #13

Merged
merged 1 commit into from
Feb 16, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions cmd/kubedrainer/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,7 @@ func drainerFlags(options *drainer.Options) *pflag.FlagSet {
flags.Duration("timeout", options.Timeout, "The length of time to wait before giving up, zero means infinite")
flags.StringP("selector", "l", options.Selector, "Selector (label query) to filter on")
flags.StringP("pod-selector", "", options.PodSelector, "Label selector to filter pods on the node")
flags.String("drain-delay", options.DrainDelay.String(), "For how long to wait before draining a node")
return flags
}

Expand Down
1 change: 1 addition & 0 deletions cmd/kubedrainer/serve.go
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ func serveCmd() *cobra.Command {
Timeout: 60 * time.Second,
DeleteLocalData: true,
IgnoreAllDaemonSets: true,
DrainDelay: 0 * time.Second,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the fact it's zero for backward compatibility

},
AWS: &autoscaling.Options{
LoopSleepTime: 10 * time.Second,
Expand Down
7 changes: 7 additions & 0 deletions pkg/drainer/drainer.go
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ type Options struct {
DeleteLocalData bool
Selector string
PodSelector string
DrainDelay time.Duration `mapstructure:"drain-delay"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I don't understand the significance of the mapstructure:"drain-delay" part, the code is not fresh in my mind like it used to be, but AFAIR there is no serialisation here, happy to learn more

Copy link
Contributor Author

@rtim75 rtim75 Feb 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope I got it right.
In cmd/kubedrainer/serve.go#181 you call settings.Parse(o.Drainer) which unmarshalls flags using mapstructure:

err := viper.Unmarshal(target, func(config *mapstructure.DecoderConfig) {
	config.ErrorUnused = false
	config.ZeroFields = false
})

I tested it without a mapstructure tag and it couldn't parse flags properly: the value of DrainDelay stayed default - 0s.

Copy link
Contributor Author

@rtim75 rtim75 Feb 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, I believe, that some other flags, like for example pod-selector cannot be parsed properly as well:

I added simple debug output at the end of serve/Parse() function to check it:

// Parse parses all flags and settings to options
func (o *ServeOptions) Parse(cmd *cobra.Command) error {
	settings.Bind(cmd.Flags()) // needs to be run inside the command and before any viper usage for flags to be visible

	if debug {
		log.Debug().Msgf("All keys: %+v", viper.AllKeys())
		log.Debug().Msgf("All settings: %+v", viper.AllSettings())
		cmd.Flags().VisitAll(func(flag *pflag.Flag) {
			log.Debug().Msgf("'%s' -> flag: '%+v' | setting: '%+v'", flag.Name, flag.Value, viper.Get(flag.Name))
		})
		log.Debug().Msgf("Settings: %+v", *o)
	}
	if err := settings.Parse(o.Kubernetes); err != nil {
		return err
	}
	if err := settings.Parse(o.Drainer); err != nil {
		return err
	}
	if err := settings.Parse(o.AWS); err != nil {
		return err
	}
	log.Debug().Msgf("Parsed settings: %+v", *o) # <--- this one
	return nil
}

and I see next behaviour when starting kubedrainer with ./kubedrainer -d serve --pod-selector checking:

  • without mapstructure tag for PodSelector:
{"level":"debug","time":1613463410,"message":"Settings: {Kubernetes:{ConfigFlags:{CacheDir:/Users/rtim/.kube/http-cache KubeConfig: ClusterName: AuthInfoName: Context: Namespace:<nil> APIServer: Insecure:false CertFile: KeyFile: CAFile: BearerToken: Impersonate:<nil> ImpersonateGroup:<nil> Username:<nil> Password:<nil> Timeout:0 clientConfig:<nil> lock:{state:0 sema:0} usePersistentConfig:true}} Drainer:{Node: Force:false GracePeriodSeconds:-1 IgnoreAllDaemonSets:true Timeout:1m0s DeleteLocalData:true Selector: PodSelector: DrainDelay:0s} AWS:0xc00003a6e0}"}

{"level":"debug","time":1613463410,"message":"Parsed settings: {Kubernetes:{ConfigFlags:{CacheDir:/Users/rtim/.kube/http-cache KubeConfig: ClusterName: AuthInfoName: Context: Namespace:<nil> APIServer: Insecure:false CertFile: KeyFile: CAFile: BearerToken: Impersonate:<nil> ImpersonateGroup:<nil> Username:<nil> Password:<nil> Timeout:0 clientConfig:<nil> lock:{state:0 sema:0} usePersistentConfig:true}} Drainer:{Node: Force:false GracePeriodSeconds:-1 IgnoreAllDaemonSets:true Timeout:1m0s DeleteLocalData:true Selector: PodSelector: DrainDelay:0s} AWS:0xc00003a6e0}"}

you can see that PodSelector(at the end of JSONs) doesn't change

  • with mapstructure tag mapstructure:"pod-selector":
{"level":"debug","time":1613463080,"message":"Settings: {Kubernetes:{ConfigFlags:{CacheDir:/Users/rtim/.kube/http-cache KubeConfig: ClusterName: AuthInfoName: Context: Namespace:<nil> APIServer: Insecure:false CertFile: KeyFile: CAFile: BearerToken: Impersonate:<nil> ImpersonateGroup:<nil> Username:<nil> Password:<nil> Timeout:0 clientConfig:<nil> lock:{state:0 sema:0} usePersistentConfig:true}} Drainer:{Node: Force:false GracePeriodSeconds:-1 IgnoreAllDaemonSets:true Timeout:1m0s DeleteLocalData:true Selector: PodSelector: DrainDelay:0s} AWS:0xc00029e5f0}"}

{"level":"debug","time":1613463080,"message":"Parsed settings: {Kubernetes:{ConfigFlags:{CacheDir:/Users/rtim/.kube/http-cache KubeConfig: ClusterName: AuthInfoName: Context: Namespace:<nil> APIServer: Insecure:false CertFile: KeyFile: CAFile: BearerToken: Impersonate:<nil> ImpersonateGroup:<nil> Username:<nil> Password:<nil> Timeout:0 clientConfig:<nil> lock:{state:0 sema:0} usePersistentConfig:true}} Drainer:{Node: Force:false GracePeriodSeconds:-1 IgnoreAllDaemonSets:true Timeout:1m0s DeleteLocalData:true Selector: PodSelector:checking DrainDelay:0s} AWS:0xc00029e5f0}"}

You can verify it yourself and if you want I can work on this issue as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're absolutely right, thank you for looking at other fields as well

Copy link
Contributor Author

@rtim75 rtim75 Feb 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to me to continue in this PR then or should we merge this one and open a new PR to fix other fields?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll merge this one, and if you would be son kind we'll gladly accept the fix for the other fields (was about to start working on it) :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, I can fix other fields

}

func (o *Options) String() string {
Expand Down Expand Up @@ -72,6 +73,7 @@ func New(client kubernetes.Interface, options *Options) Drainer {
ErrOut: errOut,
Out: out,
},
drainDelay: options.DrainDelay,
drainer: &drain.Helper{
Client: client,
ErrOut: errOut,
Expand Down Expand Up @@ -142,6 +144,9 @@ func (o *drainCmdOptions) Drain(nodeName string) error {
_ = printObj(n, o.Out)
}

log.Info().Msgf("Sleep for %v before starting to evict", o.drainDelay.String())
time.Sleep(o.drainDelay)

return o.deleteOrEvictPodsSimple(nodeName)
}

Expand All @@ -151,6 +156,8 @@ type drainCmdOptions struct {
PrintFlags *genericclioptions.PrintFlags
ToPrinter func(string) (printers.ResourcePrinterFunc, error)

drainDelay time.Duration

drainer *drain.Helper
nodes *node.Node

Expand Down