-
Notifications
You must be signed in to change notification settings - Fork 391
Add --poll-snapshot-period for periodic readiness requeue check when creating a VolumeSnapshotContent #1284
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
base: master
Are you sure you want to change the base?
Add --poll-snapshot-period for periodic readiness requeue check when creating a VolumeSnapshotContent #1284
Conversation
…creating a VolumeSnapshotContent
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: pwschuurman The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/assign @sunnylovestiramisu |
Question:
|
@@ -93,6 +93,7 @@ var ( | |||
metricsAddress = flag.String("metrics-address", "", "(deprecated) The TCP network address where the prometheus metrics endpoint will listen (example: `:8080`). The default is empty string, which means metrics endpoint is disabled. Only one of `--metrics-address` and `--http-endpoint` can be set.") | |||
httpEndpoint = flag.String("http-endpoint", "", "The TCP network address where the HTTP server for diagnostics, including metrics and leader election health check, will listen (example: `:8080`). The default is empty string, which means the server is disabled. Only one of `--metrics-address` and `--http-endpoint` can be set.") | |||
metricsPath = flag.String("metrics-path", "/metrics", "The HTTP path where prometheus metrics will be exposed. Default is `/metrics`.") | |||
pollSnapshotPeriod = flag.Duration("poll-snapshot-period", 10*time.Second, "Poll interval to check if a snapshot is ready. Default is 10 seconds.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you set the default to 0 so that drivers can opt-in to this new behavior?
What type of PR is this?
/kind feature
What this PR does / why we need it:
Today, the csi-snapshotter runs with a reconciler pattern when processing VolumeSnapshotContent resources. This follows the familiar
requeue, err
pattern that is followed by other controller frameworks, like controller-runtime.requeue
field is useful, as the reconciler can periodically attempt to check the status of a snapshot.The problem with the current requeue implementation is that it relies on the same exponential backoff rate limiter queue as the error handling scenario. This is a problem, as it can lead to non-deterministic ready times, and can lead to significantly higher effective snapshot ready latencies for snapshots that take longer to process (eg: larger snapshots).
To fix this, the requeue logic should use a constant retry period, to allow for more reliable snapshot reconciliation. This PR adds a new command line argument
--poll-snapshot-period
to the csi-snapshotter, with a default of10 seconds
.Which issue(s) this PR fixes:
Fixes #1282
Special notes for your reviewer:
Does this PR introduce a user-facing change?: