Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add scaleup step size feature #132

Merged
merged 30 commits into from
Jan 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
162054e
Replace Duration document because of an expired link
kazuki-hanai Oct 25, 2023
01b621d
Add scaleup parameters to crd-reference
kazuki-hanai Oct 25, 2023
b193983
[WIP] Implement scaleup
kazuki-hanai Nov 2, 2023
14a6477
fmt
kazuki-hanai Nov 2, 2023
4a07c8e
Add scaleup interval test
kazuki-hanai Nov 9, 2023
172fb2e
fix
kazuki-hanai Nov 9, 2023
a5577e3
add generated deepcopy
kazuki-hanai Nov 9, 2023
43e5d1b
fix
kazuki-hanai Nov 10, 2023
a065979
Update api/v1beta1/spannerautoscaler_types.go
kazuki-hanai Nov 10, 2023
e78d4af
Update docs/crd-reference.md
kazuki-hanai Nov 10, 2023
3aa0a50
Update api/v1beta1/spannerautoscaler_webhook.go
kazuki-hanai Nov 16, 2023
7f9f2b7
Fix scaleup when default value
kazuki-hanai Nov 16, 2023
9b33ac6
Fix scaleupInterval logic
kazuki-hanai Nov 28, 2023
dd009a9
Fix scaleupInterval default
kazuki-hanai Nov 28, 2023
15972a1
Update cmd/main.go
kazuki-hanai Nov 29, 2023
8434469
Update cmd/main.go
kazuki-hanai Dec 5, 2023
09f507a
Update internal/controller/spannerautoscaler_controller_test.go
kazuki-hanai Dec 6, 2023
0d848d8
Update internal/controller/spannerautoscaler_controller.go
kazuki-hanai Dec 11, 2023
44fca5c
Update internal/controller/spannerautoscaler_controller_test.go
kazuki-hanai Dec 11, 2023
87783d8
Set default `ScaleUpInterval` as 10 seconds
kazuki-hanai Dec 18, 2023
e7f95ab
Fix small
kazuki-hanai Dec 18, 2023
ae47fad
Set default `ScaleUpInterval` as 60 seconds
kazuki-hanai Dec 18, 2023
46809d5
Fix suStepSize validation
kazuki-hanai Dec 20, 2023
a85f59e
Fix test if scaledown and scaleup intervals function works
kazuki-hanai Dec 20, 2023
c4ef133
Fix default value of scaleupStepSize in scaledownStepSize test
kazuki-hanai Dec 20, 2023
b453448
Make test for getOrConvertTimeDuration simple
kazuki-hanai Dec 20, 2023
19ad040
Remove no required setup block
kazuki-hanai Dec 25, 2023
bb399c5
Update cmd/main.go
kazuki-hanai Dec 25, 2023
2117743
Fix ScaleupStepSize test
kazuki-hanai Dec 26, 2023
8adf381
Merge branch 'master' into add-scaleup
kazuki-hanai Dec 28, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions api/v1beta1/spannerautoscaler_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,14 @@ type ScaleConfig struct {
// The cool down period between two consecutive scaledown operations. If this option is omitted, the value of the `--scale-down-interval` command line option is taken as the default value.
ScaledownInterval *metav1.Duration `json:"scaledownInterval,omitempty"`

// The maximum number of processing units which can be added in one scale-up operation. It can be a multiple of 100 for values < 1000, or a multiple of 1000 otherwise.
// +kubebuilder:default=0
ScaleupStepSize int `json:"scaleupStepSize,omitempty"`
Copy link
Contributor

@tkuchiki tkuchiki Nov 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For compatibility, could you please make the default value zero and allow unlimited scaling?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


// How often autoscaler is reevaluated for scale up.
// The warm up period between two consecutive scaleup operations. If this option is omitted, the value of the `--scale-up-interval` command line option is taken as the default value.
ScaleupInterval *metav1.Duration `json:"scaleupInterval,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as ScaleupStepSize

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed it to 10 seconds because default interval for scaleup was set to 10 seconds before introducing scaleup feature.
https://github.com/mercari/spanner-autoscaler/pull/132/files#diff-1984e42ef0a0d5e7edf140b3ae6decf3fd6e1b9bd4ee351d4dcd8cf7213bb537L488


// The CPU utilization which the autoscaling will try to achieve. Ref: [Spanner CPU utilization](https://cloud.google.com/spanner/docs/cpu-utilization#task-priority)
TargetCPUUtilization TargetCPUUtilization `json:"targetCPUUtilization"`
}
Expand Down
12 changes: 12 additions & 0 deletions api/v1beta1/spannerautoscaler_webhook.go
Original file line number Diff line number Diff line change
Expand Up @@ -200,6 +200,18 @@ func (r *SpannerAutoscaler) validateScaleConfig() *field.Error {
"must be a multiple of 100 for values which are less than 1000")
}

if sc.ScaleupStepSize > 1000 && sc.ScaleupStepSize%1000 != 0 {
return field.Invalid(
field.NewPath("spec").Child("scaleConfig").Child("scaleupStepSize"),
sc.ScaleupStepSize,
"must be a multiple of 1000 for values which are greater than 1000")
} else if sc.ScaleupStepSize < 1000 && sc.ScaleupStepSize%100 != 0 {
return field.Invalid(
field.NewPath("spec").Child("scaleConfig").Child("scaleupStepSize"),
sc.ScaleupStepSize,
"must be a multiple of 100 for values which are less than 1000")
}

return nil
}

Expand Down
5 changes: 5 additions & 0 deletions api/v1beta1/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions cmd/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@ var (
enableLeaderElection = flag.Bool("leader-elect", false, "Enable leader election for controller manager. Enabling this will ensure there is only one active controller manager.")
leaderElectionID = flag.String("leader-elect-id", "", "Lease name for leader election.")
scaleDownInterval = flag.Duration("scale-down-interval", 55*time.Minute, "The scale down interval.")
scaleUpInterval = flag.Duration("scale-up-interval", 60*time.Second, "The scale up interval.")
configFile = flag.String("config", "", "The controller will load its initial configuration from this file. "+
"Omit this flag to use the default configuration values. Command-line flags override configuration from this file.")
)
Expand Down Expand Up @@ -87,6 +88,7 @@ func main() {
"metricsAddr", metricsAddr,
"probeAddr", probeAddr,
"scaleDownInterval", scaleDownInterval,
"scaleUpInterval", scaleUpInterval,
)

cfg, err := config.GetConfig()
Expand Down Expand Up @@ -139,6 +141,7 @@ func main() {
log,
controller.WithLog(log),
controller.WithScaleDownInterval(*scaleDownInterval),
controller.WithScaleUpInterval(*scaleUpInterval),
)
if err := sar.SetupWithManager(mgr); err != nil {
setupLog.Error(err, "unable to create controller", "controller", "SpannerAutoscaler")
Expand Down
12 changes: 12 additions & 0 deletions config/crd/bases/spanner.mercari.com_spannerautoscalers.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -361,6 +361,18 @@ spec:
be deleted in one scale-down operation. It can be a multiple
of 100 for values < 1000, or a multiple of 1000 otherwise.
type: integer
scaleupInterval:
description: How often autoscaler is reevaluated for scale up.
The warm up period between two consecutive scaleup operations.
If this option is omitted, the value of the `--scale-up-interval`
command line option is taken as the default value.
type: string
scaleupStepSize:
default: 0
description: The maximum number of processing units which can
be added in one scale-up operation. It can be a multiple of
100 for values < 1000, or a multiple of 1000 otherwise.
type: integer
targetCPUUtilization:
description: 'The CPU utilization which the autoscaling will try
to achieve. Ref: [Spanner CPU utilization](https://cloud.google.com/spanner/docs/cpu-utilization#task-priority)'
Expand Down
4 changes: 3 additions & 1 deletion docs/crd-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,9 @@ _Appears in:_
| `nodes` _[ScaleConfigNodes](#scaleconfignodes)_ | If `nodes` are provided at the time of resource creation, then they are automatically converted to `processing-units`. So it is recommended to use only the processing units. Ref: [Spanner Compute Capacity](https://cloud.google.com/spanner/docs/compute-capacity#compute_capacity) |
| `processingUnits` _[ScaleConfigPUs](#scaleconfigpus)_ | ProcessingUnits for scaling of the Spanner instance. Ref: [Spanner Compute Capacity](https://cloud.google.com/spanner/docs/compute-capacity#compute_capacity) |
| `scaledownStepSize` _integer_ | The maximum number of processing units which can be deleted in one scale-down operation. It can be a multiple of 100 for values < 1000, or a multiple of 1000 otherwise. |
| `scaledownInterval` _[Duration](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.22/#duration-v1-meta)_ | How often autoscaler is reevaluated for scale down. The cool down period between two consecutive scaledown operations. If this option is omitted, the value of the `--scale-down-interval` command line option is taken as the default value. |
| `scaledownInterval` _Duration_ | How often autoscaler is reevaluated for scale down. The cool down period between two consecutive scaledown operations. If this option is omitted, the value of the `--scale-down-interval` command line option is taken as the default value. Duration string is a possibly sequence of decimal numbers, each with unit suffix, such as "300m", "1.5h" or "2h45m". |
| `scaleupStepSize` _integer_ | The maximum number of processing units which can be added in one scale-up operation. It can be a multiple of 100 for values < 1000, or a multiple of 1000 otherwise. |
| `scaleupInterval` _Duration_ | How often autoscaler is reevaluated for scale up. The warm up period between two consecutive scaleup operations. If this option is omitted, the value of the `--scale-up-interval` command line option is taken as the default value. Duration string is a possibly sequence of decimal numbers, each with unit suffix, such as "300m", "1.5h" or "2h45m". |
| `targetCPUUtilization` _[TargetCPUUtilization](#targetcpuutilization)_ | The CPU utilization which the autoscaling will try to achieve. Ref: [Spanner CPU utilization](https://cloud.google.com/spanner/docs/cpu-utilization#task-priority) |


Expand Down
34 changes: 33 additions & 1 deletion internal/controller/spannerautoscaler_controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ type SpannerAutoscalerReconciler struct {
schedulers map[types.NamespacedName]schedulerpkg.Scheduler

scaleDownInterval time.Duration
scaleUpInterval time.Duration

clock utilclock.Clock
log logr.Logger
Expand Down Expand Up @@ -147,6 +148,19 @@ func (o withScaleDownInterval) applySpannerAutoscalerReconciler(r *SpannerAutosc
r.scaleDownInterval = o.scaleDownInterval
}

// Add scale-up-interval option for the autoscaler-reconciler
func WithScaleUpInterval(scaleUpInterval time.Duration) Option {
return withScaleUpInterval{scaleUpInterval: scaleUpInterval}
}

type withScaleUpInterval struct {
scaleUpInterval time.Duration
}

func (o withScaleUpInterval) applySpannerAutoscalerReconciler(r *SpannerAutoscalerReconciler) {
r.scaleUpInterval = o.scaleUpInterval
}

// NewSpannerAutoscalerReconciler returns a new SpannerAutoscalerReconciler.
func NewSpannerAutoscalerReconciler(
ctrlClient ctrlclient.Client,
Expand All @@ -165,6 +179,7 @@ func NewSpannerAutoscalerReconciler(
schedulers: make(map[types.NamespacedName]schedulerpkg.Scheduler),
crons: make(map[types.NamespacedName]*cronpkg.Cron),
scaleDownInterval: 55 * time.Minute,
scaleUpInterval: 60 * time.Second,
clock: utilclock.RealClock{},
log: logger,
}
Expand Down Expand Up @@ -485,7 +500,7 @@ func (r *SpannerAutoscalerReconciler) needUpdateProcessingUnits(log logr.Logger,
log.Info("no need to scale", "currentPU", currentProcessingUnits, "currentCPU", sa.Status.CurrentHighPriorityCPUUtilization)
return false

case desiredProcessingUnits > currentProcessingUnits && r.clock.Now().Before(sa.Status.LastScaleTime.Time.Add(10*time.Second)):
case currentProcessingUnits < desiredProcessingUnits && r.clock.Now().Before(sa.Status.LastScaleTime.Time.Add(getOrConvertTimeDuration(sa.Spec.ScaleConfig.ScaleupInterval, r.scaleUpInterval))):
log.Info("too short to scale up since last scale-up event",
"timeGap", now.Sub(sa.Status.LastScaleTime.Time).String(),
"now", now.String(),
Expand Down Expand Up @@ -532,18 +547,35 @@ func calcDesiredProcessingUnits(sa spannerv1beta1.SpannerAutoscaler) int {
}

sdStepSize := sa.Spec.ScaleConfig.ScaledownStepSize
suStepSize := sa.Spec.ScaleConfig.ScaleupStepSize

// round up the scaledownStepSize to avoid intermediate values
// for example: 8000 -> 7000 instead of 8000 -> 7400
if sdStepSize < 1000 && sa.Status.CurrentProcessingUnits > 1000 {
sdStepSize = 1000
}

// round up the scaleupStepSize to avoid intermediate values
// for example: 7000 -> 8000 instead of 7000 -> 7600
// To keep compatibility, check if scaleupStepSize is not 0
if suStepSize != 0 && suStepSize < 1000 && sa.Status.CurrentProcessingUnits+suStepSize > 1000 {
suStepSize = 1000
}

// in case of scaling down, check that we don't scale down beyond the ScaledownStepSize
if scaledDownPU := (sa.Status.CurrentProcessingUnits - sdStepSize); desiredPU < scaledDownPU {
desiredPU = scaledDownPU
}

// in case of scaling up, check that we don't scale up beyond the ScaleupStepSize
if scaledUpPU := (sa.Status.CurrentProcessingUnits + suStepSize); suStepSize != 0 && scaledUpPU < desiredPU {
desiredPU = scaledUpPU

if 1000 < desiredPU && desiredPU%1000 != 0 {
desiredPU = ((desiredPU / 1000) + 1) * 1000
}
}

// keep the scaling between the specified min/max range
minPU := sa.Spec.ScaleConfig.ProcessingUnits.Min
maxPU := sa.Spec.ScaleConfig.ProcessingUnits.Max
Expand Down
Loading
Loading