Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add resync-period flag for k8s native informers #4047

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

sfc-gh-raravena
Copy link

This is to optionally help with cache inconsistencies

What type of PR is this?

/kind feature

What this PR does / why we need it:

  • Add resync-period flag for k8s native informers
  • This is to optionally help with cache inconsistencies

Which issue(s) this PR fixes:

Fixes # #4046

Special notes for your reviewer:

Does this PR introduce a user-facing change?


This is to optinally help with cache inconsistencies

Signed-off-by: Ricardo Aravena <ricardo.aravena@snowflake.com>
@volcano-sh-bot
Copy link
Contributor

Welcome @sfc-gh-raravena!

It looks like this is your first PR to volcano-sh/volcano.

Thank you, and welcome to Volcano. 😃

@volcano-sh-bot volcano-sh-bot added kind/bug Categorizes issue or PR as related to a bug. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. labels Feb 27, 2025
@volcano-sh-bot volcano-sh-bot added the kind/documentation Categorizes issue or PR as related to documentation. label Feb 27, 2025
@volcano-sh-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign hwdef
You can assign the PR to them by writing /assign @hwdef in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@volcano-sh-bot volcano-sh-bot added kind/feature Categorizes issue or PR as related to a new feature. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. kind/flake Categorizes issue or PR as related to a flaky test. area/scheduling area/controllers area/cli area/dependency Issues or PRs related to dependency changes area/webhooks area/deploy Issues or PRs related to deploy/helm/build/scripts changes area/documentation documentation of design or user-guide area/performance Issues or PRs related to performance size/S Denotes a PR that changes 10-29 lines, ignoring generated files. area/test CI and test related Issues or PRs labels Feb 27, 2025
@@ -32,6 +32,7 @@ import (
const (
defaultSchedulerName = "volcano"
defaultSchedulerPeriod = time.Second
defaultResyncPeriod = 5 * time.Minute
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give more details about why we need this resync perion, as I didn't see that the resync is set in upstream kubernetes components: )

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is too small and will cause scheduler performance issue.

@@ -633,7 +634,7 @@ func newSchedulerCache(config *rest.Config, schedulerNames []string, defaultQueu
}

func (sc *SchedulerCache) addEventHandler() {
informerFactory := informers.NewSharedInformerFactory(sc.kubeClient, 0)
informerFactory := informers.NewSharedInformerFactory(sc.kubeClient, sc.resyncPeriod)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sc.resyncPeriod is not set.

@@ -32,6 +32,7 @@ import (
const (
defaultSchedulerName = "volcano"
defaultSchedulerPeriod = time.Second
defaultResyncPeriod = 5 * time.Minute
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is too small and will cause scheduler performance issue.

@Monokaix Monokaix removed kind/bug Categorizes issue or PR as related to a bug. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. area/scheduling area/controllers area/cli area/dependency Issues or PRs related to dependency changes kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. labels Feb 28, 2025
@Monokaix Monokaix removed kind/flake Categorizes issue or PR as related to a flaky test. kind/documentation Categorizes issue or PR as related to documentation. area/test CI and test related Issues or PRs area/webhooks area/deploy Issues or PRs related to deploy/helm/build/scripts changes area/documentation documentation of design or user-guide area/performance Issues or PRs related to performance labels Feb 28, 2025
@Monokaix Monokaix requested a review from Copilot February 28, 2025 06:40
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Overview

This PR adds a new resync-period flag for Kubernetes native informers to help mitigate cache inconsistencies.

  • Adds a new flag and default value in options.go
  • Passes the resync-period parameter through to the scheduler and cache constructors
  • Updates the informer factory initialization in scheduler cache to use the new resync period

Reviewed Changes

File Description
cmd/scheduler/app/options/options.go Adds a new command-line flag for resync period
pkg/scheduler/scheduler.go Passes the resync period parameter to the cache constructor
pkg/scheduler/cache/cache.go Updates the cache interface and informer factory initialization

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

@@ -542,7 +543,7 @@ func newDefaultAndRootQueue(vcClient vcclient.Interface, defaultQueue string) {
}
}

func newSchedulerCache(config *rest.Config, schedulerNames []string, defaultQueue string, nodeSelectors []string, nodeWorkers uint32, ignoredProvisioners []string) *SchedulerCache {
func newSchedulerCache(config *rest.Config, schedulerNames []string, defaultQueue string, nodeSelectors []string, nodeWorkers uint32, ignoredProvisioners []string, resyncPeriod time.Duration) *SchedulerCache {
Copy link
Preview

Copilot AI Feb 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new field 'resyncPeriod' in SchedulerCache is declared but not assigned within newSchedulerCache; ensure you set sc.resyncPeriod = resyncPeriod to properly propagate the flag value.

Copilot is powered by AI, so mistakes are possible. Review output carefully before use.

Positive Feedback
Negative Feedback

Provide additional feedback

Please help us improve GitHub Copilot by sharing more details about this comment.

Please select one or more of the options
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants