Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

rework configuration using custom resources #164

Merged
merged 30 commits into from
Nov 24, 2023

Conversation

klihub
Copy link
Collaborator

@klihub klihub commented Oct 18, 2023

This patch set reworks how configuration is handled, switching from a single dynamically constructed map[string]interface{}/ConfigMap to multiple/per-plugin statically defined custom resources. In particular the patch set,

  • defines a v1alpha1 configuration API (per (resource policy) plugin custom resource for configuration)
  • implements a new agent for monitoring configuration, delivering configuration data
  • updates resource manager to wait for initial configuration before starting up
  • updates resource manager to reconfigure the various components for subsequent configuration changes
  • updates the policy plugins to use the new CR-based configuration
  • reworks the helm charts to deal with custom resources instead of ConfigMaps
  • reworks the e2e test infra to use helm/charts instead of deployment files
  • updates the test cases for the new configuration mechanism and format(s)

Note: Although already due to its mere size this is still work in progress, I've given up the idea that we'd be able to get this in final shape in a single go.

TODO:

  • externally visible changes could use a consistency check by reviewers (config data type, names, etc. also wrt. to other existing user-visible interfaces, like namespace used for annotations, etc.)
  • large parts of the code could use a cleanup (but maybe other than the worst could be addressed with subsequent PRs?)
  • nuking a bunch of old code probably would now allow further structural and other simplifications (maybe in subsequent PRs?)

@klihub klihub force-pushed the devel/cr-based-config branch 3 times, most recently from 59554d7 to 83bfbf9 Compare October 18, 2023 18:19
@klihub klihub force-pushed the devel/cr-based-config branch 2 times, most recently from 47da7a4 to 44a8887 Compare October 24, 2023 13:49
@klihub
Copy link
Collaborator Author

klihub commented Oct 24, 2023

One way to give this a test using the Helm charts is to generate the images, copy and import them to the test node (podman or ctr depending on your runtime), then use a Helm override like this with the name/tag matching your build and how you imported the image...

# Test overrides for TA Helm Chart. You can use an almost identical one for balloons.
#   - assume/use pre-imported plugin image
#   - never try to pull
#   - turn on full debugging
---
image:
  name: localhost/nri-resource-policy-topology-aware
  tag: v0.1.0-138-g0c402fe
  pullPolicy: Never

policyConfig:
  reservedResources:
    cpu: 750m

commonConfig:
  control:
    enable:
      - cpu
  log:
    debug:
      "*": true
    logSource: true
    klog:
      skip_headers: true
  instrumentation:
    reportPeriod: 60s
    samplingRatePerMillion: 0

hostPort: 8891

resources:
  cpu: 500m
  memory: 512Mi

nri:
  patchRuntimeConfig: false

initContainerImage:
  name: ghcr.io/containers/nri-plugins/nri-config-manager
  # If not defined Chart.AppVersion will be used
  #tag: unstable
  pullPolicy: Always

tolerations: []

@klihub klihub force-pushed the devel/cr-based-config branch from 44a8887 to 09db8f0 Compare October 24, 2023 17:14
@klihub klihub force-pushed the devel/cr-based-config branch 11 times, most recently from bd58e42 to 43e33d8 Compare November 2, 2023 19:57
@klihub klihub force-pushed the devel/cr-based-config branch 7 times, most recently from 8a2f1cf to cd821ab Compare November 7, 2023 09:53
Implement bootstrapping and configuration using custom resources.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Remove old ConfigMap-based configuration infra.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Add updated sample configuration for the balloons and
topology-aware policy plugins. Remove old ConfigMaps.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Update Helm chart for custom resource based configuration.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Update Helm chart for custom resource based configuration.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Add a helm chart for the template policy plugin.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Use a generic 'nri-plugin' template variable instead of the
per chart '$PLUGIN_NAME[-plugin]' multiple variants, which
AFAICT are neither necessary nor useful.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
@klihub klihub force-pushed the devel/cr-based-config branch 2 times, most recently from bfc5408 to 6c77937 Compare November 23, 2023 13:51
Update playbooks and e2e test infra to use helm for launching
and terminating test plugins.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Update topology-aware tests for helm-based deployment and
configuration by custom resources.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Update balloons tests for helm-based deployment and configuration
by custom resources.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Remove old launch and terminate script API functions which were
used to start/stop test plugins using daemonset deployment files.
These are obsoleted by the recently introduced helm versions and
should be unused.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Vagrant automatically configures sudo with !requiretty, so
it should be safe to enable ansible pipelining by default.
Additionally, enable persistent connections for ssh.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Remove obsolete deployment templates and related Makefile rules.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Do minimal editorial updates to the documentation. Remove
references to ConfigMap-based common configuration. Refer
to configuration custom resources instead.

Remove a few other obsolete bits spotted and add a single-
paragraph description about the template policy.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Fix the easiest bunch of complaints ansible-lint gave for
playbook/provision.yaml.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
@klihub klihub force-pushed the devel/cr-based-config branch from 6c77937 to 3a6704b Compare November 23, 2023 15:44
Copy link
Collaborator

@marquiz marquiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @klihub for this huge effort 👏 I think the CRD API now looks nice and clean. And everything else (if we find something) we can easily change later without touching the UI (I hope) 😇

LGTM

Copy link
Collaborator

@askervin askervin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this huge work, @klihub!

LGTM.

@askervin askervin merged commit f75d68b into containers:main Nov 24, 2023
@klihub klihub deleted the devel/cr-based-config branch November 24, 2023 13:58
fmuyassarov added a commit to fmuyassarov/nri-plugins-operator that referenced this pull request Nov 27, 2023
Follow the same group name as the new CRDs that are introduced by
Krisztian Litkey in containers/nri-plugins#164.

Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>
fmuyassarov added a commit to fmuyassarov/nri-plugins-operator that referenced this pull request Nov 27, 2023
Follow the same group name as the new CRDs that are introduced by
Krisztian Litkey in containers/nri-plugins#164.

Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>
fmuyassarov added a commit to fmuyassarov/nri-plugins-operator that referenced this pull request Nov 27, 2023
Also follow the same group name config.nri as the new CRDs that are
introduced by Krisztian Litkey in containers/nri-plugins#164.

Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>
fmuyassarov added a commit to fmuyassarov/nri-plugins-operator that referenced this pull request Nov 27, 2023
Also follow the same group name config.nri as the new CRDs that are
introduced by Krisztian Litkey in containers/nri-plugins#164.

Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>
fmuyassarov added a commit to fmuyassarov/nri-plugins-operator that referenced this pull request Nov 28, 2023
This commit adds the following improvments on CRD:
1. Add a finalizer to the Custom Resource to prevent its deletion
until the associated Helm chart of the plugin has been removed.

2. Prevent mutation of the spec.pluginName field to ensure proper
handling of plugin updates. If a user intends to deploy a different
plugin than the one currently deployed, they must follow a two-step
process:
  - Delete the existing Custom Resource (CR) instance.
  - Deploy a new CR instance with the updated plugin name.

This restriction is necessary because the controller reconciliation
process cannot distinguish between object creation and editing events.
Allowing mutation could lead to reconciliation issues, where the
controller may not recognize the need to uninstall the current plugin
before installing the new one. This could potentially result in
attempting to install two plugins in the same namespace, leading to
errors.

3. Follow the same group name config.nri as the new CRDs that are
introduced by Krisztian Litkey in containers/nri-plugins#164.

Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>
fmuyassarov added a commit to fmuyassarov/nri-plugins-operator that referenced this pull request Nov 28, 2023
This commit adds the following improvments on CRD:
1. Add a finalizer to the Custom Resource to prevent its deletion
until the associated Helm chart of the plugin has been removed.

2. Prevent mutation of the spec.pluginName field to ensure propely
handling of plugin updates. If a user intends to deploy a different
plugin than the one currently deployed, they must follow a two-step
process:
  - Delete the existing Custom Resource (CR) instance.
  - Deploy a new CR instance with the updated plugin name.

This restriction is necessary because the controller reconciliation
process cannot distinguish between object creation and editing events.
Allowing mutation could lead to reconciliation issues, where the
controller may not recognize the need to uninstall the current plugin
before installing the new one. This could potentially result in
attempting to install two plugins in the same namespace, leading to
errors.

3. Follows the same group name config.nri as the new CRDs that are
introduced by Krisztian Litkey in containers/nri-plugins#164.

4. Tightens RBAC rules for better security

5. Cleanups auto-generated comments and boiler plate code

6. Allow users to customize Helm charts by extending the CR with
spec.values. This field is optional (omitempty), and when specified,
the values are passed to the Helm chart during plugin installation

7. Introduce an enumeration with predefined values for the
'spec.pluginName' and 'spec.state' fields, preventing users from
setting incorrect values.

Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@intel.com>
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants