Skip to content

stackitcloud/gardener-extension-acl

Repository files navigation

Gardener ACL Extension

TL;DR: The Gardener ACL extension allows you to limit the access to shoot clusters using an allow-list mechanism. Basically, it looks like this:

# in the shoot object
spec:
  extensions:
  - type: acl
    providerConfig:
      rule:
        action: ALLOW
        type: remote_ip
        cidrs:
          - "1.2.3.4/24"
          - "10.250.0.0/16"
          - ...

The extension also supports multiple ingress namespaces, e.g. when using Gardener ExposureClasses or deploying Highly Available Control Planes (see ADR03 for more information).

Please read on for more information.

Installation

Set your KUBECONFIG variable to the Garden cluster.

kubectl apply -f deploy/extension/base/controller-registration.yaml

Background, Functionality & Limitations

Gardener introduced Shoot API Server SNI with GEP08.

Using Istio, Gardener configures a single ingress gateway per seed to proxy traffic to all API servers on this seed based on some criteria. At it's core, Istio configures an envoy proxy using a set of Kubernetes CRDs. We can hook into this mechanism and insert additional configuration, which further limits the access to a specific cluster.

Broadly speaking, there are three different external traffic flows:

  1. Kubernetes API Listener (via SNI name)
  2. Kubernetes Service Listener (internal flow)
  3. VPN Listener

These ways are described in more detail in the aforementioned GEP. Essentially, these three ways are all represented by a specific Envoy listener with filters. The extension needs to hook into each of these filters (and their filter chains) to implement the desired behavior. Unfortunately, all three types of access require a unique way of handling them, respectively.

Listener Overview

  1. SNI Access - The most straightforward approach. Wen can deploy one additional EnvoyFilter per shoot with enabled ACL extension. It contains a filter patch that matches on the shoot SNI name and specifies an ALLOW rule with the provided IPs.
  2. Internal Flow - Gardener creates one EnvoyFilter per shoot that defines this listener. Unfortunately, it doesn't have any criteria we could use to match it with an additional EvnoyFilter spec on a per-shoot basis, and we've tried a lot of things to make it work. On top of that, a behavior that we see as a bug in Istio prevents us from working with priorities here, which was the closest we got to make it work. Now instead, the extension deploys a MutatingWebhook that intercepts creations and updates of EnvoyFilter resources starting with shoot-- (which is their only common feature). We then insert our rules. To make this work with updates to Extension objects, the controller dealing with 1) also updates a hash annotation on these EnvoyFilter resources every time the respective ACL extension object is updated.
  3. VPN Access - All VPN traffic moves through the same listener. This requires us to create only a single EnvoyFilter for VPN that contains all rules of all shoots that have the extension enabled. And, conversely, we need to make sure that traffic of all shoots that don't have the extension enabled is still able to pass through this filter unhindered. We achieve this by not only creating a policy for every shoot with ACL enabled, but also an "inverted" policy which matches all shoots that don't have ACL enabled. All these policies are then put in a single EnvoyFilter patch.

Because of the last point, we currently see no way of allowing the user to define multiple rules of different action types (ALLOW or DENY). Instead, we only support a single ALLOW rule per shoot, which is in our opinion the best trade-off to efficiently secure Kubernetes API servers.

See ADR02 for a more in-depth discussion of the challenges we had.

Cloud specific settings

Openstack

In order for the internal VPN traffic to work, the router IP adresses from the shoot openstack projects have to get allowlisted in the ACL extension.

Healthchecks

Gardener provides a Health Check Library that we can use to monitor the health of resources that our extension is responsible for. Example: If the extension controller deploys a Gardener ManagedResource, we can define a health check on the extension that checks for the health of this ManagedResource. This lets the extension reflect the state of the resources it is responsible for. This is expressed by status conditions in the extension resource itself (one per health check).

Generating ControllerRegistration and ControllerDeployment

Extensions are installed on a Gardener cluster by deploying a ControllerRegistration and a ControllerDeployment object to the garden cluster. In this repository, you find an example for both of these resources in the deploy/extension/base/controller-registration.yaml file.

The ProviderConfig.chart field contains the entire Helm chart for your extension as a gzipped and then base64-encoded string. After you altered this Helm chart in the charts/gardener-extension-acl directory, run make generate to re-create this value.

Tests

To run the test suite, execute:

make test

Place all needed Gardener CRDs in the upstream-crds directory, they get installed automatically in the envtest cluster.

See the actuator_test.go for a minimal test case example.

Local deployment

Set up a garden local-setup.

To install the extension with 2 ways:

make extension-up this will install the acl-extension into the local gardener environment.

make extension-dev this will also install the acl-extension into the local gardener environment but it will rebuild and redeploy if you press any key in the terminal.

Local debugging

This can only be done with the gardener local-setup.

After your local gardener is ready you can start the controller

Install extension

make extension-up

Disable reconcile of managed resource

kubectl annotate managedresource acl-XXXXXX resources.gardener.cloud/ignore="true"

Scale down acl-extension:

kubectl scale deployment -n extension-acl-XXXXXXX --replicas=0 gardener-extension-acl

Now you can run the acl-extension locally to debug it.

make run