Pod-level identity improvements #237

unexge · 2024-08-07T14:20:41Z

This PR adds the following improvements:

Ensure we don't log sensitive information from csi.NodePublishVolumeRequest
Disable caching with pod-level credentials
Ensure to disable all other credential providers except STS provider
Add a test case to make sure we're not using secrets if pod-level identity is enabled

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

identity is enabled

…Request`

muddyfish · 2024-08-08T12:18:54Z

tests/e2e-kubernetes/testsuites/credentials.go

+			killCSIDriverPods(ctx, f)
+		}
+
+		updateDriverLevelCredentials := func(ctx context.Context, policyARN string) {


Can we call this something like updateDriverLevelK8sSecret? The current name doesn't make it obvious

It was getting skipped if we `Skip` some tests and it was causing cluster to stay in an invalid state.

* Add a test case to make sure we're not using secrets if pod-level identity is enabled * Ensure to disable all other credential providers except STS provider * Add an utility function to extract arguments from Mountpoint args * Disable caching with pod-level credentials * Ensure we don't log sensitive information from `csi.NodePublishVolumeRequest` * Move cluster-wide cleanup to `AfterEach` to ensure its always called It was getting skipped if we `Skip` some tests and it was causing cluster to stay in an invalid state.

* Add a test case to make sure we're not using secrets if pod-level identity is enabled * Ensure to disable all other credential providers except STS provider * Add an utility function to extract arguments from Mountpoint args * Disable caching with pod-level credentials * Ensure we don't log sensitive information from `csi.NodePublishVolumeRequest` * Move cluster-wide cleanup to `AfterEach` to ensure its always called It was getting skipped if we `Skip` some tests and it was causing cluster to stay in an invalid state. Signed-off-by: Burak Varlı <burakvar@amazon.co.uk>

* Initial implementation of multicredential support (#223) Implements a proof of concept of pod level identity Co-authored-by: Simon Beal <simobeal@amazon.com> Signed-off-by: Burak Varlı <burakvar@amazon.co.uk> * Add cluster role to helm service account definition (#226) Co-authored-by: Simon Beal <simobeal@amazon.com> Signed-off-by: Burak Varlı <burakvar@amazon.co.uk> * Add initial set of end-to-end credential tests (#231) * Initial e2e test for pod level identity * Temporarily remove other e2e tests In WIP state where configuration options for pod level identity arent yet complete. * Add some initial set of credential tests * Revert `tests/e2e-kubernetes/scripts/eksctl-patch.yaml` to original * Pass created files seed to `expectReadToSucceed` function * Use updated SA object while restoring the override * Improve assertions --------- Co-authored-by: Simon Beal <simobeal@amazon.com> Signed-off-by: Burak Varlı <burakvar@amazon.co.uk> * Create credential provider and refactor pod-level e2e tests (#235) * Create credential provider and refactor pod-level e2e tests * Enable all end-to-end tests * Fix sanity tests * Update CI scripts * Clean resources in reverse order similar to how `defer` works * Delete pod with `gracePeriod=0` in `expectFailToMount` * Increase end-to-end test timeout to 30 minutes * Implement STS region detection * Cleanup service account tokens before and after mount * Separate pod-level tests for ease of review * Add pod-level tests back Signed-off-by: Burak Varlı <burakvar@amazon.co.uk> * Ensure service account token files are unique per mount (#236) * Add `WriteFileAtomic` function This is copied from Tailscale codebase. * Ensure service account token files are unique per mount We're using Pod ID + Volume ID to ensure uniqueness. * Fix race condition in unit tests * Use `github.com/google/renameio` for atomic writes to a file * Fix test coverage report in CI * Replace spaces with tab in `Makefile` Signed-off-by: Burak Varlı <burakvar@amazon.co.uk> * Pod-level identity improvements (#237) * Add a test case to make sure we're not using secrets if pod-level identity is enabled * Ensure to disable all other credential providers except STS provider * Add an utility function to extract arguments from Mountpoint args * Disable caching with pod-level credentials * Ensure we don't log sensitive information from `csi.NodePublishVolumeRequest` * Move cluster-wide cleanup to `AfterEach` to ensure its always called It was getting skipped if we `Skip` some tests and it was causing cluster to stay in an invalid state. Signed-off-by: Burak Varlı <burakvar@amazon.co.uk> * Feature/pod level identity cache key (#245) * Support `UNSTABLE_MOUNTPOINT_CACHE_KEY` when using pod level identity When using pod level identity, use send a cache key to MP, which separates caches even if the cache location is the same. This is frequent when using the PLI feature. * Enable cache with PLI Signed-off-by: Burak Varlı <burakvar@amazon.co.uk> * Clean up tokens by parsing target path in `NodeUnpublishVolume` (#251) * Add `ParseTargetPath` function Signed-off-by: Burak Varlı <burakvar@amazon.co.uk> * Clean up tokens by parsing target path in `NodeUnpublishVolume` We need Pod ID and Volume ID to clean up any previously created tokens. `NodeUnpublishVolume` call receives Volume ID but not Pod ID. We were keeping the mapping from target path to Pod ID in memory before but this is not ideal as it's not persistent and we were losing this data when CSI driver restarts. With this change, we're parsing target path to get Pod ID and use that to clean up tokens without storing any state in the CSI driver. Signed-off-by: Burak Varlı <burakvar@amazon.co.uk> * Use a RegExp to parse target path Signed-off-by: Burak Varlı <burakvar@amazon.co.uk> * Update target path RegExp Signed-off-by: Burak Varlı <burakvar@amazon.co.uk> --------- Signed-off-by: Burak Varlı <burakvar@amazon.co.uk> * Add authentication source to user-agent (#253) * Add authentication source to user-agent Signed-off-by: Burak Varlı <burakvar@amazon.co.uk> * Add authentication source to user-agent unconditionally Signed-off-by: Burak Varlı <burakvar@amazon.co.uk> --------- Signed-off-by: Burak Varlı <burakvar@amazon.co.uk> * Pod-level identity improvements (#257) * Update Go to `1.22.7` Signed-off-by: Burak Varlı <burakvar@amazon.co.uk> * Escape Pod ID in `tokenFilename` function Signed-off-by: Burak Varlı <burakvar@amazon.co.uk> * Set permissions of Unix socket explicitly Signed-off-by: Burak Varlı <burakvar@amazon.co.uk> --------- Signed-off-by: Burak Varlı <burakvar@amazon.co.uk> * Pod-level identity improvements (#258) * Set max receive message size to 2MB for the gRPC server Signed-off-by: Burak Varlı <burakvar@amazon.co.uk> * Set `readOnlyRootFilesystem: true` and `allowPrivilegeEscalation: false` for containers Signed-off-by: Burak Varlı <burakvar@amazon.co.uk> * Check device type before unmounting the mount point Signed-off-by: Burak Varlı <burakvar@amazon.co.uk> * Ensure to escape newlines from log entries Signed-off-by: Burak Varlı <burakvar@amazon.co.uk> * Return pointer of `NodePublishVolumeRequest` to use nice formatting Signed-off-by: Burak Varlı <burakvar@amazon.co.uk> * Add tests for `IsMountPoint` method Signed-off-by: Burak Varlı <burakvar@amazon.co.uk> --------- Signed-off-by: Burak Varlı <burakvar@amazon.co.uk> * Feature/pod level identity docs (#255) PLI docs - Add diagrams for the various approaches - Add descriptions for the driver and pod level approaches - Migrate configuration instructions from install.md to new configuration file * Remove unneeded nodes in docs diagrams (#259) Add notes to say that only IRSA is supported with PLI, and also that driver level credential sources will be ignored. * Update Kustomization manifest (#261) * Update `deploy` directory for Pod Level Identity * Move resources around in `deploy` directory Rendered with `helm template --debug charts/aws-mountpoint-s3-csi-driver` --------- Co-authored-by: Simon Beal <simobeal@amazon.com> * Enable `podInfoOnMount` when using k8s>=1.30 (#262) * Enable `podInfoOnMount` when using k8s>=1.30 * Add warning about needing to pass in special config for PLI on clusters <1.30 Add instructions on configuring STS regions * Use `.Capabilities.KubeVersion.Version` rather than GitVersion * Add `podInfoOnMount: true` to kustomize install --------- Signed-off-by: Burak Varlı <burakvar@amazon.co.uk> Co-authored-by: Burak <burakvar@amazon.co.uk>

unexge added 5 commits August 6, 2024 17:09

Add a test case to make sure we're not using secrets if pod-level

c9e1f37

identity is enabled

Ensure to disable all other credential providers except STS provider

fc45f07

Add an utility function to extract arguments from Mountpoint args

a1429f5

Disable caching with pod-level credentials

f527463

Ensure we don't log sensitive information from `csi.NodePublishVolume…

058f2dd

…Request`

muddyfish reviewed Aug 8, 2024

View reviewed changes

Move cluster-wide cleanup to AfterEach to ensure its always called

bb7d6a5

It was getting skipped if we `Skip` some tests and it was causing cluster to stay in an invalid state.

unexge merged commit 598d260 into feature/pod-level-identity Aug 8, 2024
8 checks passed

unexge deleted the unexge/pod-level-identity branch August 8, 2024 14:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pod-level identity improvements #237

Pod-level identity improvements #237

unexge commented Aug 7, 2024

muddyfish Aug 8, 2024

Pod-level identity improvements #237

Pod-level identity improvements #237

Conversation

unexge commented Aug 7, 2024

muddyfish Aug 8, 2024

Choose a reason for hiding this comment