Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Feature/pod level identity #263

Merged
merged 15 commits into from
Sep 23, 2024
Merged

Feature/pod level identity #263

merged 15 commits into from
Sep 23, 2024

Conversation

muddyfish
Copy link
Contributor

Issue #, if available: #111

Description of changes:

  • Adds support for volume 'authentication sources'. Currently supported authentication sources include 'pod' and 'driver', with 'driver' being the default.
  • Adds documentation for configuring the CSI Driver, including authentication sources.
  • Pod authentication source only supports IRSA credentials.
  • CSI Driver service account now requires permissions to list all service accounts in the cluster.
  • Adds an example of usage to static_provisioning examples.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

muddyfish and others added 15 commits September 11, 2024 15:31
Implements a proof of concept of pod level identity

Co-authored-by: Simon Beal <simobeal@amazon.com>
Signed-off-by: Burak Varlı <burakvar@amazon.co.uk>
Co-authored-by: Simon Beal <simobeal@amazon.com>
Signed-off-by: Burak Varlı <burakvar@amazon.co.uk>
* Initial e2e test for pod level identity

* Temporarily remove other e2e tests

In WIP state where configuration options for pod level identity arent yet complete.

* Add some initial set of credential tests

* Revert `tests/e2e-kubernetes/scripts/eksctl-patch.yaml` to original

* Pass created files seed to `expectReadToSucceed` function

* Use updated SA object while restoring the override

* Improve assertions

---------

Co-authored-by: Simon Beal <simobeal@amazon.com>
Signed-off-by: Burak Varlı <burakvar@amazon.co.uk>
* Create credential provider and refactor pod-level e2e tests

* Enable all end-to-end tests

* Fix sanity tests

* Update CI scripts

* Clean resources in reverse order similar to how `defer` works

* Delete pod with `gracePeriod=0` in `expectFailToMount`

* Increase end-to-end test timeout to 30 minutes

* Implement STS region detection

* Cleanup service account tokens before and after mount

* Separate pod-level tests for ease of review

* Add pod-level tests back

Signed-off-by: Burak Varlı <burakvar@amazon.co.uk>
* Add `WriteFileAtomic` function

This is copied from Tailscale codebase.

* Ensure service account token files are unique per mount

We're using Pod ID + Volume ID to ensure uniqueness.

* Fix race condition in unit tests

* Use `github.com/google/renameio` for atomic writes to a file

* Fix test coverage report in CI

* Replace spaces with tab in `Makefile`
Signed-off-by: Burak Varlı <burakvar@amazon.co.uk>
* Add a test case to make sure we're not using secrets if pod-level
identity is enabled

* Ensure to disable all other credential providers except STS provider

* Add an utility function to extract arguments from Mountpoint args

* Disable caching with pod-level credentials

* Ensure we don't log sensitive information from `csi.NodePublishVolumeRequest`

* Move cluster-wide cleanup to `AfterEach` to ensure its always called

It was getting skipped if we `Skip` some tests and it was causing
cluster to stay in an invalid state.

Signed-off-by: Burak Varlı <burakvar@amazon.co.uk>
* Support `UNSTABLE_MOUNTPOINT_CACHE_KEY` when using pod level identity

When using pod level identity, use send a cache key to MP, which separates
caches even if the cache location is the same. This is frequent when using the
PLI feature.

* Enable cache with PLI

Signed-off-by: Burak Varlı <burakvar@amazon.co.uk>
* Add `ParseTargetPath` function

Signed-off-by: Burak Varlı <burakvar@amazon.co.uk>

* Clean up tokens by parsing target path in `NodeUnpublishVolume`

We need Pod ID and Volume ID to clean up any previously created
tokens. `NodeUnpublishVolume` call receives Volume ID but not Pod ID.
We were keeping the mapping from target path to Pod ID in memory
before but this is not ideal as it's not persistent and we were losing
this data when CSI driver restarts. With this change, we're parsing
target path to get Pod ID and use that to clean up tokens without
storing any state in the CSI driver.

Signed-off-by: Burak Varlı <burakvar@amazon.co.uk>

* Use a RegExp to parse target path

Signed-off-by: Burak Varlı <burakvar@amazon.co.uk>

* Update target path RegExp

Signed-off-by: Burak Varlı <burakvar@amazon.co.uk>

---------

Signed-off-by: Burak Varlı <burakvar@amazon.co.uk>
* Add authentication source to user-agent

Signed-off-by: Burak Varlı <burakvar@amazon.co.uk>

* Add authentication source to user-agent unconditionally

Signed-off-by: Burak Varlı <burakvar@amazon.co.uk>

---------

Signed-off-by: Burak Varlı <burakvar@amazon.co.uk>
* Update Go to `1.22.7`

Signed-off-by: Burak Varlı <burakvar@amazon.co.uk>

* Escape Pod ID in `tokenFilename` function

Signed-off-by: Burak Varlı <burakvar@amazon.co.uk>

* Set permissions of Unix socket explicitly

Signed-off-by: Burak Varlı <burakvar@amazon.co.uk>

---------

Signed-off-by: Burak Varlı <burakvar@amazon.co.uk>
* Set max receive message size to 2MB for the gRPC server

Signed-off-by: Burak Varlı <burakvar@amazon.co.uk>

* Set `readOnlyRootFilesystem: true` and `allowPrivilegeEscalation:
false` for containers

Signed-off-by: Burak Varlı <burakvar@amazon.co.uk>

* Check device type before unmounting the mount point

Signed-off-by: Burak Varlı <burakvar@amazon.co.uk>

* Ensure to escape newlines from log entries

Signed-off-by: Burak Varlı <burakvar@amazon.co.uk>

* Return pointer of `NodePublishVolumeRequest` to use nice formatting

Signed-off-by: Burak Varlı <burakvar@amazon.co.uk>

* Add tests for `IsMountPoint` method

Signed-off-by: Burak Varlı <burakvar@amazon.co.uk>

---------

Signed-off-by: Burak Varlı <burakvar@amazon.co.uk>
PLI docs

- Add diagrams for the various approaches
- Add descriptions for the driver and pod level approaches
- Migrate configuration instructions from install.md to new configuration file
Add notes to say that only IRSA is supported with PLI, and also that driver level credential sources will be ignored.
* Update `deploy` directory for Pod Level Identity

* Move resources around in `deploy` directory

Rendered with `helm template --debug charts/aws-mountpoint-s3-csi-driver`

---------

Co-authored-by: Simon Beal <simobeal@amazon.com>
* Enable `podInfoOnMount` when using k8s>=1.30

* Add warning about needing to pass in special config for PLI on clusters <1.30

Add instructions on configuring STS regions

* Use `.Capabilities.KubeVersion.Version` rather than GitVersion

* Add `podInfoOnMount: true` to kustomize install
@muddyfish muddyfish requested a review from unexge September 23, 2024 12:51
@muddyfish muddyfish merged commit bad6cd5 into main Sep 23, 2024
16 checks passed
@unexge unexge deleted the feature/pod-level-identity branch September 23, 2024 14:18
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants