Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Document how to manage certificates in multi master HA setup #2560

Closed
gbarton opened this issue Nov 20, 2020 · 2 comments
Closed

Document how to manage certificates in multi master HA setup #2560

gbarton opened this issue Nov 20, 2020 · 2 comments
Labels
kind/documentation Improvements or additions to documentation status/stale

Comments

@gbarton
Copy link

gbarton commented Nov 20, 2020

Is your feature request related to a problem? Please describe.
We are having challenges with certificate management within k3s specifically around cycling baked k3os AWS AMI's.
From this post we learned to generate our own CA's and bake them into the images: #1868 (comment) This successfully allows a new node of the same AMI to join an existing cluster so we can scale appropriately and everything works as intended.

Problem is when we create a new AMI (usually when k3os updates k3s), the other certs (not listed in above post) are regenerated and cause havok when we spin the new AMI up into an existing cluster. Net result is the existing cluster secrets are invalidated and nothing can talk to the k3s api's, so cluster grinds to a halt. We have to cycle all nodes to the new ami, delete all appropriate secrets, let them regenerate, then manually kill pretty much every pod so they grab new tokens. This obviously causes outages which we would like to avoid.

Describe the solution you'd like
I would like to see the HA docs amended for k3s to talk about certificate management of what needs to be shared/consistent among master nodes in order to not cause havok in a running cluster when it is upgraded. It is clearly more than the CA files mentioned above. This must include relevant information for how to generate any subsequent certificates and the correct cert info such as SANS/etc.

Describe alternatives you've considered
None, am at a loss.

Additional context
We cut AMI's so that we can bake in all the core containers we use (aws-ebs-controller, traefik, filebeat, etc) as well as deploying our registry information into it. We have no intention of restarting nodes or managing them once added to a cluster, we simply kill and cycle like one normally manages EC2's.

Any guidance is appreciated. Thank you for an extremely well made and built k8s implementation!

@radixCSgeek
Copy link

Through a lot of trial and error we discovered that if you copy service.key and serving-kubelet.key from your running master and bake it into all of your future AMIs this appears to allow everything to communicate. This doesn't necessarily seem like the right way though.

@davidnuzik davidnuzik added the kind/documentation Improvements or additions to documentation label Nov 20, 2020
@davidnuzik davidnuzik added this to the Documentation Backlog milestone Nov 20, 2020
@stale
Copy link

stale bot commented Jul 30, 2021

This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 180 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
kind/documentation Improvements or additions to documentation status/stale
Projects
Status: Closed
Development

No branches or pull requests

3 participants