-
Notifications
You must be signed in to change notification settings - Fork 171
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Injector failure mode prevents Pod deletion #500
Comments
Hello, we were wondering over here why even watch for UPDATE events in the operator? What is it accounting for? Thanks! |
@komapa You're probably right that only watching for CREATE events would be a cleaner fix for this problem. It would be nice to hear from the vault-k8s maintainers. |
I don't think the fix #501 is bullet proof. Let's assume the pod from step 3 fails to schedule. The user realizes this and attempts to delete the Pod (or Job). At this point, the Pod still has The more appropriate fix is to simply NOT add a container if the webhook is called for an UPDATE operation. /reopen |
Sorry for letting this linger, and thanks for the reports. I'm going to work on getting hashicorp/vault-helm#783 merged which will fix this properly as per the above comments. |
#619 is also related for anyone using the deployment yaml instead of the helm chart. |
#783 is merged, hopefully this fix will stick with the next release of the helm chart. |
Describe the bug
If a Pod has the agent-inject annotation yet gets created without the injected sidecars, then any future update to the Pod will trigger the injector to add the sidecars. If the Pod has already been created, these attempts to modify
spec.containers
orspec.initContainers
will fail, thus causing the Pod update to fail. If the Pod has a finalizer, it will be impossible to remove the finalizer, and therefore it will be impossible to remove the Pod.It's easy to enter this failure mode if there's a temporary connectivity error between the Kubernetes apiserver and the vault agent injector.
To Reproduce
Steps to reproduce the behavior:
spec.failurePolicy: Ignore
andspec.timeoutSeconds: 30
.kubectl scale deploy --replicas=0
.sleep 300
.kubectl scale deploy --replicas=2
.batch.kubernetes.io/job-tracking
finalizer, but it will fail withPod "pod-name" is invalid: spec.initContainers: Forbidden: pod updates may not add or remove containers
.kubectl edit pod
will fail with the same error.Expected behavior
The vault-agent-injector shouldn't block Pods from being finalized.
Environment
Workaround
To delete a Pod that's stuck in this state, use
kubectl edit pod
to delete thevault.hashicorp.com/agent-inject
annotation.Fix
Even though this behavior is surprising, and the Kubernetes error message isn't super helpful, I think Kubernetes is actually doing the right thing. I think the injector could be modified to address this problem, though. If it's being asked to mutate a Pod, and the Pod's
status.phase
is a string other thanPending
, then it should do nothing. In other words, if a Pod has already been created, the injector shouldn't try to add containers because the Pod'sspec.initContainers
andspec.containers
are immutable.The text was updated successfully, but these errors were encountered: