Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Make it possible to view and debug the lifecycle of a secret populated by the kubernetes_secrets provider #6187

Closed
cmacknz opened this issue Dec 2, 2024 · 4 comments · Fixed by #6841
Assignees
Labels
Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team

Comments

@cmacknz
Copy link
Member

cmacknz commented Dec 2, 2024

The agent kubernetes_secrets provider has an internal cache that updates periodically and includes a TTL. https://www.elastic.co/guide/en/fleet/current/kubernetes_secrets-provider.html

The secrets provider includes limited logging about when individual secrets are updated or expired. The values populated by this provider should also be considered secrets in diagnostics and be redacted, making it challenging or impossible to tell if a secret updated when it was supposed to.

When the cache updates today for example, only a single log line indicating that some unindicated secret value was updated or deleted without specifying which is logged

updatedCache := p.updateCache()
if updatedCache {
p.logger.Info("Secrets cache was updated, the agent will be notified.")
comm.Signal()
}

We recently had an internal case where a JWT token populated as a kubernetes secret in the agent policy did not rotate as expected, and the limited logging in agent made it impossible to tell if agent was involved in the root cause.

Add logging that would allow us to verify that agent is updating secrets at the expected times with the correct values in the policy, without leaking the actual secret values.

@cmacknz cmacknz added the Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team label Dec 2, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

@adamkasztenny
Copy link

Maybe the hash of the secret could be logged? Elasticsearch does something similar for the JWKS here (although admittedly that's not necessarily secret data, the JWKS could be publicly accessible). This could be an optional setting that is disabled by default.

@pkoutsovasilis
Copy link
Contributor

@adamkasztenny 👋 instead of hashing actual secret data, which I am entirely comfortable with doing 😅, would logging the resource version result in the same benefits as hashing?

@adamkasztenny
Copy link

@pkoutsovasilis Yep that's a good idea, I think that's better than using the hash and should achieve the same thing. We just care if the secret has changed, not the contents.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants