-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
KEP 4447: Promote PolicyReport API to Kubernetes SIG API #4448
Conversation
Welcome @anusha94! |
Hi @anusha94. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Based on the producer and usage, it is possible to create lots of report objects. | ||
For example, if a policy engine has 20 policy rules and a namespace has 1000 pods, | ||
an implementation may produce 20,000 reports. This can overwhelm etcd. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the producer and usage, it is possible to create lots of report objects. | |
For example, if a policy engine has 20 policy rules and a namespace has 1000 pods, | |
an implementation may produce 20,000 reports. This can overwhelm etcd. | |
Based on the producer and usage, it is possible to create lots of report objects. | |
For example, if a policy engine has 20 policy rules and a namespace has 1000 pods, | |
an implementation may produce 20,000 reports. If a cluster operator deploys PolicyReport | |
into their cluster, using this APU can overwhelm etcd. |
(is this a risk? We already let people deploy any CRD they like.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It has definitely been reported as an issue for users of the API. Whether that constitutes a risk or not is a good question, but this should be highlighted somewhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have 2 projects we can link to and reference - perhaps in the docs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is my core concern with the concept of reporting findings via the API server.
I think events are another example of a high-volume object? One salient difference between this and events is that events are understood to be subject to throttling/sampling. Security reports may not have the same luxury.
I like the idea of reports-server, but IMO it would need to be an expectation that all clusters have a similar scalable backend solution before reports could be reliably enabled without risking cluster stability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should also add that there is a difference between "users can deploy any CRD they like" and "K8s accepts using KRM/the API server this way as a valid practice", the second statement has much stronger implications around supportability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JimBugwadia can you comment on the cluster reliability and performance concerns brought up here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ritazh - can you please help clarify what exactly is expected?
The proposal is for a uniform API for reporting, and reliability or performance will depend heavily on implementations. For example, the API as a contract between consumers and producers can be used as a bounded log for the last N results.
We can help document best practices, but seems like a number of those may be applicable to any other API as well. For example, the standard size limits would apply, and resource limits can be configured.
Is there any prior work, done to test performance and reliability impacts of other APIs, that we can reference?
If there are specific tests or measurements that are recommended, happy to help capture the data.
We need approvals from the following stakeholders: | ||
[TBD] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will we target this API at a release?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, its decoupled from Kubernetes releases.
|
||
- Add `policy-report-api` as a new project under kubernetes-sigs i.e `github.com/kubernetes-sigs/policy-report-api` | ||
- Provide guidance on building consumers and producers | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to publish official artefacts for the API?
- YAML manifest?
- OCI image of Helm chart?
- something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's what I suggest:
- Golang client set to reuse in producers and consumers
- Generated YAMLs
- API spec
- Docs
Based on the producer and usage, it is possible to create lots of report objects. | ||
For example, if a policy engine has 20 policy rules and a namespace has 1000 pods, | ||
an implementation may produce 20,000 reports. This can overwhelm etcd. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It has definitely been reported as an issue for users of the API. Whether that constitutes a risk or not is a good question, but this should be highlighted somewhere
/assign @ritazh |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
/remove-lifecycle rotten |
Co-authored-by: Andy Suderman <andy@suderman.dev>
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
not need to be as detailed as the proposal, but should include enough | ||
information to express the idea and why it was not acceptable. | ||
--> | ||
- Adopt PolicyReport as an official, in-tree Kubernetes API |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get this proposal is not about putting the API it in-tree, but I'd like to register my opposition to putting it in-tree.
From a scalability/stability point of view, reporting seems like a secondary concern compared to running an application, and better targeted for a datastore that is not serving traffic (i.e. not the cluster running the actual workloads being audited). From a security operations point of view, I’d be concerned about the veracity of this API if it was coming from a cluster that is hosting the workload being reported on. I’d really want security reporting information to be stored in a separate domain (in this case, cluster) from the source of the data. If it were in the same cluster, any Kubernetes CVE/authorization misconfiguration becomes that much more worse and calls the authenticity of report data into question. To anyone relying on reports for compliance, that becomes a real business-impacting issue. Given all that, it feels like making such an API part of Kubernetes core would naturally lead people to adopt anti-patterns in both (stability & security) cases, making the cluster a self-contained unit where source data is gathered and reports are stored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@micahhausler Yes, this proposal is not for adding the API in-tree. Instead, it is for a uniform API for reporting. I have listed in-tree as an alternative that was considered, but ruled out as policy reports are best managed as a Custom Resource. Perhaps I should clarify that.
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: anusha94 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Closing this PR in favor of the OpenReports proposal. |
/sig auth
/wg policy
cc @JimBugwadia