Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add support for dynamically creating and mounting Persistent Volume Claims #144

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

fonaid
Copy link

@fonaid fonaid commented Jan 10, 2025

No description provided.

@fonaid
Copy link
Author

fonaid commented Jan 10, 2025

Thank you in advance for taking a look at this proposal! I would be interested in your view on the rejected idea, too. If you think the implementation would be feasible and more elegant, I am happy to rework the proposal.

Copy link
Member

@scholzj scholzj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the proposal. Managing the persistent volumes claims can be very complex. Just look at all the logic needed to revert unallowed changes, resizing of the volumes etc. I'm not sure I understand the motivation for adding all this complexity to the additional volumes. But this is in any case missing in the proposal.

The NodePools were designed exactly to handle a situations like this. So at least in Kafka clusters, using their functionality would be the right way to go I think.

@fonaid
Copy link
Author

fonaid commented Jan 10, 2025

@scholzj Thank you for checking on this. With all the complex logic already implemented for the data volumes, my initial thought was that parsing some more PVCs from the resource and adding them to the PVC list (e.g. in KafkaCluster) would not introduce too much additional complexity but I am not familiar with the codebase to such extent.

As for the motivation behind this, we have a use case where each broker needs a unique and separate persistent volume, which is mounted in additional volumes. Currently we need to create PVCs by hand and more importantly, a separate KafkaNodePool for each broker. This way cluster scaling requires more manual work as we can not increase the replicas in the KafkaNodePool. For example, the resources look like for one broker:

kind: PersistentVolumeClaim
metadata:
  name: pvc-0
spec: ...
---
kind: KafkaNodePool
...
spec:
  replicas: 1
  template:
    pod:
      volumes:
        - name: manual-pvc
          persistentVolumeClaim:
            claimName: pvc-0

The above resources need to be created for each broker. If we want to scale up, we need to define another PVC-NodePool pair instead of just increasing the number of replicas.

I am not aware of any feature in KafkaNodePool which would simplify the above use case but maybe I am missing something.

@scholzj
Copy link
Member

scholzj commented Jan 10, 2025

@scholzj Thank you for checking on this. With all the complex logic already implemented for the data volumes, my initial thought was that parsing some more PVCs from the resource and adding them to the PVC list (e.g. in KafkaCluster) would not introduce too much additional complexity but I am not familiar with the codebase to such extent.

That is a completely different part of the code base. It would either require significant redesign and re-architecture of the code to handle it in the same place by the existing logic or a significant code duplication. So I think it would introduce a lot of additional complexity. But in any case, this would need to be covered by the proposal.

As for the motivation behind this, we have a use case where each broker needs a unique and separate persistent volume, which is mounted in additional volumes. Currently we need to create PVCs by hand and more importantly, a separate KafkaNodePool for each broker. This way cluster scaling requires more manual work as we can not increase the replicas in the KafkaNodePool. For example, the resources look like for one broker:

One of the purposes of the proposal is to explain the use cases and explain why this is something that is a generally applicable feature that will find multiple users. You have to explain and convince people why is it worth for the community to take this feature and commit to its development and maintenance effort for you.

So a start is to not say we have a use case but explain the use case. If you fail to explain the use case, and why it is applicable to everyone / useful to others, to be honest, the "manual work" is what might be the best approach for unique requirements (plus I'm pretty sure it can be handled by some outside automation if needed).

@ppatierno
Copy link
Member

I agree with Jakub. The proposal should explain why this stuff would be important to support for more general use cases than just one specific (your) use case. Adding something like this is not just developing the feature but also having commitment to maintainance, test and fix (if any bug is found). It's not a one shot contribution, so it's worth thinking through it more.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants