Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

chore: Change iceberg into workspace #15

Merged
merged 1 commit into from
Aug 1, 2023
Merged

Conversation

Xuanwo
Copy link
Member

@Xuanwo Xuanwo commented Aug 1, 2023

This PR will change iceberg into workspace. I feel like it's better to split iceberg features into different crates so that we can work (and release) concurrently instead of waiting for the release of a whole iceberg.

For example, we can add iceberg rest api client first which could be generated from OpenAPI Specs and less related to iceberg core itself.

Signed-off-by: Xuanwo <github@xuanwo.io>
@Xuanwo
Copy link
Member Author

Xuanwo commented Aug 1, 2023

cc @Fokko, @JanKaul, @liurenjie1024 & @ZENOTME to take a look. Thanks.

@Xuanwo
Copy link
Member Author

Xuanwo commented Aug 1, 2023

The problem is the release process.

  • Can we release iceberg-rest-catalog-client seperately?
  • Is it a heavy work?
  • Or it's better to release iceberg as a whole instead of bunch of crates?

Need more input from @Fokko.

Copy link
Collaborator

@liurenjie1024 liurenjie1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, we will eventually have different feature in different crate, so separating it early is better

@liurenjie1024
Copy link
Collaborator

it's better to release iceberg as a whole instead of bunch of crates?

I prefer small crates have several advantages:

  1. Putting things in different crates helps to clarify dependencies relationships.
  2. Faster compilation speed.

For example I want to put datafusion integration in another crate.

Can we release iceberg-rest-catalog-client seperately?

I think it's better to make it part of core crate?

@Xuanwo
Copy link
Member Author

Xuanwo commented Aug 1, 2023

I think it's better to make it part of core crate?

Both LGTM. My motivation is to release it sooner, allowing us to involve more users and developers in our project.

@Fokko
Copy link
Contributor

Fokko commented Aug 1, 2023

Releasing takes some effort, but if everything is well automated, and there are folks around to test and validate the release, then it is a quick process. Even if it is not a lot of effort, it takes 3 days because everyone needs to be able to chime in and test the release. It would be great to set up a how-to-release guide, examples for Java and Python can be used for inspiration :)

What I think would be easiest, but correct me if I'm wrong here. Just release all the crates every time with the same version, so the versions of the crates don't diverge. WDYT?

@liurenjie1024
Copy link
Collaborator

liurenjie1024 commented Aug 1, 2023

What I think would be easiest, but correct me if I'm wrong here. Just release all the crates every time with the same version, so the versions of the crates don't diverge. WDYT?

Agree.

@Xuanwo
Copy link
Member Author

Xuanwo commented Aug 1, 2023

Just release all the crates every time with the same version, so the versions of the crates don't diverge. WDYT?

I'm not sure about that. Maybe we can release iceberg-rest-catalog-client with 0.0.1 and iceberg with 0.2.0?

If we are going to release everything in the same version, it's better to put them in the same crate with diffent feature flag instead.

NOTE: feature flag in rust is similiar to the addon in pip install package[addon].

@liurenjie1024
Copy link
Collaborator

If we are going to release everything in the same version, it's better to put them in the same crate with different feature flag instead.

There is a discussion about feature vs crates here: https://users.rust-lang.org/t/guidance-on-optional-functionality-crates-vs-features/85694

Personally, I prefer to put standalone things in crates for better structure and maintainability, also better documentation.

@JanKaul
Copy link
Collaborator

JanKaul commented Aug 1, 2023

I'm definitely in favor of small crates compared to using features. From my perspective features can get really messy when there is a hierarchy of crates depending on each other (Crate A uses feature 1 of iceberg and crate B wants to use crate A without feature 1 and so on). Plus compile times should be much better.

There is one downside. You cannot implement traits for structs defined outside of your crate. So if you want to implement Datafusions TableProvider for example, you have to create a wrapper struct like so:

struct DatafusionTable(IcebergTable);

Reagrding the release cadence, I have no preference. We could release the small crates all with the same version number or release them separately.

Copy link
Collaborator

@JanKaul JanKaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Thank you.

@Fokko Fokko merged commit 5776a66 into apache:main Aug 1, 2023
7 checks passed
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants