Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

feat: Expose Iceberg table statistics in DataFusion interface(s) #869

Open
gruuya opened this issue Jan 3, 2025 · 3 comments · May be fixed by #880
Open

feat: Expose Iceberg table statistics in DataFusion interface(s) #869

gruuya opened this issue Jan 3, 2025 · 3 comments · May be fixed by #880
Assignees

Comments

@gruuya
Copy link
Contributor

gruuya commented Jan 3, 2025

At present the two key DataFusion interfaces for Iceberg lack statistics information, as they rely on default (i.e. missing/unknown) implementations for TableProvider::statistics and ExecutionPlan::statistics.

These can be quite important (particularly the later one) during join planning, as DataFusion uses a number of heuristics that are based on these stats when planning joins, and so they can impact performance.

@gruuya
Copy link
Contributor Author

gruuya commented Jan 3, 2025

I'd be happy to work on developing this.

@Xuanwo
Copy link
Member

Xuanwo commented Jan 3, 2025

Thank you a lot for working on this!

@liurenjie1024
Copy link
Contributor

Thanks @gruuya for doing this, let's continue the discussion in pr.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants