-
Notifications
You must be signed in to change notification settings - Fork 209
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
feat: Support metadata table "Manifests" #861
Conversation
Signed-off-by: xxchan <xxchan22f@gmail.com>
Signed-off-by: xxchan <xxchan22f@gmail.com>
Signed-off-by: xxchan <xxchan22f@gmail.com>
#822 has been merged, let's move on! |
7693dd5
to
940ddc1
Compare
@Xuanwo merged the main branch, PTAL 🫡 |
crates/iceberg/src/metadata_scan.rs
Outdated
/// Get the manifests table. | ||
pub fn manifests(&self) -> ManifestsTable { | ||
ManifestsTable { | ||
metadata_table: self, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, I think we can simply use Table
here, which suggests that MetadataTable
is merely a wrapper and doesn't implement any additional API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in 9fe6bd0, PTAL
crates/iceberg/src/metadata_scan.rs
Outdated
) | ||
.await?; | ||
for manifest in manifest_list.entries() { | ||
content.append_value(manifest.content.clone() as i8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit unusual to see something that can use as u8
but still requires clone
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may derive Copy
for ManifestContentType
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in 4c6e338
crates/iceberg/src/metadata_scan.rs
Outdated
] | ||
} | ||
|
||
fn schema(&self) -> Schema { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to make this pub
, so engines can get the schema first without fetching the data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in 83e8811
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @flaneur2020 for this, and thank you @xxchan's review, let's move!
} | ||
|
||
/// Returns the schema of the manifests table. | ||
pub fn schema(&self) -> Schema { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should return iceberg schema here, and user could easily convert it to arrow schema.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should return iceberg schema here, and user could easily convert it to arrow schema.
Would you like to create an issue for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done: #868
Re #823. Extends @xxchan's #822 to add support for the Manifests metadata table. (I'll rebase and update this PR once #822 merges.)
This PR also made two small changes to make it possible to pass the test about manifests table:
io
toMetadataScan
MetadataTable
an async trait to allow us load manifests in the impl.there're related comments in #823, we can rebase this pr after #823 updated.
Reference implementations:
Java
PyIceberg