-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
feat: Introduce schema definition. #19
Conversation
} | ||
|
||
/// Set the field's initial default value. | ||
pub fn with_write_default(mut self, value: impl ToString) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
write_default and initial_default seems should be a value
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but currently Value
is not defined yet 🤪
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM! Although, I feel that the usage of once_cell
here is a bit over-designed. Is it possible that we may need to add new fields into StructType
, for example, in schema evaluation?
The Python and java implementation treats schema as immutable data structures, I think we should also follow that. Mutable makes things complicated, especially when we have indexes. For example the name indexing following. |
Got it, makes sense now. |
Should we also include identifier-field-ids? |
Just took a look at it. We need visitor pattern to verify them, for example the types, etc. So I want to postpone it after we introduce schema visitor. |
I don't entirely understand. Could you elaborate why we would need the visitor pattern for the identifier-field-ids? I was thinking about something similar to the serialized representation. |
I mean the verification check, you can find the implementation in java: https://github.com/apache/iceberg/blob/1bb853191fd378fb1dfda5a5cb297475b7fc204b/api/src/main/java/org/apache/iceberg/Schema.java#L104 cc @Fokko @JanKaul Why I don't include |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some small comments, but looks good 👍🏻
r#struct: StructType, | ||
schema_id: i32, | ||
highest_field_id: i32, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the identifier-field-ids
are missing: https://iceberg.apache.org/spec/#identifier-field-ids
Initial schema definition.
SchemaVisitor
, name indexes will come later.