-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Added support for dynamically getting delta table features #428
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for the delayed response, I've added some comments about places where you can add some tests. Thanks for taking up the patch!
@@ -0,0 +1,3 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you update the .gitignore
with .vscode/*
as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely!
InternalSchema.MetadataValue.MICROS); | ||
metadata = DEFAULT_TIMESTAMP_PRECISION_METADATA; | ||
break; | ||
case "timestamp_ntz": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test class should also get a small update https://github.com/apache/incubator-xtable/blob/main/core/src/test/java/org/apache/xtable/delta/TestDeltaSchemaExtractor.java
@@ -268,9 +281,23 @@ private void commitTransaction() { | |||
} | |||
|
|||
private Map<String, String> getConfigurationsForDeltaSync() { | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can test this here
I would expect we can add the timestamp_ntz and see that the delta table has the correct versions set for min reader/writer?
// a private variable to of deltaTable | ||
|
||
// limit the results to the attributes needed | ||
Dataset<Row> record = deltaTable.detail().select("minWriterVersion", "minReaderVersion"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you get this information from deltaLog.snapshot().metadata()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont think so. I looked through the api for the deltaLog.snapshot().metadata()
, and did not find a referance to it. The only place i found it at was here https://docs.delta.io/latest/delta-utility.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to see if we can just let the Delta Lake library automatically set the versions for us. Can you try that as well? If I remember correctly, it looked like it would auto update them for us.
The test creates a new df with its partition field set as the timestamp ntz type. IF this is set, then the table should be upgraded to a reader and writer version of 3,7.
@@ -57,7 +57,7 @@ | |||
import org.junit.jupiter.params.ParameterizedTest; | |||
import org.junit.jupiter.params.provider.Arguments; | |||
import org.junit.jupiter.params.provider.MethodSource; | |||
|
|||
import org.apache.spark.sql.delta.DeltaConfigs; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ForeverAngry Can you remove these changes to this file if they are no longer required?
@ForeverAngry we've gone through some module renaming, can you pull in the latest changes from |
Sure. Also, I'm not really versed in writing java test. So I'm flying a bit
blind here, in terms if those are correct or not. Any pointers would be
appreciated.
…On Sun, May 26, 2024, 8:00 PM Tim Brown ***@***.***> wrote:
@ForeverAngry <https://github.com/ForeverAngry> we've gone through some
module renaming, can you pull in the latest changes from main to pick up
the path changes?
—
Reply to this email directly, view it on GitHub
<#428 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AOXHQZAFKJ5YM6FMSGFNLUTZEJZQZAVCNFSM6AAAAABG65VQ36VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMZSGQ2DANBTGQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@the-other-tim-brown sure. Also, I don't have a lot of experience writing java test. Can you point out what I need to fix, to please the bot :)? |
There were some issues with the test infra but those should be resolved now. After pushing your next commit, it should kick off a new run and I'll take a look at the results |
Important Read
To address: Issue #354
What is the purpose of the pull request
This pull request implements dynamically capturing the min reader and writer version of a target delta table, based on its table details.
Brief change log
Verify this pull request
This pull request is already covered by existing tests, i think...
This is my first PR to an open source project - i was a bit unsure of how to do the unit test for this change. Any pointers or insight would be welcomed :) !