-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
When there are 2 or more records with same hoodie key in a single parquet file, only one of the records gets updated in the Upsert flow #333
Comments
we discussed this f2f.. FWIW, this is the correct and expected behavior.. we don't expect a key to be present multiple times in the partition .. |
who's picking this up. |
i have made the update to the code and added tests. will send a pull request soon. |
Closing this since PR has been inactive for a while. please reopen if needed |
# for free
to join this conversation on GitHub.
Already have an account?
# to comment
There may be situations where there are multiple records with same hoodie key in a single parquet file. Let's assume a scenario where in we have 3 parquet files, and all the three parquet files have a record with same hoodie key and 1 of the three files have multiple records with same hoodie key. When a new record with same hoodie key is upserted, updates happen to both parquet files having 1 record and only 1 record gets updated in the 3rd file having multiple records.
@ovj @vinothchandar @jianxu @n3nash
The text was updated successfully, but these errors were encountered: