Skip to content

Commit

Permalink
Fix to Delta Uniform to support convert Delta null partition values t…
Browse files Browse the repository at this point in the history
…o iceberg

The existing Delta to iceberg conversion has a bug that it does not handle null partition values as it will write the string with content "null" in the partition path, and "null" cannot be converted to other numeric types. The fix uses a special marker from iceberg library so it recognizes the null value and converts correctly.

GitOrigin-RevId: 667e795ead753803565340abcc23ae01d9738a2c
  • Loading branch information
lzlfred authored and allisonport-db committed Sep 11, 2023
1 parent b30a903 commit 9a5eeb7
Showing 1 changed file with 6 additions and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -150,6 +150,7 @@ object IcebergTransactionUtils
.withFormat(FileFormat.PARQUET)

if (partitionSpec.isPartitioned) {
val ICEBERG_NULL_PARTITION_VALUE = "__HIVE_DEFAULT_PARTITION__"
val partitionPath = partitionSpec
.fields()
.asScala
Expand All @@ -158,7 +159,11 @@ object IcebergTransactionUtils
// The Iceberg Schema and PartitionSpec all use the column logical names.
// Delta FileAction::partitionValues, however, uses physical names.
val physicalPartKey = logicalToPhysicalPartitionNames(logicalPartCol)
s"$logicalPartCol=${f.partitionValues(physicalPartKey)}"

// ICEBERG_NULL_PARTITION_VALUE is referred in Iceberg lib to mark NULL partition value
val partValue = Option(f.partitionValues(physicalPartKey))
.getOrElse(ICEBERG_NULL_PARTITION_VALUE)
s"$logicalPartCol=$partValue"
}
.mkString("/")

Expand Down

0 comments on commit 9a5eeb7

Please # to comment.