From fddc6c30d237e6102fd647865dbb9d830948f13e Mon Sep 17 00:00:00 2001 From: Jacek Laskowski Date: Tue, 2 Jul 2024 23:45:04 +0200 Subject: [PATCH] DeltaCDFRelation --- docs/change-data-feed/DeltaCDFRelation.md | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/docs/change-data-feed/DeltaCDFRelation.md b/docs/change-data-feed/DeltaCDFRelation.md index d6884ea76e..764cffb31e 100644 --- a/docs/change-data-feed/DeltaCDFRelation.md +++ b/docs/change-data-feed/DeltaCDFRelation.md @@ -13,7 +13,7 @@ `DeltaCDFRelation` is created when: -* `CDCReaderImpl` is requested for a [CDF-aware BaseRelation](CDCReaderImpl.md#getCDCRelation) +* `CDCReaderImpl` is requested for a [CDF-aware BaseRelation](CDCReaderImpl.md#getCDCRelation) and [emptyCDFRelation](CDCReaderImpl.md#emptyCDFRelation) ## Building Distributed Scan { #buildScan } @@ -32,3 +32,15 @@ `buildScan` does column pruning with the `requiredColumns` defined (using `Dataset.select` operator). In the end, `buildScan` converts the `DataFrame` to `RDD[Row]` (using `DataFrame.rdd` operator). + +## Schema + +??? note "BaseRelation" + + ```scala + schema: StructType + ``` + + `schema` is part of the `BaseRelation` ([Spark SQL]({{ book.spark_sql }}/BaseRelation/#schema)) abstraction. + +`schema` [cdcReadSchema](CDCReaderImpl.md#cdcReadSchema) for the [schema](../Metadata.md#schema) of the delta table (based on the [Metadata](../Snapshot.md#metadata) of the [snapshotForBatchSchema](#snapshotForBatchSchema)).