-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Support passing Hadoop configurations via DeltaTable
This PR makes DeltaTable support reading Hadoop configuration. It adds a new public API to the DeltaTable in both Scala and Python: ``` def forPath( sparkSession: SparkSession, path: String, hadoopConf: scala.collection.Map[String, String]) ``` Along with the API change, it adds the necessary change to make operations on `DeltaTable` work: ``` def as() def alias() def toDF() def optimize() def upgradeTableProtocol() def vacuum(...) def history() def generate(...) def update(...) def updateExpr(...) def delete(...) def merge(...) def clone(...) def cloneAtVersion(...) def restoreToVersion(...) ``` With the change in this PR, the above functions work and are verified in a new unit test. Some commands such as Merge/Vacuum/restoreToVersion etc don't pick up the Hadoop configurations even though they are passed to DeltaTableV2 through new forPath(..., options) API. Note that the unit test is written first by verifying that it fails without the change and passes with the change. New unit tests. AFFECTED VERSIONS: Delta 2.2 PROBLEM DESCRIPTION: Similar to DataFrame, DeltaTable API in both Scala and Python supports custom Hadoop file system options to access underlying storage system. Example: ``` val myCredential = Map( "fs.azure.account.auth.type" -> "OAuth", "fs.azure.account.oauth.provider.type" -> "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider", "fs.azure.account.oauth2.client.id" -> "...", "fs.azure.account.oauth2.client.secret" -> "...", "fs.azure.account.oauth2.client.endpoint" -> "..." ) val deltaTable = DeltaTable.forPath(spark, "/path/to/table", myCredential) ``` Before this PR, there is no way to pass these Hadoop configurations through DeltaTable. DeltaTable will only pick up ones starting with `fs.` or `dfs.` to create Hadoop Configuration object to access the storage the same way as DataFrame options for Delta. We avoid picking up other options because: - We don't want unrelated options to be passed into Delta underlying constructs such as DeltaLog. GitOrigin-RevId: 89cfb1a3465d30081a14f74ae6aa80a4c48f9e56
- Loading branch information
Showing
19 changed files
with
572 additions
and
52 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.