Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[SPARK-47341][SQL] Fix inaccurate documentation of RuntimeConfig#get #48264

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

xi-db
Copy link
Contributor

@xi-db xi-db commented Sep 26, 2024

What changes were proposed in this pull request?

The existing documentation of RuntimeConfig.get() is misleading:

  • get(key: String) method will not throw any exception if the key is not set as long as the config entry has a default value, instead, it will just return the defaultValue of the ConfigEntry. An NoSuchElementException will only be thrown if there is no default value for the config entry.
  • get(key: String, default: String) method will ignore the defaultValue of its ConfigEntry, and return the given param default if unset. 
  • getOption(key: String) method will return the defaultValue of its ConfigEntry if the config is not set, instead of None.

 

An example to demonstrate the behaviour:

image

The first line makes sure the config is not set.

scala> spark.conf.unset("spark.sql.session.timeZone") 

The following code returns Etc/UTC, which doesn't throw any exception.

scala> spark.conf.get("spark.sql.session.timeZone") 
res1: String = "Etc/UTC"

The following code returns Some("Etc/UTC"), instead of None.

scala> spark.conf.getOption("spark.sql.session.timeZone") 
res2: Option[String] = Some(value = "Etc/UTC")

The following code returns Europe/Berlin, ignoring the default value. However, the documentation only says it returns the value, without mentioning ignoring the default value of the entry when the config is not explicitly set.

scala> spark.conf.get("spark.sql.session.timeZone", "Europe/Berlin") 
res3: String = "Europe/Berlin"

In this PR, the documentation is fixed and a new test case is added.

Why are the changes needed?

The incorrect documentation is likely to mislead users to weird behaviours if they rely on the documentation.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

New test case in RuntimeConfigSuite.

Was this patch authored or co-authored using generative AI tooling?

No.

@xi-db xi-db changed the title [SPARK-47341] Fix inaccurate documentation of RuntimeConfig.get [SPARK-47341][SQL] Fix inaccurate documentation of RuntimeConfig.get Sep 26, 2024
@xi-db
Copy link
Contributor Author

xi-db commented Sep 26, 2024

@HyukjinKwon Could you take a look at this documentation improvement on RuntimeConfig#get? Thanks.

@xi-db xi-db changed the title [SPARK-47341][SQL] Fix inaccurate documentation of RuntimeConfig.get [SPARK-47341][SQL] Fix inaccurate documentation of RuntimeConfig#get Sep 26, 2024
Copy link
Contributor

@vicennial vicennial left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the documentation improvement + clear examples!

@HyukjinKwon
Copy link
Member

Let's make CI happy and green though :-).

@xi-db
Copy link
Contributor Author

xi-db commented Sep 27, 2024

Hi @HyukjinKwon , I reformated the code, but the Documentation Generation still fails. Do you have any ideas how I can fix it?

@@ -54,17 +54,20 @@ abstract class RuntimeConfig {
}

/**
* Returns the value of Spark runtime configuration property for the given key.
* Returns the value of Spark runtime configuration property for the given key. If the key is
* not set yet, return `defaultValue` in its [[ConfigEntry]].
Copy link
Member

@HyukjinKwon HyukjinKwon Sep 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think real error is this:

[error] Generating /__w/spark/spark/target/javaunidoc/org/apache/spark/sql/RuntimeConfig.html...
[error] /__w/spark/spark/sql/api/target/java/org/apache/spark/sql/RuntimeConfig.java:13:1:  error: reference not found
[error]    * not set yet, return <code>defaultValue</code> in its {@link ConfigEntry}.
[error]                                                                  ^
[error] /__w/spark/spark/sql/api/target/java/org/apache/spark/sql/RuntimeConfig.java:25:1:  error: reference not found
[error]    * {@link ConfigEntry} is not the desired one.
[error]             ^

Let's just switch [[ConfigEntry]] to something else or remove it. ConfigEntry isn't technically an API so there is no way for the end users to see.

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants