-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Use storageAssighmentPolicy for casts in Delta DML commands #1938
Use storageAssighmentPolicy for casts in Delta DML commands #1938
Conversation
spark/src/test/scala/org/apache/spark/sql/delta/ImplicitDMLCastingSuite.scala
Outdated
Show resolved
Hide resolved
spark/src/test/scala/org/apache/spark/sql/delta/ImplicitDMLCastingSuite.scala
Outdated
Show resolved
Hide resolved
spark/src/test/scala/org/apache/spark/sql/delta/ImplicitDMLCastingSuite.scala
Outdated
Show resolved
Hide resolved
} | ||
} | ||
|
||
case class CheckOverflowInTableWrite(child: Expression, columnName: String) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to be almost the same as CheckOverflowInTableInsert
in Spark, could we inherit it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was indeed looking into it, but there are some problems with that
- This case is a case class and CheckOverflowInTableInsert is one as well, so Scala does not like that
- I cannot refactor CheckOverflowInTableInsert because it is part of the OSS Spark imports. This also inhibits me from making methods like getCast protected or the error throwing function configurable
- Even if I could overwrite methods, with doGenCodeWithBetterErrorMsg and eval I would have to overwrite the main ones so most of the code would not be shared (this is more of a minor thing though)
Oh wow! We were letting overflows happen silently. This is bad. So glad you caught it and fixed it. This is definitely a tiny bit breaking change...but for the better. I really think we should get this into Delta 3.0 since major releases are good opportunities to do such breaking changes. |
delta-io#1938 changed the casting behavior in MERGE and UPDATE to follow the value of the `storeAssignmentPolicy` config instead of the `ansiEnabled` one, making the behavior consistent with INSERT. This change breaks MERGE and UPDATE operations that contain a cast when `storeAssignmentPolicy` is set to `STRICT`, throwing an internal error during analysis. The cause is the `UpCast` expression(s) added by `PreprocessTableMerge` and `PreprocessTableUpdate` when processing assignments. `UpCast` is meant to be replaced by a regular after performing checks by the `ResolveUpCast` rule that runs during the resolution phase **before** `PreprocessTableMerge` and `PreprocessTableUpdate` introduce the expression. The fix is to run the `ResolveUpCast` rule once more after `PreprocessTableMerge` and `PreprocessTableUpdate` have run. Missing tests covering cast behavior for the different values of`storeAssignmentPolicy` for UPDATE and MERGE are added, covering: - Invalid implicit cast (string -> boolean), valid implicit cast (string -> int), upcast (int -> long) - UPDATE, MERGE - storeAssignmentPolicy = LEGACY, ANSI, STRICT Closes delta-io#1998 GitOrigin-RevId: 5b18006b9cb8efa522cda0a4cfa6e981a0b40d28
Which Delta project/connector is this regarding?
Description
Follow spark.sql.storeAssignmentPolicy instead of spark.sql.ansi.enabled for casting behaviour in Delta UPDATE and MERGE. This will by default error out at runtime when an overflow happens because the default for storeAssignmentPolicy ins ANSI while ANSI is not enabled by default.
How was this patch tested?
Tests for casting behavior added
Does this PR introduce any user-facing changes?
The UPDATE and MERGE command will error out by default if a cast that is on the write path overflows. The spark.databricks.delta.updateMergeLegacyCasting.enabled can be used to revert to the old behaviour.