-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Update JsonToStructs and ScanJson to have white space normalization #10575
Update JsonToStructs and ScanJson to have white space normalization #10575
Conversation
Signed-off-by: Robert (Bobby) Evans <bobby@apache.org>
build |
build |
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comment, overall lgtm.
def deepTransformView(cv: ColumnView, dt: Option[DataType] = None) | ||
def deepTransformView(cv: ColumnView, dt: Option[DataType] = None, | ||
nestedMismatchHandler: Option[(ColumnView, DataType) => | ||
(Option[ColumnView], ArrayBuffer[AutoCloseable])] = None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the handler need to return a mutable ArrayBuffer? I think the handler could return an immutable Seq given how it's being used, and that seems more flexible and less error-prone than forcing an ArrayBuffer here.
build |
@jlowe please take another look |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the ArrayBuffer -> Seq change only happened halfway, suggested some updates.
sql-plugin/src/main/scala/org/apache/spark/sql/rapids/GpuJsonReadCommon.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/org/apache/spark/sql/rapids/GpuJsonReadCommon.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/org/apache/spark/sql/rapids/GpuJsonReadCommon.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/org/apache/spark/sql/rapids/GpuJsonReadCommon.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/ColumnCastUtil.scala
Outdated
Show resolved
Hide resolved
build |
@jlowe please take another look |
This also contributes to #10491 in a very small way by adding in a few more tests.
Mostly it turns on white space normalization and tries to verify that it is doing the right thing, but there were some errors, so I filed more issues.