Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[FEA] Support Databricks 11.3 ML LTS #6879

Closed
3 of 4 tasks
sameerz opened this issue Oct 21, 2022 · 2 comments · Fixed by #7152
Closed
3 of 4 tasks

[FEA] Support Databricks 11.3 ML LTS #6879

sameerz opened this issue Oct 21, 2022 · 2 comments · Fixed by #7152
Assignees
Labels
feature request New feature or request

Comments

@sameerz
Copy link
Collaborator

sameerz commented Oct 21, 2022

Is your feature request related to a problem? Please describe.
Add support for Databricks 11.3 ML LTS in the 22.12 release

Describe the solution you'd like
Create a shim layer for Databricks 11.3 ML LTS so that users of the plugin can run on Databricks.

Describe alternatives you've considered

Additional context
https://docs.databricks.com/release-notes/runtime/11.3ml.html

#6879 (comment)

@sameerz sameerz added feature request New feature or request ? - Needs Triage Need team to review and classify labels Oct 21, 2022
@sameerz sameerz changed the title [FEA] Support Databricks 11.3 LTS [FEA] Support Databricks 11.3 ML LTS Oct 21, 2022
@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label Oct 25, 2022
@mattahrens
Copy link
Collaborator

We will also want to remove support for 9.1. And we will need to ensure the recent addition of AQE + DPP continues to work in 11.3.

@amahussein amahussein assigned amahussein and unassigned NVnavkumar Oct 28, 2022
@amahussein
Copy link
Collaborator

amahussein commented Nov 2, 2022

DB11.3 backports many of the Spark-3.4 commits that breaks the build.
I added a couple of fixes until I get stuck as the task became dependent on having the spark-3.4 shims working.

My current active branch https://github.com/amahussein/spark-rapids/tree/rapids-6879-b

  • I fixed ParquetStringPredShims because databricks is using parquetFilterPushDownStringPredicate
  • DB3.3.0 removes checkForNumericExpr
  • PromotePrecision is removed as well
  • CastBase is removed
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuCast.scala:32: object CastBase is not a member of package org.apache.spark.sql.catalyst.expressions
[ERROR] import org.apache.spark.sql.catalyst.expressions.{Cast, CastBase, Expression, NullIntolerant, TimeZoneAwareExpression}
[ERROR]        ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuCast.scala:40: not found: type CastBase
[ERROR] final class CastExprMeta[INPUT <: CastBase](

There is a list of compilation errors. Aside from the Exception handling refactoring, some blockers depend on the spark-3.4 shim layers.

Full compilation Errors:

[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuCast.scala:32: object CastBase is not a member of package org.apache.spark.sql.catalyst.expressions
[ERROR] import org.apache.spark.sql.catalyst.expressions.{Cast, CastBase, Expression, NullIntolerant, TimeZoneAwareExpression}
[ERROR]        ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuCast.scala:40: not found: type CastBase
[ERROR] final class CastExprMeta[INPUT <: CastBase](
[ERROR]                                   ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/320+/scala/com/nvidia/spark/rapids/shims/Spark320PlusShims.scala:251: not found: value GpuWindowInPandasExec
[ERROR]             GpuWindowInPandasExec(
[ERROR]             ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/org/apache/spark/sql/rapids/GpuDataSourceScanExec.scala:20: object ShimLeafExecNode is not a member of package com.nvidia.spark.rapids.shims
[ERROR] import com.nvidia.spark.rapids.shims.ShimLeafExecNode
[ERROR]        ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/org/apache/spark/sql/rapids/GpuDataSourceScanExec.scala:32: not found: type ShimLeafExecNode
[ERROR] trait GpuDataSourceScanExec extends ShimLeafExecNode with GpuExec {
[ERROR]                                     ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/330+/scala/com/nvidia/spark/rapids/shims/GpuBatchScanExec.scala:39: not found: type ShimDataSourceV2ScanExecBase
[ERROR]     extends ShimDataSourceV2ScanExecBase with GpuBatchScanExecMetrics {
[ERROR]             ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/320until340-all/scala/com/nvidia/spark/rapids/shims/Spark320until340Shims.scala:20: object AnsiCast is not a member of package org.apache.spark.sql.catalyst.expressions
[ERROR] import org.apache.spark.sql.catalyst.expressions.{AnsiCast, Expression}
[ERROR]        ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/320until340-all/scala/com/nvidia/spark/rapids/shims/Spark320until340Shims.scala:25: not found: type AnsiCast
[ERROR]     GpuOverrides.expr[AnsiCast](
[ERROR]                       ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/320until340-all/scala/com/nvidia/spark/rapids/shims/Spark320until340Shims.scala:82: not found: type AnsiCast
[ERROR]       (cast, conf, p, r) => new CastExprMeta[AnsiCast](cast, ansiEnabled = true, conf = conf,
[ERROR]                                              ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/320until340-all/scala/com/nvidia/spark/rapids/shims/Spark320until340Shims.scala:82: not found: value ansiEnabled
[ERROR]       (cast, conf, p, r) => new CastExprMeta[AnsiCast](cast, ansiEnabled = true, conf = conf,
[ERROR]                                                              ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/320until340-all/scala/com/nvidia/spark/rapids/shims/Spark320until340Shims.scala:83: not found: value parent
[ERROR]         parent = p, rule = r, doFloatToIntCheck = true, stringToAnsiDate = true))
[ERROR]         ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/320until340-all/scala/com/nvidia/spark/rapids/shims/Spark320until340Shims.scala:83: not found: value rule
[ERROR]         parent = p, rule = r, doFloatToIntCheck = true, stringToAnsiDate = true))
[ERROR]                     ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/320until340-all/scala/com/nvidia/spark/rapids/shims/Spark320until340Shims.scala:83: not found: value doFloatToIntCheck
[ERROR]         parent = p, rule = r, doFloatToIntCheck = true, stringToAnsiDate = true))
[ERROR]                               ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/320until340-all/scala/com/nvidia/spark/rapids/shims/Spark320until340Shims.scala:83: not found: value stringToAnsiDate
[ERROR]         parent = p, rule = r, doFloatToIntCheck = true, stringToAnsiDate = true))
[ERROR]                                                         ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/320until340-all/scala/com/nvidia/spark/rapids/shims/Spark320until340Shims.scala:20: Unused import
[ERROR] import org.apache.spark.sql.catalyst.expressions.{AnsiCast, Expression}
[ERROR]                                                   ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/320until340-all/scala/org/apache/spark/sql/execution/datasources/rapids/DataSourceStrategyUtils.scala:26: value translateRuntimeFilter is not a member of object org.apache.spark.sql.execution.datasources.DataSourceStrategy
[ERROR]     DataSourceStrategy.translateRuntimeFilter(expr)
[ERROR]                        ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/330+/scala/com/nvidia/spark/rapids/shims/GpuBatchScanExec.scala:97: not found: value groupPartitions
[ERROR]           groupPartitions(newPartitions).get.map(_._2)
[ERROR]           ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/330+/scala/com/nvidia/spark/rapids/shims/GpuBatchScanExec.scala:105: not found: value partitions
[ERROR]       partitions
[ERROR]       ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/330+/scala/com/nvidia/spark/rapids/shims/GpuBatchScanExec.scala:137: not found: value redact
[ERROR]     redact(result)
[ERROR]     ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/330+/scala/com/nvidia/spark/rapids/shims/Spark330PlusShims.scala:39: not found: type Spark320PlusNonDBShims
[ERROR] trait Spark330PlusShims extends Spark321PlusShims with Spark320PlusNonDBShims {
[ERROR]                                                        ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/330+/scala/com/nvidia/spark/rapids/shims/Spark330PlusShims.scala:49: not enough arguments for constructor FileScanRDD: (sparkSession: org.apache.spark.sql.SparkSession, readFunction: org.apache.spark.sql.execution.datasources.PartitionedFile => Iterator[org.apache.spark.sql.catalyst.InternalRow], filePartitions: Seq[org.apache.spark.sql.execution.datasources.FilePartition], debugWriter: Option[com.databricks.sql.catalyst.BadFilesWriter], asyncIOSafe: Boolean, fileNotFoundHint: String => String, repeatedReadsMetrics: Option[com.databricks.sql.execution.RepeatedReadsMetrics], fileSystemMetrics: Option[com.databricks.sql.execution.FileSystemMetrics], fileScanMetrics: com.databricks.sql.execution.FileScanMetrics, readSchema: org.apache.spark.sql.types.StructType, metadataColumns: Seq[org.apache.spark.sql.catalyst.expressions.AttributeReference], options: org.apache.spark.sql.catalyst.FileSourceOptions)org.apache.spark.sql.execution.datasources.FileScanRDD.
Unspecified value parameter readSchema.
[ERROR]     new FileScanRDD(sparkSession, readFunction, filePartitions, readDataSchema, metadataColumns)
[ERROR]     ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/330until340/scala/org/apache/spark/sql/execution/datasources/parquet/rapids/shims/ParquetCVShims.scala:34: not enough arguments for constructor ParquetColumnVector: (x$1: org.apache.spark.sql.execution.datasources.parquet.ParquetColumn, x$2: org.apache.spark.sql.execution.vectorized.WritableColumnVector, x$3: Int, x$4: org.apache.spark.memory.MemoryMode, x$5: java.util.Set[org.apache.spark.sql.execution.datasources.parquet.ParquetColumn], x$6: Boolean, x$7: Any)org.apache.spark.sql.execution.datasources.parquet.ParquetColumnVector.
Unspecified value parameters x$6, x$7.
[ERROR]     new ParquetColumnVector(column, vector, capacity, memoryMode, missingColumns)
[ERROR]     ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/330until340/scala/org/apache/spark/sql/rapids/shims/RapidsErrorUtils.scala:31: value mapKeyNotExistError is not a member of object org.apache.spark.sql.errors.QueryExecutionErrors
[ERROR]     QueryExecutionErrors.mapKeyNotExistError(key, keyType, origin.context)
[ERROR]                          ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/330until340/scala/org/apache/spark/sql/rapids/shims/RapidsErrorUtils.scala:37: not enough arguments for method invalidElementAtIndexError: (index: Int, numElements: Int, context: org.apache.spark.sql.catalyst.trees.SQLQueryContext)ArrayIndexOutOfBoundsException.
Unspecified value parameter context.
[ERROR]       QueryExecutionErrors.invalidElementAtIndexError(index, numElements)
[ERROR]                                                      ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/330until340/scala/org/apache/spark/sql/rapids/shims/RapidsErrorUtils.scala:39: not enough arguments for method invalidArrayIndexError: (index: Int, numElements: Int, context: org.apache.spark.sql.catalyst.trees.SQLQueryContext)ArrayIndexOutOfBoundsException.
Unspecified value parameter context.
[ERROR]       QueryExecutionErrors.invalidArrayIndexError(index, numElements)
[ERROR]                                                  ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/330until340/scala/org/apache/spark/sql/rapids/shims/RapidsErrorUtils.scala:47: type mismatch;
 found   : String
 required: org.apache.spark.sql.catalyst.trees.SQLQueryContext
[ERROR]     QueryExecutionErrors.arithmeticOverflowError(message, hint, errorContext)
[ERROR]                                                                 ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/330until340/scala/org/apache/spark/sql/rapids/shims/RapidsErrorUtils.scala:55: type mismatch;
 found   : String
 required: org.apache.spark.sql.catalyst.trees.SQLQueryContext
[ERROR]       value, toType.precision, toType.scale, context
[ERROR]                                              ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/330until340/scala/org/apache/spark/sql/rapids/shims/RapidsErrorUtils.scala:61: type mismatch;
 found   : String
 required: org.apache.spark.sql.catalyst.trees.SQLQueryContext
[ERROR]       "Overflow in integral divide", "try_divide", context
[ERROR]                                                    ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/330until340/scala/org/apache/spark/sql/rapids/shims/RapidsErrorUtils.scala:69: not enough arguments for constructor SparkDateTimeException: (errorClass: String, errorSubClass: Option[String], messageParameters: Array[String], context: Array[org.apache.spark.QueryContext], summary: String)org.apache.spark.SparkDateTimeException.
Unspecified value parameters messageParameters, context, summary.
[ERROR]     new SparkDateTimeException(errorClass, Array(infOrNan) ++ messageParameters)
[ERROR]     ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/330until340/scala/org/apache/spark/sql/rapids/shims/SparkUpgradeExceptionShims.scala:27: overloaded method constructor SparkUpgradeException with alternatives:
  (errorClass: String,errorSubClass: String,messageParameters: Array[String],cause: Throwable)org.apache.spark.SparkUpgradeException <and>
  (errorClass: String,errorSubClass: Option[String],messageParameters: Array[String],cause: Throwable)org.apache.spark.SparkUpgradeException
 cannot be applied to (String, Array[String], Throwable)
[ERROR]     new SparkUpgradeException(
[ERROR]     ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/basicPhysicalOperators.scala:26: object ShimLeafExecNode is not a member of package com.nvidia.spark.rapids.shims
[ERROR] import com.nvidia.spark.rapids.shims.{ShimLeafExecNode, ShimSparkPlan, ShimUnaryExecNode}
[ERROR]        ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/org/apache/spark/sql/rapids/execution/GpuBroadcastExchangeExec.scala:32: object ShimBroadcastExchangeLike is not a member of package com.nvidia.spark.rapids.shims
[ERROR] import com.nvidia.spark.rapids.shims.{ShimBroadcastExchangeLike, ShimUnaryExecNode, SparkShimImpl}
[ERROR]        ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/org/apache/spark/sql/rapids/execution/GpuBroadcastExchangeExec.scala:314: not found: type ShimBroadcastExchangeLike
[ERROR]     child: SparkPlan) extends ShimBroadcastExchangeLike with ShimUnaryExecNode with GpuExec {
[ERROR]                               ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuBroadcastJoinMeta.scala:34: fruitless type test: a value of type org.apache.spark.sql.execution.exchange.Exchange cannot also be a org.apache.spark.sql.rapids.execution.GpuBroadcastExchangeExec
[ERROR]               .child.isInstanceOf[GpuBroadcastExchangeExec]
[ERROR]                                  ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuBroadcastJoinMeta.scala:35: fruitless type test: a value of type org.apache.spark.sql.execution.exchange.Exchange cannot also be a org.apache.spark.sql.rapids.execution.GpuBroadcastExchangeExec
[ERROR]       case reused: ReusedExchangeExec => reused.child.isInstanceOf[GpuBroadcastExchangeExec]
[ERROR]                                                                   ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuBroadcastJoinMeta.scala:46: fruitless type test: a value of type org.apache.spark.sql.execution.exchange.Exchange cannot also be a org.apache.spark.sql.rapids.execution.GpuBroadcastExchangeExec
[ERROR]                   .child.isInstanceOf[GpuBroadcastExchangeExec]
[ERROR]                                      ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuBroadcastJoinMeta.scala:47: fruitless type test: a value of type org.apache.spark.sql.execution.exchange.Exchange cannot also be a org.apache.spark.sql.rapids.execution.GpuBroadcastExchangeExec
[ERROR]       case reused: ReusedExchangeExec => reused.child.isInstanceOf[GpuBroadcastExchangeExec]
[ERROR]                                                                   ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuCanonicalize.scala:58: not found: type CastBase
[ERROR]     case c: CastBase if c.timeZoneId.nonEmpty && !c.needsTimeZone =>
[ERROR]             ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuCanonicalize.scala:58: value timeZoneId is not a member of org.apache.spark.sql.catalyst.expressions.Expression
[ERROR]     case c: CastBase if c.timeZoneId.nonEmpty && !c.needsTimeZone =>
[ERROR]                           ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuCanonicalize.scala:58: value needsTimeZone is not a member of org.apache.spark.sql.catalyst.expressions.Expression
[ERROR]     case c: CastBase if c.timeZoneId.nonEmpty && !c.needsTimeZone =>
[ERROR]                                                     ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuCanonicalize.scala:59: value withTimeZone is not a member of org.apache.spark.sql.catalyst.expressions.Expression
[ERROR]       c.withTimeZone(null)
[ERROR]         ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuCast.scala:58: value child is not a member of type parameter INPUT
[ERROR]   val fromType: DataType = cast.child.dataType
[ERROR]                                 ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuCast.scala:59: value dataType is not a member of type parameter INPUT
[ERROR]   val toType: DataType = toTypeOverride.getOrElse(cast.dataType)
[ERROR]                                                        ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuCast.scala:158: value timeZoneId is not a member of type parameter INPUT
[ERROR]     GpuCast(child, toType, ansiEnabled, cast.timeZoneId, legacyCastToString,
[ERROR]                                              ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuCast.scala:32: Unused import
[ERROR] import org.apache.spark.sql.catalyst.expressions.{Cast, CastBase, Expression, NullIntolerant, TimeZoneAwareExpression}
[ERROR]                                                         ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala:27: object AQEUtils is not a member of package com.nvidia.spark.rapids.shims
[ERROR] import com.nvidia.spark.rapids.shims.{AQEUtils, DeltaLakeUtils, GpuBatchScanExec, GpuHashPartitioning, GpuRangePartitioning, GpuSpecifiedWindowFrameMeta, GpuTypeShims, GpuWindowExpressionMeta, OffsetWindowFunctionMeta, SparkShimImpl}
[ERROR]        ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala:30: object GpuShuffleExchangeExec is not a member of package org.apache.spark.rapids.shims
[ERROR] import org.apache.spark.rapids.shims.GpuShuffleExchangeExec
[ERROR]        ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala:65: object shims is not a member of package org.apache.spark.sql.rapids.execution.python
[ERROR] import org.apache.spark.sql.rapids.execution.python.shims.GpuFlatMapGroupsInPandasExecMeta
[ERROR]                                                     ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala:569: not found: value AQEUtils
[ERROR]             AQEUtils.newReuseInstance(sqse, newOutput)
[ERROR]             ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala:899: not found: type PromotePrecision
[ERROR]     expr[PromotePrecision](
[ERROR]          ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala:903: not found: type PromotePrecision
[ERROR]       (a, conf, p, r) => new UnaryExprMeta[PromotePrecision](a, conf, p, r) {
[ERROR]                                            ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala:934: not found: type PromotePrecision
[ERROR]             case p: PromotePrecision if p.child.isInstanceOf[CastBase] &&
[ERROR]                     ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala:934: value child is not a member of Any
[ERROR]             case p: PromotePrecision if p.child.isInstanceOf[CastBase] &&
[ERROR]                                           ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala:934: not found: type CastBase
[ERROR]             case p: PromotePrecision if p.child.isInstanceOf[CastBase] &&
[ERROR]                                                              ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala:935: value child is not a member of Any
[ERROR]                 p.child.dataType.isInstanceOf[DecimalType] =>
[ERROR]                   ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala:936: value child is not a member of Any
[ERROR]               val c = p.child.asInstanceOf[CastBase]
[ERROR]                         ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala:936: not found: type CastBase
[ERROR]               val c = p.child.asInstanceOf[CastBase]
[ERROR]                                            ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala:2087: recursive value x$15 needs type
[ERROR]           val Seq(boolExpr, trueExpr, falseExpr) = childExprs.map(_.convertToGpu())
[ERROR]                   ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala:2596: value failOnError is not a member of org.apache.spark.sql.catalyst.expressions.GetMapValue
[ERROR]           GpuGetMapValue(map, key, in.failOnError)
[ERROR]                                       ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/basicPhysicalOperators.scala:640: not found: type ShimLeafExecNode
[ERROR]     extends ShimLeafExecNode with GpuExec {
[ERROR]             ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala:3837: not found: value GpuShuffleExchangeExec
[ERROR]                 GpuShuffleExchangeExec(
[ERROR]                 ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala:3943: recursive value x$42 needs type
[ERROR]           val Seq(left, right) = childPlans.map(_.convertIfNeeded())
[ERROR]                   ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala:4087: not found: type GpuFlatMapGroupsInPandasExecMeta
[ERROR]       (flatPy, conf, p, r) => new GpuFlatMapGroupsInPandasExecMeta(flatPy, conf, p, r)),
[ERROR]                                   ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala:4400: not found: value DeltaLakeUtils
[ERROR]       case f: FileSourceScanExec if DeltaLakeUtils.isDatabricksDeltaLakeScan(f) =>
[ERROR]                                     ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala:4426: not found: value AQEUtils
[ERROR]         !AQEUtils.isAdaptiveExecutionSupportedInSparkVersion(plan.conf) =>
[ERROR]          ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala:4430: not found: value AQEUtils
[ERROR]         !AQEUtils.isAdaptiveExecutionSupportedInSparkVersion(plan.conf) =>
[ERROR]          ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala:27: Unused import
[ERROR] import com.nvidia.spark.rapids.shims.{AQEUtils, DeltaLakeUtils, GpuBatchScanExec, GpuHashPartitioning, GpuRangePartitioning, GpuSpecifiedWindowFrameMeta, GpuTypeShims, GpuWindowExpressionMeta, OffsetWindowFunctionMeta, SparkShimImpl}
[ERROR]                                       ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala:27: Unused import
[ERROR] import com.nvidia.spark.rapids.shims.{AQEUtils, DeltaLakeUtils, GpuBatchScanExec, GpuHashPartitioning, GpuRangePartitioning, GpuSpecifiedWindowFrameMeta, GpuTypeShims, GpuWindowExpressionMeta, OffsetWindowFunctionMeta, SparkShimImpl}
[ERROR]                                                 ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala:30: Unused import
[ERROR] import org.apache.spark.rapids.shims.GpuShuffleExchangeExec
[ERROR]                                      ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala:65: Unused import
[ERROR] import org.apache.spark.sql.rapids.execution.python.shims.GpuFlatMapGroupsInPandasExecMeta
[ERROR]                                                           ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/aggregate.scala:28: object AggregationTagging is not a member of package com.nvidia.spark.rapids.shims
[ERROR] import com.nvidia.spark.rapids.shims.{AggregationTagging, ShimUnaryExecNode}
[ERROR]        ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/aggregate.scala:870: not found: value AggregationTagging
[ERROR]     if (AggregationTagging.mustReplaceBoth) {
[ERROR]         ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/aggregate.scala:28: Unused import
[ERROR] import com.nvidia.spark.rapids.shims.{AggregationTagging, ShimUnaryExecNode}
[ERROR]                                       ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/basicPhysicalOperators.scala:26: Unused import
[ERROR] import com.nvidia.spark.rapids.shims.{ShimLeafExecNode, ShimSparkPlan, ShimUnaryExecNode}
[ERROR]                                       ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/limit.scala:26: object GpuShuffleExchangeExec is not a member of package org.apache.spark.rapids.shims
[ERROR] import org.apache.spark.rapids.shims.GpuShuffleExchangeExec
[ERROR]        ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/limit.scala:177: not found: value GpuShuffleExchangeExec
[ERROR]       GpuShuffleExchangeExec(
[ERROR]       ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/limit.scala:26: Unused import
[ERROR] import org.apache.spark.rapids.shims.GpuShuffleExchangeExec
[ERROR]                                      ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/org/apache/spark/sql/rapids/GpuDataSourceScanExec.scala:20: Unused import
[ERROR] import com.nvidia.spark.rapids.shims.ShimLeafExecNode
[ERROR]                                      ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/org/apache/spark/sql/rapids/GpuFileSourceScanExec.scala:78: illegal inheritance; superclass Object
 is not a subclass of the superclass SparkPlan
 of the mixin trait GpuExec
[ERROR]     extends GpuDataSourceScanExec with GpuExec {
[ERROR]                                        ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/org/apache/spark/sql/rapids/GpuInMemoryTableScanExec.scala:21: object ShimLeafExecNode is not a member of package com.nvidia.spark.rapids.shims
[ERROR] import com.nvidia.spark.rapids.shims.ShimLeafExecNode
[ERROR]        ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/org/apache/spark/sql/rapids/GpuInMemoryTableScanExec.scala:79: not found: type ShimLeafExecNode
[ERROR]    @transient relation: InMemoryRelation) extends ShimLeafExecNode with GpuExec {
[ERROR]                                                   ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/org/apache/spark/sql/rapids/GpuInMemoryTableScanExec.scala:21: Unused import
[ERROR] import com.nvidia.spark.rapids.shims.ShimLeafExecNode
[ERROR]                                      ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/org/apache/spark/sql/rapids/execution/GpuBroadcastExchangeExec.scala:399: not found: value promise
[ERROR]             promise.success(broadcasted)
[ERROR]             ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/org/apache/spark/sql/rapids/execution/GpuBroadcastExchangeExec.scala:407: not found: value promise
[ERROR]               promise.failure(ex)
[ERROR]               ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/org/apache/spark/sql/rapids/execution/GpuBroadcastExchangeExec.scala:411: not found: value promise
[ERROR]               promise.failure(ex)
[ERROR]               ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/org/apache/spark/sql/rapids/execution/GpuBroadcastExchangeExec.scala:414: not found: value promise
[ERROR]               promise.failure(e)
[ERROR]               ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/org/apache/spark/sql/rapids/execution/GpuBroadcastExchangeExec.scala:32: Unused import
[ERROR] import com.nvidia.spark.rapids.shims.{ShimBroadcastExchangeLike, ShimUnaryExecNode, SparkShimImpl}
[ERROR]                                       ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/org/apache/spark/sql/rapids/execution/GpuShuffleExchangeExecBase.scala:27: object GpuShuffleExchangeExec is not a member of package org.apache.spark.rapids.shims
[ERROR] import org.apache.spark.rapids.shims.GpuShuffleExchangeExec
[ERROR]        ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/org/apache/spark/sql/rapids/execution/GpuShuffleExchangeExecBase.scala:111: not found: value GpuShuffleExchangeExec
[ERROR]     GpuShuffleExchangeExec(
[ERROR]     ^
[ERROR] /home/ubuntu/spark-rapids/sql-plugin/src/main/scala/org/apache/spark/sql/rapids/execution/GpuShuffleExchangeExecBase.scala:27: Unused import
[ERROR] import org.apache.spark.rapids.shims.GpuShuffleExchangeExec
[ERROR]                                      ^
[ERROR] 90 errors found

gerashegalov added a commit that referenced this issue Dec 13, 2022
* introduce non330db directories
* ShimExtractValue
* GpuPredicateHelper now extends and shims PredicateHelper
* Allow passing TEST_PARALLEL to test.sh to be able to run integration tests on a small instance
* No need to override getSparkShimVersion using the same implementation in every shim
 
Fixes #6879 

Signed-off-by: Gera Shegalov <gera@apache.org>

Signed-off-by: Ahmed Hussein (amahussein) <a@ahussein.me>
Co-authored-by: Ahmed Hussein (amahussein) <a@ahussein.me>

Signed-off-by: Niranjan Artal <nartal@nvidia.com>
Co-authored-by: Niranjan Artal <nartal@nvidia.com>
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
feature request New feature or request
Projects
None yet
4 participants