Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Improve aggregation performance of average on DECIMAL128 columns [databricks] #4776

Merged
merged 2 commits into from
Feb 16, 2022

Conversation

jlowe
Copy link
Contributor

@jlowe jlowe commented Feb 14, 2022

Fixes #4722.

Accelerates DECIMAL128 average aggregations using the same technique employed in #4688. This allows the use of the cudf hash-based algorithm instead of the slower sort-based algorithm for these types of aggregations.

Signed-off-by: Jason Lowe <jlowe@nvidia.com>
@jlowe jlowe added the performance A performance related task/issue label Feb 14, 2022
@jlowe jlowe added this to the Feb 14 - Feb 25 milestone Feb 14, 2022
@jlowe jlowe self-assigned this Feb 14, 2022
@jlowe
Copy link
Contributor Author

jlowe commented Feb 14, 2022

build

@jlowe
Copy link
Contributor Author

jlowe commented Feb 15, 2022

CI failed due to recent cudf string split API breakage:

Error: :43:11.360Z] [ERROR] [Error] /home/jenkins/agent/workspace/jenkins-rapids_premerge-github-3972/sql-plugin/src/main/scala/org/apache/spark/sql/rapids/stringFunctions.scala:1371: overloaded method value stringSplitRecord with alternatives:
[2022-02-14T17:43:11.361Z]   (x$1: String,x$2: Int)ai.rapids.cudf.ColumnVector <and>
[2022-02-14T17:43:11.361Z]   (x$1: String,x$2: Boolean)ai.rapids.cudf.ColumnVector
[2022-02-14T17:43:11.361Z]  cannot be applied to (ai.rapids.cudf.Scalar, Int)

@jlowe
Copy link
Contributor Author

jlowe commented Feb 15, 2022

build

@jlowe
Copy link
Contributor Author

jlowe commented Feb 15, 2022

CI couldn't find resources:

[2022-02-15T15:28:14.617Z] Waited 60 times already, stopping

@jlowe
Copy link
Contributor Author

jlowe commented Feb 15, 2022

build

@jlowe jlowe merged commit 342cbca into NVIDIA:branch-22.04 Feb 16, 2022
@jlowe jlowe deleted the dec128-avg-perf branch February 16, 2022 00:22
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
performance A performance related task/issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize DECIMAL128 average aggregations
3 participants