-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Fix tests failures in collection_ops_test.py #11011
Labels
Comments
mythrocks
added a commit
to mythrocks/spark-rapids
that referenced
this issue
Aug 30, 2024
Fixes NVIDIA#11011. This commit fixes the failures in `collection_ops_tests` on Spark 4.0. On all versions of Spark, when a Sequence is collected with rows that exceed MAX_INT, an exception is thrown indicating that the collected Sequence/array is larger than permissible. The different versions of Spark vary in the contents of the exception message. On Spark 4, one sees that the error message now contains more information than all prior versions, including: 1. The name of the op causing the error 2. The errant sequence size This commit introduces a shim to make this new information available in the exception. Note that this shim does not fit cleanly in RapidsErrorUtils, because there are differences within major Spark versions. For instance, Spark 3.4.0-1 have a different message as compared to 3.4.2 and 3.4.3. Likewise, the differences in 3.5.0, 3.5.1, 3.5.2.
mythrocks
added a commit
to mythrocks/spark-rapids
that referenced
this issue
Aug 30, 2024
Fixes NVIDIA#11011. This commit fixes the failures in `collection_ops_tests` on Spark 4.0. On all versions of Spark, when a Sequence is collected with rows that exceed MAX_INT, an exception is thrown indicating that the collected Sequence/array is larger than permissible. The different versions of Spark vary in the contents of the exception message. On Spark 4, one sees that the error message now contains more information than all prior versions, including: 1. The name of the op causing the error 2. The errant sequence size This commit introduces a shim to make this new information available in the exception. Note that this shim does not fit cleanly in RapidsErrorUtils, because there are differences within major Spark versions. For instance, Spark 3.4.0-1 have a different message as compared to 3.4.2 and 3.4.3. Likewise, the differences in 3.5.0, 3.5.1, 3.5.2.
mythrocks
added a commit
to mythrocks/spark-rapids
that referenced
this issue
Aug 30, 2024
Fixes NVIDIA#11011. This commit fixes the failures in `collection_ops_tests` on Spark 4.0. On all versions of Spark, when a Sequence is collected with rows that exceed MAX_INT, an exception is thrown indicating that the collected Sequence/array is larger than permissible. The different versions of Spark vary in the contents of the exception message. On Spark 4, one sees that the error message now contains more information than all prior versions, including: 1. The name of the op causing the error 2. The errant sequence size This commit introduces a shim to make this new information available in the exception. Note that this shim does not fit cleanly in RapidsErrorUtils, because there are differences within major Spark versions. For instance, Spark 3.4.0-1 have a different message as compared to 3.4.2 and 3.4.3. Likewise, the differences in 3.5.0, 3.5.1, 3.5.2. Signed-off-by: MithunR <mithunr@nvidia.com>
# for free
to join this conversation on GitHub.
Already have an account?
# to comment
The text was updated successfully, but these errors were encountered: