Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Fix tests failures in hash_aggregate_test.py #11018

Closed
Tracked by #11004
razajafri opened this issue Jun 8, 2024 · 1 comment · Fixed by #11219
Closed
Tracked by #11004

Fix tests failures in hash_aggregate_test.py #11018

razajafri opened this issue Jun 8, 2024 · 1 comment · Fixed by #11219
Assignees
Labels
bug Something isn't working Spark 4.0+ Spark 4.0+ issues

Comments

@razajafri
Copy link
Collaborator

FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_agg_count
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_agg_nested_array
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_agg_nested_map
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_agg_nested_struct
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_arithmetic_reductions
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_computation_in_grpby_columns
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_count
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_count_distinct_with_nan_floats
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_decimal128_count_group_by
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_decimal128_count_reduction
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_distinct_count_reductions
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_distinct_float_count_reductions
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_exceptAll
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_generic_reductions
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_agg_force_pre_sort
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_agg_with_nan_keys
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_agg_with_struct_keys
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_aggregate_complete_with_grouping_expressions
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_avg_nulls_partial_only
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_count_with_filter
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_groupby_approx_percentile_byte_scalar
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_groupby_approx_percentile_decimal128_single
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_groupby_approx_percentile_decimal32_single
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_groupby_approx_percentile_decimal64_single
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_groupby_approx_percentile_long_single
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_groupby_collect_partial_replace_with_distinct_fallback
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_groupby_collect_set
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_groupby_collect_with_multi_distinct
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_groupby_collect_with_single_distinct
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_groupby_single_distinct_collect
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_grpby_avg
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_grpby_avg_nulls
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_grpby_pivot
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_grpby_sum
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_grpby_sum_count_action
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_grpby_sum_full_decimal
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_multiple_filters
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_multiple_grpby_pivot
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_multiple_mode_query
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_multiple_mode_query_avg_distincts
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_pivot_groupby_duplicates_fallback
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_query_max_with_multiple_distincts
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_query_multiple_distincts_with_non_distinct
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_reduction_avg_nulls
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_reduction_collect_set
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_reduction_decimal_overflow_sum
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_reduction_pivot
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_reduction_sum
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_reduction_sum_count_action
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_hash_reduction_sum_full_decimal
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_intersectAll
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_reduction_nested_array
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_reduction_nested_map
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_reduction_nested_struct
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_struct_cast_groupby_count
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_struct_count_distinct_cast
FAILED ../../../../integration_tests/src/main/python/hash_aggregate_test.py::test_subquery_in_agg
@razajafri razajafri added bug Something isn't working ? - Needs Triage Need team to review and classify labels Jun 8, 2024
@razajafri razajafri added the Spark 4.0+ Spark 4.0+ issues label Jun 8, 2024
@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label Jun 11, 2024
@mythrocks mythrocks self-assigned this Jun 11, 2024
@mythrocks
Copy link
Collaborator

mythrocks commented Jun 11, 2024

Trying to tackle the biggish ones first. It looks like the majority of the problems here are with spark.sql.ansi.enabled=true. The tests are passing, with ANSI mode disabled:

=============== 1661 passed, 435 warnings in 1137.02s (0:18:57) ================

mythrocks added a commit to mythrocks/spark-rapids that referenced this issue Jul 16, 2024
Fixes NVIDIA#11018.

This commit fixes the hash aggregate tests that fail with ANSI enabled.

These tests fail most visibly on Spark 4.0, where ANSI mode is enabled by default.

Signed-off-by: MithunR <mithunr@nvidia.com>
mythrocks added a commit that referenced this issue Jul 18, 2024
)

* Fix hash-aggregate tests failing in ANSI mode

Fixes #11018.  

This commit fixes the tests in `hash_aggregate_test.py` to run correctly when run with ANSI enabled.  This is essential for running the tests with Spark 4.0, where ANSI mode is on by default.  

A vast majority of the tests here happen to exercise aggregations like `SUM`, `COUNT`, `AVG`, etc. which fall to CPU, on account of #5114.  These tests have been marked with `@disable_ansi_mode`, so that they run to completion correctly.  These may be revisited after #5114 has been addressed.  

In cases where #5114 does not apply, the tests have been modified to run with ANSI on and off.

---------

Signed-off-by: MithunR <mithunr@nvidia.com>
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working Spark 4.0+ Spark 4.0+ issues
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants