Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Don't distinguish between types of ArithmeticException for Spark 3.2.x #5483

Merged

Conversation

HaoYang670
Copy link
Collaborator

@HaoYang670 HaoYang670 commented May 13, 2022

Signed-off-by: remzi 13716567376yh@gmail.com

Close #5474

In the test test_mod_pmod_by_zero, Spark 31X always throws java.lang.ArithmeticException, Spark 33X always throws SparkArithmeticException, but Spark 32X will throw either of them. So we don't distinguish the two for Spark 32X.

Signed-off-by: remzi <13716567376yh@gmail.com>
@HaoYang670
Copy link
Collaborator Author

build

@HaoYang670 HaoYang670 self-assigned this May 13, 2022
Copy link
Collaborator

@tgravescs tgravescs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when you say it throws both exceptions, is it for different of these operations? It would be better to perhaps split apart by operation in that case.

@jlowe jlowe changed the title Don't distinguish butween types of ArithmeticException for spark32X. Don't distinguish between types of ArithmeticException for Spark 3.2.x May 13, 2022
@jlowe jlowe added this to the May 2 - May 20 milestone May 13, 2022
@sameerz sameerz added the bug Something isn't working label May 13, 2022
@HaoYang670
Copy link
Collaborator Author

HaoYang670 commented May 14, 2022

when you say it throws both exceptions, is it for different of these operations? It would be better to perhaps split apart by operation in that case.

For Spark 32X, this test will throw

java.lang.ArithmeticException: Decimal(expanded,-12,2,0}) cannot be represented as Decimal(7, 7).
  at org.apache.spark.sql.errors.QueryExecutionErrors$.cannotChangeDecimalPrecisionError(QueryExecutionErrors.scala:97)

and

SparkArithmeticException(errorClass = "DIVIDE_BY_ZERO",..)

Do you mean to move pmod(cast(-12 as {}), cast(0 as {}))-Decimal(7,7) and cast(-12 as {}) % cast(0 as {})-Decimal(7,7) into a new test?

@nartal1
Copy link
Collaborator

nartal1 commented May 17, 2022

Looks like it throws java.lang.ArithmeticException for DecimalType when precision=scale and SparkArithmeticException for all other cases. Can we have separate test for Spark-3.2 which shows this difference?

@tgravescs
Copy link
Collaborator

yes, what @nartal1 said, move the ones that java.lang.ArithmeticException to a different test then the ones that throw SparkArithmeticException so that we can properly check the full Exception type.

@HaoYang670
Copy link
Collaborator Author

yes, what @nartal1 said, move the ones that java.lang.ArithmeticException to a different test then the ones that throw SparkArithmeticException so that we can properly check the full Exception type.

I could do that, but I am afraid the test code would be less readable. Because I have to skip the cases of cast -12 to decimal(7, 7), maybe something like this:

@pytest.mark.parametrize('data_gen', _arith_data_gens_no_neg_scale, ids=idfn)
@pytest.mark.parametrize('overflow_exp', [
    'pmod(a, cast(0 as {}))',
    'pmod(cast(-12 as {}), cast(0 as {}))',
    'a % (cast(0 as {}))',
    'cast(-12 as {}) % cast(0 as {})'], ids=idfn)
def test_mod_pmod_by_zero(data_gen, overflow_exp):
    # -12 cannot cast to decimal(7, 7), we will test this case in test_cannot_cast_to_decimal
    if (data_gen is _decimal_gen_7_7) and ("cast(-12 as {})" in overflow_exp):
        pass
    else:
        ...
        assert_gpu_and_cpu_error(
            ...
            java.lang.ArithmeticException if < 3.2.0 else
            org.apache.spark.SparkArithmeticException
            ...
        )

@pytest.mark.parametrize('data_gen', _arith_data_gens_no_neg_scale, ids=idfn)
@pytest.mark.parametrize('overflow_exp', [
    'pmod(cast(-12 as {}), cast(0 as {}))',
    'cast(-12 as {}) % cast(0 as {})'], ids=idfn)
def test_cannot_cast_to_decimal(data_gen_overflow_exp):
      assert_gpu_and_cpu_error(
          ...
          java.lang.ArithmeticException if < 3.3.0 else
          org.apache.spark.SparkArithmeticException
          ...
      )

@HaoYang670
Copy link
Collaborator Author

cc @res-life. What's your opinion?

@res-life
Copy link
Collaborator

From the name test_mod_pmod_by_zero it's testing divided by zero, so -12 cast to decimal(7, 7) is actually another test case about cast.
In my opinion, just remove decimal(7,7) from _arith_data_gens_no_neg_scale

@pytest.mark.parametrize('data_gen', _arith_data_gens_no_neg_scale.remove(`decimal(7,7)`) , ids=idfn) 
test_mod_pmod_by_zero

And add a cast case
test_cast_ArithmeticException ...

Is this OK @HaoYang670

@HaoYang670
Copy link
Collaborator Author

just remove decimal(7,7) from _arith_data_gens_no_neg_scale

I am not sure if this would decrease the test coverage as pmod(a, cast(0 as {decimal(7, 7})) and a % (cast(0 as {decimal(7,7})) are also removed.

@jlowe
Copy link
Contributor

jlowe commented May 19, 2022

Tests should not normally be special-casing their parametric inputs. That implies there should be a separate test case potentially with separate parametric inputs.

If _arith_data_gens_no_neg_scale is problematic for pmod, then create a new one for pmod. You can build one off the other, e.g.:

_arith_data_gens_no_neg_scale_for_pmod = ...
_arith_data_gens_no_neg_scale = _arith_data_gens_no_neg_scale_for_pmod + ...

Names aren't great, but you get the idea. Let's not complicate test code by checking for special inputs -- that's a different test case. If there's too much duplication between tests at that point, refactor the common parts into a utility method that multiple tests can leverage.

Signed-off-by: remzi <13716567376yh@gmail.com>
@HaoYang670
Copy link
Collaborator Author

build

Copy link
Collaborator

@tgravescs tgravescs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like we did drop a couple of tests like pmod(cast(-12 as {}), but not sure necessary since its cast that is throwing.

@tgravescs tgravescs merged commit c9f3595 into NVIDIA:branch-22.06 May 20, 2022
@HaoYang670 HaoYang670 deleted the spark321_test_mod_pmod_by_zero_bug branch May 20, 2022 21:51
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Spark 3.2.1 arithmetic_ops_test failures
6 participants