Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Safeguard against potential inexact row count being smaller than exact null count #9007

Merged
merged 2 commits into from
Jan 27, 2024

Conversation

gruuya
Copy link
Contributor

@gruuya gruuya commented Jan 26, 2024

Which issue does this PR close?

Closes #9006.

Rationale for this change

Sometimes an ineact row count can be smaller than the exact null count which leads to a panic.

What changes are included in this PR?

Do a checked subtraction to get the max distinct count estimate, and if it over(under)flows return Inexact(0).

Are these changes tested?

They are tested against the case presented in #9006, and there's a unit test added.

Are there any user-facing changes?

Avoid panic during execution.

@gruuya gruuya force-pushed the max-distinct-count-overflow branch from be142d5 to 81781ff Compare January 26, 2024 11:32
// To safeguard against inexact number of rows (e.g. 0) being smaller than
// an exact null count we need to do a checked subtraction.
match count.checked_sub(*stats.null_count.get_value().unwrap_or(&0)) {
None => Precision::Inexact(0),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This made more sense than Precision::Absent.

Also, I'm not sure whether this can happen below as well, i.e. an inexact null count being larger than an exact row count.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree -- the use of Statistics::get_value() I think may also have other bugs as get_value() may be exact or inexact but there are some places in the code that treat it as though it were always exact (like here)

I have hopes to improve statistics in general (see #8227) but other higher priority things have kept me busy. I think @berkaysynnada was also working on this item for a while -- I am not sure if they have any short term plans

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This issue has been out of my focus for a while. I can help those who wish to take it on and make progress. Unfortunately, addressing this issue is not in my short-term plans.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @gruuya -- this looks good to me (and is well tested 👌 )

// To safeguard against inexact number of rows (e.g. 0) being smaller than
// an exact null count we need to do a checked subtraction.
match count.checked_sub(*stats.null_count.get_value().unwrap_or(&0)) {
None => Precision::Inexact(0),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree -- the use of Statistics::get_value() I think may also have other bugs as get_value() may be exact or inexact but there are some places in the code that treat it as though it were always exact (like here)

I have hopes to improve statistics in general (see #8227) but other higher priority things have kept me busy. I think @berkaysynnada was also working on this item for a while -- I am not sure if they have any short term plans

@@ -1670,133 +1677,156 @@ mod tests {
//
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I double checked that this code fails without the code change in this PR 👍

attempt to subtract with overflow
thread 'joins::utils::tests::test_inner_join_cardinality_single_column' panicked at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ops/arith.rs:217:1:
attempt to subtract with overflow
stack backtrace:
   0: rust_begin_unwind
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:645:5
   1: core::panicking::panic_fmt
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/panicking.rs:72:14
   2: core::panicking::panic
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/panicking.rs:127:5
   3: <usize as core::ops::arith::Sub>::sub
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ops/arith.rs:217:1
   4: <&usize as core::ops::arith::Sub<&usize>>::sub
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/internal_macros.rs:55:17
   5: datafusion_physical_plan::joins::utils::max_distinct_count
             at ./src/joins/utils.rs:960:40
   6: datafusion_physical_plan::joins::utils::estimate_inner_join_cardinality
             at ./src/joins/utils.rs:911:33
   7: datafusion_physical_plan::joins::utils::tests::test_inner_join_cardinality_single_column
             at ./src/joins/utils.rs:1818:17
   8: datafusion_physical_plan::joins::utils::tests::test_inner_join_cardinality_single_column::{{closure}}
             at ./src/joins/utils.rs:1666:55
   9: core::ops::function::FnOnce::call_once
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ops/function.rs:250:5
  10: core::ops::function::FnOnce::call_once
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

error: test failed, to rerun pass `--lib`
error: 1 target failed:
    `--lib`

@alamb alamb merged commit a7a74fa into apache:main Jan 27, 2024
22 checks passed
@alamb
Copy link
Contributor

alamb commented Jan 27, 2024

Thanks again @gruuya

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Attempt to subtract with overflow panic in max_distinct_count
3 participants