Skip to content

Substrait support for propagating TableScan.filters to Substrait ReadRel.filter #14194

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Conversation

jamxia155
Copy link
Contributor

@jamxia155 jamxia155 commented Jan 19, 2025

Which issue does this PR close?

Closes #14193.

Rationale for this change

Substrait producer currently does not propagate TableScan.filters into Substrait ReadRel. This results in loss of filter predicate pushdown information for Substrait consumers.

What changes are included in this PR?

The conjunction of exact filters in TableScan.filters is saved to Substrait ReadRel.filter.
The conjunction of inexact filters in TableScan.filters is saved to Substrait ReadRel.filter.

Are these changes tested?

Yes.

Are there any user-facing changes?

No.

@github-actions github-actions bot added the substrait Changes to the substrait crate label Jan 19, 2025
@jamxia155
Copy link
Contributor Author

Hi @jonahgao , I've fixed the previous workflow failure. Would you be able to approve the latest workflow please? Or would you recommend a set of checks I should do offline first? Thanks.

@jonahgao
Copy link
Member

Hi @jonahgao , I've fixed the previous workflow failure. Would you be able to approve the latest workflow please? Or would you recommend a set of checks I should do offline first? Thanks.

Done. The most commonly used offline checks are cargo clippy and cargo test.

cargo clippy --all-targets --workspace --features avro,pyarrow -- -D warnings
cargo test --lib --tests --bins --features avro,json

@Blizzara
Copy link
Contributor

would it be possible to add the matching support to consumer as well?

@jamxia155 jamxia155 changed the title Substrait support for propagating TableScan.filters to Substrait ReadRel.best_effort_filter Substrait support for propagating TableScan.filters to Substrait ReadRel.filter Jan 22, 2025
@jamxia155 jamxia155 changed the title Substrait support for propagating TableScan.filters to Substrait ReadRel.filter Substrait support for propagating TableScan.filters to Substrait ReadRel.filter and ReadRel.best_effort_filter Jan 23, 2025
@jamxia155 jamxia155 force-pushed the jamxia_substrait_logical_producer_ReadRel_best_effort_filter branch from f09b057 to 6c6c74c Compare February 4, 2025 02:03
@jamxia155
Copy link
Contributor Author

jamxia155 commented Feb 4, 2025

would it be possible to add the matching support to consumer as well?

Hi @Blizzara , finally got the time to add the matching support to consumer. Please share any comments if you get a chance. Thanks.

Copy link
Contributor

@Blizzara Blizzara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the consumer part! I left some nits, but overall the logic seems sane to me now

@jamxia155 jamxia155 force-pushed the jamxia_substrait_logical_producer_ReadRel_best_effort_filter branch from 6c6c74c to 43717c4 Compare February 14, 2025 19:16
@alamb alamb mentioned this pull request Feb 16, 2025
@alamb alamb mentioned this pull request Feb 24, 2025
10 tasks
jamxia155 added 11 commits March 1, 2025 08:41
Propagate information in datafusion::logical_expr::TableScan.filters
to substrait::proto::ReadRel.best_effort_filter.
Use TableScan.source.supports_filters_pushdown() to determine if each
filter in TableScan.filters should be included in ReadRel.filter or
ReadRel.best_effort_filter
@jamxia155 jamxia155 force-pushed the jamxia_substrait_logical_producer_ReadRel_best_effort_filter branch from 43717c4 to f803557 Compare March 1, 2025 16:42
@jamxia155 jamxia155 changed the title Substrait support for propagating TableScan.filters to Substrait ReadRel.filter and ReadRel.best_effort_filter Substrait support for propagating TableScan.filters to Substrait ReadRel.filter and ReadRel.filter Mar 2, 2025
@alamb alamb mentioned this pull request Mar 3, 2025
12 tasks
Copy link
Contributor

@vbarua vbarua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me from a Substrait perspective ✨

Thanks for following through on this @jamxia155 🙇

@jamxia155 jamxia155 changed the title Substrait support for propagating TableScan.filters to Substrait ReadRel.filter and ReadRel.filter Substrait support for propagating TableScan.filters to Substrait ReadRel.filter Mar 6, 2025
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jamxia155 and @vbarua

I am sorry I missed this -- please do feel free to mention me or one of the other committers if a PR is ready for review

@jamxia155
Copy link
Contributor Author

Hi @alamb, thanks for reviewing. Could you please approve the workflows?

@alamb alamb merged commit 04d823b into apache:main Mar 11, 2025
24 checks passed
@alamb
Copy link
Contributor

alamb commented Mar 11, 2025

Thanks again @jamxia155 @vbarua @Blizzara and everyone else!

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
substrait Changes to the substrait crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Substrait support for propagating TableScan.filters to Substrait ReadRel
7 participants