-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Improve push down filter of join #13184
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Improve push down filter of join #13184
Conversation
Thanks everyone |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this PR has correctness issues see: #13211
99 NULL NULL NULL | ||
|
||
query ITIT | ||
SELECT * FROM t1 LEFT JOIN t2 ON t1.t1_id = t2.t2_id AND t2.t2_name = 'z' ORDER BY t1_id, t1_name, t2_id, t2_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the predicates here are "to simple" and a lot of the outer joins here will be converted to Inner
joins by eliminate_outer_join rule and therefore these tests do not provide the desired coverage.
let columns = predicate.column_refs(); | ||
macro_rules! restrict_null { | ||
() => {{ | ||
let predicate_cloned = predicate.clone(); | ||
let cols = columns.iter().cloned(); | ||
is_restrict_null_predicate(predicate_cloned, cols).unwrap_or(false) | ||
}}; | ||
} | ||
|
||
if checker.left_only(&columns) && (left_preserved || restrict_null!()) { | ||
left_push.push(predicate); | ||
} else if right_preserved && checker.is_right_only(&predicate) { | ||
} else if checker.right_only(&columns) && (right_preserved || restrict_null!()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems incorrect. Why would a predicate being restrict null allow us to push it? It filtering it out null seems to be a argument against why it can be pushed down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is wrong. I neglected to modify JoinType here. Thank you @eejbyfeldt for your help in finding this problem.
This reverts commit 7ae1ccb.
For anyone following along, this PR appears to have had some correctness issues so @eejbyfeldt reverted it #13229 @JasonLi-cn are you willing to create a new PR that we can work through the correctness issues |
I think for this pushdown to be correct, the join type can be changed. I think this can be done in two phases:
|
Just adding to it, I think supporting this would be relatively simple. Currently eliminate_outer_join only supports columns that are non nullable, we can support expressions that filter out any nulls like most binary operators.
|
Hm actually, the transformation is already supported by |
I tracked this here #13232 |
I'm sorry that this PR has introduced bugs. I will continue to follow up this issue, and then please continue to help review. 🙏 |
Which issue does this PR close?
Closes #.
Rationale for this change
In the current version, the push down filter of Join is not very complete. For example:
Expect
BTW: If
eliminate_outer_join
can convert Left Join to Inner Join in this query, the filterabs(t2.c1) > Int32(5)
can also be pushed down. So the next plan is to improveeliminate_outer_join
rule. It's a little difficult.What changes are included in this PR?
Modify the
push_down_all_join
functionin
push_down_filter` rule.Are these changes tested?
yes
Are there any user-facing changes?