Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Determine automatically if push join to table scan #6818

Conversation

losipiuk
Copy link
Member

@losipiuk losipiuk commented Feb 4, 2021

On top of: #6752

Review last commit only.

@cla-bot cla-bot bot added the cla-signed label Feb 4, 2021
@losipiuk losipiuk requested a review from findepi February 4, 2021 13:04
@losipiuk losipiuk force-pushed the lo/oportunistic-join-pushdown-cost-based branch from 5513f17 to 92dbbb0 Compare February 4, 2021 13:18
PlanNodeStatsEstimate joinStats = context.getStatsProvider().getStats(joinNode);
PlanNodeStatsEstimate leftStats = context.getStatsProvider().getStats(joinNode.getLeft());
PlanNodeStatsEstimate rightStats = context.getStatsProvider().getStats(joinNode.getRight());
if (joinStats.isOutputRowCountUnknown() || leftStats.isOutputRowCountUnknown() || rightStats.isOutputRowCountUnknown()) {
Copy link
Member

@raunaqmorarka raunaqmorarka Feb 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if one of left output count or right count is known and larger than join output row count, why not pushdown join in such case as well ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - we could. Though it is strictly theoretical case. As if we do not know either left or right size. We would not know the join size :)

Copy link
Member

@raunaqmorarka raunaqmorarka Feb 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right, I missed that. Any particular reason for basing this on row count instead of size ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. Probably size would be more appropriate. I will see how painful it is to change that.

@findepi
Copy link
Member

findepi commented Feb 5, 2021

On top of: #6752

i plan to review this once that one is merged

@losipiuk losipiuk force-pushed the lo/oportunistic-join-pushdown-cost-based branch from 92dbbb0 to 0e8b453 Compare February 9, 2021 15:03
/**
* Determine automatically if push join to connector
*/
AUTOMATIC,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is safe to make it the default

PlanNodeStatsEstimate joinStats = context.getStatsProvider().getStats(joinNode);
PlanNodeStatsEstimate leftStats = context.getStatsProvider().getStats(joinNode.getLeft());
PlanNodeStatsEstimate rightStats = context.getStatsProvider().getStats(joinNode.getRight());
if (joinStats.isOutputRowCountUnknown() || leftStats.isOutputRowCountUnknown() || rightStats.isOutputRowCountUnknown()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since stats calculation can be costly (eg can involve a trip to metastore), short-circuit calculation as early as you can.
To keep this readable, please extract the condition to a separate method.

@@ -114,6 +115,19 @@ public Result apply(JoinNode joinNode, Captures captures, Context context)
return Result.empty();
}

if (getJoinPushdownMode(context.getSession()) == JoinPushdownMode.AUTOMATIC) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ideally use a switch, and make it exhaustive, future proofing for the case when we add something like AUTOMATIC_EAGER (which we don't have to add yet, but we may want to add in the future)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvrm, in this case it doesn't matter -- this is the only place the enum is used, so no way it gets forgotten and not updated

return Result.empty();
}

if (joinStats.getOutputRowCount() > leftStats.getOutputRowCount() + rightStats.getOutputRowCount()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this is not a rocket science, it'd be nice to add some comment, eg why we're choosing + over max.
from my perspective it was some 'random thought from findepi' (and i don't feel strongly), but still let's safe future readers suffering and try to word some explanation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some reasoning. Not sure if helpful

@losipiuk losipiuk force-pushed the lo/oportunistic-join-pushdown-cost-based branch from 0e8b453 to 70680d7 Compare February 10, 2021 16:10
@losipiuk
Copy link
Member Author

ac

@@ -135,7 +135,7 @@
private DataSize filterAndProjectMinOutputPageSize = DataSize.of(500, KILOBYTE);
private int filterAndProjectMinOutputPageRowCount = 256;
private int maxGroupingSets = 2048;
private JoinPushdownMode joinPushdownMode = JoinPushdownMode.DISABLED;
private JoinPushdownMode joinPushdownMode = JoinPushdownMode.AUTOMATIC;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see conversation about code level documentation in the other pr

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comment as a separate commit before introducing AUTOMATIC mode.

@@ -114,6 +117,10 @@ public Result apply(JoinNode joinNode, Captures captures, Context context)
return Result.empty();
}

if (getJoinPushdownMode(context.getSession()) == JoinPushdownMode.AUTOMATIC && !shouldProceedWithPushDown(joinNode, context)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the getJoinPushdownMode should be consulted inside shouldProceedWithPushDown
(or you'd want to rename the method to indicate it's appropriate for "automatic" mode only)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed the method to skipJoinPushdownBasedOnCost (reversing true/false return value semantics), and moved getJoinPushdownMode(context.getSession()) == JoinPushdownMode.AUTOMATIC inside.

Add "automatic" mode of join pushdown operation. In that mode join will
only be pused down into table scan if statistics are available for join
node and both source table scan nodes. And if expected numuber of rows
coming out of join is less than total number of rows from both sources.
@losipiuk losipiuk force-pushed the lo/oportunistic-join-pushdown-cost-based branch from 70680d7 to 2e3ebdf Compare February 11, 2021 11:43
@@ -135,16 +135,7 @@
private DataSize filterAndProjectMinOutputPageSize = DataSize.of(500, KILOBYTE);
Copy link
Member

@sopel39 sopel39 Feb 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if number of rows after pushdown is smaller then without pushdown it could significantly increase cpu overhead of underlying source (table scans might be much cheaper than join). I think it would be great to determine what's the impact of pushdown on underlying connectors. It could be that join pushdown is beneficial only when joins are very non selective and users don't want cpu of underlying connector to increase significantly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Yet I would assume that you will still be able to disable pushdown on per-connector level in configuration. As well as per-query using session.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally -- #6874 provides both catalog level config and session toggle.

return true;
}

if (joinOutputSize > leftOutputSize + rightOutputSize) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding some factor here, e.g pushed down join should produce 2x less rows than in trino. Such factor might need to be empirically established

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so you mean to replace left + right with max(left, right) * 0.5? Works for me, given that the current formula is not very scientificly determined.
I think we should do "something reasonable" & iterate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - I find initial value of a factor 1.0 as good as 0.5

@losipiuk losipiuk closed this Feb 24, 2021
# for free to join this conversation on GitHub. Already have an account? # to comment
Development

Successfully merging this pull request may close these issues.

4 participants