Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[SPARK-39551][SQL] Add AQE invalid plan check #36953

Closed
wants to merge 2 commits into from

Conversation

maryannxue
Copy link
Contributor

@maryannxue maryannxue commented Jun 22, 2022

What changes were proposed in this pull request?

This PR adds a check for invalid plans in AQE replanning process. The check will throw exceptions when it detects an invalid plan, causing AQE to void the current replanning result and keep using the latest valid plan.

Why are the changes needed?

AQE logical optimization rules can lead to invalid physical plans and cause runtime exceptions as certain physical plan nodes are not compatible with others. E.g., BroadcastExchangeExec can only work as a direct child of broadcast join nodes, but it could appear under other incompatible physical plan nodes because of empty relation propagation.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Added UT.

@github-actions github-actions bot added the SQL label Jun 22, 2022
case BuildLeft => (b.left, b.right)
case BuildRight => (b.right, b.left)
}
if (!buildPlan.isInstanceOf[BroadcastQueryStageExec]) {
Copy link
Contributor

@ulysses-you ulysses-you Jun 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the BroadcastHashJoinExec is from a sort merge join, it's direct child should be BroadcastExchangeExec ? for example, two joins, the first join is BroadcastHashJoinExec and the second join is sort merge join at first then after re-optimze it changes to broadcast join.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, got it. It checks the BroadcastQueryStageExec can only appear at the direct child of broadcast join but not check the direct child must be the BroadcastQueryStageExec.

@ulysses-you
Copy link
Contributor

It seems the different issue with #36845, this pr checks the plan itself for broadcast-like join. And #36845 check the whole plan (e,g. subquery with DDP's) for broadcast-like join. We should take both of them..

@maryannxue
Copy link
Contributor Author

@ulysses-you You can just add your specific check patterns into ValidateSparkPlan so the main AQE flow looks clean.

@cloud-fan
Copy link
Contributor

Yea this is a framework and we can add more checks to it.

@cloud-fan
Copy link
Contributor

thanks, merging to master/3.3!

@cloud-fan cloud-fan closed this in 58b91b1 Jun 22, 2022
cloud-fan pushed a commit that referenced this pull request Jun 22, 2022
This PR adds a check for invalid plans in AQE replanning process. The check will throw exceptions when it detects an invalid plan, causing AQE to void the current replanning result and keep using the latest valid plan.

AQE logical optimization rules can lead to invalid physical plans and cause runtime exceptions as certain physical plan nodes are not compatible with others. E.g., `BroadcastExchangeExec` can only work as a direct child of broadcast join nodes, but it could appear under other incompatible physical plan nodes because of empty relation propagation.

No.

Added UT.

Closes #36953 from maryannxue/validate-aqe.

Authored-by: Maryann Xue <maryann.xue@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit 58b91b1)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
@dongjoon-hyun
Copy link
Member

I made a backport to branch-3.2 for the comment, #37087 (comment) .

@dongjoon-hyun
Copy link
Member

This will be a part of Apache Spark 3.2.2.

dongjoon-hyun pushed a commit that referenced this pull request Jul 7, 2022
### What changes were proposed in this pull request?

This is a backport of #36953

This PR adds a check for invalid plans in AQE replanning process. The check will throw exceptions when it detects an invalid plan, causing AQE to void the current replanning result and keep using the latest valid plan.

### Why are the changes needed?

AQE logical optimization rules can lead to invalid physical plans and cause runtime exceptions as certain physical plan nodes are not compatible with others. E.g., `BroadcastExchangeExec` can only work as a direct child of broadcast join nodes, but it could appear under other incompatible physical plan nodes because of empty relation propagation.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Added UT.

Closes #37108 from dongjoon-hyun/SPARK-39551.

Authored-by: Maryann Xue <maryann.xue@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
sunchao pushed a commit to sunchao/spark that referenced this pull request Jun 2, 2023
### What changes were proposed in this pull request?

This is a backport of apache#36953

This PR adds a check for invalid plans in AQE replanning process. The check will throw exceptions when it detects an invalid plan, causing AQE to void the current replanning result and keep using the latest valid plan.

### Why are the changes needed?

AQE logical optimization rules can lead to invalid physical plans and cause runtime exceptions as certain physical plan nodes are not compatible with others. E.g., `BroadcastExchangeExec` can only work as a direct child of broadcast join nodes, but it could appear under other incompatible physical plan nodes because of empty relation propagation.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Added UT.

Closes apache#37108 from dongjoon-hyun/SPARK-39551.

Authored-by: Maryann Xue <maryann.xue@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit be891ad)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
lwz9103 pushed a commit to lwz9103/spark that referenced this pull request May 10, 2024
This PR adds a check for invalid plans in AQE replanning process. The check will throw exceptions when it detects an invalid plan, causing AQE to void the current replanning result and keep using the latest valid plan.

AQE logical optimization rules can lead to invalid physical plans and cause runtime exceptions as certain physical plan nodes are not compatible with others. E.g., `BroadcastExchangeExec` can only work as a direct child of broadcast join nodes, but it could appear under other incompatible physical plan nodes because of empty relation propagation.

No.

Added UT.

Closes apache#36953 from maryannxue/validate-aqe.

Authored-by: Maryann Xue <maryann.xue@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit 58b91b1)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit 3cf3048)
lwz9103 pushed a commit to Kyligence/spark that referenced this pull request May 13, 2024
This PR adds a check for invalid plans in AQE replanning process. The check will throw exceptions when it detects an invalid plan, causing AQE to void the current replanning result and keep using the latest valid plan.

AQE logical optimization rules can lead to invalid physical plans and cause runtime exceptions as certain physical plan nodes are not compatible with others. E.g., `BroadcastExchangeExec` can only work as a direct child of broadcast join nodes, but it could appear under other incompatible physical plan nodes because of empty relation propagation.

No.

Added UT.

Closes apache#36953 from maryannxue/validate-aqe.

Authored-by: Maryann Xue <maryann.xue@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit 58b91b1)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit 3cf3048)
lwz9103 pushed a commit to Kyligence/spark that referenced this pull request Jun 5, 2024
This PR adds a check for invalid plans in AQE replanning process. The check will throw exceptions when it detects an invalid plan, causing AQE to void the current replanning result and keep using the latest valid plan.

AQE logical optimization rules can lead to invalid physical plans and cause runtime exceptions as certain physical plan nodes are not compatible with others. E.g., `BroadcastExchangeExec` can only work as a direct child of broadcast join nodes, but it could appear under other incompatible physical plan nodes because of empty relation propagation.

No.

Added UT.

Closes apache#36953 from maryannxue/validate-aqe.

Authored-by: Maryann Xue <maryann.xue@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit 58b91b1)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit 3cf3048)
lwz9103 pushed a commit to lwz9103/spark that referenced this pull request Jul 3, 2024
This PR adds a check for invalid plans in AQE replanning process. The check will throw exceptions when it detects an invalid plan, causing AQE to void the current replanning result and keep using the latest valid plan.

AQE logical optimization rules can lead to invalid physical plans and cause runtime exceptions as certain physical plan nodes are not compatible with others. E.g., `BroadcastExchangeExec` can only work as a direct child of broadcast join nodes, but it could appear under other incompatible physical plan nodes because of empty relation propagation.

No.

Added UT.

Closes apache#36953 from maryannxue/validate-aqe.

Authored-by: Maryann Xue <maryann.xue@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit 58b91b1)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit 3cf3048)
lwz9103 pushed a commit to Kyligence/spark that referenced this pull request Jul 4, 2024
This PR adds a check for invalid plans in AQE replanning process. The check will throw exceptions when it detects an invalid plan, causing AQE to void the current replanning result and keep using the latest valid plan.

AQE logical optimization rules can lead to invalid physical plans and cause runtime exceptions as certain physical plan nodes are not compatible with others. E.g., `BroadcastExchangeExec` can only work as a direct child of broadcast join nodes, but it could appear under other incompatible physical plan nodes because of empty relation propagation.

No.

Added UT.

Closes apache#36953 from maryannxue/validate-aqe.

Authored-by: Maryann Xue <maryann.xue@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit 58b91b1)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit 3cf3048)
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants