Skip to content

Fix ArrayAgg schema mismatch issue #8055

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Merged
merged 10 commits into from
Nov 9, 2023

Conversation

jayzhan211
Copy link
Contributor

@jayzhan211 jayzhan211 commented Nov 5, 2023

Which issue does this PR close?

Closes #8032.

Rationale for this change

The main issue for #8032 is as what the errors said we have different schema for result.schema which is logical-expr schema and results.collect()[0].schema which is physical-expr schema.

We should have the same is_nullable for both logical-expr and physical-expr

We can see that the logical-expr of List is set with expr.nullable(schema)
https://github.com/apache/arrow-datafusion/blob/3469c4e09a3d32381949dd0c0f626f406c00c6ad/datafusion/expr/src/expr_schema.rs#L289-L305
and logical-expr of List element is set with the return type of Aggregatefunciton::ArrayAgg
https://github.com/apache/arrow-datafusion/blob/3469c4e09a3d32381949dd0c0f626f406c00c6ad/datafusion/expr/src/aggregate_function.rs#L286-L290

What changes are included in this PR?

Fix schema is_nullable

Are these changes tested?

I did not test end to end like #8032 , just make sure the schema is the same.

Are there any user-facing changes?

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
@github-actions github-actions bot added physical-expr Changes to the physical-expr crates core Core DataFusion crate labels Nov 5, 2023
Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
@jayzhan211 jayzhan211 marked this pull request as draft November 5, 2023 03:56
Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
@jayzhan211 jayzhan211 marked this pull request as ready for review November 5, 2023 07:56
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @jayzhan211 -- it would be nice if there was some way to avoid having to plumb the is_expr_nullable flag through, but I don't think there is

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @jayzhan211 for the contribution and working on this area of the code

Signed-off-by: jayzhan211 <jayzhan211@gmail.com>
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again @jayzhan211

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
core Core DataFusion crate physical-expr Changes to the physical-expr crates
Projects
None yet
Development

Successfully merging this pull request may close these issues.

array_agg with pyarrow errors with ArrowInvalid: Schema at index 0 was different
2 participants