Skip to content

Implementation of array_intersect #8081

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Merged
merged 2 commits into from
Nov 11, 2023
Merged

Conversation

Veeupup
Copy link
Contributor

@Veeupup Veeupup commented Nov 7, 2023

Which issue does this PR close?

Closes #6978

Rationale for this change

Using RowConverter to implement array_intersect, no duplicate code for different data types,

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added logical-expr Logical plan and expressions physical-expr Changes to the physical-expr crates core Core DataFusion crate labels Nov 7, 2023
@Veeupup
Copy link
Contributor Author

Veeupup commented Nov 7, 2023

hi @edmondop @jayzhan211 , I have a draft implementation for array_intersect, I use a hashmap to store the first array values and find values that appear in the hashmap. And it works just fine and looks like a general approach?

But I encounter a problem for Float32 and Float64, we can not just use HashMap<f32/f64> because std::hash::Hash is not implemented for float types, so I want to fix it by introducing ordered-float to make it hashable, is it too heavy for this case ?
Or do you have any better ideas to make it? Very happy to see your advice !

@tustvold
Copy link
Contributor

tustvold commented Nov 7, 2023

Perhaps you could use https://docs.rs/arrow-row/latest/arrow_row/? This would have the added benefit of supporting more complex types, e.g. lists of lists, and not adding additional codegen (#7988). The notes I wrote on #6981 (comment) might be helpful for this

@Veeupup Veeupup changed the title Initial Implementation of array_intersect Implementation of array_intersect Nov 8, 2023
@Veeupup Veeupup marked this pull request as ready for review November 8, 2023 16:38
@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Nov 9, 2023
@Veeupup Veeupup force-pushed the array_interact branch 2 times, most recently from 6b7198f to f12525c Compare November 9, 2023 12:42
Copy link
Contributor

@jayzhan211 jayzhan211 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Roughly LGTM

@Veeupup
Copy link
Contributor Author

Veeupup commented Nov 9, 2023

@alamb PTAL : )

@Veeupup Veeupup requested a review from Dandandan November 9, 2023 14:28
@xudong963
Copy link
Member

@Veeupup Triggered the ci, please fix it.

Signed-off-by: veeupup <code@tanweime.com>
Signed-off-by: veeupup <code@tanweime.com>

x
@Veeupup
Copy link
Contributor Author

Veeupup commented Nov 10, 2023

@xudong963 hi try fixing ci locally, maybe you can help me trigger it again or review it? : )

@Veeupup Veeupup requested a review from tustvold November 10, 2023 16:01
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a nice implementation to me -- thank you @Veeupup @Dandandan @tustvold @xudong963 and @Dandandan -- what a team effort ❤️

Copy link
Member

@xudong963 xudong963 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, welcome to arrow-datafusion

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
core Core DataFusion crate logical-expr Logical plan and expressions physical-expr Changes to the physical-expr crates sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement array_intersect function
6 participants