Skip to content

Fix: generate_series function support string type #12002

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Merged
merged 11 commits into from
Aug 19, 2024

Conversation

getChan
Copy link
Contributor

@getChan getChan commented Aug 15, 2024

Which issue does this PR close?

Closes #11922.

Rationale for this change

generate_series function does not accept string arguments.
However, the as-is test provide string arguments.

I think it would be better if the type signature of generate_series was changed.
https://github.com/apache/datafusion/blob/main/datafusion/functions-nested/src/range.rs#L129

What changes are included in this PR?

Only sqllogictest

Are these changes tested?

Yes.

Are there any user-facing changes?

No.

@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Aug 15, 2024
@jayzhan211
Copy link
Contributor

@getChan I think what we need is to support string type but not modifying the test to other already supported type

D create table date_table(start date, stop date, step interval);
D insert into date_table values (DATE '1992-01-01', DATE '1993-01-02', INTERVAL '1' MONTH);
D select generate_series(start, '1993-03-01', INTERVAL '1 year') from date_table;
┌────────────────────────────────────────────────────────────────────┐
│ generate_series("start", '1993-03-01', CAST('1 year' AS INTERVAL)) │
│                            timestamp[]                             │
├────────────────────────────────────────────────────────────────────┤
│ [1992-01-01 00:00:00, 1993-01-01 00:00:00]                         │
└────────────────────────────────────────────────────────────────────┘

We should able to return the same result as Duckdb's

@getChan
Copy link
Contributor Author

getChan commented Aug 15, 2024

In that case, I guess we should extend the above issue to "Add string type support to generate_series function".

@jayzhan211
Copy link
Contributor

jayzhan211 commented Aug 15, 2024

I expect you get the same answer as if you explicitly cast the second argument to date:

The goal is make the query return the same result as the one you explicitly cast the argument.

In that case, I guess we should extend the above issue to "Add string type support to generate_series function".

Yes, I think what we need is to support string type, or cast the string to date. Not sure what is the exact solution. But I expect we should get the correct result for that query

@alamb
Copy link
Contributor

alamb commented Aug 15, 2024

Yes, I think what we need is to support string type, or cast the string to date. Not sure what is the exact solution. But I expect we should get the correct result for that query

The type coercsion system is supposed to handle adding this kind of cast.

I wonder if we can could remove the Any(3) signature a new signature to generate_series :

TypeSignature::Exact(vec![Int64]),
TypeSignature::Exact(vec![Int64, Int64]),
TypeSignature::Exact(vec![Int64, Int64, Int64]),
TypeSignature::Exact(vec![Date32, Date32, Interval(MonthDayNano)]),
TypeSignature::Any(3),

I think that would then ensure DataFusion actually casts the arguments to the supported types

@alamb alamb marked this pull request as draft August 15, 2024 22:45
@alamb
Copy link
Contributor

alamb commented Aug 15, 2024

Marking as draft as I think this PR is no longer waiting on feedback. Please mark it as ready for review when it is ready for another look

@getChan getChan force-pushed the generate-series-sqllogictest-bug branch from 4957a1d to bccdc2e Compare August 17, 2024 17:49
@github-actions github-actions bot added the logical-expr Logical plan and expressions label Aug 17, 2024
@getChan
Copy link
Contributor Author

getChan commented Aug 17, 2024

I wonder if we can could remove the Any(3) signature a new signature to generate_series :

removed Any(3) signature. and fix some type coerce code for range, generate_series

@getChan getChan changed the title Fix: sqllogictest - generate_series function invalid argument type Fix: generate_series function support string type Aug 17, 2024
@getChan getChan marked this pull request as ready for review August 17, 2024 18:13
@getChan getChan marked this pull request as draft August 18, 2024 11:09
@github-actions github-actions bot removed the logical-expr Logical plan and expressions label Aug 18, 2024
@getChan getChan marked this pull request as ready for review August 18, 2024 12:30
Copy link
Contributor

@jayzhan211 jayzhan211 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

@jayzhan211 jayzhan211 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @getChan

@jayzhan211 jayzhan211 merged commit 574dfeb into apache:main Aug 19, 2024
24 checks passed
@@ -5804,7 +5804,7 @@ select generate_series(5),
----
[0, 1, 2, 3, 4, 5] [2, 3, 4, 5] [2, 5, 8] [1, 2, 3, 4, 5] [5, 4, 3, 2, 1] [10, 7, 4] [1992-09-01, 1992-10-01, 1992-11-01, 1992-12-01, 1993-01-01, 1993-02-01, 1993-03-01] [1993-02-01, 1993-01-31, 1993-01-30, 1993-01-29, 1993-01-28, 1993-01-27, 1993-01-26, 1993-01-25, 1993-01-24, 1993-01-23, 1993-01-22, 1993-01-21, 1993-01-20, 1993-01-19, 1993-01-18, 1993-01-17, 1993-01-16, 1993-01-15, 1993-01-14, 1993-01-13, 1993-01-12, 1993-01-11, 1993-01-10, 1993-01-09, 1993-01-08, 1993-01-07, 1993-01-06, 1993-01-05, 1993-01-04, 1993-01-03, 1993-01-02, 1993-01-01] [1989-04-01, 1990-04-01, 1991-04-01, 1992-04-01]

query error DataFusion error: Execution error: unsupported type for range. Expected Int64 or Date32, got: Timestamp\(Nanosecond, None\)
query error DataFusion error: Execution error: Cannot generate date range less than 1 day\.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will file an ticket for this issue

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you 🙏

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Internal error generate_series: Internal error: could not cast value to .....arrow_array::types::Date32Type`
3 participants