Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Support to unparse Time scalar value to String #11121

Merged
merged 3 commits into from
Jun 26, 2024

Conversation

goldmedal
Copy link
Contributor

Which issue does this PR close?

Part of #9494

Rationale for this change

What changes are included in this PR?

Are these changes tested?

yes

Are there any user-facing changes?

@github-actions github-actions bot added the sql SQL Planner label Jun 25, 2024
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @goldmedal -- looks good to me ❤️

@@ -1241,6 +1268,22 @@ mod tests {
)),
r#"CAST('1970-01-01 08:00:00.000010001 +08:00' AS TIMESTAMP)"#,
),
(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I double checked and the actual types are slightly different (so this code won't round trip) but I can't think of anything better for here

> select CAST('00:00:00.010001' AS TIME);
+-------------------------+
| Utf8("00:00:00.010001") |
+-------------------------+
| 00:00:00.010001         |
+-------------------------+
1 row(s) fetched.
Elapsed 0.001 seconds.

> select arrow_typeof(CAST('00:00:00.010001' AS TIME));
+---------------------------------------+
| arrow_typeof(Utf8("00:00:00.010001")) |
+---------------------------------------+
| Time64(Nanosecond)                    |
+---------------------------------------+
1 row(s) fetched.
Elapsed 0.005 seconds.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the root cause is that we always use nanosecond precision to construct the TIME or TIMESTAMP value. This issue also occurs for TIMESTAMP:

> select arrow_typeof(CAST('1970-01-01 00:00:00.000010001' AS TIMESTAMP));
+-----------------------------------------------------+
| arrow_typeof(Utf8("1970-01-01 00:00:00.000010001")) |
+-----------------------------------------------------+
| Timestamp(Nanosecond, None)                         |
+-----------------------------------------------------+
1 row(s) fetched. 
Elapsed 0.001 seconds.

> select arrow_typeof(CAST('1970-01-01 00:00:10.001' AS TIMESTAMP));
+-----------------------------------------------+
| arrow_typeof(Utf8("1970-01-01 00:00:10.001")) |
+-----------------------------------------------+
| Timestamp(Nanosecond, None)                   |
+-----------------------------------------------+
1 row(s) fetched. 
Elapsed 0.001 seconds.

I tested some ways to construct a TIME value:

> select arrow_typeof(TIME '00:00:01.001');
+------------------------------------+
| arrow_typeof(Utf8("00:00:01.001")) |
+------------------------------------+
| Time64(Nanosecond)                 |
+------------------------------------+
1 row(s) fetched. 
Elapsed 0.001 seconds.

> select arrow_typeof(CAST('00:00:01' AS TIME));
+--------------------------------+
| arrow_typeof(Utf8("00:00:01")) |
+--------------------------------+
| Time64(Nanosecond)             |
+--------------------------------+
1 row(s) fetched. 
Elapsed 0.001 seconds.

I think if we want to round-trip the result, we should generate the time or timestamp node of AST with the specific precision. However, I found DataFusion doesn't support constructing a time or timestamp value with the specified precision:

> select arrow_typeof(TIME(3) '00:00:01.001');
This feature is not implemented: Unsupported SQL type Time(Some(3), None)
> select arrow_typeof(CAST('00:00:01.001' AS TIME(3)));
This feature is not implemented: Unsupported SQL type Time(Some(3), None)
> select arrow_typeof(CAST('00:00:01.001' AS TIME(6)));
This feature is not implemented: Unsupported SQL type Time(Some(6), None)
> select arrow_typeof(TIMESTAMP(3) '1970-01-01 00:00:10.001');
This feature is not implemented: Unsupported SQL type Timestamp(Some(3), None)

I think this could be another issue but is required for expression round-trip.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the root cause is that we always use nanosecond precision to construct the TIME or TIMESTAMP value. This issue also occurs for TIMESTAMP:

I agree with this analysis.

It is not clear to me that fully preserving the type of timestamps / times etc is necessary for Expr --> text. Thus I think we shouldn't make the code more complicated until we have some actual usecase for trying to preserve the time

Ok(ast::Expr::Cast {
kind: ast::CastKind::Cast,
expr: Box::new(ast::Expr::Value(SingleQuotedString(time))),
data_type: ast::DataType::Time(None, TimezoneInfo::None),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can assign the precision here like

Suggested change
data_type: ast::DataType::Time(None, TimezoneInfo::None),
data_type: ast::DataType::Time(Some(6), TimezoneInfo::None),

It can be converted to String well.

CAST('02:46:41' AS TIME(6))

@alamb alamb merged commit 82f7bf4 into apache:main Jun 26, 2024
23 checks passed
@alamb
Copy link
Contributor

alamb commented Jun 26, 2024

Thanks @goldmedal

@goldmedal
Copy link
Contributor Author

Thanks @alamb

@goldmedal goldmedal deleted the feature/unparse-time branch June 26, 2024 14:08
findepi pushed a commit to findepi/datafusion that referenced this pull request Jul 16, 2024
* support to unparse Time scalar value to string

* fix typo

* cargo fmt
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
sql SQL Planner
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants