Skip to content

to_timestamp for negative values gives wrong result #7802

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
comphead opened this issue Oct 12, 2023 · 7 comments · Fixed by #7844
Closed

to_timestamp for negative values gives wrong result #7802

comphead opened this issue Oct 12, 2023 · 7 comments · Fixed by #7844
Assignees
Labels
bug Something isn't working

Comments

@comphead
Copy link
Contributor

Describe the bug

In Datafusion to_timestamp builtin scalar function gives wrong result for negative values


❯ select to_timestamp(-62125747200);
+-----------------------------------+
| to_timestamp(Int64(-62125747200)) |
+-----------------------------------+
| 1969-12-31T23:58:57.874252800     |
+-----------------------------------+
1 row in set. Query took 0.035 seconds.

PG gives correct result

0001-04-25T00:00:00.000Z

To Reproduce

Run

❯ select to_timestamp(-62125747200);

Expected behavior

Query results should match

Additional context

No response

@comphead comphead added the bug Something isn't working label Oct 12, 2023
@Weijun-H
Copy link
Member

/// to_timestamp SQL function
pub fn to_timestamp(args: &[ColumnarValue]) -> Result<ColumnarValue> {
    handle::<TimestampNanosecondType, _, TimestampNanosecondType>(
        args,
        string_to_timestamp_nanos_shim,
        "to_timestamp",
    )
}

The problem is that the function 'to_timestamp' treats all input as 'nano', which is causing an issue.

@comphead
Copy link
Contributor Author

it might be a bigger problem.....

 select to_timestamp_millis(-621257472);
+----------------------------------------+
| to_timestamp_millis(Int64(-621257472)) |
+----------------------------------------+
| 1969-12-24T19:25:42.528                |
+----------------------------------------+
1 row in set. Query took 0.002 seconds.

❯ select to_timestamp_micros(-621257472);
+----------------------------------------+
| to_timestamp_micros(Int64(-621257472)) |
+----------------------------------------+
| 1969-12-31T23:49:38.742528             |
+----------------------------------------+
1 row in set. Query took 0.005 seconds.

❯ select to_timestamp_micros(-621257472000);
+-------------------------------------------+
| to_timestamp_micros(Int64(-621257472000)) |
+-------------------------------------------+
| 1969-12-24T19:25:42.528                   |
+-------------------------------------------+
1 row in set. Query took 0.002 seconds.

@alamb
Copy link
Contributor

alamb commented Oct 13, 2023

I agree the difference with postgres seems related to the fact that to_timestamp takes an argument in terms of second for postgres, and nanoseconds for datafusion:

https://www.postgresql.org/docs/current/functions-datetime.html#FUNCTIONS-DATETIME-TABLE

to_timestamp ( double precision ) → timestamp with time zone

Convert Unix epoch (seconds since 1970-01-01 00:00:00+00) to timestamp with time zone

to_timestamp(1284352323) → 2010-09-13 04:32:03+00

@comphead
Copy link
Contributor Author

comphead commented Oct 13, 2023

@alamb @Weijun-H
I agree

❯ select to_timestamp_seconds(-62125747200);
+-------------------------------------------+
| to_timestamp_seconds(Int64(-62125747200)) |
+-------------------------------------------+
| 0001-04-25T00:00:00                       |
+-------------------------------------------+
1 row in set. Query took 0.002 seconds.

I'll prepare a fix to make to_timestamp to be in sync with PG and make and input as seconds, do you think we need a to_timestamp_nanos instead?

@comphead comphead self-assigned this Oct 13, 2023
@alamb
Copy link
Contributor

alamb commented Oct 13, 2023

I'll prepare a fix to make to_timestamp to be in sync with PG and make and input as seconds, do you think we need a to_timestamp_nanos instead?

Yes I think this is important.

Also I was recently looking for a reference to the docs where we say "datafusion aims to have the postgres dialect" but could not find it 🤔 Do you know where it is?

@comphead
Copy link
Contributor Author

postgres

https://github.com/apache/arrow-datafusion/blob/main/datafusion/sqllogictest/README.md#running-tests-postgres-compatibility
This is the only thing I was able to find.

I plan to renew a bit a Datafusion main page to be more user-friendly and easier to start so we can include the PG compatibility statement there as well.

@alamb
Copy link
Contributor

alamb commented Oct 13, 2023

I plan to renew a bit a Datafusion main page to be more user-friendly and easier to start so we can include the PG compatibility statement there as well.

That would be awesome -- thank you @comphead

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working
Projects
None yet
3 participants