You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First we need to verify that we can use the existing hour cudf code for timestamps that are in different time zones and we get the correct answer. It should work because they are all relative to an epoch in the given time zone and offsets should have been taken into account properly so there should not be values that actually exist in the gaps when daylight savings time happens. GpuHour should be updated so if the ZoneId is set and it is not UTC, then we would first convert the timestamps to the desired time zone, and then extract the hour from it.
The hardest part will be in testing this. We should have a list of zone ids that are supported and ones that are not in the tests. Then we can parameterize the tests to include these and set spark.sql.session.timeZone accordingly as the test runs. We may want/need to include some outliers for really old and really new dates that we know python cannot handle, but Spark can.
This is separate from many of the other operators because it is one of the first ones and we want to be sure that it works properly before we plan out and execute a lot of other operators.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
This depends on https://github.com/NVIDIA/spark-rapids/issues/6831and #6832.
First we need to verify that we can use the existing
hour
cudf code for timestamps that are in different time zones and we get the correct answer. It should work because they are all relative to an epoch in the given time zone and offsets should have been taken into account properly so there should not be values that actually exist in the gaps when daylight savings time happens.GpuHour
should be updated so if theZoneId
is set and it is not UTC, then we would first convert the timestamps to the desired time zone, and then extract the hour from it.The hardest part will be in testing this. We should have a list of zone ids that are supported and ones that are not in the tests. Then we can parameterize the tests to include these and set
spark.sql.session.timeZone
accordingly as the test runs. We may want/need to include some outliers for really old and really new dates that we know python cannot handle, but Spark can.This is separate from many of the other operators because it is one of the first ones and we want to be sure that it works properly before we plan out and execute a lot of other operators.
The text was updated successfully, but these errors were encountered: