-
Notifications
You must be signed in to change notification settings - Fork 1k
chore(parquet/record/field): dont truncate timestamps on display #8266
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Open
Erigara
wants to merge
1
commit into
apache:main
Choose a base branch
from
Erigara:dont_truncate_ts
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+22
−29
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -928,29 +928,22 @@ fn convert_date_to_string(value: i32) -> String { | |
format!("{}", dt.format("%Y-%m-%d")) | ||
} | ||
|
||
/// Helper method to convert Parquet timestamp into a string. | ||
/// Input `value` is a number of seconds since the epoch in UTC. | ||
/// Datetime is displayed in local timezone. | ||
#[inline] | ||
fn convert_timestamp_secs_to_string(value: i64) -> String { | ||
let dt = Utc.timestamp_opt(value, 0).unwrap(); | ||
format!("{}", dt.format("%Y-%m-%d %H:%M:%S %:z")) | ||
} | ||
|
||
/// Helper method to convert Parquet timestamp into a string. | ||
/// Input `value` is a number of milliseconds since the epoch in UTC. | ||
/// Datetime is displayed in local timezone. | ||
#[inline] | ||
fn convert_timestamp_millis_to_string(value: i64) -> String { | ||
convert_timestamp_secs_to_string(value / 1000) | ||
let dt = Utc.timestamp_millis_opt(value).unwrap(); | ||
format!("{}", dt.format("%Y-%m-%d %H:%M:%S%.3f %:z")) | ||
} | ||
|
||
/// Helper method to convert Parquet timestamp into a string. | ||
/// Input `value` is a number of microseconds since the epoch in UTC. | ||
/// Datetime is displayed in local timezone. | ||
#[inline] | ||
fn convert_timestamp_micros_to_string(value: i64) -> String { | ||
convert_timestamp_secs_to_string(value / 1000000) | ||
let dt = Utc.timestamp_micros(value).unwrap(); | ||
format!("{}", dt.format("%Y-%m-%d %H:%M:%S%.6f %:z")) | ||
} | ||
|
||
/// Helper method to convert Parquet time (milliseconds since midnight) into a string. | ||
|
@@ -1278,44 +1271,44 @@ mod tests { | |
|
||
#[test] | ||
fn test_convert_timestamp_millis_to_string() { | ||
fn check_datetime_conversion(y: u32, m: u32, d: u32, h: u32, mi: u32, s: u32) { | ||
fn check_datetime_conversion(y: u32, m: u32, d: u32, h: u32, mi: u32, s: u32, milli: u32) { | ||
let datetime = chrono::NaiveDate::from_ymd_opt(y as i32, m, d) | ||
.unwrap() | ||
.and_hms_opt(h, mi, s) | ||
.and_hms_milli_opt(h, mi, s, milli) | ||
.unwrap(); | ||
let dt = Utc.from_utc_datetime(&datetime); | ||
let res = convert_timestamp_millis_to_string(dt.timestamp_millis()); | ||
let exp = format!("{}", dt.format("%Y-%m-%d %H:%M:%S %:z")); | ||
let exp = format!("{}", dt.format("%Y-%m-%d %H:%M:%S%.3f %:z")); | ||
assert_eq!(res, exp); | ||
} | ||
|
||
check_datetime_conversion(1969, 9, 10, 1, 2, 3); | ||
check_datetime_conversion(2010, 1, 2, 13, 12, 54); | ||
check_datetime_conversion(2011, 1, 3, 8, 23, 1); | ||
check_datetime_conversion(2012, 4, 5, 11, 6, 32); | ||
check_datetime_conversion(2013, 5, 12, 16, 38, 0); | ||
check_datetime_conversion(2014, 11, 28, 21, 15, 12); | ||
check_datetime_conversion(1969, 9, 10, 1, 2, 3, 4); | ||
check_datetime_conversion(2010, 1, 2, 13, 12, 54, 42); | ||
check_datetime_conversion(2011, 1, 3, 8, 23, 1, 27); | ||
check_datetime_conversion(2012, 4, 5, 11, 6, 32, 0); | ||
check_datetime_conversion(2013, 5, 12, 16, 38, 0, 15); | ||
check_datetime_conversion(2014, 11, 28, 21, 15, 12, 59); | ||
} | ||
|
||
#[test] | ||
fn test_convert_timestamp_micros_to_string() { | ||
fn check_datetime_conversion(y: u32, m: u32, d: u32, h: u32, mi: u32, s: u32) { | ||
fn check_datetime_conversion(y: u32, m: u32, d: u32, h: u32, mi: u32, s: u32, micro: u32) { | ||
let datetime = chrono::NaiveDate::from_ymd_opt(y as i32, m, d) | ||
.unwrap() | ||
.and_hms_opt(h, mi, s) | ||
.and_hms_micro_opt(h, mi, s, micro) | ||
.unwrap(); | ||
let dt = Utc.from_utc_datetime(&datetime); | ||
let res = convert_timestamp_micros_to_string(dt.timestamp_micros()); | ||
let exp = format!("{}", dt.format("%Y-%m-%d %H:%M:%S %:z")); | ||
let exp = format!("{}", dt.format("%Y-%m-%d %H:%M:%S%.6f %:z")); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. it would be nice if we could change these tests to have hard coded values rather than just programatically checking consistency -- with just these changes, I don't have any sense of how different the output is after this change. |
||
assert_eq!(res, exp); | ||
} | ||
|
||
check_datetime_conversion(1969, 9, 10, 1, 2, 3); | ||
check_datetime_conversion(2010, 1, 2, 13, 12, 54); | ||
check_datetime_conversion(2011, 1, 3, 8, 23, 1); | ||
check_datetime_conversion(2012, 4, 5, 11, 6, 32); | ||
check_datetime_conversion(2013, 5, 12, 16, 38, 0); | ||
check_datetime_conversion(2014, 11, 28, 21, 15, 12); | ||
check_datetime_conversion(1969, 9, 10, 1, 2, 3, 4); | ||
check_datetime_conversion(2010, 1, 2, 13, 12, 54, 42); | ||
check_datetime_conversion(2011, 1, 3, 8, 23, 1, 27); | ||
check_datetime_conversion(2012, 4, 5, 11, 6, 32, 0); | ||
check_datetime_conversion(2013, 5, 12, 16, 38, 0, 15); | ||
check_datetime_conversion(2014, 11, 28, 21, 15, 12, 59); | ||
} | ||
|
||
#[test] | ||
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The input is
i64
, so it will never attach a time zone, and the output will always be' +00:00'. Do we need to add "%:z" in the output?From
Utc.timestamp_millis_opt
, it seems that it will always return a UTC timezone value, so maybe we need to change the comment for this function(the comment said the result will be in local timezone).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another question not related to the current PR: this function
convert_timestamp_millis_to_string
will be used inField
, andField
contains ani64
forTimestampMillis/TimestampMicros
(the milliseconds/microseconds from unix epoch), does output the raw timestamp(thei64
value) here is enough? and translate the timestamp todt.format
value when displaying something likeLogicalType::Timestamp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think saying that "Datetime is displayed in local timezone." is misleading in this case.
Maybe we indeed can remove "%:z" and fix comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue as I see it is that the record api only uses
ConvertedType
, so the timestamps are normalized to UTC. I would suggest keeping the%:z
and simply update the documentation.I also think this may be a breaking change since it will change the JSON output value. What do you think @alamb?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have much insight into this particular API and if anyone would treat this as an API change
Since the existing code uses almost the same format string it seems like not a big change to me, but I don't really know