Skip to content

Optimize date_part Minute by avoiding unnecessary computation #14043

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
jayzhan211 opened this issue Jan 8, 2025 · 2 comments
Open

Optimize date_part Minute by avoiding unnecessary computation #14043

jayzhan211 opened this issue Jan 8, 2025 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@jayzhan211
Copy link
Contributor

Is your feature request related to a problem or challenge?

Open an issue to track the status of this optimization

Related
#13449
apache/arrow-rs#6746

Describe the solution you'd like

I guess there are some changes in arrow-rs left to do.

Describe alternatives you've considered

No response

Additional context

impl ExtractDatePartExt for PrimitiveArray<TimestampSecondType> {
    fn date_part(&self, part: DatePart) -> Result<Int32Array, ArrowError> {
        // TimestampSecond only encodes number of seconds, so these will always be 0
        let array =
            if let DatePart::Millisecond | DatePart::Microsecond | DatePart::Nanosecond = part {
                Int32Array::new(vec![0; self.len()].into(), self.nulls().cloned())
            } else if let Some(tz) = get_tz(self.data_type())? {
                let map_func = get_date_time_part_extract_fn(part);
                self.unary_opt(|d| {
                    timestamp_s_to_datetime(d)
                        .map(|c| Utc.from_utc_datetime(&c).with_timezone(&tz))
                        .map(map_func)
                })
            } else {
                let map_func = get_date_time_part_extract_fn(part);
                self.unary_opt(|d| timestamp_s_to_datetime(d).map(map_func))
            };
        Ok(array)
    }
}

If I remember correctly, we need to switch timestamp_s_to_datetime to timestamp_s_to_time and extract the data from Minute

@jayzhan211 jayzhan211 added the enhancement New feature or request label Jan 8, 2025
@jayzhan211
Copy link
Contributor Author

I think this is a good first issue for getting familiar with optimization and benchmarking code.

@samsond
Copy link

samsond commented Jan 8, 2025

Take

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants