Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[FEAT] Cross-temporal forecasting #309

Open
wants to merge 34 commits into
base: main
Choose a base branch
from

Conversation

elephaint
Copy link
Contributor

@elephaint elephaint commented Nov 27, 2024

Temporal reconciliation strategy

  • Aggregate temporally using aggregate_temporal, which is a minimal wrapper around aggregate, but in the temporal dimension, using a provided temporal spec
  • Reconcile using our current reconcilers (except for in_sample reconcilers)

See the two new examples that showcase temporal reconciliation.

Limitations

  • Temporal reconcilation (obviously) doesn't support in_sample reconcilers (as there are no insample residuals for the temporal aggregation in the test set, so we'd need an estimate for that - which seems good for future work)
  • Limited to temporal aggregations defined by datetime features, which I think is fine.

Open issues

  • First example notebook doesn't work yet; main issue to solve is how the temporal unique ids can be matched back to the forecast df Example works!
  • Include an easier evaluate function for cross-temporal forecasts, it now requires to run a somewhat complex itertools double for-loop.
  • Unit tests for aggregate_temporal
  • Unit tests for make_future_dataframe
  • Add temporal-only example
  • Add temporal-probabilistic example
  • ufe.time_features cannot handle non-timestamp ds columns Input should be timestamp or integer
  • Add cross-temporal tags creation unit test
  • Fix failures on date features that don't naturally convert to dates
  • Add more unit tests on weird aggregation combinations

@elephaint elephaint linked an issue Nov 27, 2024 that may be closed by this pull request
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@elephaint elephaint linked an issue Feb 11, 2025 that may be closed by this pull request
@elephaint elephaint marked this pull request as ready for review February 17, 2025 10:36
@elephaint
Copy link
Contributor Author

elephaint commented Feb 18, 2025

Seems like date features that are not year or days and don't generally easily convert to datetimes are still an issue to be solved🤔

This seems solved.

Not quite yet. The aggregated ids are sorted lexicographically, which messes up the assignment of the dates

@elephaint elephaint marked this pull request as draft February 19, 2025 06:13
@elephaint elephaint marked this pull request as ready for review February 19, 2025 14:58
@elephaint elephaint marked this pull request as draft February 19, 2025 16:06
@elephaint
Copy link
Contributor Author

elephaint commented Feb 19, 2025

Seems like date features that are not year or days and don't generally easily convert to datetimes are still an issue to be solved🤔

This seems solved.

Not quite yet. The aggregated ids are sorted lexicographically, which messes up the assignment of the dates

Fixed by maintaining order throughout the aggregation. Issues remaining:

  • Assignment of bottom dates to aggregated levels is still iffy. Maybe a better direction is to exhaustively get the frequencies from the (limited set of) attributes, and then construct a daterange from a starting point and a number of periods, which are both known at each level.
  • FIx handling of series of different lengths. Currently the aggregation can handle it but the timestamp assignment fails, it assigns NaNs
  • Add unit test on series of different lengths.

@elephaint
Copy link
Contributor Author

Seems like date features that are not year or days and don't generally easily convert to datetimes are still an issue to be solved🤔
This seems solved.
Not quite yet. The aggregated ids are sorted lexicographically, which messes up the assignment of the dates

Fixed by maintaining order throughout the aggregation. Issues remaining:

  • Assignment of bottom dates to aggregated levels is still iffy. Maybe a better direction is to exhaustively get the frequencies from the (limited set of) attributes, and then construct a daterange from a starting point and a number of periods, which are both known at each level.
  • FIx handling of series of different lengths. Currently the aggregation can handle it but the timestamp assignment fails, it assigns NaNs
  • Add unit test on series of different lengths.

Assignment of bottom dates to aggregated levels by assigning the observed unique timestamps in a linspace where the step is determined by the number of steps in the temporal aggregation. Also not ideal, but best for the moment.

@elephaint elephaint marked this pull request as ready for review February 24, 2025 16:13
id_col: str = "unique_id",
time_col: str = "ds",
id_time_col: str = "temporal_id",
target_cols: list[str] = ["y"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should make this immutable, e.g.

from collections.abc import Sequence

...

target_cols: Sequence[str] = ("y",),  # one-tuple

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work when we have multiple targets? (which is the reason it's a list)

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cross-termporal forecasting Add temporal hierarchies
2 participants