theme	layout	highlighter	colorSchema	favicon	title
default	cover	shiki	light	favicon/url	How we used Polars to build functime, a next gen ML forecasting library

🔮 functime

How we used Polars and global forecasting to build
a next-generation ML forecasting library

👤 Luca Baggi
💼 ML Engineer @xtream
🛠️ Maintainer @functime

📍 Talk outline

🔮 The problem with forecasting

📈 `functime`'s answer

🐻‍❄️ What is Polars?

🌏 What is global forecasting?

💻 A forecasting workflow with `functime`

🔎 Diagnostic tools

🔌 References

🔮 The problem with forecasting

A new paradigm to evaluate the forecasting process

"We spend far too many resources generating, reviewing, adjusting, and approving our forecasts, while almost invariably failing to achieve the level of accuracy desired." (source)

Mike Gilliland
Board of Directors of the International Institute of Forecasters

🔮 The problem with forecasting

A new paradigm to evaluate the forecasting process

"The focus needs to change. We need to shift our attention from esoteric model building to the forecasting process itself – its efficiency and its effectiveness." (source)

Mike Gilliland
Board of Directors of the International Institute of Forecasters

📈 `functime`'s answer

Reframe the problem

Make forecasting just work at a reasonable scale (~90% of use cases).

Forecast thousands of time series without distributed systems (PySpark).
Feature-engineering and diagnostics API compatible with panel datasets.
Smoothly translate form experimentation to production.

This can be achieved with two ingredients: Polars and global forecasting.

🐻‍❄️ What is Polars?

A brief description

Dataframes powered by a multithreaded, vectorized query engine, written in Rust

A dataframe frontend: work with a Python object and not a SQL table.
Utilises all cores on your machine, efficiently (more on this later).
Uses 50+ years of relational database research to optimise the query.
In-process, like sqlite (OLTP), duckdb (OLAP) and LanceDB (vector).

🐻‍❄️ What is Polars?

What makes it so fast

Efficient data representation and I/O with Apache Arrow
Work stealing, AKA efficient multithreading.
Query optimisations through lazy evaluation (e.g.: DataFrame.sort("col1").head(5) in pandas vs Polars).

🌏 What is global forecasting?

A lesson from forecasting competitions

Global forecasting just means to fit a single model on all the time series in your panel dataset.

This approach proved successful in multiple forecasting competitions, most notably M4 (1 2) and M5 (1).

🌏 What is global forecasting?

A lesson from forecasting competitions

Gradient boosted regression trees secured the top spots, but linear models work well too, provided some thoughtful and deliberate feature engineering.

Here's the recipe to make functime: a powerful query engine to perform blazingly fast feature engineering, followed by a single model.fit().

Doesn't have to be best model, but fast to iterate on and scalable to thousands of time series on your laptop.

layout: intro

💻 Forecasting with `functime`

Time for some dangerous live coding 🥶

Source code at https://github.com/baggiponte/pydata-global-2023-functime

💻 Forecasting with `functime`

What I could not show

Prediction intervals with conformal predictions.
Hyperparameter tuning with flaml.
Advanced feature extraction.
Censored forecasts.
LLM data analysis.

🔌 References

A deep dive into the Arrow ecosystem and Polars internals

Apache Arrow and Substrait, the secret foundations of Data Engineering - Alessandro Molina @EuroPython 2023
Polars: DataFrames in the multi-core era - Ritchie Vink @PyData NYC 2023
Is the great dataframe showdown finally over? Enter: Polars - Luca Baggi (me) @PyConIt 2023

🔌 References

More PyData Global 2023 talks

Polars and time zones: everything you need to know by Marco Gorelli
- Practical showcase on how to use Polars to master datetimes and time-zones
We rewrote tsfresh in Polars and why you should too by Chris Lo and Mathieu Cayssol
- 90-minute workshop to dive deeper into functime internals, Polars integration and benchmarking (thanks to Polars-Rust plugins!)

🔌 References

Documentation and communities

Polars web site
Polars discord server
functime.ai website and docs
functime.ai discord server

layout: intro

🙏 Thank you!

Please share your feedback! My address is lucabaggi [at] duck.com

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

slides.md

slides.md

🔮 functime

📍 Talk outline

🔮 The problem with forecasting

📈 `functime`'s answer

🐻‍❄️ What is Polars?

🌏 What is global forecasting?

💻 A forecasting workflow with `functime`

🔎 Diagnostic tools

🔌 References

🔮 The problem with forecasting

🔮 The problem with forecasting

📈 `functime`'s answer

🐻‍❄️ What is Polars?

🐻‍❄️ What is Polars?

🌏 What is global forecasting?

🌏 What is global forecasting?

layout: intro

💻 Forecasting with `functime`

Source code at https://github.com/baggiponte/pydata-global-2023-functime

💻 Forecasting with `functime`

🔌 References

🔌 References

🔌 References

layout: intro

🙏 Thank you!

Files

slides.md

Latest commit

History

slides.md

File metadata and controls

🔮 functime

📍 Talk outline

🔮 The problem with forecasting

📈 functime's answer

🐻‍❄️ What is Polars?

🌏 What is global forecasting?

💻 A forecasting workflow with functime

🔎 Diagnostic tools

🔌 References

🔮 The problem with forecasting

🔮 The problem with forecasting

📈 functime's answer

🐻‍❄️ What is Polars?

🐻‍❄️ What is Polars?

🌏 What is global forecasting?

🌏 What is global forecasting?

layout: intro

💻 Forecasting with functime

Source code at https://github.com/baggiponte/pydata-global-2023-functime

💻 Forecasting with functime

🔌 References

🔌 References

🔌 References

layout: intro

🙏 Thank you!

📈 `functime`'s answer

💻 A forecasting workflow with `functime`

📈 `functime`'s answer

💻 Forecasting with `functime`

💻 Forecasting with `functime`