Improve the univariate statsforecast function in EvaDB #1081

xzdandy · 2023-09-08T08:11:35Z

Search before asking

I have searched the EvaDB issues and found no similar feature requests.

Description

The univariate statsforecast function train and predicts on the exact same input relation, so there is no need for a separate training procedure. Currently SELECT Forecast(12) FROM AirData; does not make sense.
The timeseries column is not properly handled. statsforecast has a required format for the timeseries column. https://nixtla.github.io/statsforecast/docs/getting-started/getting_started_short.html
The univariate statsforecast expects a fixed schema for the input dataframe. Renaming the column is not handled properly now.
Update documentation with all available parameters.

Use case

No response

Are you willing to submit a PR?

Yes I'd like to help by submitting a PR!

The text was updated successfully, but these errors were encountered:

americast · 2023-09-08T15:08:21Z

The univariate statsforecast function train and predicts on the exact same input relation, so there is no need for a separate training procedure. Currently SELECT Forecast(12) FROM AirData; does not make sense.

I believe we can simply do SELECT Forecast(12);. The FROM part is a little redundant, but I am not sure if that is in line with SQL syntax.

@xzdandy I'll take care of 2, you can assign it to me. Thanks!

xzdandy · 2023-09-08T15:42:49Z

The univariate statsforecast function train and predicts on the exact same input relation, so there is no need for a separate training procedure. Currently SELECT Forecast(12) FROM AirData; does not make sense.

I believe we can simply do SELECT Forecast(12);. The FROM part is a little redundant, but I am not sure if that is in line with SQL syntax.

@xzdandy I'll take care of 2, you can assign it to me. Thanks!

Thanks @americast! SELECT Forecast(12) the syntax is not supported. We can add that. I will handle 3 first, which does not allow me to doing forecast on tables with customized column names.

For time data type, it can be tricky. For example, I am using House Property Sales Time Series data set, where the saledate column is 30/09/2007, which is not the default panda date type format. We need to support some kind of date type and conversion here. Any idea you have.

americast · 2023-09-08T15:47:41Z

I will handle 3 first, which does not allow me to doing forecast on tables with customized column names.

@xzdandy I had added some support for customized column names in #969 . It's handled by the id and time variables. Are they not working for you?

xzdandy · 2023-09-08T15:51:16Z

I will handle 3 first, which does not allow me to doing forecast on tables with customized column names.

@xzdandy I had added some support for customized column names in #969 . It's handled by the id and time variables. Are they not working for you?

It is not working. 1) the change is to the aggregated_batch instead of data. This can be easily fixed. 2) The output object of the UDF is not correctly binded. So in projection, we are looking for a non-existent column.

xzdandy · 2023-09-09T16:54:15Z

From the warning message, /home/zxu330/eva/evadb-venv-test/lib/python3.10/site-packages/statsforecast/core.py:691: UserWarning: Parsing dates in %d/%m/%Y format when dayfir st=False (the default) was specified. Pass dayfirst=True or specify a format to silence this warning. It seems we can specify a time format to parsing. We can explore this option.

Addressing item3 in #1081 * [x] In `evadb/executor/create_function_executor.py`, we rename the input relationship to a [fixed schema](https://nixtla.github.io/statsforecast/docs/getting-started/getting_started_short.html) requested by statsforecast * [x] Rename the output column so it is synced with binder. A temporal fix. We will reconsider the de# #1017 * [x] Update testcases to test the column rename feature.

- Addressing ` Update documentation with all available parameters.` in #1081. - Adding documentation for * MODEL * ID * TIME * PREDICT * FREQUENCY

Fixes #1081 pt 2.

Address the `SELECT Forecast(12) FROM AirData;` to `SELECT Forecast(12);` in #1081 - [x] update parser, binder, optimizer, and executor to allow project without children. - [x] update forecasting test cases and documentation. - [x] add unit test and short integration test for `SELECT expr;`. - [x] add documentation that we support `SELECT expr;`.

xzdandy added Feature Request ✨ New feature or request High Priority ⚡️ labels Sep 8, 2023

xzdandy added this to the v0.3.5 milestone Sep 8, 2023

xzdandy self-assigned this Sep 8, 2023

This was referenced Sep 8, 2023

Add documentation on all the parameters CREATE FUNCTION TYPE Forecasting support #1064

Closed

Forecast model crash when converting date into float #1065

Closed

xzdandy assigned americast Sep 8, 2023

xzdandy mentioned this issue Sep 9, 2023

Fix column name related issue for Forecast functions #1084

Merged

3 tasks

This was referenced Sep 10, 2023

Update parameters documentation for forecast #1086

Merged

Support SELECT expr; which does not require FROM table #1087

Merged

jiashenC pushed a commit that referenced this issue Sep 10, 2023

Update parameters documentation for forecast (#1086)

52b4d8e

- Addressing ` Update documentation with all available parameters.` in #1081. - Adding documentation for * MODEL * ID * TIME * PREDICT * FREQUENCY

americast mentioned this issue Sep 11, 2023

Fixes date and frequency issues in forecasting #1094

Merged

americast closed this as completed in #1094 Sep 12, 2023

americast added a commit that referenced this issue Sep 12, 2023

Fixes date and frequency issues in forecasting (#1094)

788d65e

Fixes #1081 pt 2.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve the univariate statsforecast function in EvaDB #1081

Improve the univariate statsforecast function in EvaDB #1081

xzdandy commented Sep 8, 2023 •

edited by americast

Loading

americast commented Sep 8, 2023

xzdandy commented Sep 8, 2023 •

edited

Loading

americast commented Sep 8, 2023

xzdandy commented Sep 8, 2023

xzdandy commented Sep 9, 2023

Improve the univariate statsforecast function in EvaDB #1081

Improve the univariate statsforecast function in EvaDB #1081

Comments

xzdandy commented Sep 8, 2023 • edited by americast Loading

Search before asking

Description

Use case

Are you willing to submit a PR?

americast commented Sep 8, 2023

xzdandy commented Sep 8, 2023 • edited Loading

americast commented Sep 8, 2023

xzdandy commented Sep 8, 2023

xzdandy commented Sep 9, 2023

xzdandy commented Sep 8, 2023 •

edited by americast

Loading

xzdandy commented Sep 8, 2023 •

edited

Loading