[Issue]: How does FLAML handle missing values #1358

lizhuoq · 2024-09-22T04:42:13Z

I looked in the FLAML documentation and I didn't see how FLAML handles missing values for regression and classification tasks for different estimators, FLAML should add in the documentation for different learning algorithms for different tasks, How FLAML handles missing values of categorical variables and continuous variables, this will be very helpful, thank you!

dannycg1996 · 2024-11-04T10:49:45Z

Hi @lizhuoq, FLAML doesn't appear to do any preprocessing to handle missing values - it leaves this to the estimators themselves.

To test this, I applied an LRL1 estimator to the Titanic Dataset (which contains missing data) - the following error was raised:

Some estimators can't handle missing values, whilst others (like Catboost - see here) can. My code for generating the above error can be found below. If we change the estimator to instead be estimator_list: ['catboost'], no error will be raised.

import seaborn as sns
import pandas as pd
from flaml import AutoML
# load dataset titanic
titanic_df = sns.load_dataset('titanic')
titanic_df = titanic_df.drop(columns=["deck"])
X_train = titanic_df.drop(columns = ['survived']).to_numpy()
y_train = pd.DataFrame(titanic_df['survived']).to_numpy()
automl_settings = {
    "time_budget": 20,  # in seconds
    "metric": 'accuracy',
    "estimator_list": ['lrl1'],
    "task": 'classification',
    "log_file_name": "titanic_test.log",
    "n_splits":10,
    "split_type": 'uniform'
}
automl = AutoML()
automl.fit(X_train, y_train, **automl_settings)

I hope that helps!

lizhuoq changed the title ~~[Issue]: FLAML~~ [Issue]: How does FLAML handle missing values Sep 22, 2024

thinkall added documentation Improvements or additions to documentation help wanted Extra attention is needed labels Oct 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Issue]: How does FLAML handle missing values #1358

[Issue]: How does FLAML handle missing values #1358

lizhuoq commented Sep 22, 2024 •

edited

Loading

dannycg1996 commented Nov 4, 2024 •

edited

Loading

[Issue]: How does FLAML handle missing values #1358

[Issue]: How does FLAML handle missing values #1358

Comments

lizhuoq commented Sep 22, 2024 • edited Loading

dannycg1996 commented Nov 4, 2024 • edited Loading

lizhuoq commented Sep 22, 2024 •

edited

Loading

dannycg1996 commented Nov 4, 2024 •

edited

Loading