[Bug]: Sometimes the optimal results of non-optimal estimators are not saved #1388

flippercy · 2024-12-16T20:10:06Z

Describe the bug

Hi:

I've created two customized lightGBM estimators for automl:

class MyMonotonicLightGBMGBDTClassifier(BaseEstimator):

def __init__(self, task = 'binary:logistic', n_jobs = num_cores, **params):

    super().__init__(task, **params)

    self.estimator_class = LGBMClassifier

    # convert to int for integer hyperparameters
    self.params = {
        'n_jobs': params['n_jobs'] if 'n_jobs' in params else num_cores,
        'boosting_type':params['boosting_type'] if 'boosting_type' in params else 'gbdt',
        'colsample_bytree':params['colsample_bytree'],
        'n_estimators':int(params['n_estimators']),
        'random_state': params['random_state'] if 'random_state' in params else randomseed,
        "monotone_constraints":params['monotone_constraints'] if 'monotone_constraints' in params else monotone,
        
   }   

@classmethod
def search_space(cls, data_size, task):
    
    space = {     
    'n_estimators': {'domain': tune.uniform(lower = 50, upper = 500), 'init_value': 200, 'low_cost_init_value': 200},
    'colsample_bytree': {'domain': tune.uniform(lower = 0.5, upper = 1), 'init_value': 0.9, 'low_cost_init_value': 0.9},
    }
    return space

automl.add_learner(learner_name = 'MonotonicLightGBMGBDT', learner_class = MyMonotonicLightGBMGBDTClassifier)

class MyMonotonicLightGBMDartClassifier(BaseEstimator):

def __init__(self, task = 'binary:logistic', n_jobs = num_cores, **params):

    super().__init__(task, **params)

    self.estimator_class = LGBMClassifier

    self.params = {
        'n_jobs': params['n_jobs'] if 'n_jobs' in params else num_cores,
        'boosting_type':params['boosting_type'] if 'boosting_type' in params else 'dart',
        'colsample_bytree':params['colsample_bytree'],
        'n_estimators':int(params['n_estimators']),
        'drop_rate': params['drop_rate'],
        'random_state': params['random_state'] if 'random_state' in params else randomseed,
        "monotone_constraints":params['monotone_constraints'] if 'monotone_constraints' in params else monotone,
   }   

@classmethod
def search_space(cls, data_size, task):
    
    space = {        
    'n_estimators': {'domain': tune.uniform(lower = 50, upper = 500), 'init_value': 200, 'low_cost_init_value': 200},
    'colsample_bytree': {'domain': tune.uniform(lower = 0.5, upper = 1), 'init_value': 0.9, 'low_cost_init_value': 0.9},
    'drop_rate': {'domain': tune.uniform(lower = 0.1, upper = 0.4), 'init_value': 0.2, 'low_cost_init_value': 0.2},
    }
    return space

automl.add_learner(learner_name = 'MonotonicLightGBMDart', learner_class = MyMonotonicLightGBMDartClassifier)

Then I call these two estimators for automl with the setting below:

from flaml import AutoML
from flaml.automl.model import BaseEstimator, LRL1Classifier
from xgboost.sklearn import XGBClassifier
from lightgbm.sklearn import LGBMClassifier

estimator_list= [ 'MonotonicLightGBMDart', 'MonotonicLightGBMGBDT']

settings = {
"keep_search_state": True,
"time_budget": flaml_time_budget,
'max_iter': 15,
'mem_thres': flaml_mem_thres,
"metric": 'roc_auc',
"task": 'classification',
"estimator_list": estimator_list,
"log_file_name": logfilename,
"log_type":'all',
"seed":randomseed,
"model_history":True
}

The process usually runs well; however I noticed one issue: sometimes the best result of the estimator which is not the optimal one is not saved. For example, after the search I want to retrieve the best models of both MonotonicLightGBMDart and MonotonicLightGBMGBDT. In this case, if the overall optimal model returned is built by MonotonicLightGBMDart, then sometimes, the best model by MonotonicLightGBMGBDT is not saved (returned an empty model when I tried automl.best_model_for_estimator('MonotonicLightGBMGBDT')._model).

What makes me more confused is that it does not happen every time and is not always repeatable. Sometimes if I restarted the kernel and re-ran the process the issue disappeared.

Could anyone check my codes and tell me the reason for this problem?

Thank you.

Steps to reproduce

No response

Model Used

No response

Expected Behavior

No response

Screenshots and logs

No response

Additional Information

No response

The text was updated successfully, but these errors were encountered:

thinkall · 2024-12-17T07:17:41Z

Hi @flippercy , thank you for reporting the issue. It happens when one estimator is nevered trained. Could you check the detailed logs to confirm that?

flippercy · 2024-12-17T16:10:18Z

Hi @thinkall:

Unfortunately this is not the reason. Based on the logs and results from other functions (such as automl._search_states.items()), all the estimators have been trained. I can even retrieve the optimal results (such as AUC) for each learner without any issue; the only problem is that the best model itself of non-optimal learners cannot be saved sometimes. It does not happen every time but just randomly, which makes it harder to troubleshoot.

Is it due to my customized learners? Could you check them quickly please?

Thank you.

thinkall · 2024-12-18T03:25:05Z

Hi @flippercy , I guess it happens when the non-optimal learner is not fully trained at all. Do you mind share a full history of logs, and code snippet for reproducing?

flippercy · 2024-12-18T21:10:48Z

Hi @thinkall:

Thank you for the response. Unfortunately the related log file was deleted; however, I can guarantee that under training is not the reason for this issue. With the example above, if I did the search for 100 iterations, the ratio between the usage of the two estimators was usually about 6:4.

thinkall · 2024-12-19T00:45:49Z

Hi @flippercy , it would be helpful if you can share a complete code snippet for reproducing the issue. Thanks.

flippercy · 2024-12-19T17:34:04Z

@thinkall:

Thank you for the response. I am afraid that the issue is not easily repeatable because first of all, it happens RANDOMLY. As I said, our current "solution" is simply restart the kernel and rerun the whole process; most of the time the issue will be gone; if not, we will repeat until it disappears......; in addition, I am not sure whether it will happen with default learners. I suspect that probably it is related with my customized learners but cannot find any clue.

No matter what, below are the codes I used:

_predictors_to_use_for_FLAML = predictors_i + RawModelingVariables

df = pd.DataFrame(data_dev_balanced_B_WtCor, index=[targetVariable])
monotone_values = (df[predictors_to_use_for_FLAML] / df[predictors_to_use_for_FLAML].abs()).astype(int).values.tolist()
predictors_to_use_for_FLAML_monotone = []
for sublist in monotone_values:
predictors_to_use_for_FLAML_monotone.extend(sublist)

data_dev_balanced_B_flaml = data_dev_balanced_B.loc[:, IndexVariables + [targetVariable, weightVariable] + predictors_to_use_for_FLAML]
data_dev_balanced_B_flaml[targetVariable] = data_dev_balanced_B_flaml[targetVariable].astype(int)

data_val_balanced_B_flaml = data_val_balanced_B.loc[:, IndexVariables + [targetVariable, weightVariable] + predictors_to_use_for_FLAML]
data_val_balanced_B_flaml[targetVariable] = data_val_balanced_B_flaml[targetVariable].astype(int)

logfilename = outputDir + '/model_result.txt'

flaml_estimator_list= ['MonotonicLightGBMGBDT', 'MonotonicLightGBMDart']

flaml_time_budget = int(3600243) # seconds
flaml_max_iter = 250
flaml_mem_thres = 1024 * 1024 * 1024 * 60 # bytes
randomNumberSeed = int(randomNumberSeed)

import flaml as flaml
import pickle
import random
import numpy as np
import time as time
import pandas as pd
from flaml import tune
from flaml import AutoML
from flaml.automl.model import BaseEstimator, LRL1Classifier
from xgboost.sklearn import XGBClassifier
from lightgbm.sklearn import LGBMClassifier

automl = AutoML()
print(flaml.version)

num_cores = numCores
randomseed = randomNumberSeed

predictors_to_consider_for_FLAML = predictors_to_use_for_FLAML

monotone=tuple(predictors_to_use_for_FLAML_monotone)

data_dev_balanced_B_X=data_dev_balanced_B_flaml[data_dev_balanced_B_flaml.columns.intersection(predictors_to_consider_for_FLAML)]
data_dev_balanced_B_y=data_dev_balanced_B_flaml[targetVariable].values.ravel()
data_dev_balanced_B_w=data_dev_balanced_B_flaml[weightVariable].values.ravel()

data_val_balanced_B_X=data_val_balanced_B_flaml[data_val_balanced_B_flaml.columns.intersection(predictors_to_consider_for_FLAML)]
data_val_balanced_B_y=data_val_balanced_B_flaml[targetVariable].values.ravel()
data_val_balanced_B_w=data_val_balanced_B_flaml[weightVariable].values.ravel()

class MyMonotonicLightGBMGBDTClassifier(BaseEstimator):

def init(self, task = 'binary:logistic', n_jobs = num_cores, **params):

super().__init__(task, **params)

self.estimator_class = LGBMClassifier

# convert to int for integer hyperparameters
self.params = {
    'n_jobs': params['n_jobs'] if 'n_jobs' in params else num_cores,
    'boosting_type':params['boosting_type'] if 'boosting_type' in params else 'gbdt',
    'colsample_bytree':params['colsample_bytree'],
    'n_estimators':int(params['n_estimators']),
    'random_state': params['random_state'] if 'random_state' in params else randomseed,
    "monotone_constraints":params['monotone_constraints'] if 'monotone_constraints' in params else monotone,

}

@classmethod
def search_space(cls, data_size, task):

space = {     
'n_estimators': {'domain': tune.uniform(lower = 50, upper = 500), 'init_value': 200, 'low_cost_init_value': 200},
'colsample_bytree': {'domain': tune.uniform(lower = 0.5, upper = 1), 'init_value': 0.9, 'low_cost_init_value': 0.9},
}
return space

automl.add_learner(learner_name = 'MonotonicLightGBMGBDT', learner_class = MyMonotonicLightGBMGBDTClassifier)

class MyMonotonicLightGBMDartClassifier(BaseEstimator):

def init(self, task = 'binary:logistic', n_jobs = num_cores, **params):

super().__init__(task, **params)

self.estimator_class = LGBMClassifier

self.params = {
    'n_jobs': params['n_jobs'] if 'n_jobs' in params else num_cores,
    'boosting_type':params['boosting_type'] if 'boosting_type' in params else 'dart',
    'colsample_bytree':params['colsample_bytree'],
    'n_estimators':int(params['n_estimators']),
    'drop_rate': params['drop_rate'],
    'random_state': params['random_state'] if 'random_state' in params else randomseed,
    "monotone_constraints":params['monotone_constraints'] if 'monotone_constraints' in params else monotone,

}

@classmethod
def search_space(cls, data_size, task):

space = {        
'n_estimators': {'domain': tune.uniform(lower = 50, upper = 500), 'init_value': 200, 'low_cost_init_value': 200},
'colsample_bytree': {'domain': tune.uniform(lower = 0.5, upper = 1), 'init_value': 0.9, 'low_cost_init_value': 0.9},
'drop_rate': {'domain': tune.uniform(lower = 0.1, upper = 0.4), 'init_value': 0.2, 'low_cost_init_value': 0.2},
}
return space

automl.add_learner(learner_name = 'MonotonicLightGBMDart', learner_class = MyMonotonicLightGBMDartClassifier)

estimator_list= flaml_estimator_list

settings = {
"keep_search_state": False,
"time_budget": flaml_time_budget,
'max_iter': 15,
'mem_thres': flaml_mem_thres,
"metric": 'roc_auc',
"task": 'classification',
"estimator_list": estimator_list,
"log_file_name": logfilename,
"log_type":'all',
"seed":randomseed,
"model_history":True,
}

automl.fit(X_train = data_dev_balanced_B_X, y_train = data_dev_balanced_B_y, sample_weight=data_dev_balanced_B_w,
X_val = data_val_balanced_B_X, y_val = data_val_balanced_B_y, sample_weight_val=data_val_balanced_B_w, **settings)

for x in estimator_list:
automl_best_model = automl.best_model_for_estimator(x)
if automl_best_model is not None:
automl_best_model.model.booster.save_model(dataDir + '/Best_' + x)_

flippercy · 2024-12-19T20:52:41Z

model_result.txt

This is the log file from a recent search. After that, the optimal model by learner "MonotonicLightGBMDart" cannot be saved.

thinkall · 2024-12-21T05:24:56Z

Hi @flippercy , this is tricky. I'm not sure what's the root cause. But as you've mentioned:

the best model by MonotonicLightGBMGBDT is not saved (returned an empty model when I tried automl.best_model_for_estimator('MonotonicLightGBMGBDT')._model"

You got an empty model (dummy model?) instead of None. Maybe you've hit resource limitation. Just some thoughts.

FLAML/flaml/automl/model.py

Lines 278 to 287 in 6d53929

    
           except (MemoryError, TimeoutError) as e: 
        
               logger.warning(f"{e.__class__} {e}") 
        
               if self._task.is_classification(): 
        
                   model = DummyClassifier() 
        
               else: 
        
                   model = DummyRegressor() 
        
               X_train = self._preprocess(X_train) 
        
               model.fit(X_train, y_train) 
        
               self._model = model 
        
               train_time = time.time() - start_time

flippercy added the bug Something isn't working label Dec 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Sometimes the optimal results of non-optimal estimators are not saved #1388

[Bug]: Sometimes the optimal results of non-optimal estimators are not saved #1388

flippercy commented Dec 16, 2024

thinkall commented Dec 17, 2024

flippercy commented Dec 17, 2024

thinkall commented Dec 18, 2024

flippercy commented Dec 18, 2024

thinkall commented Dec 19, 2024

flippercy commented Dec 19, 2024 •

edited

Loading

flippercy commented Dec 19, 2024

thinkall commented Dec 21, 2024

[Bug]: Sometimes the optimal results of non-optimal estimators are not saved #1388

[Bug]: Sometimes the optimal results of non-optimal estimators are not saved #1388

Comments

flippercy commented Dec 16, 2024

Describe the bug

Steps to reproduce

Model Used

Expected Behavior

Screenshots and logs

Additional Information

thinkall commented Dec 17, 2024

flippercy commented Dec 17, 2024

thinkall commented Dec 18, 2024

flippercy commented Dec 18, 2024

thinkall commented Dec 19, 2024

flippercy commented Dec 19, 2024 • edited Loading

flippercy commented Dec 19, 2024

thinkall commented Dec 21, 2024

flippercy commented Dec 19, 2024 •

edited

Loading