-
Notifications
You must be signed in to change notification settings - Fork 517
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[Bug]: Sometimes the optimal results of non-optimal estimators are not saved #1388
Comments
Hi @flippercy , thank you for reporting the issue. It happens when one estimator is nevered trained. Could you check the detailed logs to confirm that? |
Hi @thinkall: Unfortunately this is not the reason. Based on the logs and results from other functions (such as automl._search_states.items()), all the estimators have been trained. I can even retrieve the optimal results (such as AUC) for each learner without any issue; the only problem is that the best model itself of non-optimal learners cannot be saved sometimes. It does not happen every time but just randomly, which makes it harder to troubleshoot. Is it due to my customized learners? Could you check them quickly please? Thank you. |
Hi @flippercy , I guess it happens when the non-optimal learner is not fully trained at all. Do you mind share a full history of logs, and code snippet for reproducing? |
Hi @thinkall: Thank you for the response. Unfortunately the related log file was deleted; however, I can guarantee that under training is not the reason for this issue. With the example above, if I did the search for 100 iterations, the ratio between the usage of the two estimators was usually about 6:4. |
Hi @flippercy , it would be helpful if you can share a complete code snippet for reproducing the issue. Thanks. |
Thank you for the response. I am afraid that the issue is not easily repeatable because first of all, it happens RANDOMLY. As I said, our current "solution" is simply restart the kernel and rerun the whole process; most of the time the issue will be gone; if not, we will repeat until it disappears......; in addition, I am not sure whether it will happen with default learners. I suspect that probably it is related with my customized learners but cannot find any clue. No matter what, below are the codes I used: _predictors_to_use_for_FLAML = predictors_i + RawModelingVariables df = pd.DataFrame(data_dev_balanced_B_WtCor, index=[targetVariable]) data_dev_balanced_B_flaml = data_dev_balanced_B.loc[:, IndexVariables + [targetVariable, weightVariable] + predictors_to_use_for_FLAML] data_val_balanced_B_flaml = data_val_balanced_B.loc[:, IndexVariables + [targetVariable, weightVariable] + predictors_to_use_for_FLAML] logfilename = outputDir + '/model_result.txt' flaml_estimator_list= ['MonotonicLightGBMGBDT', 'MonotonicLightGBMDart'] flaml_time_budget = int(3600243) # seconds import flaml as flaml automl = AutoML() num_cores = numCores predictors_to_consider_for_FLAML = predictors_to_use_for_FLAML monotone=tuple(predictors_to_use_for_FLAML_monotone) data_dev_balanced_B_X=data_dev_balanced_B_flaml[data_dev_balanced_B_flaml.columns.intersection(predictors_to_consider_for_FLAML)] data_val_balanced_B_X=data_val_balanced_B_flaml[data_val_balanced_B_flaml.columns.intersection(predictors_to_consider_for_FLAML)] class MyMonotonicLightGBMGBDTClassifier(BaseEstimator): def init(self, task = 'binary:logistic', n_jobs = num_cores, **params):
} @classmethod
automl.add_learner(learner_name = 'MonotonicLightGBMGBDT', learner_class = MyMonotonicLightGBMGBDTClassifier) class MyMonotonicLightGBMDartClassifier(BaseEstimator): def init(self, task = 'binary:logistic', n_jobs = num_cores, **params):
} @classmethod
automl.add_learner(learner_name = 'MonotonicLightGBMDart', learner_class = MyMonotonicLightGBMDartClassifier) estimator_list= flaml_estimator_list settings = { automl.fit(X_train = data_dev_balanced_B_X, y_train = data_dev_balanced_B_y, sample_weight=data_dev_balanced_B_w, for x in estimator_list: |
This is the log file from a recent search. After that, the optimal model by learner "MonotonicLightGBMDart" cannot be saved. |
Hi @flippercy , this is tricky. I'm not sure what's the root cause. But as you've mentioned:
You got an empty model (dummy model?) instead of Lines 278 to 287 in 6d53929
|
Describe the bug
Hi:
I've created two customized lightGBM estimators for automl:
class MyMonotonicLightGBMGBDTClassifier(BaseEstimator):
automl.add_learner(learner_name = 'MonotonicLightGBMGBDT', learner_class = MyMonotonicLightGBMGBDTClassifier)
class MyMonotonicLightGBMDartClassifier(BaseEstimator):
automl.add_learner(learner_name = 'MonotonicLightGBMDart', learner_class = MyMonotonicLightGBMDartClassifier)
Then I call these two estimators for automl with the setting below:
from flaml import AutoML
from flaml.automl.model import BaseEstimator, LRL1Classifier
from xgboost.sklearn import XGBClassifier
from lightgbm.sklearn import LGBMClassifier
estimator_list= [ 'MonotonicLightGBMDart', 'MonotonicLightGBMGBDT']
settings = {
"keep_search_state": True,
"time_budget": flaml_time_budget,
'max_iter': 15,
'mem_thres': flaml_mem_thres,
"metric": 'roc_auc',
"task": 'classification',
"estimator_list": estimator_list,
"log_file_name": logfilename,
"log_type":'all',
"seed":randomseed,
"model_history":True
}
The process usually runs well; however I noticed one issue: sometimes the best result of the estimator which is not the optimal one is not saved. For example, after the search I want to retrieve the best models of both MonotonicLightGBMDart and MonotonicLightGBMGBDT. In this case, if the overall optimal model returned is built by MonotonicLightGBMDart, then sometimes, the best model by MonotonicLightGBMGBDT is not saved (returned an empty model when I tried automl.best_model_for_estimator('MonotonicLightGBMGBDT')._model).
What makes me more confused is that it does not happen every time and is not always repeatable. Sometimes if I restarted the kernel and re-ran the process the issue disappeared.
Could anyone check my codes and tell me the reason for this problem?
Thank you.
Steps to reproduce
No response
Model Used
No response
Expected Behavior
No response
Screenshots and logs
No response
Additional Information
No response
The text was updated successfully, but these errors were encountered: