Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

GradientTreeBoost : OnlineRegression #691

Open
olivbrau opened this issue Aug 13, 2021 · 2 comments
Open

GradientTreeBoost : OnlineRegression #691

olivbrau opened this issue Aug 13, 2021 · 2 comments

Comments

@olivbrau
Copy link

olivbrau commented Aug 13, 2021

Is your feature request related to a problem? Please describe.
GradientTreeBoost is a powerfull machine learning algorithm, but it is difficult and painfull to find the good parameters. We have to make multiple attemps, which can be slow.
But there is one parameter that could be analysed differently and efficiently : ntrees (nb of trees)

Describe the solution you'd like
It would be nice to adapt the fitting method to allow the caller to test the model, at each iteration, to compare the evolution of RMSE (for ex.) on training dataset and validation dataset to see the effect of ntrees, and then be abble to detect when the model is overfitting.
It would avoid to test with ntrees=100 then ntrees = 200 etc. which is not efficient.
So, in Smile vocabulary, it consists of making GradientTreeBoost an OnlineRegression with update method.

This mechanism could also allow the caller to monitor the progress of the training (UI with progress bar, etc.) and to stop it if too long.

@haifengl
Copy link
Owner

It is more about early stopping than online learning.

@olivbrau
Copy link
Author

olivbrau commented Aug 15, 2021

Yes, I was wrong, it is early stopping. Since there is no early stopping possible with Gradient Tree Boost, I thought that OnlineRegression could let the user to make his own mechanism.
We can do it with Neural Network (MLP) : the user makes his own iteration. I think it is very usefull. And also it let the user to stop learning if someting is wrong, since Gradient Tree Boost can take a long time if not carefully parametrized.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants