-
Notifications
You must be signed in to change notification settings - Fork 738
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
EBM Loss functions #196
Comments
Any info on the types of loss function used to start with would be really useful. Thanks. |
Hi I am also looking to implement custom loss function. However after reading the source, I think the loss function is implemented in the C library.
|
Yes. Unfortunately, I cannot read C++ so cannot even figure out what the current loss functions being used are. It also doesn't seem to be documented anywhere. Any info or insight on this would be great. |
Maybe out of topic but I think you may find this useful: They claimed it beats EBM in most of their cases, with similar setting (GA2M). |
Yes, as you've pointed out, the loss function code is all implemented in C++ currently. We would like to expose this someday in python, but we're not clear at the moment how this will look and what the performance costs might be. For classification we currently use log loss, and MSE in regression. The EbmStats.h file contains most of these functions, but if you're not familiar with C++ it might be difficult to change these currently. -InterpretML team |
Thank you! My assumption was MSE/RMSE but it is nice to have that confirmed. Yes, unfortunately I cannot write C++, otherwise I would be doing so PRs for this. In my opinion, being able to write custom objective functions and custom eval scoring functions in python is really important. The other boosting libraries might should have some good examples of how they do this, as I don't notice any major changes in speed when using other functions (these days). Thanks again for the amazing work and also the really helpful replies!! |
Hi @JoshuaC3 -- We've been looking at this question in more depth the last few days, and had a look at how XGBoost and LightGBM handle this internally. Exposing it at the python level probably won't happen for a while because their method relies on having an interface for accessing their internal dataset (DMatrix for XGBoost), and we don't yet have a clean separation of that concept in InterpretML. We do think though within the shorter term that it would be possible to make some changes in the C++ that would reduce the amount of code required for a new loss function to replacing just a few lines of code. We continue to investigate this and will update this thread if/once this change happens, and if it happens we'll also write out a description here on how to change it. -InterpretML team |
That's really useful to know. I always wondered why XGB, LBG and I think CB used their own Dataset objects. I guess this is one motivation. Thank you interpret-ml!!! :D |
Impatient to get progress on this side. |
Closing this to consolidate issues. We'll track updates regarding custom losses in the duplicate issue #281 |
What loss functions are being used for the boosting of the EBMs, both regression and classification? I searched the repo and could only find this but wasn't sure how
_merged_pair_score_fn
related to boosting.Additionally, how can one use the non-standard metrics to train the model. E.g. assuming RMSE is used, how can we use MAE, Huber and custom loss functions? This is a major feature in other GBM/Boosted packages and having it would make EBM even more competitive.
Many thanks!! :D
The text was updated successfully, but these errors were encountered: