-
-
Notifications
You must be signed in to change notification settings - Fork 613
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[feature request] L1, L2 regularization of weights #160
Comments
Both of these are pretty easy to do: using Flux
m = Chain(Dense(10, 5, σ), Dense(5, 2), softmax)
l1(x) = sum(x.^2)
l2(x) = sum(abs.(x))
sum(l1, params(m)) # Add this to your loss We should probably add a note to the docs so it's more obvious that we have it, though :) |
Oh, awesome! [ But as my Stack Overflow question suggests, I think the L1 regularization needs the soft threshold operator to bring the coefficients to exactly zero when they are close enough. |
Just skimming over this right now, but it looks like soft-threshold is a scalar function that you'd broadcast? In which case it should be trivial to define it, and broadcasting it will just work with our AD. Happy to help if you can't get that working. |
By the way, is it possible to integrate |
Possibly, although it looks like an unnecessarily heavy API around |
I've addressed this for now by documenting how to do it in the manual. I'm happy to help with the soft thresholding thing as well, if I can get some more detail on it. |
@SimonEnsemble |
The ability to L1- or L2- regularize the coefficients of the weights of the neural network would be very helpful for my research.
L2-regularization is easy, but L1- regularization I think requires the soft-thresholding operator. I initiated a discussion here that adding the absolute value of the magnitude of the coefficients to the loss is not appropriate, but I'm not 100% sure.
[Hopefully it's okay for users to request features; I am too eager to start using Flux.jl so I can do all of my research in Julia!]
The text was updated successfully, but these errors were encountered: