-
-
Notifications
You must be signed in to change notification settings - Fork 613
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Add orthogonal kernel initialization for recurrent cells #1171
Conversation
src/utils.jl
Outdated
See [Exact solutions to the nonlinear dynamics of learning in deep linear neural networks](http://arxiv.org/abs/1312.6120) | ||
""" | ||
orthogonal(dim) = Matrix(qr(randn(Float32, dim, dim)).Q) | ||
orthogonal(dim1, dim2) = vcat([orthogonal(dim2) for n in 1:(dim1 ÷ dim2)]...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a warning or error when dim1 < dim2
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to align with some other implementations here. For example:
If the shape of the tensor to initialize is two-dimensional, it is initialized with an orthogonal matrix obtained from the QR decomposition of a matrix of random numbers drawn from a normal distribution. If the matrix has fewer rows than columns then the output will have orthogonal rows. Otherwise, the output will have orthogonal columns
https://www.tensorflow.org/api_docs/python/tf/keras/initializers/Orthogonal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added orthogonal_maxtrix
as an internal method for orthogonal
. If orthogonal_maxtrix
is called orthogonal
then it will conflict with the use of kernel_init
and I cannot think of a better name.
Maybe checkout the way we have implemented Flux.convfilter for the higher level API too. Good to mirror and have a consistent API |
@dhairyagandhi96 If we initialize recurrent cells just like what |
My point is related to the API, not the keyword. The keyword can be ideated on separately |
This has been merged in #1496. |
No description provided.