Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add orthogonal kernel initialization for recurrent cells #1171

Closed
wants to merge 7 commits into from

Conversation

AStupidBear
Copy link
Contributor

No description provided.

src/utils.jl Outdated
See [Exact solutions to the nonlinear dynamics of learning in deep linear neural networks](http://arxiv.org/abs/1312.6120)
"""
orthogonal(dim) = Matrix(qr(randn(Float32, dim, dim)).Q)
orthogonal(dim1, dim2) = vcat([orthogonal(dim2) for n in 1:(dim1 ÷ dim2)]...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a warning or error when dim1 < dim2?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to align with some other implementations here. For example:

If the shape of the tensor to initialize is two-dimensional, it is initialized with an orthogonal matrix obtained from the QR decomposition of a matrix of random numbers drawn from a normal distribution. If the matrix has fewer rows than columns then the output will have orthogonal rows. Otherwise, the output will have orthogonal columns

https://www.tensorflow.org/api_docs/python/tf/keras/initializers/Orthogonal

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added orthogonal_maxtrix as an internal method for orthogonal. If orthogonal_maxtrix is called orthogonal then it will conflict with the use of kernel_init and I cannot think of a better name.

@DhairyaLGandhi
Copy link
Member

Maybe checkout the way we have implemented Flux.convfilter for the higher level API too. Good to mirror and have a consistent API

@AStupidBear
Copy link
Contributor Author

@dhairyagandhi96 If we initialize recurrent cells just like what Conv does, the keyword arguments will be too long.

@DhairyaLGandhi
Copy link
Member

My point is related to the API, not the keyword. The keyword can be ideated on separately

@darsnack
Copy link
Member

This has been merged in #1496.

@darsnack darsnack closed this Jan 27, 2022
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants