Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Weight Initialisation might be a cause for loss NaN #12

Open
xiaoxuqi-ms opened this issue Dec 27, 2017 · 1 comment
Open

Weight Initialisation might be a cause for loss NaN #12

xiaoxuqi-ms opened this issue Dec 27, 2017 · 1 comment

Comments

@xiaoxuqi-ms
Copy link

xiaoxuqi-ms commented Dec 27, 2017

As weights are initialized as codes line: 70 - line: 75, in file FFMWithAdag.scala, standard deviation of Z= W * X +b roughly equals to sqrt(mn/2). Once it is larger than 706(around), exp(z) becomes NaN. Proper initialization of weight should make the Guassion distribution more narrowed. coef = sqrt(1/mnk)?
screen shot 2017-12-27 at 5 54 12 pm

@VinceShieh
Copy link
Owner

Thanks for raising this issue. Yes, proper initialization is quite important, but I dont get your point here. Note that, k is normally set to a small number.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants