You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As weights are initialized as codes line: 70 - line: 75, in file FFMWithAdag.scala, standard deviation of Z= W * X +b roughly equals to sqrt(mn/2). Once it is larger than 706(around), exp(z) becomes NaN. Proper initialization of weight should make the Guassion distribution more narrowed. coef = sqrt(1/mnk)?
The text was updated successfully, but these errors were encountered:
Thanks for raising this issue. Yes, proper initialization is quite important, but I dont get your point here. Note that, k is normally set to a small number.
As weights are initialized as codes line: 70 - line: 75, in file FFMWithAdag.scala, standard deviation of Z= W * X +b roughly equals to sqrt(mn/2). Once it is larger than 706(around), exp(z) becomes NaN. Proper initialization of weight should make the Guassion distribution more narrowed. coef = sqrt(1/mnk)?
data:image/s3,"s3://crabby-images/62444/624443a053ed056a18ceb68357df65d5a70c500d" alt="screen shot 2017-12-27 at 5 54 12 pm"
The text was updated successfully, but these errors were encountered: