Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[Potential NAN bug] Loss may become NAN during training #383

Open
Justobe opened this issue Jul 27, 2020 · 1 comment
Open

[Potential NAN bug] Loss may become NAN during training #383

Justobe opened this issue Jul 27, 2020 · 1 comment

Comments

@Justobe
Copy link

Justobe commented Jul 27, 2020

Hello~

Thank you very much for sharing the code!

I try to use my own data set ( with the same shape as mnist) in code. After some iterations, it is found that the training loss becomes NAN. After carefully checking the code, I found that the following code may trigger NAN in loss:

In TensorFlow-Examples/examples/2_BasicModels/logistic_regression.py:

cost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(pred), reduction_indices=1))

If pred contains 0 (output of softmax ), the result of tf.log(pred) is inf because log(0) is illegal . And this may cause the result of loss to become NAN.

It could be fixed by making the following changes:

cost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(pred + 1e-10), reduction_indices=1))

or

cost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(tf.clip_by_value(pred,1e-10,1.0)), reduction_indices=1))

Hope to hear from you ~

Thanks in advance! : )

@Justobe
Copy link
Author

Justobe commented Jan 13, 2021

@aymericdamien

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant