Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

train loss is always 0, is this a bug? #5

Open
zhangbububu opened this issue Jul 24, 2023 · 6 comments
Open

train loss is always 0, is this a bug? #5

zhangbububu opened this issue Jul 24, 2023 · 6 comments

Comments

@zhangbububu
Copy link

line 168 in solve.py:

loss = prior_loss - series_loss

train loss is always 0, is this a bug?

@maheshyadav007
Copy link

line 168 in solve.py:

loss = prior_loss - series_loss

train loss is always 0, is this a bug?

This is happening for me too. I think line 168 will be ... loss = prior_loss + series_loss

@yyysjz1997
Copy link
Contributor

line 168 in solve.py:
loss = prior_loss - series_loss
train loss is always 0, is this a bug?

This is happening for me too. I think line 168 will be ... loss = prior_loss + series_loss

Thanks for pointing that out, it's not a bug. You can replace this with "+" to see what effect it has on the results, especially for datasets with low anomaly ratios.

@tianzhou2011
Copy link
Contributor

Thank you all for bringing this to our attention. We must admit that we overlooked this aspect initially. Upon further investigation, it does seem to yield a zero. However, when we conducted the experiment for MSL and changed the loss to 0*(prior_loss - series_loss), We observed a significant decrease of over 10% in F1 performance. Hence, the actual training loss is not zero but rather a very small number. Or It is possible that our understanding of Torch is not as comprehensive as it should be. As an example, if you maintain the loss as prior_loss - series_loss and print out the values of prior_loss and series_loss during training, you will observe that they are updating. However, if you set the loss as 0*(prior_loss - series_loss), both losses will remain still.

@tianzhou2011
Copy link
Contributor

An updated response reveals that the value is actually precisely zero. However, the effectiveness of the training is attributed to the distinct stop gradients we assign. It must be acknowledged that this aspect went unnoticed during our research, as we solely focused on monitoring the test F1 metrics. We sincerely appreciate all of you and other researchers who noticed that for bringing this to our attention, as it is indeed an intriguing process.

@herozen97
Copy link

Why is the loss not "prior_loss + series_loss" as depicted in equation (9) from the paper?

@herozen97
Copy link

Another question is, my "prior_loss" is always equal to "series_loss".

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants