-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Acc drop significantly during the last epoch of stage1 #16
Comments
Hi, |
I tried for another 3 times. The train acc is approximately 0.87 during the last epoch(the 60th epoch) but the validation acc changes every time and always lower than 0.50. The validation acc is around 0.80 in the 55th epoch so it seems that there is a sudden drop during the last epoch and I notice that the training loss gets slightly higher during the last epoch. |
Hi, |
Hi, |
Hi, |
Hi, |
IIRC, your repo sets batch size to 1. If that is the case it's not really a PyTorch bug. Running stats with batch size = 1 is unstable itself. |
Thanks for the suggestion! The training batch size is 6 and testing is 1. When testing, eval() mode is on and the batch size does not affect the computation. |
I see. 6 is still too small though. People usually use >128 with BN.
…On Fri, Jun 15, 2018 at 02:25 Xingyi Zhou ***@***.***> wrote:
Thanks for the suggestion! The training batch size is 6 and testing is 1.
When testing, eval() mode is on and the batch size does not affect the
computation.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#16 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AFaWZf1qIJrKrOlsgtOtU14Fr2KuH1iWks5t81N3gaJpZM4TWpam>
.
|
Hi all, |
Get it. Thanks. |
Oh I still want this issue to be opened to wait for better solutions... |
Sure! My bad. |
@ssnl @xingyizhou Does this bug still exist with pytorch >= 1.0? |
@wangg12 I am doing experiments to observe if the bug exists in pytorch >= 1.0. |
Can you meet this error when the version of pytorch >= 1.0 |
@ujsyehao 你好,请问你的实验结果如何? |
torch.backends.cudnn.enabled = False If I have followed this step, I need not modify main.py, right? : |
Hi Xingyi,
After training the 2D hourglass component for 50+ epochs, the accuracy is approximately 83%, but after the 60th epoch, the accuracy suddenly drop to 43%.
Here's the log.
The text was updated successfully, but these errors were encountered: