You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hi!
I'm trying to move my previously working model (based on OpenNMT) to pytorch-lightning, which is basically a Bart fine-tuning model.
Here are some key callbacks:
Here is the initialization of pytorch-lightning trainer:
and here are some of my hyper-parameters:
I was training with one GPU on one node.
The loss on toy training dataset (1000 examples) went down from 5 to 1.7 -ish in the first couple of epochs and grow back to 2.5 later.
I also tried without using 'noam' schedular, but with the same learning rate, the loss followed 9+ -> 2+ -> 6+.
I also tried to decrease the learning rate but the similar situation happened going alone with longer time (more epochs).
Could you please point me to some potential directions?
Cheers,
Xinnuo
Beta Was this translation helpful? Give feedback.
All reactions