You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
How long did you train ELECTRA-Small OWT
In the expected result section of READEME.md, you have mentioned "OWT is the OpenWebText-trained model from above (it performs a bit worse than ELECTRA-Small due to being trained for less time and on a smaller dataset)". How may steps have you trained ? And AFAIK openwebtext should be larger than wikibook, is that mean you use only part of the data ?
How come the scores in expected results
You have also mentioned "The below scores show median performance over a large number of random seeds.", is that mean the scores listed in that section is the scores of models pretrained from scractch with random seeds and each model was finetuned for 10 runs with random seeds, or is one pretrained model and finetuned for 10 runs with many random seeds ?
Did you use double_unordered in training models for expected results ?
The text was updated successfully, but these errors were encountered:
It is was trained for 1 million steps. I'm actually not sure how many epochs over the dataset it does, but the (public) OWT dataset is only about 50% bigger than Wikibooks I believe.
They are from the same pre-trained checkpoint with different random seeds for fine-tuning. The number of runs was at least 10, but much more (I think 100) for some tasks; I left the eval jobs running for a while and took the median of all the results.
Hi @clarkkev ,
How long did you train ELECTRA-Small OWT
In the expected result section of READEME.md, you have mentioned "OWT is the OpenWebText-trained model from above (it performs a bit worse than ELECTRA-Small due to being trained for less time and on a smaller dataset)". How may steps have you trained ? And AFAIK openwebtext should be larger than wikibook, is that mean you use only part of the data ?
How come the scores in expected results
You have also mentioned "The below scores show median performance over a large number of random seeds.", is that mean the scores listed in that section is the scores of models pretrained from scractch with random seeds and each model was finetuned for 10 runs with random seeds, or is one pretrained model and finetuned for 10 runs with many random seeds ?
Did you use double_unordered in training models for expected results ?
The text was updated successfully, but these errors were encountered: