Different Embeddings for [E1] [/E1] [E2] [/E2] tokens #30

carlosrokk3r · 2020-10-14T19:45:02Z

Hi, I just trained my model locally and checked the results of my trained models against the ones on the README. I found that they are different. I believe this is due to the embeddings of the previously mentioned tokens change every time the model is instantiated. For instance, trying with the same phrase, if I instantiated the model and predicted, the output would be different from the next time I instantiated and predicted the same phrase.

I believe in the __init__ method of the infer_from_trained class, with the method resize_token_embeddings()at line 83 of the infer.py file, the embeddings are being extended to have the 4 extra tokens, but the embeddings are being initialized randomly and this causes the results to vary.

Am I understanding it correctly? Or am I mistaken? Any help would be appreciated.

The text was updated successfully, but these errors were encountered:

plkmo · 2020-11-07T00:06:15Z

I have never encountered this issue. After resize_token_embeddings(), the trained model weights will be loaded with load_state which loads the trained embeddings, so there is no reason for them to change every load.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Different Embeddings for [E1] [/E1] [E2] [/E2] tokens #30

Different Embeddings for [E1] [/E1] [E2] [/E2] tokens #30

carlosrokk3r commented Oct 14, 2020 •

edited

Loading

plkmo commented Nov 7, 2020

Different Embeddings for [E1] [/E1] [E2] [/E2] tokens #30

Different Embeddings for [E1] [/E1] [E2] [/E2] tokens #30

Comments

carlosrokk3r commented Oct 14, 2020 • edited Loading

plkmo commented Nov 7, 2020

carlosrokk3r commented Oct 14, 2020 •

edited

Loading