Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Different Embeddings for [E1] [/E1] [E2] [/E2] tokens #30

Open
carlosrokk3r opened this issue Oct 14, 2020 · 1 comment
Open

Different Embeddings for [E1] [/E1] [E2] [/E2] tokens #30

carlosrokk3r opened this issue Oct 14, 2020 · 1 comment

Comments

@carlosrokk3r
Copy link

carlosrokk3r commented Oct 14, 2020

Hi, I just trained my model locally and checked the results of my trained models against the ones on the README. I found that they are different. I believe this is due to the embeddings of the previously mentioned tokens change every time the model is instantiated. For instance, trying with the same phrase, if I instantiated the model and predicted, the output would be different from the next time I instantiated and predicted the same phrase.

I believe in the __init__ method of the infer_from_trained class, with the method resize_token_embeddings()at line 83 of the infer.py file, the embeddings are being extended to have the 4 extra tokens, but the embeddings are being initialized randomly and this causes the results to vary.

Am I understanding it correctly? Or am I mistaken? Any help would be appreciated.

@plkmo
Copy link
Owner

plkmo commented Nov 7, 2020

I have never encountered this issue. After resize_token_embeddings(), the trained model weights will be loaded with load_state which loads the trained embeddings, so there is no reason for them to change every load.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants