-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
intersect_word2vec_format lock-factor is not triggered when the data is not in the binary format #2918
Comments
A link to some cut & paste source code elsewhere isn't very relevant to current project source code. Does the problem exist in either current-released, or current |
That's my bad, the issue has been resolved in the current release. |
Glad to hear that. FYI, |
Thank you kindly for the answer. I must admit I find it quite unusual that changing the weights of the Word2Vec model is rather difficult. While I do not have a particular interest in NLP, the Word2Vec model is also used in representation learning on graphs, such as in DeepWalk, Node2Vec, and HARP methods. The issue arose when I was implementing the HARP method, in which one needs to initialize the model with specific weights. I gladly hope that in the future the process of weight initialization will become more intuitive, if possible. |
I'm not familiar with the HARP method, but I would highlight that because you can directly tamper with any part of the model, especially after the Shortcuts & helper methods for less-common or research-techniques, different from classic word2vec steps, could certainly be added, but require a knowledgeable champion or contributor to arrive in a clean & maintainable way. (When gensim has collected wishlist requests for certain features, but then had rookie/non-practitioner/temporary contributors implement those features, it's often been a functionality & maintenance disaster.) |
Problem description
The method intersect_word2vec_format lacks updating the lock-factor (lockf) if the data is not in binary format.
Steps/code/corpus to reproduce
I base my issue on reading the source code of intersect_word2vec_format: https://tedboy.github.io/nlps/_modules/gensim/models/word2vec.html#Word2Vec.intersect_word2vec_format
In the [if binary: ... else: ... ] statement the lock is triggered only in the "if" clause:
However, when the data is not stored in the binary format, that is when the default binary=False value is passed, the lock is not triggered:
Versions
I base my issue solely on the source code provided in the documentation.
The text was updated successfully, but these errors were encountered: