Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Word embeddings file #24

Open
aniket-sen opened this issue May 24, 2018 · 3 comments
Open

Word embeddings file #24

aniket-sen opened this issue May 24, 2018 · 3 comments

Comments

@aniket-sen
Copy link

Would you like to share how the word embedding file was created, like what procedure was used. And also if I want this algorithm to work on my dataset, how am I supposed to create a word embedding file for my dataset

@Mrlyk423
Copy link
Member

See the ReadMe. Pre-Trained Word Vectors are learned from New York Times Annotated Corpus (LDC Data LDC2008T19), which should be obtained from LDC (https://catalog.ldc.upenn.edu/LDC2008T19). And we also provide the word embedding file 'vec.bin' used in the experiments in data.zip.

@aniket-sen
Copy link
Author

You didn't answer my last question

@ghost
Copy link

ghost commented Jun 1, 2018

You can use gensim to train vector on your own dataset

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants