Skip to content

Latest commit

 

History

History
13 lines (7 loc) · 614 Bytes

README.md

File metadata and controls

13 lines (7 loc) · 614 Bytes

sent2vec

Python/Keras program for generating both sentence embeddings and word embeddings jointly. Impelemented for Turkish text only. Modify tokenize function for other languages.

In Progress

Paper:

Pagliardini, Matteo, Prakhar Gupta, and Martin Jaggi. "Unsupervised learning of sentence embeddings using compositional n-gram features." Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Vol. 1. 2018.

Original Work based on FastText Library

https://github.com/epfml/sent2vec