This is a full implementation of data preprocessing for a CNN and a b-LSTM. Some of the codes are based on Amit Mandelbaum's code.
With this code you can reproduce almost all results presented on baselines and the results we present in our paper.
- Python (2.7)
- NumPy
- NLTK
- Pandas
Download Google's word embeddings binary file from https://code.google.com/p/word2vec/ extract it, and place it under data/
folder
For most dataset, it could be downloaded from https://github.com/harvardnlp/sent-conv-torch/tree/master/data