In this project we will implement a movie rating Sentiment (Positive/Negative) Classifier with CNN using TensorFlow.
- The project is under the guide of the great blog post on CNN classification.
- Almost re-implement the paper: Convolutional Neural Networks for Sentence Classification
- the best dev precision under random init of word embeddings is 75.21%, while best result of baseline model (biLSTM) is 68.67%
The existing data set is the Moive review data from Rotten Tomatoes which is pretty small but convenient to tune the model under CPUs. The adaption of other dataset (such as SST) is under development.
- Python 2.7
- Tensorflow 1.3.0
- Numpy
- Set the hyper-parameters in
config.py
. - Then run with existing dataset
python train.py
- add BiLSTM baseline model
- add TensorBoard visualization
- add learning rate exponential decay to enhence generalization
- Initialize the embeddings with pre-trained word vectors (word2vec, glove)
- some way to prevent overfitting (l2 regularization, increase dropout rate..)
- add interactive evaluation
- Convolutional Neural Networks for Sentence Classification
- author's Theano code
- Denny Britz's Tensorflow implementation