- Large Movie Review Dataset
- Source: using torchtext imdb implementation
- Downsampled to 2500 examples - training and validation 1250 each.
- Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.
- Alexey Kurakin, Ian Goodfellow, and Samy Bengio. Adversarial examples in the physical world. arXiv preprint arXiv:1607.02533, 2016.
- Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z Berkay Celik, and Ananthram Swami. The limitations of deep learning in adversarial settings. In Security and Privacy (EuroS&P), 2016 IEEE European Symposium on, pp. 372–387. IEEE, 2016.
- Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. Intriguing properties of neural networks. In ICLR 2014, 2014.
- Yonatan Belinkov and Yonatan Bisk. 2018. Synthetic and natural noise both break neural machine transla- tion. In Proceedings of ICLR.
- Adversarial Examples for Natural Language Classification Problems
- HotFlip: White-Box Adversarial Examples for Text Classification
- Adversarial Examples for Evaluating Reading Comprehension Systems
- Papernot, N., McDaniel, P., Swami, A., & Harang, R. (2016, November). Crafting adversarial input sequences for recurrent neural networks. In Military Communications Conference, MILCOM 2016-2016 IEEE (pp. 49-54). IEEE.
- Deceiving Google’s Perspective API Built for Detecting Toxic Comments
- Towards Crafting Text Adversarial Samples
- Deep Text Classification Can be Fooled
- Generating Natural Language Adversarial Examples
- Yoon Kim. Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014.
- Sepp Hochreiter and Jurgen Schmidhuber. Long short-term memory. ¨ Neural computation, 9(8):1735–1780, 1997.
- Naive Bayes
- CNN
- Bidirectional RNN
- Bidirectional LSTM
- Accuracy, precision, recall
- Loss function analysis
- Increment of loss
- Change of classifier confidence (probability) (i.e. sigmoid output)
- Imperceptibility analysis
- Human evaluation
- Quantitative measurement: thought vectors
- Sentence error analysis
- Syntactic error: word replacement incur grammatical error
- Semantic error: meaning of the sentence change after word replacement
- Counterfactual error: some fact in the sentence is incorrect after word replacement
- Research question, data collection, related work (adversarial learning), Experiment (@Erica)
- Related work (adversarial learning for NLP x5), Experiment (@Alicia)
- Related work (adversarial learning for NLP x3), Evaluation, Experiment, Future work (@Tobey)